* BUG: spinlock recursion on CPU#0
@ 2010-10-21 14:12 Alexander Stein
2010-10-21 16:44 ` Mika Westerberg
0 siblings, 1 reply; 8+ messages in thread
From: Alexander Stein @ 2010-10-21 14:12 UTC (permalink / raw)
To: linux-arm-kernel
Hello,
I tried a demo app which results in a kernel BUG, the backtrace is as follows:
> BUG: spinlock recursion on CPU#0, demogui/507
> lock: c3e15590, .magic: dead4ead, .owner: demogui/507, .owner_cpu: 0
> [<c002a858>] (unwind_backtrace+0x0/0xec) from [<c01609ac>]
> (do_raw_spin_lock+0x48/0xac) [<c01609ac>] (do_raw_spin_lock+0x48/0xac)
> from [<c002bec4>] (adjust_pte+0x54/0x94) [<c002bec4>]
> (adjust_pte+0x54/0x94) from [<c002bfa4>] (make_coherent+0xa0/0xe4)
> [<c002bfa4>] (make_coherent+0xa0/0xe4) from [<c002c088>]
> (update_mmu_cache+0xa0/0xac) [<c002c088>] (update_mmu_cache+0xa0/0xac)
> from [<c007bfdc>] (__do_fault+0x334/0x414) [<c007bfdc>]
> (__do_fault+0x334/0x414) from [<c007d594>] (handle_mm_fault+0x108/0x2c8)
> [<c007d594>] (handle_mm_fault+0x108/0x2c8) from [<c002b658>]
> (__do_page_fault+0x6c/0xb8) [<c002b658>] (__do_page_fault+0x6c/0xb8) from
> [<c002b878>] (do_page_fault+0xb4/0x15c) [<c002b878>]
> (do_page_fault+0xb4/0x15c) from [<c0025264>] (do_DataAbort+0x34/0x94)
> [<c0025264>] (do_DataAbort+0x34/0x94) from [<c0025da0>]
> (ret_from_exception+0x0/0x10) Exception stack(0xc3e75fb0 to 0xc3e75ff8)
> 5fa0: 40fff000 000040a8 00000003
> 4089d3d0 5fc0: 0005e3a0 00000001 4089d3d0 0005c940 ffff0fc0 0005e480
> ffff0fc0 be81bda8 5fe0: 0005c940 be81bbc0 404798c0 404760ec 20000010
> ffffffff
The used version is a Linux-2.6.35.7+ (local patches) running on an
AT19SAM9263 (ARM926EJ-S). My test program is a QT application I cannot share.
After sme searching i found this patch:
http://permalink.gmane.org/gmane.linux.ports.arm.kernel/79676
I didn't try any of the demos in this patch, but I applied the changes and the
BUG didn't occur anymore.
Now I'm wondering what's the status of this patch and is this the right
approach to handle this problem.
Best regards
Alexander
^ permalink raw reply [flat|nested] 8+ messages in thread
* BUG: spinlock recursion on CPU#0
2010-10-21 14:12 BUG: spinlock recursion on CPU#0 Alexander Stein
@ 2010-10-21 16:44 ` Mika Westerberg
2010-10-21 17:09 ` [PATCH RESEND] ARM: fix spinlock recursion in adjust_pte() Mika Westerberg
0 siblings, 1 reply; 8+ messages in thread
From: Mika Westerberg @ 2010-10-21 16:44 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Oct 21, 2010 at 04:12:28PM +0200, Alexander Stein wrote:
[...]
> After sme searching i found this patch:
> http://permalink.gmane.org/gmane.linux.ports.arm.kernel/79676
> I didn't try any of the demos in this patch, but I applied the changes and the
> BUG didn't occur anymore.
> Now I'm wondering what's the status of this patch and is this the right
> approach to handle this problem.
I didn't receive any comments on the patch so it ended up only in
my local tree.
The problem seems to reproduce with the latest mainline also so I
guess it is time to resend the patch. Will do that in a minute.
Regards,
MW
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH RESEND] ARM: fix spinlock recursion in adjust_pte()
2010-10-21 16:44 ` Mika Westerberg
@ 2010-10-21 17:09 ` Mika Westerberg
2010-10-22 6:28 ` Baruch Siach
0 siblings, 1 reply; 8+ messages in thread
From: Mika Westerberg @ 2010-10-21 17:09 UTC (permalink / raw)
To: linux-arm-kernel
When running following code in a machine which has VIVT caches and
USE_SPLIT_PTLOCKS is not defined:
fd = open("/etc/passwd", O_RDONLY);
addr = mmap(NULL, 4096, PROT_READ, MAP_SHARED, fd, 0);
addr2 = mmap(NULL, 4096, PROT_READ, MAP_SHARED, fd, 0);
v = *((int *)addr);
we will hang in spinlock recursion in the page fault handler:
BUG: spinlock recursion on CPU#0, mmap_test/717
lock: c5e295d8, .magic: dead4ead, .owner: mmap_test/717,
.owner_cpu: 0
[<c0026604>] (unwind_backtrace+0x0/0xec)
[<c014ee48>] (do_raw_spin_lock+0x40/0x140)
[<c0027f68>] (update_mmu_cache+0x208/0x250)
[<c0079db4>] (__do_fault+0x320/0x3ec)
[<c007af7c>] (handle_mm_fault+0x2f0/0x6d8)
[<c0027834>] (do_page_fault+0xdc/0x1cc)
[<c00202d0>] (do_DataAbort+0x34/0x94)
Same thing can be achieved by running:
# useradd dummy
This comes from the fact that when USE_SPLIT_PTLOCKS is not defined,
the only lock protecting the page tables is mm->page_table_lock
which is already locked before update_mmu_cache() is called.
Signed-off-by: Mika Westerberg <mika.westerberg@iki.fi>
---
arch/arm/mm/fault-armv.c | 28 ++++++++++++++++++++++++++--
1 files changed, 26 insertions(+), 2 deletions(-)
diff --git a/arch/arm/mm/fault-armv.c b/arch/arm/mm/fault-armv.c
index 9b906de..56036ff 100644
--- a/arch/arm/mm/fault-armv.c
+++ b/arch/arm/mm/fault-armv.c
@@ -65,6 +65,30 @@ static int do_adjust_pte(struct vm_area_struct *vma, unsigned long address,
return ret;
}
+#if USE_SPLIT_PTLOCKS
+/*
+ * If we are using split PTE locks, then we need to take the page
+ * lock here. Otherwise we are using shared mm->page_table_lock
+ * which is already locked, thus cannot take it.
+ */
+static inline void do_pte_lock(spinlock_t *ptl)
+{
+ /*
+ * Use nested version here to indicate that we are already
+ * holding one similar spinlock.
+ */
+ spin_lock_nested(ptl, SINGLE_DEPTH_NESTING);
+}
+
+static inline void do_pte_unlock(spinlock_t *ptl)
+{
+ spin_unlock(ptl);
+}
+#else /* !USE_SPLIT_PTLOCKS */
+static inline void do_pte_lock(spinlock_t *ptl) {}
+static inline void do_pte_unlock(spinlock_t *ptl) {}
+#endif /* USE_SPLIT_PTLOCKS */
+
static int adjust_pte(struct vm_area_struct *vma, unsigned long address,
unsigned long pfn)
{
@@ -89,11 +113,11 @@ static int adjust_pte(struct vm_area_struct *vma, unsigned long address,
*/
ptl = pte_lockptr(vma->vm_mm, pmd);
pte = pte_offset_map_nested(pmd, address);
- spin_lock(ptl);
+ do_pte_lock(ptl);
ret = do_adjust_pte(vma, address, pfn, pte);
- spin_unlock(ptl);
+ do_pte_unlock(ptl);
pte_unmap_nested(pte);
return ret;
--
1.5.6.5
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH RESEND] ARM: fix spinlock recursion in adjust_pte()
2010-10-21 17:09 ` [PATCH RESEND] ARM: fix spinlock recursion in adjust_pte() Mika Westerberg
@ 2010-10-22 6:28 ` Baruch Siach
2010-10-22 6:38 ` Mika Westerberg
0 siblings, 1 reply; 8+ messages in thread
From: Baruch Siach @ 2010-10-22 6:28 UTC (permalink / raw)
To: linux-arm-kernel
Hi Mika,
On Thu, Oct 21, 2010 at 08:09:42PM +0300, Mika Westerberg wrote:
> When running following code in a machine which has VIVT caches and
> USE_SPLIT_PTLOCKS is not defined:
>
> fd = open("/etc/passwd", O_RDONLY);
> addr = mmap(NULL, 4096, PROT_READ, MAP_SHARED, fd, 0);
> addr2 = mmap(NULL, 4096, PROT_READ, MAP_SHARED, fd, 0);
>
> v = *((int *)addr);
>
> we will hang in spinlock recursion in the page fault handler:
>
> BUG: spinlock recursion on CPU#0, mmap_test/717
[snip]
Do you have any idea when was this bug introduced? Does it affect already
release kernels other than .36?
baruch
> Same thing can be achieved by running:
>
> # useradd dummy
>
> This comes from the fact that when USE_SPLIT_PTLOCKS is not defined,
> the only lock protecting the page tables is mm->page_table_lock
> which is already locked before update_mmu_cache() is called.
>
> Signed-off-by: Mika Westerberg <mika.westerberg@iki.fi>
> ---
> arch/arm/mm/fault-armv.c | 28 ++++++++++++++++++++++++++--
> 1 files changed, 26 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm/mm/fault-armv.c b/arch/arm/mm/fault-armv.c
> index 9b906de..56036ff 100644
> --- a/arch/arm/mm/fault-armv.c
> +++ b/arch/arm/mm/fault-armv.c
> @@ -65,6 +65,30 @@ static int do_adjust_pte(struct vm_area_struct *vma, unsigned long address,
> return ret;
> }
>
> +#if USE_SPLIT_PTLOCKS
> +/*
> + * If we are using split PTE locks, then we need to take the page
> + * lock here. Otherwise we are using shared mm->page_table_lock
> + * which is already locked, thus cannot take it.
> + */
> +static inline void do_pte_lock(spinlock_t *ptl)
> +{
> + /*
> + * Use nested version here to indicate that we are already
> + * holding one similar spinlock.
> + */
> + spin_lock_nested(ptl, SINGLE_DEPTH_NESTING);
> +}
> +
> +static inline void do_pte_unlock(spinlock_t *ptl)
> +{
> + spin_unlock(ptl);
> +}
> +#else /* !USE_SPLIT_PTLOCKS */
> +static inline void do_pte_lock(spinlock_t *ptl) {}
> +static inline void do_pte_unlock(spinlock_t *ptl) {}
> +#endif /* USE_SPLIT_PTLOCKS */
> +
> static int adjust_pte(struct vm_area_struct *vma, unsigned long address,
> unsigned long pfn)
> {
> @@ -89,11 +113,11 @@ static int adjust_pte(struct vm_area_struct *vma, unsigned long address,
> */
> ptl = pte_lockptr(vma->vm_mm, pmd);
> pte = pte_offset_map_nested(pmd, address);
> - spin_lock(ptl);
> + do_pte_lock(ptl);
>
> ret = do_adjust_pte(vma, address, pfn, pte);
>
> - spin_unlock(ptl);
> + do_pte_unlock(ptl);
> pte_unmap_nested(pte);
>
> return ret;
> --
> 1.5.6.5
--
~. .~ Tk Open Systems
=}------------------------------------------------ooO--U--Ooo------------{=
- baruch at tkos.co.il - tel: +972.2.679.5364, http://www.tkos.co.il -
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH RESEND] ARM: fix spinlock recursion in adjust_pte()
2010-10-22 6:28 ` Baruch Siach
@ 2010-10-22 6:38 ` Mika Westerberg
2010-10-22 6:42 ` Baruch Siach
0 siblings, 1 reply; 8+ messages in thread
From: Mika Westerberg @ 2010-10-22 6:38 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, Oct 22, 2010 at 08:28:26AM +0200, Baruch Siach wrote:
[...]
>
> Do you have any idea when was this bug introduced? Does it affect already
> release kernels other than .36?
I've seen it at least on .35. The locking code itself was introduced
in commit:
56dd47098abe (ARM: make_coherent: fix problems with highpte, part 1)
Regards,
MW
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH RESEND] ARM: fix spinlock recursion in adjust_pte()
2010-10-22 6:38 ` Mika Westerberg
@ 2010-10-22 6:42 ` Baruch Siach
2010-10-22 7:08 ` Mika Westerberg
0 siblings, 1 reply; 8+ messages in thread
From: Baruch Siach @ 2010-10-22 6:42 UTC (permalink / raw)
To: linux-arm-kernel
Hi Mika,
On Fri, Oct 22, 2010 at 09:38:45AM +0300, Mika Westerberg wrote:
> On Fri, Oct 22, 2010 at 08:28:26AM +0200, Baruch Siach wrote:
> [...]
> >
> > Do you have any idea when was this bug introduced? Does it affect already
> > release kernels other than .36?
>
> I've seen it at least on .35. The locking code itself was introduced
> in commit:
>
> 56dd47098abe (ARM: make_coherent: fix problems with highpte, part 1)
Which is present on .34-rc1. In this case adding stable at kernel.org to Cc would
be nice.
Thanks,
baruch
--
~. .~ Tk Open Systems
=}------------------------------------------------ooO--U--Ooo------------{=
- baruch at tkos.co.il - tel: +972.2.679.5364, http://www.tkos.co.il -
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH RESEND] ARM: fix spinlock recursion in adjust_pte()
2010-10-22 6:42 ` Baruch Siach
@ 2010-10-22 7:08 ` Mika Westerberg
2010-10-22 8:01 ` Baruch Siach
0 siblings, 1 reply; 8+ messages in thread
From: Mika Westerberg @ 2010-10-22 7:08 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, Oct 22, 2010 at 08:42:05AM +0200, Baruch Siach wrote:
> Hi Mika,
>
> On Fri, Oct 22, 2010 at 09:38:45AM +0300, Mika Westerberg wrote:
> > On Fri, Oct 22, 2010 at 08:28:26AM +0200, Baruch Siach wrote:
> > [...]
> > >
> > > Do you have any idea when was this bug introduced? Does it affect already
> > > release kernels other than .36?
> >
> > I've seen it at least on .35. The locking code itself was introduced
> > in commit:
> >
> > 56dd47098abe (ARM: make_coherent: fix problems with highpte, part 1)
>
> Which is present on .34-rc1. In this case adding stable at kernel.org to Cc would
> be nice.
Ok, thanks.
I'm not familiar with the stable rules and it wasn't immediately
clear from the Documentation/stable_kernel_rules.txt: should I resend
this patch with added 'Cc:' or is it enough just to add Cc into followup
mail?
Thanks,
MW
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH RESEND] ARM: fix spinlock recursion in adjust_pte()
2010-10-22 7:08 ` Mika Westerberg
@ 2010-10-22 8:01 ` Baruch Siach
0 siblings, 0 replies; 8+ messages in thread
From: Baruch Siach @ 2010-10-22 8:01 UTC (permalink / raw)
To: linux-arm-kernel
Hi Mika,
On Fri, Oct 22, 2010 at 10:08:47AM +0300, Mika Westerberg wrote:
> On Fri, Oct 22, 2010 at 08:42:05AM +0200, Baruch Siach wrote:
> > Hi Mika,
> >
> > On Fri, Oct 22, 2010 at 09:38:45AM +0300, Mika Westerberg wrote:
> > > On Fri, Oct 22, 2010 at 08:28:26AM +0200, Baruch Siach wrote:
> > > [...]
> > > >
> > > > Do you have any idea when was this bug introduced? Does it affect already
> > > > release kernels other than .36?
> > >
> > > I've seen it at least on .35. The locking code itself was introduced
> > > in commit:
> > >
> > > 56dd47098abe (ARM: make_coherent: fix problems with highpte, part 1)
> >
> > Which is present on .34-rc1. In this case adding stable at kernel.org to Cc would
> > be nice.
>
> Ok, thanks.
>
> I'm not familiar with the stable rules and it wasn't immediately
> clear from the Documentation/stable_kernel_rules.txt: should I resend
> this patch with added 'Cc:' or is it enough just to add Cc into followup
> mail?
Patch with 'Cc: stable at kernel.org' line in its commit log is automatically
forwarded to the stable team once this patch hits Linus' tree. Since this
patch should go through Russell's tree, you can add this line when posting to
his patch tracker. I guess that adding some info in the commit log on the
history of this bug should help the stable team in the process.
baruch
--
~. .~ Tk Open Systems
=}------------------------------------------------ooO--U--Ooo------------{=
- baruch at tkos.co.il - tel: +972.2.679.5364, http://www.tkos.co.il -
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2010-10-22 8:01 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-10-21 14:12 BUG: spinlock recursion on CPU#0 Alexander Stein
2010-10-21 16:44 ` Mika Westerberg
2010-10-21 17:09 ` [PATCH RESEND] ARM: fix spinlock recursion in adjust_pte() Mika Westerberg
2010-10-22 6:28 ` Baruch Siach
2010-10-22 6:38 ` Mika Westerberg
2010-10-22 6:42 ` Baruch Siach
2010-10-22 7:08 ` Mika Westerberg
2010-10-22 8:01 ` Baruch Siach
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox