* BUG: spinlock recursion on CPU#0
@ 2010-10-21 14:12 Alexander Stein
2010-10-21 16:44 ` Mika Westerberg
0 siblings, 1 reply; 8+ messages in thread
From: Alexander Stein @ 2010-10-21 14:12 UTC (permalink / raw)
To: linux-arm-kernel
Hello,
I tried a demo app which results in a kernel BUG, the backtrace is as follows:
> BUG: spinlock recursion on CPU#0, demogui/507
> lock: c3e15590, .magic: dead4ead, .owner: demogui/507, .owner_cpu: 0
> [<c002a858>] (unwind_backtrace+0x0/0xec) from [<c01609ac>]
> (do_raw_spin_lock+0x48/0xac) [<c01609ac>] (do_raw_spin_lock+0x48/0xac)
> from [<c002bec4>] (adjust_pte+0x54/0x94) [<c002bec4>]
> (adjust_pte+0x54/0x94) from [<c002bfa4>] (make_coherent+0xa0/0xe4)
> [<c002bfa4>] (make_coherent+0xa0/0xe4) from [<c002c088>]
> (update_mmu_cache+0xa0/0xac) [<c002c088>] (update_mmu_cache+0xa0/0xac)
> from [<c007bfdc>] (__do_fault+0x334/0x414) [<c007bfdc>]
> (__do_fault+0x334/0x414) from [<c007d594>] (handle_mm_fault+0x108/0x2c8)
> [<c007d594>] (handle_mm_fault+0x108/0x2c8) from [<c002b658>]
> (__do_page_fault+0x6c/0xb8) [<c002b658>] (__do_page_fault+0x6c/0xb8) from
> [<c002b878>] (do_page_fault+0xb4/0x15c) [<c002b878>]
> (do_page_fault+0xb4/0x15c) from [<c0025264>] (do_DataAbort+0x34/0x94)
> [<c0025264>] (do_DataAbort+0x34/0x94) from [<c0025da0>]
> (ret_from_exception+0x0/0x10) Exception stack(0xc3e75fb0 to 0xc3e75ff8)
> 5fa0: 40fff000 000040a8 00000003
> 4089d3d0 5fc0: 0005e3a0 00000001 4089d3d0 0005c940 ffff0fc0 0005e480
> ffff0fc0 be81bda8 5fe0: 0005c940 be81bbc0 404798c0 404760ec 20000010
> ffffffff
The used version is a Linux-2.6.35.7+ (local patches) running on an
AT19SAM9263 (ARM926EJ-S). My test program is a QT application I cannot share.
After sme searching i found this patch:
http://permalink.gmane.org/gmane.linux.ports.arm.kernel/79676
I didn't try any of the demos in this patch, but I applied the changes and the
BUG didn't occur anymore.
Now I'm wondering what's the status of this patch and is this the right
approach to handle this problem.
Best regards
Alexander
^ permalink raw reply [flat|nested] 8+ messages in thread* BUG: spinlock recursion on CPU#0 2010-10-21 14:12 BUG: spinlock recursion on CPU#0 Alexander Stein @ 2010-10-21 16:44 ` Mika Westerberg 2010-10-21 17:09 ` [PATCH RESEND] ARM: fix spinlock recursion in adjust_pte() Mika Westerberg 0 siblings, 1 reply; 8+ messages in thread From: Mika Westerberg @ 2010-10-21 16:44 UTC (permalink / raw) To: linux-arm-kernel On Thu, Oct 21, 2010 at 04:12:28PM +0200, Alexander Stein wrote: [...] > After sme searching i found this patch: > http://permalink.gmane.org/gmane.linux.ports.arm.kernel/79676 > I didn't try any of the demos in this patch, but I applied the changes and the > BUG didn't occur anymore. > Now I'm wondering what's the status of this patch and is this the right > approach to handle this problem. I didn't receive any comments on the patch so it ended up only in my local tree. The problem seems to reproduce with the latest mainline also so I guess it is time to resend the patch. Will do that in a minute. Regards, MW ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH RESEND] ARM: fix spinlock recursion in adjust_pte() 2010-10-21 16:44 ` Mika Westerberg @ 2010-10-21 17:09 ` Mika Westerberg 2010-10-22 6:28 ` Baruch Siach 0 siblings, 1 reply; 8+ messages in thread From: Mika Westerberg @ 2010-10-21 17:09 UTC (permalink / raw) To: linux-arm-kernel When running following code in a machine which has VIVT caches and USE_SPLIT_PTLOCKS is not defined: fd = open("/etc/passwd", O_RDONLY); addr = mmap(NULL, 4096, PROT_READ, MAP_SHARED, fd, 0); addr2 = mmap(NULL, 4096, PROT_READ, MAP_SHARED, fd, 0); v = *((int *)addr); we will hang in spinlock recursion in the page fault handler: BUG: spinlock recursion on CPU#0, mmap_test/717 lock: c5e295d8, .magic: dead4ead, .owner: mmap_test/717, .owner_cpu: 0 [<c0026604>] (unwind_backtrace+0x0/0xec) [<c014ee48>] (do_raw_spin_lock+0x40/0x140) [<c0027f68>] (update_mmu_cache+0x208/0x250) [<c0079db4>] (__do_fault+0x320/0x3ec) [<c007af7c>] (handle_mm_fault+0x2f0/0x6d8) [<c0027834>] (do_page_fault+0xdc/0x1cc) [<c00202d0>] (do_DataAbort+0x34/0x94) Same thing can be achieved by running: # useradd dummy This comes from the fact that when USE_SPLIT_PTLOCKS is not defined, the only lock protecting the page tables is mm->page_table_lock which is already locked before update_mmu_cache() is called. Signed-off-by: Mika Westerberg <mika.westerberg@iki.fi> --- arch/arm/mm/fault-armv.c | 28 ++++++++++++++++++++++++++-- 1 files changed, 26 insertions(+), 2 deletions(-) diff --git a/arch/arm/mm/fault-armv.c b/arch/arm/mm/fault-armv.c index 9b906de..56036ff 100644 --- a/arch/arm/mm/fault-armv.c +++ b/arch/arm/mm/fault-armv.c @@ -65,6 +65,30 @@ static int do_adjust_pte(struct vm_area_struct *vma, unsigned long address, return ret; } +#if USE_SPLIT_PTLOCKS +/* + * If we are using split PTE locks, then we need to take the page + * lock here. Otherwise we are using shared mm->page_table_lock + * which is already locked, thus cannot take it. + */ +static inline void do_pte_lock(spinlock_t *ptl) +{ + /* + * Use nested version here to indicate that we are already + * holding one similar spinlock. + */ + spin_lock_nested(ptl, SINGLE_DEPTH_NESTING); +} + +static inline void do_pte_unlock(spinlock_t *ptl) +{ + spin_unlock(ptl); +} +#else /* !USE_SPLIT_PTLOCKS */ +static inline void do_pte_lock(spinlock_t *ptl) {} +static inline void do_pte_unlock(spinlock_t *ptl) {} +#endif /* USE_SPLIT_PTLOCKS */ + static int adjust_pte(struct vm_area_struct *vma, unsigned long address, unsigned long pfn) { @@ -89,11 +113,11 @@ static int adjust_pte(struct vm_area_struct *vma, unsigned long address, */ ptl = pte_lockptr(vma->vm_mm, pmd); pte = pte_offset_map_nested(pmd, address); - spin_lock(ptl); + do_pte_lock(ptl); ret = do_adjust_pte(vma, address, pfn, pte); - spin_unlock(ptl); + do_pte_unlock(ptl); pte_unmap_nested(pte); return ret; -- 1.5.6.5 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH RESEND] ARM: fix spinlock recursion in adjust_pte() 2010-10-21 17:09 ` [PATCH RESEND] ARM: fix spinlock recursion in adjust_pte() Mika Westerberg @ 2010-10-22 6:28 ` Baruch Siach 2010-10-22 6:38 ` Mika Westerberg 0 siblings, 1 reply; 8+ messages in thread From: Baruch Siach @ 2010-10-22 6:28 UTC (permalink / raw) To: linux-arm-kernel Hi Mika, On Thu, Oct 21, 2010 at 08:09:42PM +0300, Mika Westerberg wrote: > When running following code in a machine which has VIVT caches and > USE_SPLIT_PTLOCKS is not defined: > > fd = open("/etc/passwd", O_RDONLY); > addr = mmap(NULL, 4096, PROT_READ, MAP_SHARED, fd, 0); > addr2 = mmap(NULL, 4096, PROT_READ, MAP_SHARED, fd, 0); > > v = *((int *)addr); > > we will hang in spinlock recursion in the page fault handler: > > BUG: spinlock recursion on CPU#0, mmap_test/717 [snip] Do you have any idea when was this bug introduced? Does it affect already release kernels other than .36? baruch > Same thing can be achieved by running: > > # useradd dummy > > This comes from the fact that when USE_SPLIT_PTLOCKS is not defined, > the only lock protecting the page tables is mm->page_table_lock > which is already locked before update_mmu_cache() is called. > > Signed-off-by: Mika Westerberg <mika.westerberg@iki.fi> > --- > arch/arm/mm/fault-armv.c | 28 ++++++++++++++++++++++++++-- > 1 files changed, 26 insertions(+), 2 deletions(-) > > diff --git a/arch/arm/mm/fault-armv.c b/arch/arm/mm/fault-armv.c > index 9b906de..56036ff 100644 > --- a/arch/arm/mm/fault-armv.c > +++ b/arch/arm/mm/fault-armv.c > @@ -65,6 +65,30 @@ static int do_adjust_pte(struct vm_area_struct *vma, unsigned long address, > return ret; > } > > +#if USE_SPLIT_PTLOCKS > +/* > + * If we are using split PTE locks, then we need to take the page > + * lock here. Otherwise we are using shared mm->page_table_lock > + * which is already locked, thus cannot take it. > + */ > +static inline void do_pte_lock(spinlock_t *ptl) > +{ > + /* > + * Use nested version here to indicate that we are already > + * holding one similar spinlock. > + */ > + spin_lock_nested(ptl, SINGLE_DEPTH_NESTING); > +} > + > +static inline void do_pte_unlock(spinlock_t *ptl) > +{ > + spin_unlock(ptl); > +} > +#else /* !USE_SPLIT_PTLOCKS */ > +static inline void do_pte_lock(spinlock_t *ptl) {} > +static inline void do_pte_unlock(spinlock_t *ptl) {} > +#endif /* USE_SPLIT_PTLOCKS */ > + > static int adjust_pte(struct vm_area_struct *vma, unsigned long address, > unsigned long pfn) > { > @@ -89,11 +113,11 @@ static int adjust_pte(struct vm_area_struct *vma, unsigned long address, > */ > ptl = pte_lockptr(vma->vm_mm, pmd); > pte = pte_offset_map_nested(pmd, address); > - spin_lock(ptl); > + do_pte_lock(ptl); > > ret = do_adjust_pte(vma, address, pfn, pte); > > - spin_unlock(ptl); > + do_pte_unlock(ptl); > pte_unmap_nested(pte); > > return ret; > -- > 1.5.6.5 -- ~. .~ Tk Open Systems =}------------------------------------------------ooO--U--Ooo------------{= - baruch at tkos.co.il - tel: +972.2.679.5364, http://www.tkos.co.il - ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH RESEND] ARM: fix spinlock recursion in adjust_pte() 2010-10-22 6:28 ` Baruch Siach @ 2010-10-22 6:38 ` Mika Westerberg 2010-10-22 6:42 ` Baruch Siach 0 siblings, 1 reply; 8+ messages in thread From: Mika Westerberg @ 2010-10-22 6:38 UTC (permalink / raw) To: linux-arm-kernel On Fri, Oct 22, 2010 at 08:28:26AM +0200, Baruch Siach wrote: [...] > > Do you have any idea when was this bug introduced? Does it affect already > release kernels other than .36? I've seen it at least on .35. The locking code itself was introduced in commit: 56dd47098abe (ARM: make_coherent: fix problems with highpte, part 1) Regards, MW ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH RESEND] ARM: fix spinlock recursion in adjust_pte() 2010-10-22 6:38 ` Mika Westerberg @ 2010-10-22 6:42 ` Baruch Siach 2010-10-22 7:08 ` Mika Westerberg 0 siblings, 1 reply; 8+ messages in thread From: Baruch Siach @ 2010-10-22 6:42 UTC (permalink / raw) To: linux-arm-kernel Hi Mika, On Fri, Oct 22, 2010 at 09:38:45AM +0300, Mika Westerberg wrote: > On Fri, Oct 22, 2010 at 08:28:26AM +0200, Baruch Siach wrote: > [...] > > > > Do you have any idea when was this bug introduced? Does it affect already > > release kernels other than .36? > > I've seen it at least on .35. The locking code itself was introduced > in commit: > > 56dd47098abe (ARM: make_coherent: fix problems with highpte, part 1) Which is present on .34-rc1. In this case adding stable at kernel.org to Cc would be nice. Thanks, baruch -- ~. .~ Tk Open Systems =}------------------------------------------------ooO--U--Ooo------------{= - baruch at tkos.co.il - tel: +972.2.679.5364, http://www.tkos.co.il - ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH RESEND] ARM: fix spinlock recursion in adjust_pte() 2010-10-22 6:42 ` Baruch Siach @ 2010-10-22 7:08 ` Mika Westerberg 2010-10-22 8:01 ` Baruch Siach 0 siblings, 1 reply; 8+ messages in thread From: Mika Westerberg @ 2010-10-22 7:08 UTC (permalink / raw) To: linux-arm-kernel On Fri, Oct 22, 2010 at 08:42:05AM +0200, Baruch Siach wrote: > Hi Mika, > > On Fri, Oct 22, 2010 at 09:38:45AM +0300, Mika Westerberg wrote: > > On Fri, Oct 22, 2010 at 08:28:26AM +0200, Baruch Siach wrote: > > [...] > > > > > > Do you have any idea when was this bug introduced? Does it affect already > > > release kernels other than .36? > > > > I've seen it at least on .35. The locking code itself was introduced > > in commit: > > > > 56dd47098abe (ARM: make_coherent: fix problems with highpte, part 1) > > Which is present on .34-rc1. In this case adding stable at kernel.org to Cc would > be nice. Ok, thanks. I'm not familiar with the stable rules and it wasn't immediately clear from the Documentation/stable_kernel_rules.txt: should I resend this patch with added 'Cc:' or is it enough just to add Cc into followup mail? Thanks, MW ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH RESEND] ARM: fix spinlock recursion in adjust_pte() 2010-10-22 7:08 ` Mika Westerberg @ 2010-10-22 8:01 ` Baruch Siach 0 siblings, 0 replies; 8+ messages in thread From: Baruch Siach @ 2010-10-22 8:01 UTC (permalink / raw) To: linux-arm-kernel Hi Mika, On Fri, Oct 22, 2010 at 10:08:47AM +0300, Mika Westerberg wrote: > On Fri, Oct 22, 2010 at 08:42:05AM +0200, Baruch Siach wrote: > > Hi Mika, > > > > On Fri, Oct 22, 2010 at 09:38:45AM +0300, Mika Westerberg wrote: > > > On Fri, Oct 22, 2010 at 08:28:26AM +0200, Baruch Siach wrote: > > > [...] > > > > > > > > Do you have any idea when was this bug introduced? Does it affect already > > > > release kernels other than .36? > > > > > > I've seen it at least on .35. The locking code itself was introduced > > > in commit: > > > > > > 56dd47098abe (ARM: make_coherent: fix problems with highpte, part 1) > > > > Which is present on .34-rc1. In this case adding stable at kernel.org to Cc would > > be nice. > > Ok, thanks. > > I'm not familiar with the stable rules and it wasn't immediately > clear from the Documentation/stable_kernel_rules.txt: should I resend > this patch with added 'Cc:' or is it enough just to add Cc into followup > mail? Patch with 'Cc: stable at kernel.org' line in its commit log is automatically forwarded to the stable team once this patch hits Linus' tree. Since this patch should go through Russell's tree, you can add this line when posting to his patch tracker. I guess that adding some info in the commit log on the history of this bug should help the stable team in the process. baruch -- ~. .~ Tk Open Systems =}------------------------------------------------ooO--U--Ooo------------{= - baruch at tkos.co.il - tel: +972.2.679.5364, http://www.tkos.co.il - ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2010-10-22 8:01 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-10-21 14:12 BUG: spinlock recursion on CPU#0 Alexander Stein 2010-10-21 16:44 ` Mika Westerberg 2010-10-21 17:09 ` [PATCH RESEND] ARM: fix spinlock recursion in adjust_pte() Mika Westerberg 2010-10-22 6:28 ` Baruch Siach 2010-10-22 6:38 ` Mika Westerberg 2010-10-22 6:42 ` Baruch Siach 2010-10-22 7:08 ` Mika Westerberg 2010-10-22 8:01 ` Baruch Siach
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox