linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: masterzorag <masterzorag@gmail.com>
Cc: linuxppc-dev@lists.ozlabs.org, Al Viro <viro@zeniv.linux.org.uk>,
	Arnd Bergmann <arnd@arndb.de>
Subject: Re: [PATCH] spufs raises two exceptions
Date: Wed, 07 Mar 2012 14:49:50 +1100	[thread overview]
Message-ID: <1331092190.3105.3.camel@pasglop> (raw)
In-Reply-To: <4F55D84B.7030306@gmail.com>

On Tue, 2012-03-06 at 10:26 +0100, masterzorag wrote:
> I'm running my test program, it uses all available spus to compute via 
> OpenCL
> kernel 3.2.5 on a ps3
> even on testing spu directly, it crashes

I think the patch is not 100% right yet. Looking at the code, we
have a real mess of who gets to clean what up here. This is an
attempt at sorting things by having the mutex and dentry dropped
in spufs_create() always. Can you give it a spin (untested):

Al, I'm not familiar with the vfs, can you take a quick look ?

Thanks !

Cheers,
Ben.


> 
> =====================================
> [ BUG: bad unlock balance detected! ]
> -------------------------------------
> test/1067 is trying to release lock (&sb->s_type->i_mutex_key) at:
> [<d0000000005828a8>] .do_spu_create+0x90/0xd8 [spufs]
> but there are no more locks to release!
> other info that might help us debug this:
> no locks held by test/1067.
> stack backtrace:
> Call Trace:
> [c00000000e9bfa30] [c0000000000110d0] .show_stack+0x6c/0x16c (unreliable)
> [c00000000e9bfae0] [c000000000081f90] .print_unlock_inbalance_bug+0xe8/0x110
> [c00000000e9bfb70] [c0000000000868cc] .lock_release+0xd8/0x200
> [c00000000e9bfc10] [c0000000003efb60] .__mutex_unlock_slowpath+0x11c/0x1d8
> [c00000000e9bfcb0] [d0000000005828a8] .do_spu_create+0x90/0xd8 [spufs]
> [c00000000e9bfd70] [c0000000000346ac] .sys_spu_create+0x164/0x1c0
> [c00000000e9bfe30] [c0000000000097d8] syscall_exit+0x0/0x40
> ------------[ cut here ]------------
> kernel BUG at fs/dcache.c:474!
> Oops: Exception in kernel mode, sig: 5 [#1]
> SMP NR_CPUS=2 NUMA PS3
> Modules linked in: spufs dm_mod btusb bluetooth usb_storage ohci_hcd 
> snd_ps3 ehci_hcd snd_pcm snd_page_alloc snd_timer sg snd usbcore 
> usb_common ps3flash rtc_ps3 soundcore ps3_lpm ps3vram [last unloaded: 
> scsi_wait_scan]
> NIP: c000000000109f94 LR: c000000000109f84 CTR: c0000000000a029c
> REGS: c00000000e9bf930 TRAP: 0700 Not tainted (3.2.5)
> MSR: 8000000000028032 <EE,CE,IR,DR> CR: 22004822 XER: 00000000
> TASK = c0000000062f0ec0[1067] 'test' THREAD: c00000000e9bc000 CPU: 1
> GPR00: 0000000000000001 c00000000e9bfbb0 c0000000006812e8 c00000000543b798
> GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000002
> GPR08: 0000000000000000 0000000000000000 c000000000109f84 c0000000062f0ec0
> GPR12: 0000000082004824 c000000007ffe280 0000000000000004 00000000f7850688
> GPR16: 00000000f7830734 00000000f78517a4 00000000f7852008 00000000f78517a8
> GPR20: 00000000ff805dc0 000000000fd958a0 0000000000000000 000000000000000d
> GPR24: 000000000fd98240 c00000000e101e10 0000000040000010 c00000000616e080
> GPR28: c00000000543b738 c00000000543b798 c0000000006149e8 c00000000543b738
> NIP [c000000000109f94] .dput+0x48/0x214
> LR [c000000000109f84] .dput+0x38/0x214
> Call Trace:
> [c00000000e9bfbb0] [c000000000109f84] .dput+0x38/0x214 (unreliable)
> [c00000000e9bfc50] [c0000000000f1740] .fput+0x24c/0x288
> [c00000000e9bfd00] [c0000000000ed708] .filp_close+0xbc/0xe4
> [c00000000e9bfd90] [c0000000000ed800] .SyS_close+0xd0/0x128
> [c00000000e9bfe30] [c0000000000097d8] syscall_exit+0x0/0x40
> Instruction dump:
> fb61ffd8 fb81ffe0 fba1ffe8 f821ff61 418201c8 3bbf0060 7fa3eb78 482e7f31
> 60000000 813f0058 7d200074 7800d182 <0b000000> 2b890001 409d0010 3809ffff
> ---[ end trace c337aad05d94532f ]---
> ------------[ cut here ]------------
> kernel BUG at fs/dcache.c:474!
> Oops: Exception in kernel mode, sig: 5 [#2]
> SMP NR_CPUS=2 NUMA PS3
> Modules linked in: spufs dm_mod btusb bluetooth usb_storage ohci_hcd 
> snd_ps3 ehci_hcd snd_pcm snd_page_alloc snd_timer sg snd usbcore 
> usb_common ps3flash rtc_ps3 soundcore ps3_lpm ps3vram [last unloaded: 
> scsi_wait_scan]
> NIP: c000000000109f94 LR: c000000000109f84 CTR: c0000000000a029c
> REGS: c00000000e9bec20 TRAP: 0700 Tainted: G D (3.2.5)
> MSR: 8000000000028032 <EE,CE,IR,DR> CR: 22004822 XER: 00000000
> TASK = c0000000062f0ec0[1067] 'test' THREAD: c00000000e9bc000 CPU: 1
> GPR00: 0000000000000001 c00000000e9beea0 c0000000006812e8 c0000000054361c8
> GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000002
> GPR08: 0000000000000000 0000000000000000 c000000000109f84 c0000000062f0ec0
> GPR12: 0000000042004824 c000000007ffe280 0000000000000004 00000000f7850688
> GPR16: 00000000f7830734 00000000f78517a4 00000000f7852008 00000000f78517a8
> GPR20: 00000000ff805dc0 000000000fd958a0 0000000000000000 0000000000000001
> GPR24: 000000000fd98240 c00000000e9b2390 0000000000000008 c0000000062bd010
> GPR28: c000000005436168 c0000000054361c8 c0000000006149e8 c000000005436168
> NIP [c000000000109f94] .dput+0x48/0x214
> LR [c000000000109f84] .dput+0x38/0x214
> Call Trace:
> [c00000000e9beea0] [c000000000109f84] .dput+0x38/0x214 (unreliable)
> [c00000000e9bef40] [c0000000000f1740] .fput+0x24c/0x288
> [c00000000e9beff0] [c0000000000c93a8] .remove_vma+0x68/0xcc
> [c00000000e9bf080] [c0000000000c951c] .exit_mmap+0x110/0x14c
> [c00000000e9bf1a0] [c00000000004b4c8] .mmput+0x5c/0x13c
> [c00000000e9bf230] [d00000000058237c] .spu_forget+0x54/0x7c [spufs]
> [c00000000e9bf2c0] [d00000000057c294] .spufs_dir_close+0x8c/0xc8 [spufs]
> [c00000000e9bf370] [c0000000000f166c] .fput+0x178/0x288
> [c00000000e9bf420] [c0000000000ed708] .filp_close+0xbc/0xe4
> [c00000000e9bf4b0] [c000000000050294] .put_files_struct+0xf4/0x1b8
> [c00000000e9bf560] [c0000000000520bc] .do_exit+0x23c/0x6f4
> [c00000000e9bf660] [c00000000001922c] .die+0x274/0x2a4
> [c00000000e9bf700] [c000000000019640] ._exception+0x88/0x17c
> [c00000000e9bf8c0] [c000000000005314] program_check_common+0x114/0x180
> --- Exception: 700 at .dput+0x48/0x214
> LR = .dput+0x38/0x214
> [c00000000e9bfc50] [c0000000000f1740] .fput+0x24c/0x288
> [c00000000e9bfd00] [c0000000000ed708] .filp_close+0xbc/0xe4
> [c00000000e9bfd90] [c0000000000ed800] .SyS_close+0xd0/0x128
> [c00000000e9bfe30] [c0000000000097d8] syscall_exit+0x0/0x40
> Instruction dump:
> fb61ffd8 fb81ffe0 fba1ffe8 f821ff61 418201c8 3bbf0060 7fa3eb78 482e7f31
> 60000000 813f0058 7d200074 7800d182 <0b000000> 2b890001 409d0010 3809ffff
> ---[ end trace c337aad05d945330 ]---
> Fixing recursive fault but reboot is needed!
> 
> First time, the mutex gets unlocked in spufs_create_context, then the 
> second time in do_spu_create.
> It seems that SPU main directory dentry has invalid d_count.
> 
> 
> This patch fixes all, OpenCL is running fine, testing spe runs without 
> issues.
> 
> --- arch/powerpc/platforms/cell/spufs/syscalls.c
> +++ arch/powerpc/platforms/cell/spufs/syscalls.c.new
> @@ -70,8 +70,8 @@
>       ret = PTR_ERR(dentry);
>       if (!IS_ERR(dentry)) {
>           ret = spufs_create(&path, dentry, flags, mode, neighbor);
> -        mutex_unlock(&path.dentry->d_inode->i_mutex);
> -        dput(dentry);
> +        if (ret < 0)
> +            dput(dentry);
>           path_put(&path);
>       }
> 
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev

  reply	other threads:[~2012-03-07  3:49 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-06  9:26 [PATCH] spufs raises two exceptions masterzorag
2012-03-07  3:49 ` Benjamin Herrenschmidt [this message]
2012-03-07  3:51   ` Benjamin Herrenschmidt
2012-03-07 12:48     ` Arnd Bergmann
2012-03-07 21:01       ` Benjamin Herrenschmidt
2012-03-07 21:23         ` Al Viro
2012-03-07 22:32           ` Arnd Bergmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1331092190.3105.3.camel@pasglop \
    --to=benh@kernel.crashing.org \
    --cc=arnd@arndb.de \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=masterzorag@gmail.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).