* [PATCH] spufs raises two exceptions
@ 2012-03-06 9:26 masterzorag
2012-03-07 3:49 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 7+ messages in thread
From: masterzorag @ 2012-03-06 9:26 UTC (permalink / raw)
To: linuxppc-dev
I'm running my test program, it uses all available spus to compute via
OpenCL
kernel 3.2.5 on a ps3
even on testing spu directly, it crashes
=====================================
[ BUG: bad unlock balance detected! ]
-------------------------------------
test/1067 is trying to release lock (&sb->s_type->i_mutex_key) at:
[<d0000000005828a8>] .do_spu_create+0x90/0xd8 [spufs]
but there are no more locks to release!
other info that might help us debug this:
no locks held by test/1067.
stack backtrace:
Call Trace:
[c00000000e9bfa30] [c0000000000110d0] .show_stack+0x6c/0x16c (unreliable)
[c00000000e9bfae0] [c000000000081f90] .print_unlock_inbalance_bug+0xe8/0x110
[c00000000e9bfb70] [c0000000000868cc] .lock_release+0xd8/0x200
[c00000000e9bfc10] [c0000000003efb60] .__mutex_unlock_slowpath+0x11c/0x1d8
[c00000000e9bfcb0] [d0000000005828a8] .do_spu_create+0x90/0xd8 [spufs]
[c00000000e9bfd70] [c0000000000346ac] .sys_spu_create+0x164/0x1c0
[c00000000e9bfe30] [c0000000000097d8] syscall_exit+0x0/0x40
------------[ cut here ]------------
kernel BUG at fs/dcache.c:474!
Oops: Exception in kernel mode, sig: 5 [#1]
SMP NR_CPUS=2 NUMA PS3
Modules linked in: spufs dm_mod btusb bluetooth usb_storage ohci_hcd
snd_ps3 ehci_hcd snd_pcm snd_page_alloc snd_timer sg snd usbcore
usb_common ps3flash rtc_ps3 soundcore ps3_lpm ps3vram [last unloaded:
scsi_wait_scan]
NIP: c000000000109f94 LR: c000000000109f84 CTR: c0000000000a029c
REGS: c00000000e9bf930 TRAP: 0700 Not tainted (3.2.5)
MSR: 8000000000028032 <EE,CE,IR,DR> CR: 22004822 XER: 00000000
TASK = c0000000062f0ec0[1067] 'test' THREAD: c00000000e9bc000 CPU: 1
GPR00: 0000000000000001 c00000000e9bfbb0 c0000000006812e8 c00000000543b798
GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000002
GPR08: 0000000000000000 0000000000000000 c000000000109f84 c0000000062f0ec0
GPR12: 0000000082004824 c000000007ffe280 0000000000000004 00000000f7850688
GPR16: 00000000f7830734 00000000f78517a4 00000000f7852008 00000000f78517a8
GPR20: 00000000ff805dc0 000000000fd958a0 0000000000000000 000000000000000d
GPR24: 000000000fd98240 c00000000e101e10 0000000040000010 c00000000616e080
GPR28: c00000000543b738 c00000000543b798 c0000000006149e8 c00000000543b738
NIP [c000000000109f94] .dput+0x48/0x214
LR [c000000000109f84] .dput+0x38/0x214
Call Trace:
[c00000000e9bfbb0] [c000000000109f84] .dput+0x38/0x214 (unreliable)
[c00000000e9bfc50] [c0000000000f1740] .fput+0x24c/0x288
[c00000000e9bfd00] [c0000000000ed708] .filp_close+0xbc/0xe4
[c00000000e9bfd90] [c0000000000ed800] .SyS_close+0xd0/0x128
[c00000000e9bfe30] [c0000000000097d8] syscall_exit+0x0/0x40
Instruction dump:
fb61ffd8 fb81ffe0 fba1ffe8 f821ff61 418201c8 3bbf0060 7fa3eb78 482e7f31
60000000 813f0058 7d200074 7800d182 <0b000000> 2b890001 409d0010 3809ffff
---[ end trace c337aad05d94532f ]---
------------[ cut here ]------------
kernel BUG at fs/dcache.c:474!
Oops: Exception in kernel mode, sig: 5 [#2]
SMP NR_CPUS=2 NUMA PS3
Modules linked in: spufs dm_mod btusb bluetooth usb_storage ohci_hcd
snd_ps3 ehci_hcd snd_pcm snd_page_alloc snd_timer sg snd usbcore
usb_common ps3flash rtc_ps3 soundcore ps3_lpm ps3vram [last unloaded:
scsi_wait_scan]
NIP: c000000000109f94 LR: c000000000109f84 CTR: c0000000000a029c
REGS: c00000000e9bec20 TRAP: 0700 Tainted: G D (3.2.5)
MSR: 8000000000028032 <EE,CE,IR,DR> CR: 22004822 XER: 00000000
TASK = c0000000062f0ec0[1067] 'test' THREAD: c00000000e9bc000 CPU: 1
GPR00: 0000000000000001 c00000000e9beea0 c0000000006812e8 c0000000054361c8
GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000002
GPR08: 0000000000000000 0000000000000000 c000000000109f84 c0000000062f0ec0
GPR12: 0000000042004824 c000000007ffe280 0000000000000004 00000000f7850688
GPR16: 00000000f7830734 00000000f78517a4 00000000f7852008 00000000f78517a8
GPR20: 00000000ff805dc0 000000000fd958a0 0000000000000000 0000000000000001
GPR24: 000000000fd98240 c00000000e9b2390 0000000000000008 c0000000062bd010
GPR28: c000000005436168 c0000000054361c8 c0000000006149e8 c000000005436168
NIP [c000000000109f94] .dput+0x48/0x214
LR [c000000000109f84] .dput+0x38/0x214
Call Trace:
[c00000000e9beea0] [c000000000109f84] .dput+0x38/0x214 (unreliable)
[c00000000e9bef40] [c0000000000f1740] .fput+0x24c/0x288
[c00000000e9beff0] [c0000000000c93a8] .remove_vma+0x68/0xcc
[c00000000e9bf080] [c0000000000c951c] .exit_mmap+0x110/0x14c
[c00000000e9bf1a0] [c00000000004b4c8] .mmput+0x5c/0x13c
[c00000000e9bf230] [d00000000058237c] .spu_forget+0x54/0x7c [spufs]
[c00000000e9bf2c0] [d00000000057c294] .spufs_dir_close+0x8c/0xc8 [spufs]
[c00000000e9bf370] [c0000000000f166c] .fput+0x178/0x288
[c00000000e9bf420] [c0000000000ed708] .filp_close+0xbc/0xe4
[c00000000e9bf4b0] [c000000000050294] .put_files_struct+0xf4/0x1b8
[c00000000e9bf560] [c0000000000520bc] .do_exit+0x23c/0x6f4
[c00000000e9bf660] [c00000000001922c] .die+0x274/0x2a4
[c00000000e9bf700] [c000000000019640] ._exception+0x88/0x17c
[c00000000e9bf8c0] [c000000000005314] program_check_common+0x114/0x180
--- Exception: 700 at .dput+0x48/0x214
LR = .dput+0x38/0x214
[c00000000e9bfc50] [c0000000000f1740] .fput+0x24c/0x288
[c00000000e9bfd00] [c0000000000ed708] .filp_close+0xbc/0xe4
[c00000000e9bfd90] [c0000000000ed800] .SyS_close+0xd0/0x128
[c00000000e9bfe30] [c0000000000097d8] syscall_exit+0x0/0x40
Instruction dump:
fb61ffd8 fb81ffe0 fba1ffe8 f821ff61 418201c8 3bbf0060 7fa3eb78 482e7f31
60000000 813f0058 7d200074 7800d182 <0b000000> 2b890001 409d0010 3809ffff
---[ end trace c337aad05d945330 ]---
Fixing recursive fault but reboot is needed!
First time, the mutex gets unlocked in spufs_create_context, then the
second time in do_spu_create.
It seems that SPU main directory dentry has invalid d_count.
This patch fixes all, OpenCL is running fine, testing spe runs without
issues.
--- arch/powerpc/platforms/cell/spufs/syscalls.c
+++ arch/powerpc/platforms/cell/spufs/syscalls.c.new
@@ -70,8 +70,8 @@
ret = PTR_ERR(dentry);
if (!IS_ERR(dentry)) {
ret = spufs_create(&path, dentry, flags, mode, neighbor);
- mutex_unlock(&path.dentry->d_inode->i_mutex);
- dput(dentry);
+ if (ret < 0)
+ dput(dentry);
path_put(&path);
}
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] spufs raises two exceptions
2012-03-06 9:26 [PATCH] spufs raises two exceptions masterzorag
@ 2012-03-07 3:49 ` Benjamin Herrenschmidt
2012-03-07 3:51 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 7+ messages in thread
From: Benjamin Herrenschmidt @ 2012-03-07 3:49 UTC (permalink / raw)
To: masterzorag; +Cc: linuxppc-dev, Al Viro, Arnd Bergmann
On Tue, 2012-03-06 at 10:26 +0100, masterzorag wrote:
> I'm running my test program, it uses all available spus to compute via
> OpenCL
> kernel 3.2.5 on a ps3
> even on testing spu directly, it crashes
I think the patch is not 100% right yet. Looking at the code, we
have a real mess of who gets to clean what up here. This is an
attempt at sorting things by having the mutex and dentry dropped
in spufs_create() always. Can you give it a spin (untested):
Al, I'm not familiar with the vfs, can you take a quick look ?
Thanks !
Cheers,
Ben.
>
> =====================================
> [ BUG: bad unlock balance detected! ]
> -------------------------------------
> test/1067 is trying to release lock (&sb->s_type->i_mutex_key) at:
> [<d0000000005828a8>] .do_spu_create+0x90/0xd8 [spufs]
> but there are no more locks to release!
> other info that might help us debug this:
> no locks held by test/1067.
> stack backtrace:
> Call Trace:
> [c00000000e9bfa30] [c0000000000110d0] .show_stack+0x6c/0x16c (unreliable)
> [c00000000e9bfae0] [c000000000081f90] .print_unlock_inbalance_bug+0xe8/0x110
> [c00000000e9bfb70] [c0000000000868cc] .lock_release+0xd8/0x200
> [c00000000e9bfc10] [c0000000003efb60] .__mutex_unlock_slowpath+0x11c/0x1d8
> [c00000000e9bfcb0] [d0000000005828a8] .do_spu_create+0x90/0xd8 [spufs]
> [c00000000e9bfd70] [c0000000000346ac] .sys_spu_create+0x164/0x1c0
> [c00000000e9bfe30] [c0000000000097d8] syscall_exit+0x0/0x40
> ------------[ cut here ]------------
> kernel BUG at fs/dcache.c:474!
> Oops: Exception in kernel mode, sig: 5 [#1]
> SMP NR_CPUS=2 NUMA PS3
> Modules linked in: spufs dm_mod btusb bluetooth usb_storage ohci_hcd
> snd_ps3 ehci_hcd snd_pcm snd_page_alloc snd_timer sg snd usbcore
> usb_common ps3flash rtc_ps3 soundcore ps3_lpm ps3vram [last unloaded:
> scsi_wait_scan]
> NIP: c000000000109f94 LR: c000000000109f84 CTR: c0000000000a029c
> REGS: c00000000e9bf930 TRAP: 0700 Not tainted (3.2.5)
> MSR: 8000000000028032 <EE,CE,IR,DR> CR: 22004822 XER: 00000000
> TASK = c0000000062f0ec0[1067] 'test' THREAD: c00000000e9bc000 CPU: 1
> GPR00: 0000000000000001 c00000000e9bfbb0 c0000000006812e8 c00000000543b798
> GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000002
> GPR08: 0000000000000000 0000000000000000 c000000000109f84 c0000000062f0ec0
> GPR12: 0000000082004824 c000000007ffe280 0000000000000004 00000000f7850688
> GPR16: 00000000f7830734 00000000f78517a4 00000000f7852008 00000000f78517a8
> GPR20: 00000000ff805dc0 000000000fd958a0 0000000000000000 000000000000000d
> GPR24: 000000000fd98240 c00000000e101e10 0000000040000010 c00000000616e080
> GPR28: c00000000543b738 c00000000543b798 c0000000006149e8 c00000000543b738
> NIP [c000000000109f94] .dput+0x48/0x214
> LR [c000000000109f84] .dput+0x38/0x214
> Call Trace:
> [c00000000e9bfbb0] [c000000000109f84] .dput+0x38/0x214 (unreliable)
> [c00000000e9bfc50] [c0000000000f1740] .fput+0x24c/0x288
> [c00000000e9bfd00] [c0000000000ed708] .filp_close+0xbc/0xe4
> [c00000000e9bfd90] [c0000000000ed800] .SyS_close+0xd0/0x128
> [c00000000e9bfe30] [c0000000000097d8] syscall_exit+0x0/0x40
> Instruction dump:
> fb61ffd8 fb81ffe0 fba1ffe8 f821ff61 418201c8 3bbf0060 7fa3eb78 482e7f31
> 60000000 813f0058 7d200074 7800d182 <0b000000> 2b890001 409d0010 3809ffff
> ---[ end trace c337aad05d94532f ]---
> ------------[ cut here ]------------
> kernel BUG at fs/dcache.c:474!
> Oops: Exception in kernel mode, sig: 5 [#2]
> SMP NR_CPUS=2 NUMA PS3
> Modules linked in: spufs dm_mod btusb bluetooth usb_storage ohci_hcd
> snd_ps3 ehci_hcd snd_pcm snd_page_alloc snd_timer sg snd usbcore
> usb_common ps3flash rtc_ps3 soundcore ps3_lpm ps3vram [last unloaded:
> scsi_wait_scan]
> NIP: c000000000109f94 LR: c000000000109f84 CTR: c0000000000a029c
> REGS: c00000000e9bec20 TRAP: 0700 Tainted: G D (3.2.5)
> MSR: 8000000000028032 <EE,CE,IR,DR> CR: 22004822 XER: 00000000
> TASK = c0000000062f0ec0[1067] 'test' THREAD: c00000000e9bc000 CPU: 1
> GPR00: 0000000000000001 c00000000e9beea0 c0000000006812e8 c0000000054361c8
> GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000002
> GPR08: 0000000000000000 0000000000000000 c000000000109f84 c0000000062f0ec0
> GPR12: 0000000042004824 c000000007ffe280 0000000000000004 00000000f7850688
> GPR16: 00000000f7830734 00000000f78517a4 00000000f7852008 00000000f78517a8
> GPR20: 00000000ff805dc0 000000000fd958a0 0000000000000000 0000000000000001
> GPR24: 000000000fd98240 c00000000e9b2390 0000000000000008 c0000000062bd010
> GPR28: c000000005436168 c0000000054361c8 c0000000006149e8 c000000005436168
> NIP [c000000000109f94] .dput+0x48/0x214
> LR [c000000000109f84] .dput+0x38/0x214
> Call Trace:
> [c00000000e9beea0] [c000000000109f84] .dput+0x38/0x214 (unreliable)
> [c00000000e9bef40] [c0000000000f1740] .fput+0x24c/0x288
> [c00000000e9beff0] [c0000000000c93a8] .remove_vma+0x68/0xcc
> [c00000000e9bf080] [c0000000000c951c] .exit_mmap+0x110/0x14c
> [c00000000e9bf1a0] [c00000000004b4c8] .mmput+0x5c/0x13c
> [c00000000e9bf230] [d00000000058237c] .spu_forget+0x54/0x7c [spufs]
> [c00000000e9bf2c0] [d00000000057c294] .spufs_dir_close+0x8c/0xc8 [spufs]
> [c00000000e9bf370] [c0000000000f166c] .fput+0x178/0x288
> [c00000000e9bf420] [c0000000000ed708] .filp_close+0xbc/0xe4
> [c00000000e9bf4b0] [c000000000050294] .put_files_struct+0xf4/0x1b8
> [c00000000e9bf560] [c0000000000520bc] .do_exit+0x23c/0x6f4
> [c00000000e9bf660] [c00000000001922c] .die+0x274/0x2a4
> [c00000000e9bf700] [c000000000019640] ._exception+0x88/0x17c
> [c00000000e9bf8c0] [c000000000005314] program_check_common+0x114/0x180
> --- Exception: 700 at .dput+0x48/0x214
> LR = .dput+0x38/0x214
> [c00000000e9bfc50] [c0000000000f1740] .fput+0x24c/0x288
> [c00000000e9bfd00] [c0000000000ed708] .filp_close+0xbc/0xe4
> [c00000000e9bfd90] [c0000000000ed800] .SyS_close+0xd0/0x128
> [c00000000e9bfe30] [c0000000000097d8] syscall_exit+0x0/0x40
> Instruction dump:
> fb61ffd8 fb81ffe0 fba1ffe8 f821ff61 418201c8 3bbf0060 7fa3eb78 482e7f31
> 60000000 813f0058 7d200074 7800d182 <0b000000> 2b890001 409d0010 3809ffff
> ---[ end trace c337aad05d945330 ]---
> Fixing recursive fault but reboot is needed!
>
> First time, the mutex gets unlocked in spufs_create_context, then the
> second time in do_spu_create.
> It seems that SPU main directory dentry has invalid d_count.
>
>
> This patch fixes all, OpenCL is running fine, testing spe runs without
> issues.
>
> --- arch/powerpc/platforms/cell/spufs/syscalls.c
> +++ arch/powerpc/platforms/cell/spufs/syscalls.c.new
> @@ -70,8 +70,8 @@
> ret = PTR_ERR(dentry);
> if (!IS_ERR(dentry)) {
> ret = spufs_create(&path, dentry, flags, mode, neighbor);
> - mutex_unlock(&path.dentry->d_inode->i_mutex);
> - dput(dentry);
> + if (ret < 0)
> + dput(dentry);
> path_put(&path);
> }
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] spufs raises two exceptions
2012-03-07 3:49 ` Benjamin Herrenschmidt
@ 2012-03-07 3:51 ` Benjamin Herrenschmidt
2012-03-07 12:48 ` Arnd Bergmann
0 siblings, 1 reply; 7+ messages in thread
From: Benjamin Herrenschmidt @ 2012-03-07 3:51 UTC (permalink / raw)
To: masterzorag; +Cc: linuxppc-dev, Al Viro, Arnd Bergmann
On Wed, 2012-03-07 at 14:49 +1100, Benjamin Herrenschmidt wrote:
> On Tue, 2012-03-06 at 10:26 +0100, masterzorag wrote:
> > I'm running my test program, it uses all available spus to compute via
> > OpenCL
> > kernel 3.2.5 on a ps3
> > even on testing spu directly, it crashes
>
> I think the patch is not 100% right yet. Looking at the code, we
> have a real mess of who gets to clean what up here. This is an
> attempt at sorting things by having the mutex and dentry dropped
> in spufs_create() always. Can you give it a spin (untested):
>
> Al, I'm not familiar with the vfs, can you take a quick look ?
Better with the actual patch :-)
powerpc/cell: Fix locking in spufs_create()
The error path in spufs_create() could cause double unlocks
among other horrors. Clean it up.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reported-by: masterzorag@gmail.com
diff --git a/arch/powerpc/platforms/cell/spufs/inode.c b/arch/powerpc/platforms/cell/spufs/inode.c
index d4a094c..63b4e43 100644
--- a/arch/powerpc/platforms/cell/spufs/inode.c
+++ b/arch/powerpc/platforms/cell/spufs/inode.c
@@ -454,19 +454,16 @@ spufs_create_context(struct inode *inode, struct dentry *dentry,
struct spu_gang *gang;
struct spu_context *neighbor;
- ret = -EPERM;
if ((flags & SPU_CREATE_NOSCHED) &&
- !capable(CAP_SYS_NICE))
- goto out_unlock;
+ !capable(CAP_SYa_NICE))
+ return -EPERM;
- ret = -EINVAL;
if ((flags & (SPU_CREATE_NOSCHED | SPU_CREATE_ISOLATE))
== SPU_CREATE_ISOLATE)
- goto out_unlock;
+ return -EINVAL;
- ret = -ENODEV;
if ((flags & SPU_CREATE_ISOLATE) && !isolated_loader)
- goto out_unlock;
+ return -ENODEV;
gang = NULL;
neighbor = NULL;
@@ -512,10 +509,6 @@ spufs_create_context(struct inode *inode, struct dentry *dentry,
out_aff_unlock:
if (affinity)
mutex_unlock(&gang->aff_mutex);
-out_unlock:
- mutex_unlock(&inode->i_mutex);
-out:
- dput(dentry);
return ret;
}
@@ -585,11 +578,9 @@ static int spufs_create_gang(struct inode *inode,
struct dentry *dentry,
struct vfsmount *mnt, umode_t mode)
{
- int ret;
-
- ret = spufs_mkgang(inode, dentry, mode & S_IRWXUGO);
+ int ret = spufs_mkgang(inode, dentry, mode & S_IRWXUGO);
if (ret)
- goto out;
+ return ret;
/*
* get references for dget and mntget, will be released
@@ -600,10 +591,6 @@ static int spufs_create_gang(struct inode *inode,
int err = simple_rmdir(inode, dentry);
WARN_ON(err);
}
-
-out:
- mutex_unlock(&inode->i_mutex);
- dput(dentry);
return ret;
}
@@ -613,22 +600,21 @@ static struct file_system_type spufs_type;
long spufs_create(struct path *path, struct dentry *dentry,
unsigned int flags, umode_t mode, struct file *filp)
{
- int ret;
+ int ret = -EINVAL;
- ret = -EINVAL;
/* check if we are on spufs */
if (path->dentry->d_sb->s_type != &spufs_type)
- goto out;
+ goto fail;
/* don't accept undefined flags */
if (flags & (~SPU_CREATE_FLAG_ALL))
- goto out;
+ goto fail;
/* only threads can be underneath a gang */
if (path->dentry != path->dentry->d_sb->s_root) {
if ((flags & SPU_CREATE_GANG) ||
!SPUFS_I(path->dentry->d_inode)->i_gang)
- goto out;
+ goto fail;
}
mode &= ~current_umask();
@@ -640,12 +626,17 @@ long spufs_create(struct path *path, struct dentry *dentry,
ret = spufs_create_context(path->dentry->d_inode,
dentry, path->mnt, flags, mode,
filp);
- if (ret >= 0)
+ if (ret >= 0) {
+ /* We drop these before fsnotify */
+ mutex_unlock(&inode->i_mutex);
+ dput(dentry);
fsnotify_mkdir(path->dentry->d_inode, dentry);
- return ret;
-out:
- mutex_unlock(&path->dentry->d_inode->i_mutex);
+ return ret;
+ }
+ fail:
+ mutex_unlock(&inode->i_mutex);
+ dput(dentry);
return ret;
}
diff --git a/arch/powerpc/platforms/cell/spufs/syscalls.c b/arch/powerpc/platforms/cell/spufs/syscalls.c
index 8591bb6..1a65ef2 100644
--- a/arch/powerpc/platforms/cell/spufs/syscalls.c
+++ b/arch/powerpc/platforms/cell/spufs/syscalls.c
@@ -70,11 +70,8 @@ static long do_spu_create(const char __user *pathname, unsigned int flags,
ret = PTR_ERR(dentry);
if (!IS_ERR(dentry)) {
ret = spufs_create(&path, dentry, flags, mode, neighbor);
- mutex_unlock(&path.dentry->d_inode->i_mutex);
- dput(dentry);
path_put(&path);
}
-
return ret;
}
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] spufs raises two exceptions
2012-03-07 3:51 ` Benjamin Herrenschmidt
@ 2012-03-07 12:48 ` Arnd Bergmann
2012-03-07 21:01 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 7+ messages in thread
From: Arnd Bergmann @ 2012-03-07 12:48 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: linuxppc-dev, masterzorag, Al Viro
On Wednesday 07 March 2012, Benjamin Herrenschmidt wrote:
> On Wed, 2012-03-07 at 14:49 +1100, Benjamin Herrenschmidt wrote:
> > On Tue, 2012-03-06 at 10:26 +0100, masterzorag wrote:
> > > I'm running my test program, it uses all available spus to compute via
> > > OpenCL
> > > kernel 3.2.5 on a ps3
> > > even on testing spu directly, it crashes
> >
> > I think the patch is not 100% right yet. Looking at the code, we
> > have a real mess of who gets to clean what up here. This is an
> > attempt at sorting things by having the mutex and dentry dropped
> > in spufs_create() always. Can you give it a spin (untested):
> >
> > Al, I'm not familiar with the vfs, can you take a quick look ?
>
> Better with the actual patch :-)
>
> powerpc/cell: Fix locking in spufs_create()
>
> The error path in spufs_create() could cause double unlocks
> among other horrors. Clean it up.
>
> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Reported-by: masterzorag@gmail.com
>
> diff --git a/arch/powerpc/platforms/cell/spufs/inode.c b/arch/powerpc/platforms/cell/spufs/inode.c
> index d4a094c..63b4e43 100644
> --- a/arch/powerpc/platforms/cell/spufs/inode.c
> +++ b/arch/powerpc/platforms/cell/spufs/inode.c
> @@ -454,19 +454,16 @@ spufs_create_context(struct inode *inode, struct dentry *dentry,
> struct spu_gang *gang;
> struct spu_context *neighbor;
>
> - ret = -EPERM;
> if ((flags & SPU_CREATE_NOSCHED) &&
> - !capable(CAP_SYS_NICE))
> - goto out_unlock;
> + !capable(CAP_SYa_NICE))
^typo
> + return -EPERM;
>
> - ret = -EINVAL;
> if ((flags & (SPU_CREATE_NOSCHED | SPU_CREATE_ISOLATE))
> == SPU_CREATE_ISOLATE)
> - goto out_unlock;
> + return -EINVAL;
>
> - ret = -ENODEV;
> if ((flags & SPU_CREATE_ISOLATE) && !isolated_loader)
> - goto out_unlock;
> + return -ENODEV;
>
> gang = NULL;
> neighbor = NULL;
This mostly changes coding style, pointlessly.
> @@ -512,10 +509,6 @@ spufs_create_context(struct inode *inode, struct dentry *dentry,
> out_aff_unlock:
> if (affinity)
> mutex_unlock(&gang->aff_mutex);
> -out_unlock:
> - mutex_unlock(&inode->i_mutex);
> -out:
> - dput(dentry);
> return ret;
> }
The original intention of this was to always unlock in the error case. It
seems that Al changed this in 1ba10681 "switch do_spufs_create() to
user_path_create(), fix double-unlock" to never unlock early but always
unlock in do_spu_create, fixing a different bug, but it looks like
he forgot this one in the process.
The reason why we originally had the unlock in the leaf functions is to
avoid a problem with spu_forget(), which had to be called without
the i_mutex held to avoid deadlocks.
> @@ -600,10 +591,6 @@ static int spufs_create_gang(struct inode *inode,
> int err = simple_rmdir(inode, dentry);
> WARN_ON(err);
> }
> -
> -out:
> - mutex_unlock(&inode->i_mutex);
> - dput(dentry);
> return ret;
> }
Right, this obviously goes together with the one above,
> @@ -613,22 +600,21 @@ static struct file_system_type spufs_type;
> long spufs_create(struct path *path, struct dentry *dentry,
> unsigned int flags, umode_t mode, struct file *filp)
> {
> - int ret;
> + int ret = -EINVAL;
>
> - ret = -EINVAL;
> /* check if we are on spufs */
> if (path->dentry->d_sb->s_type != &spufs_type)
> - goto out;
> + goto fail;
>
> /* don't accept undefined flags */
> if (flags & (~SPU_CREATE_FLAG_ALL))
> - goto out;
> + goto fail;
>
> /* only threads can be underneath a gang */
> if (path->dentry != path->dentry->d_sb->s_root) {
> if ((flags & SPU_CREATE_GANG) ||
> !SPUFS_I(path->dentry->d_inode)->i_gang)
> - goto out;
> + goto fail;
> }
>
> mode &= ~current_umask();
These just change coding style, and not in a helpful way.
> @@ -640,12 +626,17 @@ long spufs_create(struct path *path, struct dentry *dentry,
> ret = spufs_create_context(path->dentry->d_inode,
> dentry, path->mnt, flags, mode,
> filp);
> - if (ret >= 0)
> + if (ret >= 0) {
> + /* We drop these before fsnotify */
> + mutex_unlock(&inode->i_mutex);
> + dput(dentry);
> fsnotify_mkdir(path->dentry->d_inode, dentry);
> - return ret;
>
> -out:
> - mutex_unlock(&path->dentry->d_inode->i_mutex);
> + return ret;
> + }
> + fail:
> + mutex_unlock(&inode->i_mutex);
> + dput(dentry);
> return ret;
> }
>
> diff --git a/arch/powerpc/platforms/cell/spufs/syscalls.c b/arch/powerpc/platforms/cell/spufs/syscalls.c
> index 8591bb6..1a65ef2 100644
> --- a/arch/powerpc/platforms/cell/spufs/syscalls.c
> +++ b/arch/powerpc/platforms/cell/spufs/syscalls.c
> @@ -70,11 +70,8 @@ static long do_spu_create(const char __user *pathname, unsigned int flags,
> ret = PTR_ERR(dentry);
> if (!IS_ERR(dentry)) {
> ret = spufs_create(&path, dentry, flags, mode, neighbor);
> - mutex_unlock(&path.dentry->d_inode->i_mutex);
> - dput(dentry);
> path_put(&path);
> }
> -
> return ret;
> }
This moves the unlock in front of the fsnotify_mkdir, where it was before Al's
change. This seems independent of the other change.
Arnd
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] spufs raises two exceptions
2012-03-07 12:48 ` Arnd Bergmann
@ 2012-03-07 21:01 ` Benjamin Herrenschmidt
2012-03-07 21:23 ` Al Viro
0 siblings, 1 reply; 7+ messages in thread
From: Benjamin Herrenschmidt @ 2012-03-07 21:01 UTC (permalink / raw)
To: Arnd Bergmann; +Cc: linuxppc-dev, masterzorag, Al Viro
> > diff --git a/arch/powerpc/platforms/cell/spufs/inode.c b/arch/powerpc/platforms/cell/spufs/inode.c
> > index d4a094c..63b4e43 100644
> > --- a/arch/powerpc/platforms/cell/spufs/inode.c
> > +++ b/arch/powerpc/platforms/cell/spufs/inode.c
> > @@ -454,19 +454,16 @@ spufs_create_context(struct inode *inode, struct dentry *dentry,
> > struct spu_gang *gang;
> > struct spu_context *neighbor;
> >
> > - ret = -EPERM;
> > if ((flags & SPU_CREATE_NOSCHED) &&
> > - !capable(CAP_SYS_NICE))
> > - goto out_unlock;
> > + !capable(CAP_SYa_NICE))
>
> ^typo
Odd, probably an emacs fart
> > + return -EPERM;
> >
> > - ret = -EINVAL;
> > if ((flags & (SPU_CREATE_NOSCHED | SPU_CREATE_ISOLATE))
> > == SPU_CREATE_ISOLATE)
> > - goto out_unlock;
> > + return -EINVAL;
> >
> > - ret = -ENODEV;
> > if ((flags & SPU_CREATE_ISOLATE) && !isolated_loader)
> > - goto out_unlock;
> > + return -ENODEV;
> >
> > gang = NULL;
> > neighbor = NULL;
>
> This mostly changes coding style, pointlessly.
How so ? it's no longer necessary to store ret and goto out since there
is no cleanup at all, which makes things smaller/simpler. I wouldn't
call that 'pointlessly'.
I tend to dislike the use of "goto xxxx" when the xxx: label does no
cleanup whatsoever.
> > @@ -512,10 +509,6 @@ spufs_create_context(struct inode *inode, struct dentry *dentry,
> > out_aff_unlock:
> > if (affinity)
> > mutex_unlock(&gang->aff_mutex);
> > -out_unlock:
> > - mutex_unlock(&inode->i_mutex);
> > -out:
> > - dput(dentry);
> > return ret;
> > }
>
> The original intention of this was to always unlock in the error case. It
> seems that Al changed this in 1ba10681 "switch do_spufs_create() to
> user_path_create(), fix double-unlock" to never unlock early but always
> unlock in do_spu_create, fixing a different bug, but it looks like
> he forgot this one in the process.
>
> The reason why we originally had the unlock in the leaf functions is to
> avoid a problem with spu_forget(), which had to be called without
> the i_mutex held to avoid deadlocks.
Ok. I assumed it was fnotify that was the problem...
> > @@ -600,10 +591,6 @@ static int spufs_create_gang(struct inode *inode,
> > int err = simple_rmdir(inode, dentry);
> > WARN_ON(err);
> > }
> > -
> > -out:
> > - mutex_unlock(&inode->i_mutex);
> > - dput(dentry);
> > return ret;
> > }
>
> Right, this obviously goes together with the one above,
Yes, whatever they do they should do the same thing.
> > @@ -613,22 +600,21 @@ static struct file_system_type spufs_type;
> > long spufs_create(struct path *path, struct dentry *dentry
> > unsigned int flags, umode_t mode, struct file *filp)
> > {
> > - int ret;
> > + int ret = -EINVAL;
> >
> > - ret = -EINVAL;
> > /* check if we are on spufs */
> > if (path->dentry->d_sb->s_type != &spufs_type)
> > - goto out;
> > + goto fail;
> >
> > /* don't accept undefined flags */
> > if (flags & (~SPU_CREATE_FLAG_ALL))
> > - goto out;
> > + goto fail;
> >
> > /* only threads can be underneath a gang */
> > if (path->dentry != path->dentry->d_sb->s_root) {
> > if ((flags & SPU_CREATE_GANG) ||
> > !SPUFS_I(path->dentry->d_inode)->i_gang)
> > - goto out;
> > + goto fail;
> > }
> >
> > mode &= ~current_umask();
>
> These just change coding style, and not in a helpful way.
Oh I just renamed the label, it's helpful because the label is
specifically only for the fail case, not the general exit path, hence
"fail" is a better naming than "out". It's about code clarity.
> > @@ -640,12 +626,17 @@ long spufs_create(struct path *path, struct dentry *dentry,
> > ret = spufs_create_context(path->dentry->d_inode,
> > dentry, path->mnt, flags, mode,
> > filp);
> > - if (ret >= 0)
> > + if (ret >= 0) {
> > + /* We drop these before fsnotify */
> > + mutex_unlock(&inode->i_mutex);
> > + dput(dentry);
> > fsnotify_mkdir(path->dentry->d_inode, dentry);
> > - return ret;
> >
> > -out:
> > - mutex_unlock(&path->dentry->d_inode->i_mutex);
> > + return ret;
> > + }
> > + fail:
> > + mutex_unlock(&inode->i_mutex);
> > + dput(dentry);
> > return ret;
> > }
> >
> > diff --git a/arch/powerpc/platforms/cell/spufs/syscalls.c b/arch/powerpc/platforms/cell/spufs/syscalls.c
> > index 8591bb6..1a65ef2 100644
> > --- a/arch/powerpc/platforms/cell/spufs/syscalls.c
> > +++ b/arch/powerpc/platforms/cell/spufs/syscalls.c
> > @@ -70,11 +70,8 @@ static long do_spu_create(const char __user *pathname, unsigned int flags,
> > ret = PTR_ERR(dentry);
> > if (!IS_ERR(dentry)) {
> > ret = spufs_create(&path, dentry, flags, mode, neighbor);
> > - mutex_unlock(&path.dentry->d_inode->i_mutex);
> > - dput(dentry);
> > path_put(&path);
> > }
> > -
> > return ret;
> > }
>
> This moves the unlock in front of the fsnotify_mkdir, where it was before Al's
> change. This seems independent of the other change.
No it's not, it all goes together. spufs_create_context() always
unlocked & dropped the dentry before returning, so I assumed the
lock had to be dropped before fsnotify.
Note that if the problem is that the lock has to be dropped before
spu_forget(), then we should indeed move it back into the leaf functions
and just remove all the unlock path from the top ones. It's a bit nasty
how we drop the mutex first, then do spu_forget, then drop the dentry
but we could go back to doing that.
What I want is consistent semantics. It's just silly to have 3 different
stacking levels which all 3 may or may not be responsible to dropping
the lock & dentry depending on circumstances.
In any case, what about this instead then:
diff --git a/arch/powerpc/platforms/cell/spufs/inode.c b/arch/powerpc/platforms/cell/spufs/inode.c
index d4a094c..114ab14 100644
--- a/arch/powerpc/platforms/cell/spufs/inode.c
+++ b/arch/powerpc/platforms/cell/spufs/inode.c
@@ -646,6 +646,7 @@ long spufs_create(struct path *path, struct dentry *dentry,
out:
mutex_unlock(&path->dentry->d_inode->i_mutex);
+ dput(dentry);
return ret;
}
diff --git a/arch/powerpc/platforms/cell/spufs/syscalls.c b/arch/powerpc/platforms/cell/spufs/syscalls.c
index 8591bb6..5665dcc 100644
--- a/arch/powerpc/platforms/cell/spufs/syscalls.c
+++ b/arch/powerpc/platforms/cell/spufs/syscalls.c
@@ -70,8 +70,6 @@ static long do_spu_create(const char __user *pathname, unsigned int flags,
ret = PTR_ERR(dentry);
if (!IS_ERR(dentry)) {
ret = spufs_create(&path, dentry, flags, mode, neighbor);
- mutex_unlock(&path.dentry->d_inode->i_mutex);
- dput(dentry);
path_put(&path);
}
Cheers,
Ben.
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] spufs raises two exceptions
2012-03-07 21:01 ` Benjamin Herrenschmidt
@ 2012-03-07 21:23 ` Al Viro
2012-03-07 22:32 ` Arnd Bergmann
0 siblings, 1 reply; 7+ messages in thread
From: Al Viro @ 2012-03-07 21:23 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: linuxppc-dev, masterzorag, Arnd Bergmann
> No it's not, it all goes together. spufs_create_context() always
> unlocked & dropped the dentry before returning, so I assumed the
> lock had to be dropped before fsnotify.
>
> Note that if the problem is that the lock has to be dropped before
> spu_forget(), then we should indeed move it back into the leaf functions
> and just remove all the unlock path from the top ones. It's a bit nasty
> how we drop the mutex first, then do spu_forget, then drop the dentry
> but we could go back to doing that.
>
> What I want is consistent semantics. It's just silly to have 3 different
> stacking levels which all 3 may or may not be responsible to dropping
> the lock & dentry depending on circumstances.
Why not leave unlock/dput to the caller? Details of deadlocks caused
by that approach, please...
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] spufs raises two exceptions
2012-03-07 21:23 ` Al Viro
@ 2012-03-07 22:32 ` Arnd Bergmann
0 siblings, 0 replies; 7+ messages in thread
From: Arnd Bergmann @ 2012-03-07 22:32 UTC (permalink / raw)
To: Al Viro; +Cc: linuxppc-dev, masterzorag
On Wednesday 07 March 2012, Al Viro wrote:
> > No it's not, it all goes together. spufs_create_context() always
> > unlocked & dropped the dentry before returning, so I assumed the
> > lock had to be dropped before fsnotify.
> >
> > Note that if the problem is that the lock has to be dropped before
> > spu_forget(), then we should indeed move it back into the leaf functions
> > and just remove all the unlock path from the top ones. It's a bit nasty
> > how we drop the mutex first, then do spu_forget, then drop the dentry
> > but we could go back to doing that.
> >
> > What I want is consistent semantics. It's just silly to have 3 different
> > stacking levels which all 3 may or may not be responsible to dropping
> > the lock & dentry depending on circumstances.
>
> Why not leave unlock/dput to the caller? Details of deadlocks caused
> by that approach, please...
If only I could remember that part exactly. I think I was remembering
0309f02d8 "spufs: fix deadlock in spu_create error path", which had a
double lock problem in this path. Please bear with me here because it's
been six years since I debugged that particular problem.
A lot of the nastyness of spu_forget() is about the problem of having
to give up a reference on current->mm that is held by an spu context,
while the mm contains a vma that maps this context and holds a refence
on the inode, in particular in case of calling exit().
However, I don't see now how either of these two issues requires that
we call spu_forget without the i_mutex held during a context creation
failure.
Arnd
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2012-03-07 22:38 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-03-06 9:26 [PATCH] spufs raises two exceptions masterzorag
2012-03-07 3:49 ` Benjamin Herrenschmidt
2012-03-07 3:51 ` Benjamin Herrenschmidt
2012-03-07 12:48 ` Arnd Bergmann
2012-03-07 21:01 ` Benjamin Herrenschmidt
2012-03-07 21:23 ` Al Viro
2012-03-07 22:32 ` Arnd Bergmann
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).