public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC][PATCH] f2fs: Avoid print false deadlock messages.
@ 2013-05-15  6:58 majianpeng
  2013-05-15  7:21 ` Libo Chen
  2013-05-15  8:28 ` Peter Zijlstra
  0 siblings, 2 replies; 8+ messages in thread
From: majianpeng @ 2013-05-15  6:58 UTC (permalink / raw)
  To: Jaegeuk Kim, peterz, mingo; +Cc: linux-kernel, linux-f2fs

When mounted the f2fs, kernel will print the following messages:

[  105.533038] =============================================
[  105.533065] [ INFO: possible recursive locking detected ]
[  105.533088] 3.10.0-rc1+ #101 Not tainted
[  105.533105] ---------------------------------------------
[  105.533127] mount/5833 is trying to acquire lock:
[  105.533147]  (&sbi->fs_lock[i]){+.+...}, at: [<ffffffffa02017a6>] write_checkpoint+0xb6/0xaf0 [f2fs]
[  105.533204]
[  105.533204] but task is already holding lock:
[  105.533228]  (&sbi->fs_lock[i]){+.+...}, at: [<ffffffffa02017a6>] write_checkpoint+0xb6/0xaf0 [f2fs]
[  105.533278]
[  105.533278] other info that might help us debug this:
[  105.533305]  Possible unsafe locking scenario:
[  105.533305]
[  105.533329]        CPU0
[  105.533341]        ----
[  105.533353]   lock(&sbi->fs_lock[i]);
[  105.533373]   lock(&sbi->fs_lock[i]);
[  105.533394]
[  105.533394]  *** DEADLOCK ***
[  105.533394]
[  105.533419]  May be due to missing lock nesting notation
[  105.533419]
[  105.533447] 3 locks held by mount/5833:
[  105.533463]  #0:  (&type->s_umount_key#23/1){+.+.+.}, at: [<ffffffff81167b2f>] sget+0x2af/0x620
[  105.533519]  #1:  (&sbi->cp_mutex){+.+...}, at: [<ffffffffa0201745>] write_checkpoint+0x55/0xaf0 [f2fs]
[  105.533573]  #2:  (&sbi->fs_lock[i]){+.+...}, at: [<ffffffffa02017a6>] write_checkpoint+0xb6/0xaf0 [f2fs]
[  105.533626]
[  105.533626] stack backtrace:
[  105.533647] CPU: 0 PID: 5833 Comm: mount Not tainted 3.10.0-rc1+ #101
[  105.533673] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS 080015  11/09/2011
[  105.533714]  ffffffff82240690 ffff8800a6f759d0 ffffffff8168286a ffff8800a6f75ac0
[  105.533754]  ffffffff810a03e3 0000000000000002 ffff8800a1b7a3f0 0000000000000000
[  105.533794]  ffff8800a6f75b28 ffffffff812c376e ffff8800a1b7a3f0 0000000000000000
[  105.533833] Call Trace:
[  105.533848]  [<ffffffff8168286a>] dump_stack+0x19/0x1b
[  105.533872]  [<ffffffff810a03e3>] __lock_acquire+0x733/0x1ee0
[  105.533898]  [<ffffffff812c376e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[  105.533926]  [<ffffffff8168c65c>] ? retint_restore_args+0xe/0xe
[  105.533951]  [<ffffffff810a2205>] lock_acquire+0x95/0x130
[  105.533977]  [<ffffffffa02017a6>] ? write_checkpoint+0xb6/0xaf0 [f2fs]
[  105.534011]  [<ffffffff816878da>] mutex_lock_nested+0x7a/0x3c0
[  105.534011]  [<ffffffffa02017a6>] ? write_checkpoint+0xb6/0xaf0 [f2fs]
[  105.534011]  [<ffffffffa02017a6>] ? write_checkpoint+0xb6/0xaf0 [f2fs]
[  105.534011]  [<ffffffffa02017a6>] write_checkpoint+0xb6/0xaf0 [f2fs]
[  105.534011]  [<ffffffff8109e35b>] ? mark_held_locks+0xbb/0x140
[  105.534011]  [<ffffffff81688907>] ?__mutex_unlock_slowpath+0xc7/0x170
[  105.534011]  [<ffffffffa020f43c>] recover_fsync_data+0x43c/0x4f0 [f2fs]
[  105.534011]  [<ffffffff812673bb>] ? security_d_instantiate+0x1b/0x30
[  105.534011]  [<ffffffffa0200795>] f2fs_fill_super+0x845/0x870 [f2fs]
[  105.534011]  [<ffffffff81168104>] mount_bdev+0x1b4/0x1f0
[  105.534011]  [<ffffffffa01fff50>] ? validate_superblock+0x170/0x170 [f2fs]
[  105.534011]  [<ffffffffa01fe745>] f2fs_mount+0x15/0x20 [f2fs]
[  105.534011]  [<ffffffff81168e03>] mount_fs+0x43/0x1b0
[  105.534011]  [<ffffffff8112d9d0>] ? __alloc_percpu+0x10/0x20
[  105.534011]  [<ffffffff81183be4>] vfs_kern_mount+0x74/0x120
[  105.534011]  [<ffffffff811862b9>] do_mount+0x259/0xa20
[  105.534011]  [<ffffffff8112890b>] ? strndup_user+0x5b/0x80
[  105.534011]  [<ffffffff81186b0e>] SyS_mount+0x8e/0xe0
[  105.534011]  [<ffffffff81694b02>] system_call_fastpath+0x16/0x1b

By adding some messages, i found this problem because the gcc
optimizing. For those codes:
>        for (i = 0; i < NR_GLOBAL_LOCKS; i++)
>                mutex_init(&sbi->fs_lock[i]);
The defination of mutex_init is:
> #define mutex_init(mutex)
>do {
>
>        static struct lock_class_key __key;
>
>                                                                      
>        __mutex_init((mutex), #mutex, &__key);
>
>} while (0)

Because the optimizing of gcc, there are only one __key rather than
NR_GLOBAL_LOCKS times.

Add there is other problems about lockname.Using 'for()' the lockname is
the same which is '&sbi->fs_lock[i]'.If it met problem about
mutex-operation, it can't find which one.

Although my patch can work,i think it's not best.Because if
NR_GLOBAL_LOCKS changed, we may leak to change this.

BTY, if who know how to avoid optimize, please tell me. Thanks!

Signed-off-by: Jianpeng Ma <majianpeng@gmail.com>
---
 fs/f2fs/super.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 8555f7d..ce08b96 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -520,7 +520,6 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
     struct buffer_head *raw_super_buf;
     struct inode *root;
     long err = -EINVAL;
-    int i;
 
     /* allocate memory for f2fs-specific super block info */
     sbi = kzalloc(sizeof(struct f2fs_sb_info), GFP_KERNEL);
@@ -578,8 +577,16 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
     mutex_init(&sbi->gc_mutex);
     mutex_init(&sbi->writepages);
     mutex_init(&sbi->cp_mutex);
-    for (i = 0; i < NR_GLOBAL_LOCKS; i++)
-        mutex_init(&sbi->fs_lock[i]);
+
+    mutex_init(&sbi->fs_lock[0]);
+    mutex_init(&sbi->fs_lock[1]);
+    mutex_init(&sbi->fs_lock[2]);
+    mutex_init(&sbi->fs_lock[3]);
+    mutex_init(&sbi->fs_lock[4]);
+    mutex_init(&sbi->fs_lock[5]);
+    mutex_init(&sbi->fs_lock[6]);
+    mutex_init(&sbi->fs_lock[7]);
+
     mutex_init(&sbi->node_write);
     sbi->por_doing = 0;
     spin_lock_init(&sbi->stat_lock);
-- 
1.8.3.rc1.44.gb387c77


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC][PATCH] f2fs: Avoid print false deadlock messages.
  2013-05-15  6:58 [RFC][PATCH] f2fs: Avoid print false deadlock messages majianpeng
@ 2013-05-15  7:21 ` Libo Chen
  2013-05-15  7:33   ` majianpeng
  2013-05-15  8:28 ` Peter Zijlstra
  1 sibling, 1 reply; 8+ messages in thread
From: Libo Chen @ 2013-05-15  7:21 UTC (permalink / raw)
  To: majianpeng; +Cc: Jaegeuk Kim, peterz, mingo, linux-kernel, linux-f2fs

On 2013/5/15 14:58, majianpeng wrote:
> By adding some messages, i found this problem because the gcc
> optimizing. For those codes:
>> >        for (i = 0; i < NR_GLOBAL_LOCKS; i++)
>> >                mutex_init(&sbi->fs_lock[i]);
> The defination of mutex_init is:
>> > #define mutex_init(mutex)
>> >do {
>> >
>> >        static struct lock_class_key __key;
>> >
>> >                                                                      
>> >        __mutex_init((mutex), #mutex, &__key);
>> >
>> >} while (0)
> Because the optimizing of gcc, there are only one __key rather than
> NR_GLOBAL_LOCKS times.
> 
> Add there is other problems about lockname.Using 'for()' the lockname is
> the same which is '&sbi->fs_lock[i]'.If it met problem about
> mutex-operation, it can't find which one.
> 
> Although my patch can work,i think it's not best.Because if
> NR_GLOBAL_LOCKS changed, we may leak to change this.
> 
> BTY, if who know how to avoid optimize, please tell me. Thanks!
> 
> Signed-off-by: Jianpeng Ma <majianpeng@gmail.com>
> ---
>  fs/f2fs/super.c | 13 ++++++++++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> index 8555f7d..ce08b96 100644
> --- a/fs/f2fs/super.c
> +++ b/fs/f2fs/super.c
> @@ -520,7 +520,6 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
>      struct buffer_head *raw_super_buf;
>      struct inode *root;
>      long err = -EINVAL;
> -    int i;
>  
>      /* allocate memory for f2fs-specific super block info */
>      sbi = kzalloc(sizeof(struct f2fs_sb_info), GFP_KERNEL);
> @@ -578,8 +577,16 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
>      mutex_init(&sbi->gc_mutex);
>      mutex_init(&sbi->writepages);
>      mutex_init(&sbi->cp_mutex);
> -    for (i = 0; i < NR_GLOBAL_LOCKS; i++)
> -        mutex_init(&sbi->fs_lock[i]);
> +

 you can try barrier() or mb()  to avoid compile optimization.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC][PATCH] f2fs: Avoid print false deadlock messages.
  2013-05-15  7:21 ` Libo Chen
@ 2013-05-15  7:33   ` majianpeng
  0 siblings, 0 replies; 8+ messages in thread
From: majianpeng @ 2013-05-15  7:33 UTC (permalink / raw)
  To: Libo Chen; +Cc: Jaegeuk Kim, peterz, mingo, linux-kernel, linux-f2fs

On 05/15/2013 03:21 PM, Libo Chen wrote:
> On 2013/5/15 14:58, majianpeng wrote:
>> By adding some messages, i found this problem because the gcc
>> optimizing. For those codes:
>>>>        for (i = 0; i < NR_GLOBAL_LOCKS; i++)
>>>>                mutex_init(&sbi->fs_lock[i]);
>> The defination of mutex_init is:
>>>> #define mutex_init(mutex)
>>>> do {
>>>>
>>>>        static struct lock_class_key __key;
>>>>
>>>>                                                                      
>>>>        __mutex_init((mutex), #mutex, &__key);
>>>>
>>>> } while (0)
>> Because the optimizing of gcc, there are only one __key rather than
>> NR_GLOBAL_LOCKS times.
>>
>> Add there is other problems about lockname.Using 'for()' the lockname is
>> the same which is '&sbi->fs_lock[i]'.If it met problem about
>> mutex-operation, it can't find which one.
>>
>> Although my patch can work,i think it's not best.Because if
>> NR_GLOBAL_LOCKS changed, we may leak to change this.
>>
>> BTY, if who know how to avoid optimize, please tell me. Thanks!
>>
>> Signed-off-by: Jianpeng Ma <majianpeng@gmail.com>
>> ---
>>  fs/f2fs/super.c | 13 ++++++++++---
>>  1 file changed, 10 insertions(+), 3 deletions(-)
>>
>> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
>> index 8555f7d..ce08b96 100644
>> --- a/fs/f2fs/super.c
>> +++ b/fs/f2fs/super.c
>> @@ -520,7 +520,6 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
>>      struct buffer_head *raw_super_buf;
>>      struct inode *root;
>>      long err = -EINVAL;
>> -    int i;
>>  
>>      /* allocate memory for f2fs-specific super block info */
>>      sbi = kzalloc(sizeof(struct f2fs_sb_info), GFP_KERNEL);
>> @@ -578,8 +577,16 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
>>      mutex_init(&sbi->gc_mutex);
>>      mutex_init(&sbi->writepages);
>>      mutex_init(&sbi->cp_mutex);
>> -    for (i = 0; i < NR_GLOBAL_LOCKS; i++)
>> -        mutex_init(&sbi->fs_lock[i]);
>> +
>  you can try barrier() or mb()  to avoid compile optimization.
>
Hi,
    They aren't work!But thanks very much!

Thanks!
Jianpeng Ma

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC][PATCH] f2fs: Avoid print false deadlock messages.
  2013-05-15  6:58 [RFC][PATCH] f2fs: Avoid print false deadlock messages majianpeng
  2013-05-15  7:21 ` Libo Chen
@ 2013-05-15  8:28 ` Peter Zijlstra
  2013-05-16  1:16   ` majianpeng
  1 sibling, 1 reply; 8+ messages in thread
From: Peter Zijlstra @ 2013-05-15  8:28 UTC (permalink / raw)
  To: majianpeng; +Cc: Jaegeuk Kim, mingo, linux-kernel, linux-f2fs

On Wed, May 15, 2013 at 02:58:53PM +0800, majianpeng wrote:
> When mounted the f2fs, kernel will print the following messages:
> 
> [  105.533038] =============================================
> [  105.533065] [ INFO: possible recursive locking detected ]
> [  105.533088] 3.10.0-rc1+ #101 Not tainted
> [  105.533105] ---------------------------------------------
> [  105.533127] mount/5833 is trying to acquire lock:
> [  105.533147]  (&sbi->fs_lock[i]){+.+...}, at: [<ffffffffa02017a6>] write_checkpoint+0xb6/0xaf0 [f2fs]
> [  105.533204]
> [  105.533204] but task is already holding lock:
> [  105.533228]  (&sbi->fs_lock[i]){+.+...}, at: [<ffffffffa02017a6>] write_checkpoint+0xb6/0xaf0 [f2fs]
> [  105.533278]
> [  105.533278] other info that might help us debug this:
> [  105.533305]  Possible unsafe locking scenario:
> [  105.533305]
> [  105.533329]        CPU0
> [  105.533341]        ----
> [  105.533353]   lock(&sbi->fs_lock[i]);
> [  105.533373]   lock(&sbi->fs_lock[i]);
> [  105.533394]
> [  105.533394]  *** DEADLOCK ***
> [  105.533394]

> By adding some messages, i found this problem because the gcc
> optimizing. For those codes:
> >        for (i = 0; i < NR_GLOBAL_LOCKS; i++)
> >                mutex_init(&sbi->fs_lock[i]);
> The defination of mutex_init is:
> > #define mutex_init(mutex)
> >do {
> >
> >        static struct lock_class_key __key;
> >
> >                                                                      
> >        __mutex_init((mutex), #mutex, &__key);
> >
> >} while (0)
> 
> Because the optimizing of gcc, there are only one __key rather than
> NR_GLOBAL_LOCKS times.

Its not a gcc specific optimization, any C compiler would. Its also very
much on purpose.

> Add there is other problems about lockname.Using 'for()' the lockname is
> the same which is '&sbi->fs_lock[i]'.If it met problem about
> mutex-operation, it can't find which one.
> 
> Although my patch can work,i think it's not best.Because if
> NR_GLOBAL_LOCKS changed, we may leak to change this.
> 
> BTY, if who know how to avoid optimize, please tell me. Thanks!

There isn't. What you typically want to do is annotate the lock site.

In particular it looks like mutex_lock_all() is the offensive piece of
code (horrible function name though; the only redeeming thing being that
f2fs.h isn't likely to be included elsewhere).

One thing you can do here is modify it to look like:

static inline void mutex_lock_all(struct f2fs_sb_info *sbi)
{
	int i;

	for (i = 0; i < NR_GLOBAL_LOCKS; i++) {
		/*
		 * This is the only time we take multiple fs_lock[]
		 * instances; the order is immaterial since we
		 * always hold cp_mutex, which serializes multiple
		 * such operations.
		 */
		mutex_lock_nest_lock(&sbi->fs_lock[i], &sbi->cp_mutex);
	}
}

That tells the lock validator that it is ok to lock multiple instances
of the fs_lock[i] class because the lock order is guarded by cp_mutex.

While your patch also works, it has multiple down-sides; its easy to get
out of sync when you modify NR_GLOBAL_LOCKS; also it consumes more
static lockdep resources (lockdep has to allocate all its resources
from static arrays since allocating memory also uses locks -- recursive
problem).



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC][PATCH] f2fs: Avoid print false deadlock messages.
  2013-05-15  8:28 ` Peter Zijlstra
@ 2013-05-16  1:16   ` majianpeng
  2013-05-16  8:41     ` Peter Zijlstra
  0 siblings, 1 reply; 8+ messages in thread
From: majianpeng @ 2013-05-16  1:16 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Jaegeuk Kim, mingo, linux-kernel, linux-f2fs

On 05/15/2013 04:28 PM, Peter Zijlstra wrote:
> On Wed, May 15, 2013 at 02:58:53PM +0800, majianpeng wrote:
>> When mounted the f2fs, kernel will print the following messages:
>>
>> [  105.533038] =============================================
>> [  105.533065] [ INFO: possible recursive locking detected ]
>> [  105.533088] 3.10.0-rc1+ #101 Not tainted
>> [  105.533105] ---------------------------------------------
>> [  105.533127] mount/5833 is trying to acquire lock:
>> [  105.533147]  (&sbi->fs_lock[i]){+.+...}, at: [<ffffffffa02017a6>] write_checkpoint+0xb6/0xaf0 [f2fs]
>> [  105.533204]
>> [  105.533204] but task is already holding lock:
>> [  105.533228]  (&sbi->fs_lock[i]){+.+...}, at: [<ffffffffa02017a6>] write_checkpoint+0xb6/0xaf0 [f2fs]
>> [  105.533278]
>> [  105.533278] other info that might help us debug this:
>> [  105.533305]  Possible unsafe locking scenario:
>> [  105.533305]
>> [  105.533329]        CPU0
>> [  105.533341]        ----
>> [  105.533353]   lock(&sbi->fs_lock[i]);
>> [  105.533373]   lock(&sbi->fs_lock[i]);
>> [  105.533394]
>> [  105.533394]  *** DEADLOCK ***
>> [  105.533394]
>> By adding some messages, i found this problem because the gcc
>> optimizing. For those codes:
>>>        for (i = 0; i < NR_GLOBAL_LOCKS; i++)
>>>                mutex_init(&sbi->fs_lock[i]);
>> The defination of mutex_init is:
>>> #define mutex_init(mutex)
>>> do {
>>>
>>>        static struct lock_class_key __key;
>>>
>>>                                                                      
>>>        __mutex_init((mutex), #mutex, &__key);
>>>
>>> } while (0)
>> Because the optimizing of gcc, there are only one __key rather than
>> NR_GLOBAL_LOCKS times.
> Its not a gcc specific optimization, any C compiler would. Its also very
> much on purpose.
>
>> Add there is other problems about lockname.Using 'for()' the lockname is
>> the same which is '&sbi->fs_lock[i]'.If it met problem about
>> mutex-operation, it can't find which one.
>>
>> Although my patch can work,i think it's not best.Because if
>> NR_GLOBAL_LOCKS changed, we may leak to change this.
>>
>> BTY, if who know how to avoid optimize, please tell me. Thanks!
Thanks your answer! Your patch looks good.
> There isn't. What you typically want to do is annotate the lock site.
> In particular it looks like mutex_lock_all() is the offensive piece of
> code (horrible function name though; the only redeeming thing being that
> f2fs.h isn't likely to be included elsewhere).
>
> One thing you can do here is modify it to look like:
>
> static inline void mutex_lock_all(struct f2fs_sb_info *sbi)
> {
> 	int i;
>
> 	for (i = 0; i < NR_GLOBAL_LOCKS; i++) {
> 		/*
> 		 * This is the only time we take multiple fs_lock[]
> 		 * instances; the order is immaterial since we
> 		 * always hold cp_mutex, which serializes multiple
> 		 * such operations.
> 		 */
> 		mutex_lock_nest_lock(&sbi->fs_lock[i], &sbi->cp_mutex);
> 	}
> }
>
> That tells the lock validator that it is ok to lock multiple instances
> of the fs_lock[i] class because the lock order is guarded by cp_mutex.
> While your patch also works, it has multiple down-sides; its easy to get
> out of sync when you modify NR_GLOBAL_LOCKS; also it consumes more
> static lockdep resources (lockdep has to allocate all its resources
> from static arrays since allocating memory also uses locks -- recursive
> problem).
>
Yes, but there is a problem if fs_block[] met deadlock. How to find which one?
Because the lock->name is the same.

Thanks!
Jianpeng Ma


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC][PATCH] f2fs: Avoid print false deadlock messages.
  2013-05-16  1:16   ` majianpeng
@ 2013-05-16  8:41     ` Peter Zijlstra
  2013-05-16 11:34       ` majianpeng
  0 siblings, 1 reply; 8+ messages in thread
From: Peter Zijlstra @ 2013-05-16  8:41 UTC (permalink / raw)
  To: majianpeng; +Cc: Jaegeuk Kim, mingo, linux-kernel, linux-f2fs

On Thu, May 16, 2013 at 09:16:45AM +0800, majianpeng wrote:

> > There isn't. What you typically want to do is annotate the lock site.
> > In particular it looks like mutex_lock_all() is the offensive piece of
> > code (horrible function name though; the only redeeming thing being that
> > f2fs.h isn't likely to be included elsewhere).
> >
> > One thing you can do here is modify it to look like:
> >
> > static inline void mutex_lock_all(struct f2fs_sb_info *sbi)
> > {
> > 	int i;
> >
> > 	for (i = 0; i < NR_GLOBAL_LOCKS; i++) {
> > 		/*
> > 		 * This is the only time we take multiple fs_lock[]
> > 		 * instances; the order is immaterial since we
> > 		 * always hold cp_mutex, which serializes multiple
> > 		 * such operations.
> > 		 */
> > 		mutex_lock_nest_lock(&sbi->fs_lock[i], &sbi->cp_mutex);
> > 	}
> > }
> >
> > That tells the lock validator that it is ok to lock multiple instances
> > of the fs_lock[i] class because the lock order is guarded by cp_mutex.
> > While your patch also works, it has multiple down-sides; its easy to get
> > out of sync when you modify NR_GLOBAL_LOCKS; also it consumes more
> > static lockdep resources (lockdep has to allocate all its resources
> > from static arrays since allocating memory also uses locks -- recursive
> > problem).
> >
> Yes, but there is a problem if fs_block[] met deadlock. How to find which one?
> Because the lock->name is the same.

The most useful part of the lockdep report are the call traces that tell you
where locks where acquired; the names are secondary. That is, while they are at
times helpful in finding the right lock site, they're rarely _that_ important.

Remember, your code will very likely not have the exact number hardcoded either.
It'll be a variable. So having the number in the lockdep output will not help
you find the offending code any sooner.

Suppose there's another site that acquires two fs_block[] locks; currently this
would generate another such warning as this thread started with -- lockdep
doesn't look at lock instances but at classes; and it cannot differentiate
between two locks of the same class and thus reports the possible deadlock.

The way to find the offending code is to look at the "locks held" section of
the lockdep report along with the call traces.

Once you find the way in which the two locks nest the specific numbers are
irrelevant. Your aim then is to prove your locking scheme is indeed deadlock
free and then tell lockdep about it.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC][PATCH] f2fs: Avoid print false deadlock messages.
  2013-05-16  8:41     ` Peter Zijlstra
@ 2013-05-16 11:34       ` majianpeng
  2013-05-16 18:03         ` Peter Zijlstra
  0 siblings, 1 reply; 8+ messages in thread
From: majianpeng @ 2013-05-16 11:34 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Jaegeuk Kim, mingo, linux-kernel, linux-f2fs

On 05/16/2013 04:41 PM, Peter Zijlstra wrote:
> On Thu, May 16, 2013 at 09:16:45AM +0800, majianpeng wrote:
>
>>> There isn't. What you typically want to do is annotate the lock site.
>>> In particular it looks like mutex_lock_all() is the offensive piece of
>>> code (horrible function name though; the only redeeming thing being that
>>> f2fs.h isn't likely to be included elsewhere).
>>>
>>> One thing you can do here is modify it to look like:
>>>
>>> static inline void mutex_lock_all(struct f2fs_sb_info *sbi)
>>> {
>>> 	int i;
>>>
>>> 	for (i = 0; i < NR_GLOBAL_LOCKS; i++) {
>>> 		/*
>>> 		 * This is the only time we take multiple fs_lock[]
>>> 		 * instances; the order is immaterial since we
>>> 		 * always hold cp_mutex, which serializes multiple
>>> 		 * such operations.
>>> 		 */
>>> 		mutex_lock_nest_lock(&sbi->fs_lock[i], &sbi->cp_mutex);
>>> 	}
>>> }
>>>
>>> That tells the lock validator that it is ok to lock multiple instances
>>> of the fs_lock[i] class because the lock order is guarded by cp_mutex.
>>> While your patch also works, it has multiple down-sides; its easy to get
>>> out of sync when you modify NR_GLOBAL_LOCKS; also it consumes more
>>> static lockdep resources (lockdep has to allocate all its resources
>>> from static arrays since allocating memory also uses locks -- recursive
>>> problem).
>>>
>> Yes, but there is a problem if fs_block[] met deadlock. How to find which one?
>> Because the lock->name is the same.
> The most useful part of the lockdep report are the call traces that tell you
> where locks where acquired; the names are secondary. That is, while they are at
> times helpful in finding the right lock site, they're rarely _that_ important.
>
> Remember, your code will very likely not have the exact number hardcoded either.
> It'll be a variable. So having the number in the lockdep output will not help
> you find the offending code any sooner.
>
> Suppose there's another site that acquires two fs_block[] locks; currently this
> would generate another such warning as this thread started with -- lockdep
> doesn't look at lock instances but at classes; and it cannot differentiate
> between two locks of the same class and thus reports the possible deadlock.
>
> The way to find the offending code is to look at the "locks held" section of
> the lockdep report along with the call traces.
>
> Once you find the way in which the two locks nest the specific numbers are
> irrelevant. Your aim then is to prove your locking scheme is indeed deadlock
> free and then tell lockdep about it.
>
>
Thanks very much! I'll take times to understand.
Can you send a patch about this?

Thanks!
Jianpeng Ma


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC][PATCH] f2fs: Avoid print false deadlock messages.
  2013-05-16 11:34       ` majianpeng
@ 2013-05-16 18:03         ` Peter Zijlstra
  0 siblings, 0 replies; 8+ messages in thread
From: Peter Zijlstra @ 2013-05-16 18:03 UTC (permalink / raw)
  To: majianpeng; +Cc: Jaegeuk Kim, mingo, linux-kernel, linux-f2fs

On Thu, May 16, 2013 at 07:34:15PM +0800, majianpeng wrote:

> Thanks very much! I'll take times to understand.
> Can you send a patch about this?

You mean the below or something else?

---
Subject: f2fs, lockdep: Annotate mutex_lock_all()

Majianpeng reported a lockdep splat for f2fs. It turns out mutex_lock_all()
acquires an array of locks (in global/local lock style).

Any such operation is always serialized using cp_mutex, therefore there is no
fs_lock[] lock-order issue; tell lockdep about this using the
mutex_lock_nest_lock() primitive.

Reported-by: majianpeng <majianpeng@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 fs/f2fs/f2fs.h | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 20aab02..8454277 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -495,9 +495,17 @@ static inline void clear_ckpt_flags(struct f2fs_checkpoint *cp, unsigned int f)
 
 static inline void mutex_lock_all(struct f2fs_sb_info *sbi)
 {
-	int i = 0;
-	for (; i < NR_GLOBAL_LOCKS; i++)
-		mutex_lock(&sbi->fs_lock[i]);
+	int i;
+
+	for (i = 0; i < NR_GLOBAL_LOCKS; i++) {
+		/*
+		 * This is the only time we take multiple fs_lock[]
+		 * instances; the order is immaterial since we
+		 * always hold cp_mutex, which serializes multiple
+		 * such operations.
+		 */
+		mutex_lock_nest_lock(&sbi->fs_lock[i], &sbi->cp_mutex);
+	}
 }
 
 static inline void mutex_unlock_all(struct f2fs_sb_info *sbi)


^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2013-05-16 18:05 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-05-15  6:58 [RFC][PATCH] f2fs: Avoid print false deadlock messages majianpeng
2013-05-15  7:21 ` Libo Chen
2013-05-15  7:33   ` majianpeng
2013-05-15  8:28 ` Peter Zijlstra
2013-05-16  1:16   ` majianpeng
2013-05-16  8:41     ` Peter Zijlstra
2013-05-16 11:34       ` majianpeng
2013-05-16 18:03         ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox