[Cluster-devel] A couple of DLM patches

cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed

* [Cluster-devel] A couple of DLM patches
@ 2009-10-13 14:56 Steven Whitehouse
  2009-10-13 14:56 ` [Cluster-devel] [PATCH 1/2] dlm: Send lockspace name with uevents Steven Whitehouse
  0 siblings, 1 reply; 14+ messages in thread
From: Steven Whitehouse @ 2009-10-13 14:56 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

The first of these patches adds the lockspace name to all dlm uevents
so that in future it will be possible for dlm_controld to do something
similar to gfs_controld in terms of parsing the messages. I couldn't
see any other information that might be useful to know, other than the
lockspace name.

The second patch is designed to fix the annoying lockdep warnings
which we see when mounting gfs2 filesystems due to the rwsem in
the dlm.

Steve.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Cluster-devel] [PATCH 1/2] dlm: Send lockspace name with uevents
  2009-10-13 14:56 [Cluster-devel] A couple of DLM patches Steven Whitehouse
@ 2009-10-13 14:56 ` Steven Whitehouse
  2009-10-13 14:53   ` [Cluster-devel] " David Teigland
  2009-10-13 14:56   ` [Cluster-devel] [PATCH 2/2] dlm: Add down/up_write_non_owner to keep lockdep happy Steven Whitehouse
  0 siblings, 2 replies; 14+ messages in thread
From: Steven Whitehouse @ 2009-10-13 14:56 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Although it is possible to get this information from the path,
its much easier to provide the lockspace as a seperate env
variable.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
---
 fs/dlm/lockspace.c |   14 +++++++++++++-
 1 files changed, 13 insertions(+), 1 deletions(-)

diff --git a/fs/dlm/lockspace.c b/fs/dlm/lockspace.c
index d489fcc..8dde538 100644
--- a/fs/dlm/lockspace.c
+++ b/fs/dlm/lockspace.c
@@ -191,6 +191,18 @@ static int do_uevent(struct dlm_ls *ls, int in)
 	return error;
 }
 
+static int dlm_uevent(struct kset *kset, struct kobject *kobj,
+		      struct kobj_uevent_env *env)
+{
+	struct dlm_ls *ls = container_of(kobj, struct dlm_ls, ls_kobj);
+
+	add_uevent_var(env, "LOCKSPACE=%s", ls->ls_name);
+	return 0;
+}
+
+static struct kset_uevent_ops dlm_uevent_ops = {
+	.uevent = dlm_uevent,
+};
 
 int __init dlm_lockspace_init(void)
 {
@@ -199,7 +211,7 @@ int __init dlm_lockspace_init(void)
 	INIT_LIST_HEAD(&lslist);
 	spin_lock_init(&lslist_lock);
 
-	dlm_kset = kset_create_and_add("dlm", NULL, kernel_kobj);
+	dlm_kset = kset_create_and_add("dlm", &dlm_uevent_ops, kernel_kobj);
 	if (!dlm_kset) {
 		printk(KERN_WARNING "%s: can not create kset\n", __func__);
 		return -ENOMEM;
-- 
1.6.2.5



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Cluster-devel] Re: [PATCH 1/2] dlm: Send lockspace name with uevents
  2009-10-13 14:56 ` [Cluster-devel] [PATCH 1/2] dlm: Send lockspace name with uevents Steven Whitehouse
@ 2009-10-13 14:53   ` David Teigland
  2009-10-13 15:23     ` David Teigland
  2009-10-13 14:56   ` [Cluster-devel] [PATCH 2/2] dlm: Add down/up_write_non_owner to keep lockdep happy Steven Whitehouse
  1 sibling, 1 reply; 14+ messages in thread
From: David Teigland @ 2009-10-13 14:53 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Tue, Oct 13, 2009 at 03:56:15PM +0100, Steven Whitehouse wrote:
> Although it is possible to get this information from the path,
> its much easier to provide the lockspace as a seperate env
> variable.

I don't mind this, but it's more or less a style issue.  I'm going to try to
keep upstream and rhel as close as possible for a while to minimize the
backporting work in the early stages when it's usually heaviest.  That means
putting off things like this until later.


> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
> ---
>  fs/dlm/lockspace.c |   14 +++++++++++++-
>  1 files changed, 13 insertions(+), 1 deletions(-)
> 
> diff --git a/fs/dlm/lockspace.c b/fs/dlm/lockspace.c
> index d489fcc..8dde538 100644
> --- a/fs/dlm/lockspace.c
> +++ b/fs/dlm/lockspace.c
> @@ -191,6 +191,18 @@ static int do_uevent(struct dlm_ls *ls, int in)
>  	return error;
>  }
>  
> +static int dlm_uevent(struct kset *kset, struct kobject *kobj,
> +		      struct kobj_uevent_env *env)
> +{
> +	struct dlm_ls *ls = container_of(kobj, struct dlm_ls, ls_kobj);
> +
> +	add_uevent_var(env, "LOCKSPACE=%s", ls->ls_name);
> +	return 0;
> +}
> +
> +static struct kset_uevent_ops dlm_uevent_ops = {
> +	.uevent = dlm_uevent,
> +};
>  
>  int __init dlm_lockspace_init(void)
>  {
> @@ -199,7 +211,7 @@ int __init dlm_lockspace_init(void)
>  	INIT_LIST_HEAD(&lslist);
>  	spin_lock_init(&lslist_lock);
>  
> -	dlm_kset = kset_create_and_add("dlm", NULL, kernel_kobj);
> +	dlm_kset = kset_create_and_add("dlm", &dlm_uevent_ops, kernel_kobj);
>  	if (!dlm_kset) {
>  		printk(KERN_WARNING "%s: can not create kset\n", __func__);
>  		return -ENOMEM;
> -- 
> 1.6.2.5



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Cluster-devel] Re: [PATCH 1/2] dlm: Send lockspace name with uevents
  2009-10-13 14:53   ` [Cluster-devel] " David Teigland
@ 2009-10-13 15:23     ` David Teigland
  2009-10-13 15:43       ` Steven Whitehouse
  0 siblings, 1 reply; 14+ messages in thread
From: David Teigland @ 2009-10-13 15:23 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Tue, Oct 13, 2009 at 09:53:37AM -0500, David Teigland wrote:
> On Tue, Oct 13, 2009 at 03:56:15PM +0100, Steven Whitehouse wrote:
> > Although it is possible to get this information from the path,
> > its much easier to provide the lockspace as a seperate env
> > variable.
> 
> I don't mind this, but it's more or less a style issue.  I'm going to try to
> keep upstream and rhel as close as possible for a while to minimize the
> backporting work in the early stages when it's usually heaviest.  That means
> putting off things like this until later.

Also, libudev may make this less relevant,
http://www.kernel.org/pub/linux/utils/kernel/hotplug/libudev/

Dave



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Cluster-devel] Re: [PATCH 1/2] dlm: Send lockspace name with uevents
  2009-10-13 15:23     ` David Teigland
@ 2009-10-13 15:43       ` Steven Whitehouse
  0 siblings, 0 replies; 14+ messages in thread
From: Steven Whitehouse @ 2009-10-13 15:43 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

On Tue, 2009-10-13 at 10:23 -0500, David Teigland wrote:
> On Tue, Oct 13, 2009 at 09:53:37AM -0500, David Teigland wrote:
> > On Tue, Oct 13, 2009 at 03:56:15PM +0100, Steven Whitehouse wrote:
> > > Although it is possible to get this information from the path,
> > > its much easier to provide the lockspace as a seperate env
> > > variable.
> > 
> > I don't mind this, but it's more or less a style issue.  I'm going to try to
> > keep upstream and rhel as close as possible for a while to minimize the
> > backporting work in the early stages when it's usually heaviest.  That means
> > putting off things like this until later.
> 
> Also, libudev may make this less relevant,
> http://www.kernel.org/pub/linux/utils/kernel/hotplug/libudev/
> 
> Dave
> 

I don't think so. The udev code is written on the basis that it looks at
the individual env variables, so I think its likely to be more useful in
the future,

Steve.




^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Cluster-devel] [PATCH 2/2] dlm: Add down/up_write_non_owner to keep lockdep happy
  2009-10-13 14:56 ` [Cluster-devel] [PATCH 1/2] dlm: Send lockspace name with uevents Steven Whitehouse
  2009-10-13 14:53   ` [Cluster-devel] " David Teigland
@ 2009-10-13 14:56   ` Steven Whitehouse
  2009-11-12 13:27     ` [Cluster-devel] " Steven Whitehouse
       [not found]     ` <20091112142211.GA468@elte.hu>
  1 sibling, 2 replies; 14+ messages in thread
From: Steven Whitehouse @ 2009-10-13 14:56 UTC (permalink / raw)
  To: cluster-devel.redhat.com

I looked at possibly changing this to use completions, but
it seems that the usage here is not easily adapted to that.
This patch adds suitable annotation to the write side of
the ls_in_recovery semaphore so that we don't get nasty
messages from lockdep when mounting a gfs2 filesystem.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
---
 fs/dlm/lockspace.c    |    2 +-
 fs/dlm/member.c       |    2 +-
 fs/dlm/recoverd.c     |    2 +-
 include/linux/rwsem.h |    4 ++++
 kernel/rwsem.c        |   16 ++++++++++++++++
 5 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/fs/dlm/lockspace.c b/fs/dlm/lockspace.c
index 8dde538..fa0cc22 100644
--- a/fs/dlm/lockspace.c
+++ b/fs/dlm/lockspace.c
@@ -551,7 +551,7 @@ static int new_lockspace(const char *name, int namelen, void **lockspace,
 	INIT_LIST_HEAD(&ls->ls_root_list);
 	init_rwsem(&ls->ls_root_sem);
 
-	down_write(&ls->ls_in_recovery);
+	down_write_non_owner(&ls->ls_in_recovery);
 
 	spin_lock(&lslist_lock);
 	ls->ls_create_count = 1;
diff --git a/fs/dlm/member.c b/fs/dlm/member.c
index b128775..99bd086 100644
--- a/fs/dlm/member.c
+++ b/fs/dlm/member.c
@@ -318,7 +318,7 @@ int dlm_ls_stop(struct dlm_ls *ls)
 	 */
 
 	if (new)
-		down_write(&ls->ls_in_recovery);
+		down_write_non_owner(&ls->ls_in_recovery);
 
 	/*
 	 * The recoverd suspend/resume makes sure that dlm_recoverd (if
diff --git a/fs/dlm/recoverd.c b/fs/dlm/recoverd.c
index fd677c8..376479a 100644
--- a/fs/dlm/recoverd.c
+++ b/fs/dlm/recoverd.c
@@ -40,7 +40,7 @@ static int enable_locking(struct dlm_ls *ls, uint64_t seq)
 	if (ls->ls_recover_seq == seq) {
 		set_bit(LSFL_RUNNING, &ls->ls_flags);
 		/* unblocks processes waiting to enter the dlm */
-		up_write(&ls->ls_in_recovery);
+		up_write_non_owner(&ls->ls_in_recovery);
 		error = 0;
 	}
 	spin_unlock(&ls->ls_recover_lock);
diff --git a/include/linux/rwsem.h b/include/linux/rwsem.h
index efd348f..34643df 100644
--- a/include/linux/rwsem.h
+++ b/include/linux/rwsem.h
@@ -80,12 +80,16 @@ extern void down_write_nested(struct rw_semaphore *sem, int subclass);
  *   proper abstraction for this case is completions. ]
  */
 extern void down_read_non_owner(struct rw_semaphore *sem);
+extern void down_write_non_owner(struct rw_semaphore *sem);
 extern void up_read_non_owner(struct rw_semaphore *sem);
+extern void up_write_non_owner(struct rw_semaphore *sem);
 #else
 # define down_read_nested(sem, subclass)		down_read(sem)
 # define down_write_nested(sem, subclass)	down_write(sem)
 # define down_read_non_owner(sem)		down_read(sem)
+# define down_write_non_owner(sem)		down_write(sem)
 # define up_read_non_owner(sem)			up_read(sem)
+# define up_write_non_owner(sem)		up_write(sem)
 #endif
 
 #endif /* _LINUX_RWSEM_H */
diff --git a/kernel/rwsem.c b/kernel/rwsem.c
index cae050b..2c57eef 100644
--- a/kernel/rwsem.c
+++ b/kernel/rwsem.c
@@ -126,6 +126,15 @@ void down_read_non_owner(struct rw_semaphore *sem)
 
 EXPORT_SYMBOL(down_read_non_owner);
 
+void down_write_non_owner(struct rw_semaphore *sem)
+{
+	might_sleep();
+
+	 __down_write(sem);
+}
+
+EXPORT_SYMBOL(down_write_non_owner);
+
 void down_write_nested(struct rw_semaphore *sem, int subclass)
 {
 	might_sleep();
@@ -143,6 +152,13 @@ void up_read_non_owner(struct rw_semaphore *sem)
 
 EXPORT_SYMBOL(up_read_non_owner);
 
+void up_write_non_owner(struct rw_semaphore *sem)
+{
+	__up_write(sem);
+}
+
+EXPORT_SYMBOL(up_write_non_owner);
+
 #endif
 
 
-- 
1.6.2.5



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Cluster-devel] Re: [PATCH 2/2] dlm: Add down/up_write_non_owner to keep lockdep happy
  2009-10-13 14:56   ` [Cluster-devel] [PATCH 2/2] dlm: Add down/up_write_non_owner to keep lockdep happy Steven Whitehouse
@ 2009-11-12 13:27     ` Steven Whitehouse
       [not found]     ` <20091112142211.GA468@elte.hu>
  1 sibling, 0 replies; 14+ messages in thread
From: Steven Whitehouse @ 2009-11-12 13:27 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

Is there an issue with this patch? It doesn't seem to have appeared in
your dlm git tree yet,

Steve.

On Tue, 2009-10-13 at 15:56 +0100, Steven Whitehouse wrote:
> I looked at possibly changing this to use completions, but
> it seems that the usage here is not easily adapted to that.
> This patch adds suitable annotation to the write side of
> the ls_in_recovery semaphore so that we don't get nasty
> messages from lockdep when mounting a gfs2 filesystem.
> 
> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
> Cc: Ingo Molnar <mingo@elte.hu>
> ---
>  fs/dlm/lockspace.c    |    2 +-
>  fs/dlm/member.c       |    2 +-
>  fs/dlm/recoverd.c     |    2 +-
>  include/linux/rwsem.h |    4 ++++
>  kernel/rwsem.c        |   16 ++++++++++++++++
>  5 files changed, 23 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/dlm/lockspace.c b/fs/dlm/lockspace.c
> index 8dde538..fa0cc22 100644
> --- a/fs/dlm/lockspace.c
> +++ b/fs/dlm/lockspace.c
> @@ -551,7 +551,7 @@ static int new_lockspace(const char *name, int namelen, void **lockspace,
>  	INIT_LIST_HEAD(&ls->ls_root_list);
>  	init_rwsem(&ls->ls_root_sem);
>  
> -	down_write(&ls->ls_in_recovery);
> +	down_write_non_owner(&ls->ls_in_recovery);
>  
>  	spin_lock(&lslist_lock);
>  	ls->ls_create_count = 1;
> diff --git a/fs/dlm/member.c b/fs/dlm/member.c
> index b128775..99bd086 100644
> --- a/fs/dlm/member.c
> +++ b/fs/dlm/member.c
> @@ -318,7 +318,7 @@ int dlm_ls_stop(struct dlm_ls *ls)
>  	 */
>  
>  	if (new)
> -		down_write(&ls->ls_in_recovery);
> +		down_write_non_owner(&ls->ls_in_recovery);
>  
>  	/*
>  	 * The recoverd suspend/resume makes sure that dlm_recoverd (if
> diff --git a/fs/dlm/recoverd.c b/fs/dlm/recoverd.c
> index fd677c8..376479a 100644
> --- a/fs/dlm/recoverd.c
> +++ b/fs/dlm/recoverd.c
> @@ -40,7 +40,7 @@ static int enable_locking(struct dlm_ls *ls, uint64_t seq)
>  	if (ls->ls_recover_seq == seq) {
>  		set_bit(LSFL_RUNNING, &ls->ls_flags);
>  		/* unblocks processes waiting to enter the dlm */
> -		up_write(&ls->ls_in_recovery);
> +		up_write_non_owner(&ls->ls_in_recovery);
>  		error = 0;
>  	}
>  	spin_unlock(&ls->ls_recover_lock);
> diff --git a/include/linux/rwsem.h b/include/linux/rwsem.h
> index efd348f..34643df 100644
> --- a/include/linux/rwsem.h
> +++ b/include/linux/rwsem.h
> @@ -80,12 +80,16 @@ extern void down_write_nested(struct rw_semaphore *sem, int subclass);
>   *   proper abstraction for this case is completions. ]
>   */
>  extern void down_read_non_owner(struct rw_semaphore *sem);
> +extern void down_write_non_owner(struct rw_semaphore *sem);
>  extern void up_read_non_owner(struct rw_semaphore *sem);
> +extern void up_write_non_owner(struct rw_semaphore *sem);
>  #else
>  # define down_read_nested(sem, subclass)		down_read(sem)
>  # define down_write_nested(sem, subclass)	down_write(sem)
>  # define down_read_non_owner(sem)		down_read(sem)
> +# define down_write_non_owner(sem)		down_write(sem)
>  # define up_read_non_owner(sem)			up_read(sem)
> +# define up_write_non_owner(sem)		up_write(sem)
>  #endif
>  
>  #endif /* _LINUX_RWSEM_H */
> diff --git a/kernel/rwsem.c b/kernel/rwsem.c
> index cae050b..2c57eef 100644
> --- a/kernel/rwsem.c
> +++ b/kernel/rwsem.c
> @@ -126,6 +126,15 @@ void down_read_non_owner(struct rw_semaphore *sem)
>  
>  EXPORT_SYMBOL(down_read_non_owner);
>  
> +void down_write_non_owner(struct rw_semaphore *sem)
> +{
> +	might_sleep();
> +
> +	 __down_write(sem);
> +}
> +
> +EXPORT_SYMBOL(down_write_non_owner);
> +
>  void down_write_nested(struct rw_semaphore *sem, int subclass)
>  {
>  	might_sleep();
> @@ -143,6 +152,13 @@ void up_read_non_owner(struct rw_semaphore *sem)
>  
>  EXPORT_SYMBOL(up_read_non_owner);
>  
> +void up_write_non_owner(struct rw_semaphore *sem)
> +{
> +	__up_write(sem);
> +}
> +
> +EXPORT_SYMBOL(up_write_non_owner);
> +
>  #endif
>  
> 



^ permalink raw reply	[flat|nested] 14+ messages in thread

[parent not found: <20091112142211.GA468@elte.hu>]

* [Cluster-devel] Re: [PATCH 2/2] dlm: Add down/up_write_non_owner to keep lockdep happy
       [not found]     ` <20091112142211.GA468@elte.hu>
@ 2009-11-12 14:29       ` Steven Whitehouse
  2009-11-12 17:14         ` David Teigland
  0 siblings, 1 reply; 14+ messages in thread
From: Steven Whitehouse @ 2009-11-12 14:29 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

On Thu, 2009-11-12 at 15:22 +0100, Ingo Molnar wrote:
> * Steven Whitehouse <swhiteho@redhat.com> wrote:
> 
> > I looked at possibly changing this to use completions, but
> > it seems that the usage here is not easily adapted to that.
> > This patch adds suitable annotation to the write side of
> > the ls_in_recovery semaphore so that we don't get nasty
> > messages from lockdep when mounting a gfs2 filesystem.
> 
> What do those 'nasty messages' say? If they expose some bug and this
> patch works around that bug by hiding it then NAK ...
> 
>         Ingo
> 
The nasty messages are moaning that the lock is being taken in one
thread and unlocked in another. I couldn't see any bugs in the code when
I looked at it. Below are the messages that I get - to reproduce just
mount a GFS2 filesystem with the dlm lock manager. It happens on every
mount,

Steve.

Nov 12 15:10:01 chywoon kernel:
=============================================
Nov 12 15:10:01 chywoon kernel: [ INFO: possible recursive locking
detected ]
Nov 12 15:10:01 chywoon kernel: 2.6.32-rc6 #46
Nov 12 15:10:01 chywoon kernel:
---------------------------------------------
Nov 12 15:10:01 chywoon kernel: mount.gfs2/2996 is trying to acquire
lock:
Nov 12 15:10:01 chywoon kernel: (&ls->ls_in_recovery){+++++.}, at:
[<ffffffffa01cf3b5>] dlm_lock+0x55/0x1b0 [dlm]
Nov 12 15:10:01 chywoon kernel:
Nov 12 15:10:01 chywoon kernel: but task is already holding lock:
Nov 12 15:10:01 chywoon kernel: (&ls->ls_in_recovery){+++++.}, at:
[<ffffffffa01d1e1c>] dlm_new_lockspace+0x96c/0xb80 [dlm]
Nov 12 15:10:01 chywoon kernel:
Nov 12 15:10:01 chywoon kernel: other info that might help us debug
this:
Nov 12 15:10:01 chywoon kernel: 2 locks held by mount.gfs2/2996:
Nov 12 15:10:01 chywoon kernel: #0:  (&type->s_umount_key#37/1){+.+.+.},
at: [<ffffffff81168698>] sget+0x258/0x4c0
Nov 12 15:10:01 chywoon kernel: #1:  (&ls->ls_in_recovery){+++++.}, at:
[<ffffffffa01d1e1c>] dlm_new_lockspace+0x96c/0xb80 [dlm]
Nov 12 15:10:01 chywoon kernel:
Nov 12 15:10:01 chywoon kernel: stack backtrace:
Nov 12 15:10:01 chywoon kernel: Pid: 2996, comm: mount.gfs2 Not tainted
2.6.32-rc6 #46
Nov 12 15:10:01 chywoon kernel: Call Trace:
Nov 12 15:10:01 chywoon kernel: [<ffffffff810c53e4>] validate_chain
+0xdd4/0x1170
Nov 12 15:10:01 chywoon kernel: [<ffffffff810c2a85>] ? mark_lock
+0x2e5/0x670
Nov 12 15:10:01 chywoon kernel: [<ffffffff810c5cb2>] __lock_acquire
+0x532/0xbc0
Nov 12 15:10:01 chywoon kernel: [<ffffffff81043e2c>] ?
native_sched_clock+0x2c/0x80
Nov 12 15:10:01 chywoon kernel: [<ffffffff810c6404>] lock_acquire
+0xc4/0x1b0
Nov 12 15:10:01 chywoon kernel: [<ffffffffa01cf3b5>] ? dlm_lock
+0x55/0x1b0 [dlm]
Nov 12 15:10:01 chywoon kernel: [<ffffffff81627317>] down_read+0x47/0x80
Nov 12 15:10:01 chywoon kernel: [<ffffffffa01cf3b5>] ? dlm_lock
+0x55/0x1b0 [dlm]
Nov 12 15:10:01 chywoon kernel: [<ffffffffa01d0af6>] ?
dlm_find_lockspace_local+0x46/0x60 [dlm]
Nov 12 15:10:01 chywoon kernel: [<ffffffffa01cf3b5>] dlm_lock+0x55/0x1b0
[dlm]
Nov 12 15:10:01 chywoon kernel: [<ffffffff81043df3>] ? sched_clock
+0x13/0x20
Nov 12 15:10:01 chywoon kernel: [<ffffffff810b4e1d>] ? sched_clock_cpu
+0xed/0x140
Nov 12 15:10:01 chywoon kernel: [<ffffffff810bf6fe>] ? put_lock_stats
+0xe/0x30
Nov 12 15:10:01 chywoon kernel: [<ffffffff810c10e1>] ?
lock_release_holdtime+0xb1/0x160
Nov 12 15:10:01 chywoon kernel: [<ffffffffa0224a75>] gdlm_lock
+0xe5/0x120 [gfs2]
Nov 12 15:10:01 chywoon kernel: [<ffffffffa0224b70>] ? gdlm_ast+0x0/0xd0
[gfs2]
Nov 12 15:10:01 chywoon kernel: [<ffffffffa0224ab0>] ? gdlm_bast
+0x0/0x50 [gfs2]
Nov 12 15:10:01 chywoon kernel: [<ffffffffa02081d3>] do_xmote+0xf3/0x2c0
[gfs2]
Nov 12 15:10:01 chywoon kernel: [<ffffffffa0208a21>] run_queue
+0x1a1/0x280 [gfs2]
Nov 12 15:10:01 chywoon kernel: [<ffffffffa0208cb4>] gfs2_glock_nq
+0x134/0x3b0 [gfs2]
Nov 12 15:10:01 chywoon kernel: [<ffffffffa02099a9>] gfs2_glock_nq_num
+0x69/0x90 [gfs2]
Nov 12 15:10:01 chywoon kernel: [<ffffffffa0215a25>] init_locking
+0x45/0x190 [gfs2]
Nov 12 15:10:01 chywoon kernel: [<ffffffffa0216440>] gfs2_get_sb
+0x8d0/0xc10 [gfs2]
Nov 12 15:10:01 chywoon kernel: [<ffffffffa02099a1>] ? gfs2_glock_nq_num
+0x61/0x90 [gfs2]
Nov 12 15:10:01 chywoon kernel: [<ffffffff8115f9eb>] ? __alloc_percpu
+0xb/0x10
Nov 12 15:10:01 chywoon kernel: [<ffffffff81167838>] vfs_kern_mount
+0x58/0xf0
Nov 12 15:10:01 chywoon kernel: [<ffffffff8116793d>] do_kern_mount
+0x4d/0x130
Nov 12 15:10:01 chywoon kernel: [<ffffffff81629062>] ? lock_kernel
+0x42/0x109
Nov 12 15:10:01 chywoon kernel: [<ffffffff81186854>] do_mount
+0x2c4/0x820
Nov 12 15:10:01 chywoon kernel: [<ffffffff81186e3b>] sys_mount+0x8b/0xe0
Nov 12 15:10:01 chywoon kernel: [<ffffffff8103d0c2>]
system_call_fastpath+0x16/0x1b




^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Cluster-devel] Re: [PATCH 2/2] dlm: Add down/up_write_non_owner to keep lockdep happy
  2009-11-12 14:29       ` Steven Whitehouse
@ 2009-11-12 17:14         ` David Teigland
  2009-11-12 17:24           ` Steven Whitehouse
       [not found]           ` <1258044339.4039.685.camel@laptop>
  0 siblings, 2 replies; 14+ messages in thread
From: David Teigland @ 2009-11-12 17:14 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Thu, Nov 12, 2009 at 02:29:18PM +0000, Steven Whitehouse wrote:
> Hi,
> 
> On Thu, 2009-11-12 at 15:22 +0100, Ingo Molnar wrote:
> > * Steven Whitehouse <swhiteho@redhat.com> wrote:
> > 
> > > I looked at possibly changing this to use completions, but
> > > it seems that the usage here is not easily adapted to that.
> > > This patch adds suitable annotation to the write side of
> > > the ls_in_recovery semaphore so that we don't get nasty
> > > messages from lockdep when mounting a gfs2 filesystem.
> > 
> > What do those 'nasty messages' say? If they expose some bug and this
> > patch works around that bug by hiding it then NAK ...
> > 
> >         Ingo
> > 
> The nasty messages are moaning that the lock is being taken in one
> thread and unlocked in another. I couldn't see any bugs in the code when
> I looked at it. Below are the messages that I get - to reproduce just
> mount a GFS2 filesystem with the dlm lock manager. It happens on every
> mount,
> 
> Steve.
> 
> Nov 12 15:10:01 chywoon kernel:
> =============================================
> Nov 12 15:10:01 chywoon kernel: [ INFO: possible recursive locking
> detected ]

That recursive locking trace is something different.  up_write_non_owner()
addresses this trace, which as you say, is from doing the down and up from
different threads (which is the intention):

Nov 11 16:50:08 bull-02 kernel: GFS2: fsid=bull:foo.1: Joined cluster. Now mounting FS...
Nov 11 16:50:09 bull-02 kernel: 
Nov 11 16:50:09 bull-02 kernel: =====================================
Nov 11 16:50:09 bull-02 kernel: [ BUG: bad unlock balance detected! ]
Nov 11 16:50:09 bull-02 kernel: -------------------------------------
Nov 11 16:50:09 bull-02 kernel: dlm_recoverd/8958 is trying to release lock (&ls->ls_in_recovery) at:
Nov 11 16:50:09 bull-02 kernel: [<ffffffffa02e0fa7>] dlm_recoverd+0x323/0x4ce [dlm]
Nov 11 16:50:09 bull-02 kernel: but there are no more locks to release!
Nov 11 16:50:09 bull-02 kernel: 
Nov 11 16:50:09 bull-02 kernel: other info that might help us debug this:
Nov 11 16:50:09 bull-02 kernel: 3 locks held by dlm_recoverd/8958:
Nov 11 16:50:09 bull-02 kernel:  #0:  (&ls->ls_recoverd_active){......}, at: [<ffffffffa02e0d74>] dlm_recoverd+0xf0/0x4ce [dlm]
Nov 11 16:50:09 bull-02 kernel:  #1:  (&ls->ls_recv_active){......}, at: [<ffffffffa02e0f81>] dlm_recoverd+0x2fd/0x4ce [dlm]
Nov 11 16:50:09 bull-02 kernel:  #2:  (&ls->ls_recover_lock){......}, at: [<ffffffffa02e0f89>] dlm_recoverd+0x305/0x4ce [dlm]
Nov 11 16:50:09 bull-02 kernel: 
Nov 11 16:50:09 bull-02 kernel: stack backtrace:
Nov 11 16:50:09 bull-02 kernel: Pid: 8958, comm: dlm_recoverd Not tainted 2.6.32-rc5 #2
Nov 11 16:50:09 bull-02 kernel: Call Trace:
Nov 11 16:50:09 bull-02 kernel:  [<ffffffffa02e0fa7>] ? dlm_recoverd+0x323/0x4ce [dlm]
Nov 11 16:50:09 bull-02 kernel:  [<ffffffff8106949b>] print_unlock_inbalance_bug+0xd6/0xe0
Nov 11 16:50:09 bull-02 kernel:  [<ffffffff81069563>] lock_release_non_nested+0xbe/0x259
Nov 11 16:50:09 bull-02 kernel:  [<ffffffffa02e0fa7>] ? dlm_recoverd+0x323/0x4ce [dlm]
Nov 11 16:50:09 bull-02 kernel:  [<ffffffffa02e0fa7>] ? dlm_recoverd+0x323/0x4ce [dlm]
Nov 11 16:50:09 bull-02 kernel:  [<ffffffff8106b8ae>] lock_release+0x14a/0x16c
Nov 11 16:50:09 bull-02 kernel:  [<ffffffff8105fb71>] up_write+0x1e/0x2d
Nov 11 16:50:09 bull-02 kernel:  [<ffffffffa02e0fa7>] dlm_recoverd+0x323/0x4ce [dlm]
Nov 11 16:50:09 bull-02 kernel:  [<ffffffffa02e0c84>] ? dlm_recoverd+0x0/0x4ce [dlm]
Nov 11 16:50:09 bull-02 kernel:  [<ffffffff8105cad0>] kthread+0x7d/0x85
Nov 11 16:50:09 bull-02 kernel:  [<ffffffff8100ca1a>] child_rip+0xa/0x20
Nov 11 16:50:09 bull-02 kernel:  [<ffffffff8105ca32>] ? kthreadd+0xc2/0xe3
Nov 11 16:50:09 bull-02 kernel:  [<ffffffff8105ca53>] ? kthread+0x0/0x85
Nov 11 16:50:09 bull-02 kernel:  [<ffffffff8100ca10>] ? child_rip+0x0/0x20



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Cluster-devel] Re: [PATCH 2/2] dlm: Add down/up_write_non_owner to keep lockdep happy
  2009-11-12 17:14         ` David Teigland
@ 2009-11-12 17:24           ` Steven Whitehouse
  2009-11-12 18:34             ` David Teigland
       [not found]           ` <1258044339.4039.685.camel@laptop>
  1 sibling, 1 reply; 14+ messages in thread
From: Steven Whitehouse @ 2009-11-12 17:24 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

On Thu, 2009-11-12 at 11:14 -0600, David Teigland wrote:
> On Thu, Nov 12, 2009 at 02:29:18PM +0000, Steven Whitehouse wrote:
> > Hi,
> > 
> > On Thu, 2009-11-12 at 15:22 +0100, Ingo Molnar wrote:
> > > * Steven Whitehouse <swhiteho@redhat.com> wrote:
> > > 
> > > > I looked at possibly changing this to use completions, but
> > > > it seems that the usage here is not easily adapted to that.
> > > > This patch adds suitable annotation to the write side of
> > > > the ls_in_recovery semaphore so that we don't get nasty
> > > > messages from lockdep when mounting a gfs2 filesystem.
> > > 
> > > What do those 'nasty messages' say? If they expose some bug and this
> > > patch works around that bug by hiding it then NAK ...
> > > 
> > >         Ingo
> > > 
> > The nasty messages are moaning that the lock is being taken in one
> > thread and unlocked in another. I couldn't see any bugs in the code when
> > I looked at it. Below are the messages that I get - to reproduce just
> > mount a GFS2 filesystem with the dlm lock manager. It happens on every
> > mount,
> > 
> > Steve.
> > 
> > Nov 12 15:10:01 chywoon kernel:
> > =============================================
> > Nov 12 15:10:01 chywoon kernel: [ INFO: possible recursive locking
> > detected ]
> 
> That recursive locking trace is something different.  up_write_non_owner()
> addresses this trace, which as you say, is from doing the down and up from
> different threads (which is the intention):
> 
I don't think it is different, the traces differ due to the ordering of
running of dlm_recoverd and mount.gfs2,

Steve.




^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Cluster-devel] Re: [PATCH 2/2] dlm: Add down/up_write_non_owner to keep lockdep happy
  2009-11-12 17:24           ` Steven Whitehouse
@ 2009-11-12 18:34             ` David Teigland
  2009-11-13 10:21               ` Steven Whitehouse
  0 siblings, 1 reply; 14+ messages in thread
From: David Teigland @ 2009-11-12 18:34 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Thu, Nov 12, 2009 at 05:24:12PM +0000, Steven Whitehouse wrote:
> > > Nov 12 15:10:01 chywoon kernel: [ INFO: possible recursive locking
> > > detected ]
> > 
> > That recursive locking trace is something different.  up_write_non_owner()
> > addresses this trace, which as you say, is from doing the down and up from
> > different threads (which is the intention):
> > 
> I don't think it is different, the traces differ due to the ordering of
> running of dlm_recoverd and mount.gfs2,

I explained the "recursive locking" warning back in Sep:

 > I've not looked at how to remove this "recursive" message.  What
 > happens is that mount calls dlm_new_lockspace() which returns with
 > in_recovery locked.  mount then makes a lock request which blocks on
 > in_recovery (as expected) until the dlm_recoverd thread completes
 > recovery and releases the in_recovery lock (triggering the unlock
 > balance) to allow locking activity.

It doesn't appear to me that up_write_non_owner() would suppress that.

Dave



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Cluster-devel] Re: [PATCH 2/2] dlm: Add down/up_write_non_owner to keep lockdep happy
  2009-11-12 18:34             ` David Teigland
@ 2009-11-13 10:21               ` Steven Whitehouse
  0 siblings, 0 replies; 14+ messages in thread
From: Steven Whitehouse @ 2009-11-13 10:21 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

On Thu, 2009-11-12 at 12:34 -0600, David Teigland wrote:
> On Thu, Nov 12, 2009 at 05:24:12PM +0000, Steven Whitehouse wrote:
> > > > Nov 12 15:10:01 chywoon kernel: [ INFO: possible recursive locking
> > > > detected ]
> > > 
> > > That recursive locking trace is something different.  up_write_non_owner()
> > > addresses this trace, which as you say, is from doing the down and up from
> > > different threads (which is the intention):
> > > 
> > I don't think it is different, the traces differ due to the ordering of
> > running of dlm_recoverd and mount.gfs2,
> 
> I explained the "recursive locking" warning back in Sep:
> 
>  > I've not looked at how to remove this "recursive" message.  What
>  > happens is that mount calls dlm_new_lockspace() which returns with
>  > in_recovery locked.  mount then makes a lock request which blocks on
>  > in_recovery (as expected) until the dlm_recoverd thread completes
>  > recovery and releases the in_recovery lock (triggering the unlock
>  > balance) to allow locking activity.
> 
> It doesn't appear to me that up_write_non_owner() would suppress that.
> 
> Dave
> 
It is simply down to the ordering of the running of the threads as to
which message you get at mount time. There are two possible scenarios:

Scenario 1:

1. mount.gfs2 calls (via mount sys call and gfs2) dlm_newlockspace()
which takes the ls_in_recovery rwsem with a down_write()
2. mount.gfs2 goes on to try and take out a lock on the filesystem, and
calls dlm_lock which tries to do a down_read() on the rwsem. Since this
is from the same thread as the down_write() you get the recursive
locking message reported in the dmesg which I attached to my earlier
email.

In the second scenario, dlm_recoverd runs between step 1 and 2 above.
this results in the trace which you reported, since ls_in_recovery has
then been unlocked from a different thread, which creates the unlocking
balance trace which you posted.

In both cases the cause is the same, its just the running order of the
threads which results in it being reported in a different way. The patch
should fix both of these reports, since it annotates the up & down write
side of the rwsem,

Steve.

^ permalink raw reply	[flat|nested] 14+ messages in thread

[parent not found: <1258044339.4039.685.camel@laptop>]

* [Cluster-devel] Re: [PATCH 2/2] dlm: Add down/up_write_non_owner to keep lockdep happy
       [not found]           ` <1258044339.4039.685.camel@laptop>
@ 2009-11-12 17:27             ` Steven Whitehouse
  2009-11-12 18:21             ` David Teigland
  1 sibling, 0 replies; 14+ messages in thread
From: Steven Whitehouse @ 2009-11-12 17:27 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

On Thu, 2009-11-12 at 17:45 +0100, Peter Zijlstra wrote:
> On Thu, 2009-11-12 at 11:14 -0600, David Teigland wrote:
> > up_write_non_owner()
> > addresses this trace, which as you say, is from doing the down and up from
> > different threads (which is the intention): 
> 
> That's really something I cannot advice to do. Aside from loosing
> lock-dependency validation (not a good thing), asymmetric locking like
> that is generally very hard to analyze since its not clear who 'owns'
> what data when.
> 
> There are a few places in the kernel that use the non_owner things, and
> we should generally strive to remove them, not add more.
> 
> Please consider solving your problem without adding things like this.
> 
The code that does this already exists - it is not being added by the
patch. Its just that in recent kernels lockdep has started noticing the
problem. I did seriously consider changing the locking rather than just
silencing the messages, but it looks rather complicated and not easily
replaced with other primitives.

Any suggestions as to a better solution are welcome,

Steve.




^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Cluster-devel] Re: [PATCH 2/2] dlm: Add down/up_write_non_owner to keep lockdep happy
       [not found]           ` <1258044339.4039.685.camel@laptop>
  2009-11-12 17:27             ` Steven Whitehouse
@ 2009-11-12 18:21             ` David Teigland
  1 sibling, 0 replies; 14+ messages in thread
From: David Teigland @ 2009-11-12 18:21 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Thu, Nov 12, 2009 at 05:45:39PM +0100, Peter Zijlstra wrote:
> On Thu, 2009-11-12 at 11:14 -0600, David Teigland wrote:
> > up_write_non_owner()
> > addresses this trace, which as you say, is from doing the down and up from
> > different threads (which is the intention): 
> 
> That's really something I cannot advice to do. Aside from loosing
> lock-dependency validation (not a good thing), asymmetric locking like
> that is generally very hard to analyze since its not clear who 'owns'
> what data when.
> 
> There are a few places in the kernel that use the non_owner things, and
> we should generally strive to remove them, not add more.
> 
> Please consider solving your problem without adding things like this.

It's an unusual use case; this lock is not protecting data in a direct
sense, but is controlling who can run in the dlm.

During normal activity, many threads are running through the dlm (any
process accessing the fs, dlm kernel threads, gfs kernel threads), they
all take the read lock when they enter and release it when they exit.

When dlm recovery needs to happen, this lock is taken in write mode to
wait for all the normal, active threads to exit the dlm, and then block
any further access to the dlm until recovery is complete.  Recovery is
initiated by a userland process which takes the write lock.  Recovery is
then performed by the dlm_recoverd thread which releases the write lock
when it's done, and normal activity resumes.  The release from
dlm_recoverd would use up_write_non_owner() to avoid the warning.

I'm sure there are other ways to do this, but I don't know when I'll have
the time to look into them.  In the mean time, the rw_semaphore is simple
and effective.  (I don't mind the warning myself, but others seem to be
annoyed by it.)

Dave

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2009-11-13 10:21 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-13 14:56 [Cluster-devel] A couple of DLM patches Steven Whitehouse
2009-10-13 14:56 ` [Cluster-devel] [PATCH 1/2] dlm: Send lockspace name with uevents Steven Whitehouse
2009-10-13 14:53   ` [Cluster-devel] " David Teigland
2009-10-13 15:23     ` David Teigland
2009-10-13 15:43       ` Steven Whitehouse
2009-10-13 14:56   ` [Cluster-devel] [PATCH 2/2] dlm: Add down/up_write_non_owner to keep lockdep happy Steven Whitehouse
2009-11-12 13:27     ` [Cluster-devel] " Steven Whitehouse
     [not found]     ` <20091112142211.GA468@elte.hu>
2009-11-12 14:29       ` Steven Whitehouse
2009-11-12 17:14         ` David Teigland
2009-11-12 17:24           ` Steven Whitehouse
2009-11-12 18:34             ` David Teigland
2009-11-13 10:21               ` Steven Whitehouse
     [not found]           ` <1258044339.4039.685.camel@laptop>
2009-11-12 17:27             ` Steven Whitehouse
2009-11-12 18:21             ` David Teigland

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).