public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [Patch 0/2] sysfs: fix s_active lockdep warning
@ 2010-01-29  7:01 Amerigo Wang
  2010-01-29  7:02 ` [Patch 1/2] sysfs: add support for lockdep subclasses to s_active Amerigo Wang
                   ` (2 more replies)
  0 siblings, 3 replies; 39+ messages in thread
From: Amerigo Wang @ 2010-01-29  7:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Tejun Heo, Greg Kroah-Hartman, Eric W. Biederman, Miles Lane,
	Heiko Carstens, Benjamin Herrenschmidt, Larry Finger,
	Amerigo Wang, akpm


Recently we met a lockdep warning from sysfs during s2ram or cpu hotplug.
As reported by several people, it is something like:

[ 6967.926563] ACPI: Preparing to enter system sleep state S3
[ 6967.956156] Disabling non-boot CPUs ...
[ 6967.970401]
[ 6967.970408] =============================================
[ 6967.970419] [ INFO: possible recursive locking detected ]
[ 6967.970431] 2.6.33-rc2-git6 #27
[ 6967.970439] ---------------------------------------------
[ 6967.970450] pm-suspend/22147 is trying to acquire lock:
[ 6967.970460]  (s_active){++++.+}, at: [<c10d2941>]
sysfs_hash_and_remove+0x3d/0x4f
[ 6967.970493]
[ 6967.970497] but task is already holding lock:
[ 6967.970506]  (s_active){++++.+}, at: [<c10d4110>]
sysfs_get_active_two+0x16/0x36
[...]

Eric already provides a patch for this[1], but it still can't fix the
problem. I add the missing part of Eric's patch and send these two patches
together, hopefully we can fix the warning completely.

1. http://lkml.org/lkml/2010/1/10/282


Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reported-by: Larry Finger <Larry.Finger@lwfinger.net>
Reported-by: Miles Lane <miles.lane@gmail.com>
Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: WANG Cong <amwang@redhat.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@suse.de>


^ permalink raw reply	[flat|nested] 39+ messages in thread

* [Patch 1/2] sysfs: add support for lockdep subclasses to s_active
  2010-01-29  7:01 [Patch 0/2] sysfs: fix s_active lockdep warning Amerigo Wang
@ 2010-01-29  7:02 ` Amerigo Wang
  2010-01-29  7:02 ` [PATCH 2/2] sysfs: fix the incomplete part of subclass support for s_active Amerigo Wang
  2010-01-29  7:21 ` [Patch 0/2] sysfs: fix s_active lockdep warning Eric W. Biederman
  2 siblings, 0 replies; 39+ messages in thread
From: Amerigo Wang @ 2010-01-29  7:02 UTC (permalink / raw)
  To: linux-kernel
  Cc: Tejun Heo, Greg Kroah-Hartman, Eric W. Biederman,
	Benjamin Herrenschmidt, Heiko Carstens, Miles Lane, Larry Finger,
	Amerigo Wang, akpm

From: Eric W. Biederman <ebiederm@xmission.com>
Date: Fri, 29 Jan 2010 14:03:39 +0800
Subject: [PATCH 1/2] sysfs: add support for lockdep subclasses to s_active

We have apparently valid cases where the code for a sysfs attribute
removes other sysfs attributes.  Without support for subclasses
lockdep flags a possible recursive lock problem as it figures
the first sysfs attribute could be attempting to remove itself.

By adding support for sysfs subclasses we can teach lockdep to
distinguish between different types of sysfs attributes and not
get confused.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@suse.de>

---
 fs/sysfs/dir.c        |   14 ++++++++++++--
 include/linux/sysfs.h |    7 +++++++
 kernel/power/power.h  |   15 ++++++++-------
 3 files changed, 27 insertions(+), 9 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 699f371..e0cd4a0 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -95,9 +95,14 @@ static void sysfs_unlink_sibling(struct sysfs_dirent *sd)
  */
 static struct sysfs_dirent *sysfs_get_active(struct sysfs_dirent *sd)
 {
+	int subclass;
 	if (unlikely(!sd))
 		return NULL;
 
+	subclass = SYSFS_ATTR_NORMAL;
+	if (sysfs_type(sd) == SYSFS_KOBJ_ATTR)
+		subclass = sd->s_attr.attr->subclass;
+
 	while (1) {
 		int v, t;
 
@@ -107,7 +112,7 @@ static struct sysfs_dirent *sysfs_get_active(struct sysfs_dirent *sd)
 
 		t = atomic_cmpxchg(&sd->s_active, v, v + 1);
 		if (likely(t == v)) {
-			rwsem_acquire_read(&sd->dep_map, 0, 1, _RET_IP_);
+			rwsem_acquire_read(&sd->dep_map, subclass, 1, _RET_IP_);
 			return sd;
 		}
 		if (t < 0)
@@ -192,12 +197,17 @@ void sysfs_put_active_two(struct sysfs_dirent *sd)
 static void sysfs_deactivate(struct sysfs_dirent *sd)
 {
 	DECLARE_COMPLETION_ONSTACK(wait);
+	int subclass;
 	int v;
 
 	BUG_ON(sd->s_sibling || !(sd->s_flags & SYSFS_FLAG_REMOVED));
 	sd->s_sibling = (void *)&wait;
 
-	rwsem_acquire(&sd->dep_map, 0, 0, _RET_IP_);
+	subclass = SYSFS_ATTR_NORMAL;
+	if (sysfs_type(sd) == SYSFS_KOBJ_ATTR)
+		subclass = sd->s_attr.attr->subclass;
+
+	rwsem_acquire(&sd->dep_map, subclass, 0, _RET_IP_);
 	/* atomic_add_return() is a mb(), put_active() will always see
 	 * the updated sd->s_sibling.
 	 */
diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h
index cfa8308..2f50fec 100644
--- a/include/linux/sysfs.h
+++ b/include/linux/sysfs.h
@@ -20,6 +20,12 @@
 struct kobject;
 struct module;
 
+enum sysfs_attr_lock_class
+{
+	SYSFS_ATTR_NORMAL,
+	SYSFS_ATTR_PM_CONTROL,
+};
+
 /* FIXME
  * The *owner field is no longer used.
  * x86 tree has been cleaned up. The owner
@@ -29,6 +35,7 @@ struct attribute {
 	const char		*name;
 	struct module		*owner;
 	mode_t			mode;
+	enum sysfs_attr_lock_class	subclass;
 };
 
 struct attribute_group {
diff --git a/kernel/power/power.h b/kernel/power/power.h
index 46c5a26..0459f27 100644
--- a/kernel/power/power.h
+++ b/kernel/power/power.h
@@ -54,13 +54,14 @@ extern int hibernation_platform_enter(void);
 extern int pfn_is_nosave(unsigned long);
 
 #define power_attr(_name) \
-static struct kobj_attribute _name##_attr = {	\
-	.attr	= {				\
-		.name = __stringify(_name),	\
-		.mode = 0644,			\
-	},					\
-	.show	= _name##_show,			\
-	.store	= _name##_store,		\
+static struct kobj_attribute _name##_attr = {		\
+	.attr	= {					\
+		.name = __stringify(_name),		\
+		.mode = 0644,				\
+		.subclass = SYSFS_ATTR_PM_CONTROL,	\
+	},						\
+	.show	= _name##_show,				\
+	.store	= _name##_store,			\
 }
 
 /* Preferred image size in bytes (default 500 MB) */
-- 
1.5.5.6


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 2/2] sysfs: fix the incomplete part of subclass support for s_active
  2010-01-29  7:01 [Patch 0/2] sysfs: fix s_active lockdep warning Amerigo Wang
  2010-01-29  7:02 ` [Patch 1/2] sysfs: add support for lockdep subclasses to s_active Amerigo Wang
@ 2010-01-29  7:02 ` Amerigo Wang
  2010-01-29  7:21 ` [Patch 0/2] sysfs: fix s_active lockdep warning Eric W. Biederman
  2 siblings, 0 replies; 39+ messages in thread
From: Amerigo Wang @ 2010-01-29  7:02 UTC (permalink / raw)
  To: linux-kernel
  Cc: Tejun Heo, Greg Kroah-Hartman, Eric W. Biederman, Miles Lane,
	Heiko Carstens, Benjamin Herrenschmidt, Larry Finger,
	Amerigo Wang, akpm

From: Amerigo Wang <amwang@redhat.com>
Date: Fri, 29 Jan 2010 14:18:58 +0800
Subject: [PATCH 2/2] sysfs: fix the incomplete part of subclass support for s_active

Patch 1/2 missed the initialization part for subclass, which I think
is the reason why it still can't stop the lockdep warning. This depends
on patch 1/2.

Compile test only.

Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@suse.de>

---
 fs/sysfs/dir.c   |    1 -
 fs/sysfs/file.c  |    2 ++
 fs/sysfs/sysfs.h |    6 +++---
 3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index e0cd4a0..6b63198 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -364,7 +364,6 @@ struct sysfs_dirent *sysfs_new_dirent(const char *name, umode_t mode, int type)
 
 	atomic_set(&sd->s_count, 1);
 	atomic_set(&sd->s_active, 0);
-	sysfs_dirent_init_lockdep(sd);
 
 	sd->s_name = name;
 	sd->s_mode = mode;
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index dc30d9e..abf30d7 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -509,6 +509,8 @@ int sysfs_add_file_mode(struct sysfs_dirent *dir_sd,
 	if (!sd)
 		return -ENOMEM;
 	sd->s_attr.attr = (void *)attr;
+	if ((type & SYSFS_TYPE_MASK) == SYSFS_KOBJ_ATTR)
+		sysfs_dirent_init_lockdep(sd, sd->s_attr.attr->subclass);
 
 	sysfs_addrm_start(&acxt, dir_sd);
 	rc = sysfs_add_one(&acxt, sd);
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index cdd9377..7e78f2e 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -89,14 +89,14 @@ static inline unsigned int sysfs_type(struct sysfs_dirent *sd)
 }
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
-#define sysfs_dirent_init_lockdep(sd)				\
+#define sysfs_dirent_init_lockdep(sd, c)			\
 do {								\
 	static struct lock_class_key __key;			\
 								\
-	lockdep_init_map(&sd->dep_map, "s_active", &__key, 0);	\
+	lockdep_init_map(&sd->dep_map, "s_active", &__key, c);	\
 } while(0)
 #else
-#define sysfs_dirent_init_lockdep(sd) do {} while(0)
+#define sysfs_dirent_init_lockdep(sd, c) do {} while(0)
 #endif
 
 /*
-- 
1.5.5.6


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [Patch 0/2] sysfs: fix s_active lockdep warning
  2010-01-29  7:01 [Patch 0/2] sysfs: fix s_active lockdep warning Amerigo Wang
  2010-01-29  7:02 ` [Patch 1/2] sysfs: add support for lockdep subclasses to s_active Amerigo Wang
  2010-01-29  7:02 ` [PATCH 2/2] sysfs: fix the incomplete part of subclass support for s_active Amerigo Wang
@ 2010-01-29  7:21 ` Eric W. Biederman
  2010-01-29  8:38   ` Cong Wang
  2010-01-29 18:02   ` Peter Zijlstra
  2 siblings, 2 replies; 39+ messages in thread
From: Eric W. Biederman @ 2010-01-29  7:21 UTC (permalink / raw)
  To: Amerigo Wang
  Cc: linux-kernel, Tejun Heo, Greg Kroah-Hartman, Miles Lane,
	Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, akpm

Amerigo Wang <amwang@redhat.com> writes:

> Recently we met a lockdep warning from sysfs during s2ram or cpu hotplug.
> As reported by several people, it is something like:
>
> [ 6967.926563] ACPI: Preparing to enter system sleep state S3
> [ 6967.956156] Disabling non-boot CPUs ...
> [ 6967.970401]
> [ 6967.970408] =============================================
> [ 6967.970419] [ INFO: possible recursive locking detected ]
> [ 6967.970431] 2.6.33-rc2-git6 #27
> [ 6967.970439] ---------------------------------------------
> [ 6967.970450] pm-suspend/22147 is trying to acquire lock:
> [ 6967.970460]  (s_active){++++.+}, at: [<c10d2941>]
> sysfs_hash_and_remove+0x3d/0x4f
> [ 6967.970493]
> [ 6967.970497] but task is already holding lock:
> [ 6967.970506]  (s_active){++++.+}, at: [<c10d4110>]
> sysfs_get_active_two+0x16/0x36
> [...]
>
> Eric already provides a patch for this[1], but it still can't fix the
> problem. I add the missing part of Eric's patch and send these two patches
> together, hopefully we can fix the warning completely.
>
> 1. http://lkml.org/lkml/2010/1/10/282
>
>
> Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Reported-by: Larry Finger <Larry.Finger@lwfinger.net>
> Reported-by: Miles Lane <miles.lane@gmail.com>
> Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
> Signed-off-by: WANG Cong <amwang@redhat.com>
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Greg Kroah-Hartman <gregkh@suse.de>

Thanks for following up on this.

I suspect we may want to create a separate class for each sysfs file
instead of playing whack-a-mole and creating a subclass each time we
have problems.

I don't see why the rules for one sysfs file should be the same as for
any other sysfs file.

Eric


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Patch 0/2] sysfs: fix s_active lockdep warning
  2010-01-29  7:21 ` [Patch 0/2] sysfs: fix s_active lockdep warning Eric W. Biederman
@ 2010-01-29  8:38   ` Cong Wang
  2010-01-29 13:44     ` Eric W. Biederman
  2010-01-29 18:02   ` Peter Zijlstra
  1 sibling, 1 reply; 39+ messages in thread
From: Cong Wang @ 2010-01-29  8:38 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-kernel, Tejun Heo, Greg Kroah-Hartman, Miles Lane,
	Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, akpm

Eric W. Biederman wrote:
> Amerigo Wang <amwang@redhat.com> writes:
> 
>> Recently we met a lockdep warning from sysfs during s2ram or cpu hotplug.
>> As reported by several people, it is something like:
>>
>> [ 6967.926563] ACPI: Preparing to enter system sleep state S3
>> [ 6967.956156] Disabling non-boot CPUs ...
>> [ 6967.970401]
>> [ 6967.970408] =============================================
>> [ 6967.970419] [ INFO: possible recursive locking detected ]
>> [ 6967.970431] 2.6.33-rc2-git6 #27
>> [ 6967.970439] ---------------------------------------------
>> [ 6967.970450] pm-suspend/22147 is trying to acquire lock:
>> [ 6967.970460]  (s_active){++++.+}, at: [<c10d2941>]
>> sysfs_hash_and_remove+0x3d/0x4f
>> [ 6967.970493]
>> [ 6967.970497] but task is already holding lock:
>> [ 6967.970506]  (s_active){++++.+}, at: [<c10d4110>]
>> sysfs_get_active_two+0x16/0x36
>> [...]
>>
>> Eric already provides a patch for this[1], but it still can't fix the
>> problem. I add the missing part of Eric's patch and send these two patches
>> together, hopefully we can fix the warning completely.
>>
>> 1. http://lkml.org/lkml/2010/1/10/282
>>
>>
>> Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>> Reported-by: Larry Finger <Larry.Finger@lwfinger.net>
>> Reported-by: Miles Lane <miles.lane@gmail.com>
>> Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
>> Signed-off-by: WANG Cong <amwang@redhat.com>
>> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
>> Cc: Tejun Heo <tj@kernel.org>
>> Cc: Greg Kroah-Hartman <gregkh@suse.de>
> 
> Thanks for following up on this.
> 
> I suspect we may want to create a separate class for each sysfs file
> instead of playing whack-a-mole and creating a subclass each time we
> have problems.
> 
> I don't see why the rules for one sysfs file should be the same as for
> any other sysfs file.
> 

I am confused, we don't know who created sysfs files unless we
separate them by subclasses, the way of your patch is very straight
ward.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Patch 0/2] sysfs: fix s_active lockdep warning
  2010-01-29  8:38   ` Cong Wang
@ 2010-01-29 13:44     ` Eric W. Biederman
  2010-01-29 14:22       ` Greg KH
  0 siblings, 1 reply; 39+ messages in thread
From: Eric W. Biederman @ 2010-01-29 13:44 UTC (permalink / raw)
  To: Cong Wang
  Cc: linux-kernel, Tejun Heo, Greg Kroah-Hartman, Miles Lane,
	Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, akpm

Cong Wang <amwang@redhat.com> writes:

> Eric W. Biederman wrote:
>> Amerigo Wang <amwang@redhat.com> writes:
>>
>>> Recently we met a lockdep warning from sysfs during s2ram or cpu hotplug.
>>> As reported by several people, it is something like:
>>>
>>> [ 6967.926563] ACPI: Preparing to enter system sleep state S3
>>> [ 6967.956156] Disabling non-boot CPUs ...
>>> [ 6967.970401]
>>> [ 6967.970408] =============================================
>>> [ 6967.970419] [ INFO: possible recursive locking detected ]
>>> [ 6967.970431] 2.6.33-rc2-git6 #27
>>> [ 6967.970439] ---------------------------------------------
>>> [ 6967.970450] pm-suspend/22147 is trying to acquire lock:
>>> [ 6967.970460]  (s_active){++++.+}, at: [<c10d2941>]
>>> sysfs_hash_and_remove+0x3d/0x4f
>>> [ 6967.970493]
>>> [ 6967.970497] but task is already holding lock:
>>> [ 6967.970506]  (s_active){++++.+}, at: [<c10d4110>]
>>> sysfs_get_active_two+0x16/0x36
>>> [...]
>>>
>>> Eric already provides a patch for this[1], but it still can't fix the
>>> problem. I add the missing part of Eric's patch and send these two patches
>>> together, hopefully we can fix the warning completely.
>>>
>>> 1. http://lkml.org/lkml/2010/1/10/282
>>>
>>>
>>> Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>>> Reported-by: Larry Finger <Larry.Finger@lwfinger.net>
>>> Reported-by: Miles Lane <miles.lane@gmail.com>
>>> Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
>>> Signed-off-by: WANG Cong <amwang@redhat.com>
>>> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
>>> Cc: Tejun Heo <tj@kernel.org>
>>> Cc: Greg Kroah-Hartman <gregkh@suse.de>
>>
>> Thanks for following up on this.
>>
>> I suspect we may want to create a separate class for each sysfs file
>> instead of playing whack-a-mole and creating a subclass each time we
>> have problems.
>>
>> I don't see why the rules for one sysfs file should be the same as for
>> any other sysfs file.
>>
>
> I am confused, we don't know who created sysfs files unless we
> separate them by subclasses, the way of your patch is very straight
> ward.

The assumption is that all entities in a class are used very similarly.
What I was suggesting is that it may make sense, and be simpler to have
a separate __key value and thus place each sysfs file in it's class.

Doing that is a lot more for lockdep to track, but it would not produce
the confusing false positives that we see now.  From the reports I have
seen we may have more that 16 subclasses in sysfs, and it will likely
take us a while to find them all.

Eric


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Patch 0/2] sysfs: fix s_active lockdep warning
  2010-01-29 13:44     ` Eric W. Biederman
@ 2010-01-29 14:22       ` Greg KH
  2010-01-29 17:57         ` Peter Zijlstra
  2010-01-29 20:25         ` Eric W. Biederman
  0 siblings, 2 replies; 39+ messages in thread
From: Greg KH @ 2010-01-29 14:22 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Cong Wang, linux-kernel, Tejun Heo, Miles Lane, Heiko Carstens,
	Benjamin Herrenschmidt, Larry Finger, akpm

On Fri, Jan 29, 2010 at 05:44:07AM -0800, Eric W. Biederman wrote:
> Cong Wang <amwang@redhat.com> writes:
> 
> > Eric W. Biederman wrote:
> >> Amerigo Wang <amwang@redhat.com> writes:
> >>
> >>> Recently we met a lockdep warning from sysfs during s2ram or cpu hotplug.
> >>> As reported by several people, it is something like:
> >>>
> >>> [ 6967.926563] ACPI: Preparing to enter system sleep state S3
> >>> [ 6967.956156] Disabling non-boot CPUs ...
> >>> [ 6967.970401]
> >>> [ 6967.970408] =============================================
> >>> [ 6967.970419] [ INFO: possible recursive locking detected ]
> >>> [ 6967.970431] 2.6.33-rc2-git6 #27
> >>> [ 6967.970439] ---------------------------------------------
> >>> [ 6967.970450] pm-suspend/22147 is trying to acquire lock:
> >>> [ 6967.970460]  (s_active){++++.+}, at: [<c10d2941>]
> >>> sysfs_hash_and_remove+0x3d/0x4f
> >>> [ 6967.970493]
> >>> [ 6967.970497] but task is already holding lock:
> >>> [ 6967.970506]  (s_active){++++.+}, at: [<c10d4110>]
> >>> sysfs_get_active_two+0x16/0x36
> >>> [...]
> >>>
> >>> Eric already provides a patch for this[1], but it still can't fix the
> >>> problem. I add the missing part of Eric's patch and send these two patches
> >>> together, hopefully we can fix the warning completely.
> >>>
> >>> 1. http://lkml.org/lkml/2010/1/10/282
> >>>
> >>>
> >>> Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> >>> Reported-by: Larry Finger <Larry.Finger@lwfinger.net>
> >>> Reported-by: Miles Lane <miles.lane@gmail.com>
> >>> Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
> >>> Signed-off-by: WANG Cong <amwang@redhat.com>
> >>> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
> >>> Cc: Tejun Heo <tj@kernel.org>
> >>> Cc: Greg Kroah-Hartman <gregkh@suse.de>
> >>
> >> Thanks for following up on this.
> >>
> >> I suspect we may want to create a separate class for each sysfs file
> >> instead of playing whack-a-mole and creating a subclass each time we
> >> have problems.
> >>
> >> I don't see why the rules for one sysfs file should be the same as for
> >> any other sysfs file.
> >>
> >
> > I am confused, we don't know who created sysfs files unless we
> > separate them by subclasses, the way of your patch is very straight
> > ward.
> 
> The assumption is that all entities in a class are used very similarly.
> What I was suggesting is that it may make sense, and be simpler to have
> a separate __key value and thus place each sysfs file in it's class.
> 
> Doing that is a lot more for lockdep to track, but it would not produce
> the confusing false positives that we see now.  From the reports I have
> seen we may have more that 16 subclasses in sysfs, and it will likely
> take us a while to find them all.

Heh, this whole mess is the very reason we didn't add lockdep support to
the driver core.  Nested devices that all look alike from the driver
core, are really different objects and the locking lifetimes are
separate, but lockdep can't see that.

I suggest we just remove the original patch, as it seems to be causing
way too many problems.

Any objections to that?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Patch 0/2] sysfs: fix s_active lockdep warning
  2010-01-29 14:22       ` Greg KH
@ 2010-01-29 17:57         ` Peter Zijlstra
  2010-01-29 18:10           ` Greg KH
  2010-01-29 20:25         ` Eric W. Biederman
  1 sibling, 1 reply; 39+ messages in thread
From: Peter Zijlstra @ 2010-01-29 17:57 UTC (permalink / raw)
  To: Greg KH
  Cc: Eric W. Biederman, Cong Wang, linux-kernel, Tejun Heo, Miles Lane,
	Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, akpm

On Fri, 2010-01-29 at 06:22 -0800, Greg KH wrote:
> 
> Heh, this whole mess is the very reason we didn't add lockdep support to
> the driver core.  Nested devices that all look alike from the driver
> core, are really different objects and the locking lifetimes are
> separate, but lockdep can't see that. 

And here I through Alan Stern had a handle on making the driver core
play nice.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Patch 0/2] sysfs: fix s_active lockdep warning
  2010-01-29  7:21 ` [Patch 0/2] sysfs: fix s_active lockdep warning Eric W. Biederman
  2010-01-29  8:38   ` Cong Wang
@ 2010-01-29 18:02   ` Peter Zijlstra
  1 sibling, 0 replies; 39+ messages in thread
From: Peter Zijlstra @ 2010-01-29 18:02 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Amerigo Wang, linux-kernel, Tejun Heo, Greg Kroah-Hartman,
	Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger,
	akpm

On Thu, 2010-01-28 at 23:21 -0800, Eric W. Biederman wrote:
> 
> I suspect we may want to create a separate class for each sysfs file
> instead of playing whack-a-mole and creating a subclass each time we
> have problems.

Well, you can make a class every time you have a problem, I don't think
you can make a class for each file since classes have to have static
storage, also subclasses are limited to 8.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Patch 0/2] sysfs: fix s_active lockdep warning
  2010-01-29 17:57         ` Peter Zijlstra
@ 2010-01-29 18:10           ` Greg KH
  2010-01-29 18:14             ` Peter Zijlstra
  0 siblings, 1 reply; 39+ messages in thread
From: Greg KH @ 2010-01-29 18:10 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Eric W. Biederman, Cong Wang, linux-kernel, Tejun Heo, Miles Lane,
	Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, akpm

On Fri, Jan 29, 2010 at 06:57:28PM +0100, Peter Zijlstra wrote:
> On Fri, 2010-01-29 at 06:22 -0800, Greg KH wrote:
> > 
> > Heh, this whole mess is the very reason we didn't add lockdep support to
> > the driver core.  Nested devices that all look alike from the driver
> > core, are really different objects and the locking lifetimes are
> > separate, but lockdep can't see that. 
> 
> And here I through Alan Stern had a handle on making the driver core
> play nice.

It's not the driver core that is the issue here, it's that lockdep can't
handle the tree structure of devices that is represented in the kernel.

I don't think it is a driver core problem, but rather, a lockdep issue.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Patch 0/2] sysfs: fix s_active lockdep warning
  2010-01-29 18:10           ` Greg KH
@ 2010-01-29 18:14             ` Peter Zijlstra
  2010-01-29 18:21               ` Greg KH
  0 siblings, 1 reply; 39+ messages in thread
From: Peter Zijlstra @ 2010-01-29 18:14 UTC (permalink / raw)
  To: Greg KH
  Cc: Eric W. Biederman, Cong Wang, linux-kernel, Tejun Heo, Miles Lane,
	Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, akpm

On Fri, 2010-01-29 at 10:10 -0800, Greg KH wrote:
> On Fri, Jan 29, 2010 at 06:57:28PM +0100, Peter Zijlstra wrote:
> > On Fri, 2010-01-29 at 06:22 -0800, Greg KH wrote:
> > > 
> > > Heh, this whole mess is the very reason we didn't add lockdep support to
> > > the driver core.  Nested devices that all look alike from the driver
> > > core, are really different objects and the locking lifetimes are
> > > separate, but lockdep can't see that. 
> > 
> > And here I through Alan Stern had a handle on making the driver core
> > play nice.
> 
> It's not the driver core that is the issue here, it's that lockdep can't
> handle the tree structure of devices that is represented in the kernel.
> 
> I don't think it is a driver core problem, but rather, a lockdep issue.

Right, we've been over that and I think I added enough lockdep
annotations to make it work for the device tree. At least, Alan and I
seemed to agree on that last time we talked about it.




^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Patch 0/2] sysfs: fix s_active lockdep warning
  2010-01-29 18:14             ` Peter Zijlstra
@ 2010-01-29 18:21               ` Greg KH
  2010-01-29 20:10                 ` Peter Zijlstra
  0 siblings, 1 reply; 39+ messages in thread
From: Greg KH @ 2010-01-29 18:21 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Eric W. Biederman, Cong Wang, linux-kernel, Tejun Heo, Miles Lane,
	Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, akpm

On Fri, Jan 29, 2010 at 07:14:20PM +0100, Peter Zijlstra wrote:
> On Fri, 2010-01-29 at 10:10 -0800, Greg KH wrote:
> > On Fri, Jan 29, 2010 at 06:57:28PM +0100, Peter Zijlstra wrote:
> > > On Fri, 2010-01-29 at 06:22 -0800, Greg KH wrote:
> > > > 
> > > > Heh, this whole mess is the very reason we didn't add lockdep support to
> > > > the driver core.  Nested devices that all look alike from the driver
> > > > core, are really different objects and the locking lifetimes are
> > > > separate, but lockdep can't see that. 
> > > 
> > > And here I through Alan Stern had a handle on making the driver core
> > > play nice.
> > 
> > It's not the driver core that is the issue here, it's that lockdep can't
> > handle the tree structure of devices that is represented in the kernel.
> > 
> > I don't think it is a driver core problem, but rather, a lockdep issue.
> 
> Right, we've been over that and I think I added enough lockdep
> annotations to make it work for the device tree. At least, Alan and I
> seemed to agree on that last time we talked about it.

Ah, I didn't realize that, very nice.

If so, then this sysfs lock stuff should be able to use those
annotations and we shouldn't have this issue, right?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Patch 0/2] sysfs: fix s_active lockdep warning
  2010-01-29 18:21               ` Greg KH
@ 2010-01-29 20:10                 ` Peter Zijlstra
  2010-01-29 20:30                   ` Eric W. Biederman
  0 siblings, 1 reply; 39+ messages in thread
From: Peter Zijlstra @ 2010-01-29 20:10 UTC (permalink / raw)
  To: Greg KH
  Cc: Eric W. Biederman, Cong Wang, linux-kernel, Tejun Heo, Miles Lane,
	Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, akpm

On Fri, 2010-01-29 at 10:21 -0800, Greg KH wrote:
> On Fri, Jan 29, 2010 at 07:14:20PM +0100, Peter Zijlstra wrote:
> > On Fri, 2010-01-29 at 10:10 -0800, Greg KH wrote:
> > > On Fri, Jan 29, 2010 at 06:57:28PM +0100, Peter Zijlstra wrote:
> > > > On Fri, 2010-01-29 at 06:22 -0800, Greg KH wrote:
> > > > > 
> > > > > Heh, this whole mess is the very reason we didn't add lockdep support to
> > > > > the driver core.  Nested devices that all look alike from the driver
> > > > > core, are really different objects and the locking lifetimes are
> > > > > separate, but lockdep can't see that. 
> > > > 
> > > > And here I through Alan Stern had a handle on making the driver core
> > > > play nice.
> > > 
> > > It's not the driver core that is the issue here, it's that lockdep can't
> > > handle the tree structure of devices that is represented in the kernel.
> > > 
> > > I don't think it is a driver core problem, but rather, a lockdep issue.
> > 
> > Right, we've been over that and I think I added enough lockdep
> > annotations to make it work for the device tree. At least, Alan and I
> > seemed to agree on that last time we talked about it.
> 
> Ah, I didn't realize that, very nice.
> 
> If so, then this sysfs lock stuff should be able to use those
> annotations and we shouldn't have this issue, right?

I really wouldn't know, I've not yet looked at sysfs to see what the
particular issue is. But possibly, if you say the problem space is
similar.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Patch 0/2] sysfs: fix s_active lockdep warning
  2010-01-29 14:22       ` Greg KH
  2010-01-29 17:57         ` Peter Zijlstra
@ 2010-01-29 20:25         ` Eric W. Biederman
  2010-01-30  5:30           ` Greg KH
  1 sibling, 1 reply; 39+ messages in thread
From: Eric W. Biederman @ 2010-01-29 20:25 UTC (permalink / raw)
  To: Greg KH
  Cc: Cong Wang, linux-kernel, Tejun Heo, Miles Lane, Heiko Carstens,
	Benjamin Herrenschmidt, Larry Finger, akpm

Greg KH <gregkh@suse.de> writes:

> Heh, this whole mess is the very reason we didn't add lockdep support to
> the driver core.  Nested devices that all look alike from the driver
> core, are really different objects and the locking lifetimes are
> separate, but lockdep can't see that.
>
> I suggest we just remove the original patch, as it seems to be causing
> way too many problems.
>
> Any objections to that?

I think the hit rate for real problems has been about 25-50%.  Of the
false positives a lot of those have been, code that is at least
questionable.

Furthermore there are problems we can find this way that we won't know
about any other way.  Unfortunately I haven't had much time to do
anything kernel related lately, or I would have done more with this.
My comment was about simply about finding a good way to increase the
signal to noise ration so investigations can reasonably start with the
presumption that code lockdep is complaining about real problems.

The deadlocks that we can hit in sysfs are very nasty to find, they
have persisted for years, and they pop back up after they are fixed.
So far the pain from lockdep annotations seems a lot lower.

Right now annotating with subclasses as Amerigo is attempting will work,
and remove the false positives.  I was simply hoping to find a faster
way to get there.

So yes, I do object to removing the original patch.  Let's put in the
work to find a good path to remove the handful of cases that cause
false positives.

It's a shame we aren't getting stack overflow errors on the same paths
that are removing sysfs attributes from the callback handlers of sysfs
attributes, or we could just rule out that questionable practice and
call all of the lockdep warnings valid.  Unfortunately that would just
be the tail wagging the horse.

Eric

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Patch 0/2] sysfs: fix s_active lockdep warning
  2010-01-29 20:10                 ` Peter Zijlstra
@ 2010-01-29 20:30                   ` Eric W. Biederman
  2010-02-04 11:38                     ` Peter Zijlstra
  0 siblings, 1 reply; 39+ messages in thread
From: Eric W. Biederman @ 2010-01-29 20:30 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Greg KH, Cong Wang, linux-kernel, Tejun Heo, Miles Lane,
	Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, akpm

Peter Zijlstra <peterz@infradead.org> writes:

> On Fri, 2010-01-29 at 10:21 -0800, Greg KH wrote:
>> On Fri, Jan 29, 2010 at 07:14:20PM +0100, Peter Zijlstra wrote:
>> > On Fri, 2010-01-29 at 10:10 -0800, Greg KH wrote:
>> > > On Fri, Jan 29, 2010 at 06:57:28PM +0100, Peter Zijlstra wrote:
>> > > > On Fri, 2010-01-29 at 06:22 -0800, Greg KH wrote:
>> > > > > 
>> > > > > Heh, this whole mess is the very reason we didn't add lockdep support to
>> > > > > the driver core.  Nested devices that all look alike from the driver
>> > > > > core, are really different objects and the locking lifetimes are
>> > > > > separate, but lockdep can't see that. 
>> > > > 
>> > > > And here I through Alan Stern had a handle on making the driver core
>> > > > play nice.
>> > > 
>> > > It's not the driver core that is the issue here, it's that lockdep can't
>> > > handle the tree structure of devices that is represented in the kernel.
>> > > 
>> > > I don't think it is a driver core problem, but rather, a lockdep issue.
>> > 
>> > Right, we've been over that and I think I added enough lockdep
>> > annotations to make it work for the device tree. At least, Alan and I
>> > seemed to agree on that last time we talked about it.
>> 
>> Ah, I didn't realize that, very nice.
>> 
>> If so, then this sysfs lock stuff should be able to use those
>> annotations and we shouldn't have this issue, right?
>
> I really wouldn't know, I've not yet looked at sysfs to see what the
> particular issue is. But possibly, if you say the problem space is
> similar.

We get false positives when the code of a sysfs attribute
synchronously removes other sysfs attributes.  In general that is not
safe due to hotplug etc, but there are specific instances of static
sysfs entries like the pm_core where it appears to be safe.

I am not familiar with the device core lockdep issues.  Are they similar?

Eric



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Patch 0/2] sysfs: fix s_active lockdep warning
  2010-01-29 20:25         ` Eric W. Biederman
@ 2010-01-30  5:30           ` Greg KH
  0 siblings, 0 replies; 39+ messages in thread
From: Greg KH @ 2010-01-30  5:30 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Cong Wang, linux-kernel, Tejun Heo, Miles Lane, Heiko Carstens,
	Benjamin Herrenschmidt, Larry Finger, akpm

On Fri, Jan 29, 2010 at 12:25:22PM -0800, Eric W. Biederman wrote:
> Greg KH <gregkh@suse.de> writes:
> 
> > Heh, this whole mess is the very reason we didn't add lockdep support to
> > the driver core.  Nested devices that all look alike from the driver
> > core, are really different objects and the locking lifetimes are
> > separate, but lockdep can't see that.
> >
> > I suggest we just remove the original patch, as it seems to be causing
> > way too many problems.
> >
> > Any objections to that?
> 
> I think the hit rate for real problems has been about 25-50%.  Of the
> false positives a lot of those have been, code that is at least
> questionable.
> 
> Furthermore there are problems we can find this way that we won't know
> about any other way.  Unfortunately I haven't had much time to do
> anything kernel related lately, or I would have done more with this.
> My comment was about simply about finding a good way to increase the
> signal to noise ration so investigations can reasonably start with the
> presumption that code lockdep is complaining about real problems.
> 
> The deadlocks that we can hit in sysfs are very nasty to find, they
> have persisted for years, and they pop back up after they are fixed.
> So far the pain from lockdep annotations seems a lot lower.
> 
> Right now annotating with subclasses as Amerigo is attempting will work,
> and remove the false positives.  I was simply hoping to find a faster
> way to get there.
> 
> So yes, I do object to removing the original patch.  Let's put in the
> work to find a good path to remove the handful of cases that cause
> false positives.

Ok, that sounds good to me.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Patch 0/2] sysfs: fix s_active lockdep warning
  2010-01-29 20:30                   ` Eric W. Biederman
@ 2010-02-04 11:38                     ` Peter Zijlstra
  2010-02-04 16:35                       ` Alan Stern
                                         ` (2 more replies)
  0 siblings, 3 replies; 39+ messages in thread
From: Peter Zijlstra @ 2010-02-04 11:38 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Greg KH, Cong Wang, linux-kernel, Tejun Heo, Miles Lane,
	Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, akpm,
	Alan Stern

On Fri, 2010-01-29 at 12:30 -0800, Eric W. Biederman wrote:

> We get false positives when the code of a sysfs attribute
> synchronously removes other sysfs attributes.  In general that is not
> safe due to hotplug etc, but there are specific instances of static
> sysfs entries like the pm_core where it appears to be safe.
> 
> I am not familiar with the device core lockdep issues.  Are they similar?

The device tree had the problem that we could basically hold a device
lock and an unspecified number of parent locks (iirc this was due to
device probing, where we hold the bus lock while probing/adding child
device, recursively). 

If we place each dev->lock into the same class (which would naively
happen), then this would lead to recursive lock warnings. The proposed
solution for this is to create MAX_LOCK_DEPTH classes and assign them to
the dev->lock depending on the depth in the device tree (Alan said that
MAX_LOCK_DEPTH is sufficient for all practical cases).

static struct lock_class_key dev_tree_classes[MAX_LOCK_DEPTH];

device_add() or thereabouts would have something like:

#ifdef CONFIG_PROVE_LOCKING
	BUG_ON(dev->depth >= MAX_LOCK_DEPTH);
	lockdep_set_class(dev->lock, &dev_tree_classes[dev->depth]);
#endif


Then there was a problem were we could lock all child devices while
holding the parent device lock (forgot why though), this would, on
taking the second child dev->lock, again lead to recursive lock
warnings. 

We have an annotation for that: lock_nest_lock (currently only
spin_lock_nest_lock exists, but mutex_lock_nest_lock is easily created),
and this would allow you to do things like:

	mutex_lock(&parent->lock);
	for_each_device_child(child, parent) {
		mutex_lock_nest_lock(&child->lock, &parent->lock);
		...
	}


I hope this helps in figuring out the sysfs case..


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Patch 0/2] sysfs: fix s_active lockdep warning
  2010-02-04 11:38                     ` Peter Zijlstra
@ 2010-02-04 16:35                       ` Alan Stern
  2010-02-04 16:41                         ` Peter Zijlstra
                                           ` (2 more replies)
  2010-02-05  3:43                       ` Cong Wang
  2010-02-05  8:55                       ` Eric W. Biederman
  2 siblings, 3 replies; 39+ messages in thread
From: Alan Stern @ 2010-02-04 16:35 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Eric W. Biederman, Greg KH, Thomas Gleixner, Cong Wang,
	Kernel development list, Tejun Heo, Miles Lane, Heiko Carstens,
	Benjamin Herrenschmidt, Larry Finger, Andrew Morton

Greg:

You have accepted Thomas's patch "drivers/base: Convert dev->sem to
mutex".  It generates lockdep violations galore during device probing
and removal!  Luckily lockdep is smart enough only to print the first
occurrence.  Here's what I get early on during bootup:

[    0.149911] ACPI: EC: Look up EC in DSDT
[    0.170665] ACPI: Executed 1 blocks of module-level executable AML code
[    0.198111] ACPI: Interpreter enabled
[    0.198267] ACPI: (supports S0 S3 S4 S5)
[    0.198802] ACPI: Using IOAPIC for interrupt routing
[    0.266493] 
[    0.266496] =============================================
[    0.266775] [ INFO: possible recursive locking detected ]
[    0.266917] 2.6.33-rc6 #1
[    0.267051] ---------------------------------------------
[    0.267192] swapper/1 is trying to acquire lock:
[    0.267332]  (&dev->mutex){+.+...}, at: [<c11496be>] __driver_attach+0x38/0x63
[    0.267683] 
[    0.267685] but task is already holding lock:
[    0.267953]  (&dev->mutex){+.+...}, at: [<c11496b2>] __driver_attach+0x2c/0x63
[    0.268000] 
[    0.268000] other info that might help us debug this:
[    0.268000] 1 lock held by swapper/1:
[    0.268000]  #0:  (&dev->mutex){+.+...}, at: [<c11496b2>] __driver_attach+0x2c/0x63
[    0.268000] 
[    0.268000] stack backtrace:
[    0.268000] Pid: 1, comm: swapper Not tainted 2.6.33-rc6 #1
[    0.268000] Call Trace:
[    0.268000]  [<c11c819e>] ? printk+0xf/0x11
[    0.268000]  [<c1041c9b>] __lock_acquire+0x804/0xb47
[    0.268000]  [<c10b2026>] ? sysfs_addrm_finish+0x19/0xe2
[    0.268000]  [<c1042020>] lock_acquire+0x42/0x59
[    0.268000]  [<c11496be>] ? __driver_attach+0x38/0x63
[    0.268000]  [<c11c90c6>] __mutex_lock_common+0x39/0x38f
[    0.268000]  [<c11496be>] ? __driver_attach+0x38/0x63
[    0.268000]  [<c11c94ab>] mutex_lock_nested+0x2b/0x33
[    0.268000]  [<c11496be>] ? __driver_attach+0x38/0x63
[    0.268000]  [<c11496be>] __driver_attach+0x38/0x63
[    0.268000]  [<c1148e0a>] bus_for_each_dev+0x3d/0x67
[    0.268000]  [<c11494cf>] driver_attach+0x14/0x16
[    0.268000]  [<c1149686>] ? __driver_attach+0x0/0x63
[    0.268000]  [<c11491c1>] bus_add_driver+0x92/0x1c5
[    0.268000]  [<c114990f>] driver_register+0x79/0xe0
[    0.268000]  [<c1106d32>] acpi_bus_register_driver+0x3a/0x3c
[    0.268000]  [<c131999f>] acpi_power_init+0x3f/0x5e
[    0.268000]  [<c1319422>] acpi_init+0x28e/0x2c8
[    0.268000]  [<c1319194>] ? acpi_init+0x0/0x2c8
[    0.268000]  [<c1001139>] do_one_initcall+0x4c/0x136
[    0.268000]  [<c130130b>] kernel_init+0x11c/0x16d
[    0.268000]  [<c13011ef>] ? kernel_init+0x0/0x16d
[    0.268000]  [<c1002cba>] kernel_thread_helper+0x6/0x10
[    0.268485] ACPI: Power Resource [GFAN] (on)


On Thu, 4 Feb 2010, Peter Zijlstra wrote:

> The device tree had the problem that we could basically hold a device
> lock and an unspecified number of parent locks (iirc this was due to
> device probing, where we hold the bus lock while probing/adding child
> device, recursively). 
> 
> If we place each dev->lock into the same class (which would naively
> happen), then this would lead to recursive lock warnings. The proposed
> solution for this is to create MAX_LOCK_DEPTH classes and assign them to
> the dev->lock depending on the depth in the device tree (Alan said that
> MAX_LOCK_DEPTH is sufficient for all practical cases).
> 
> static struct lock_class_key dev_tree_classes[MAX_LOCK_DEPTH];
> 
> device_add() or thereabouts would have something like:
> 
> #ifdef CONFIG_PROVE_LOCKING
> 	BUG_ON(dev->depth >= MAX_LOCK_DEPTH);
> 	lockdep_set_class(dev->lock, &dev_tree_classes[dev->depth]);
> #endif

Unfortunately this doesn't really work.  Here is a patch implementing
the scheme:

Index: usb-2.6/drivers/base/core.c
===================================================================
--- usb-2.6.orig/drivers/base/core.c
+++ usb-2.6/drivers/base/core.c
@@ -22,6 +22,7 @@
 #include <linux/kallsyms.h>
 #include <linux/mutex.h>
 #include <linux/async.h>
+#include <linux/sched.h>
 
 #include "base.h"
 #include "power/power.h"
@@ -671,6 +672,26 @@ static void setup_parent(struct device *
 		dev->kobj.parent = kobj;
 }
 
+#ifdef CONFIG_PROVE_LOCKING
+static struct lock_class_key dev_tree_classes[MAX_LOCK_DEPTH];
+
+static void setup_mutex_depth(struct device *dev, struct device *parent)
+{
+	int depth = 0;
+
+	/* Dynamically determine the device's depth in the device tree */
+	while (parent) {
+		++depth;
+		parent = parent->parent;
+	}
+	BUG_ON(depth > MAX_LOCK_DEPTH);
+	lockdep_set_class(&dev->mutex, &dev_tree_classes[depth]);
+}
+#else
+static inline void setup_mutex_depth(struct device *dev,
+		struct device *parent) {}
+#endif
+
 static int device_add_class_symlinks(struct device *dev)
 {
 	int error;
@@ -912,6 +933,7 @@ int device_add(struct device *dev)
 
 	parent = get_device(dev->parent);
 	setup_parent(dev, parent);
+	setup_mutex_depth(dev, parent);
 
 	/* use parent numa_node */
 	if (parent)


This doesn't address the fact that we really have multiple device trees
(for example, class devices are handled separately from normal
devices).  With the above patch installed, I still get lockdep
violations farther on during boot:

[    0.272332] pci_bus 0000:00: on NUMA node 0
[    0.272355] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
[    0.273503] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P4._PRT]
[    0.279205] 
[    0.279208] =============================================
[    0.279485] [ INFO: possible recursive locking detected ]
[    0.279628] 2.6.33-rc6 #2
[    0.279763] ---------------------------------------------
[    0.279905] swapper/1 is trying to acquire lock:
[    0.280000]  (&dev_tree_classes[depth]#2){+.+.+.}, at: [<c1149776>] device_attach+0x14/0x6e
[    0.280000] 
[    0.280000] but task is already holding lock:
[    0.280000]  (&dev_tree_classes[depth]#2){+.+.+.}, at: [<c11496de>] __driver_attach+0x2c/0x63
[    0.280000] 
[    0.280000] other info that might help us debug this:
[    0.280000] 2 locks held by swapper/1:
[    0.280000]  #0:  (&dev_tree_classes[depth]#2){+.+.+.}, at: [<c11496de>] __driver_attach+0x2c/0x63
[    0.280000]  #1:  (&dev_tree_classes[depth]#3){+.+.+.}, at: [<c11496ea>] __driver_attach+0x38/0x63
[    0.280000] 
[    0.280000] stack backtrace:
[    0.280000] Pid: 1, comm: swapper Not tainted 2.6.33-rc6 #2
[    0.280000] Call Trace:
[    0.280000]  [<c11c81ce>] ? printk+0xf/0x11
[    0.280000]  [<c1041c9b>] __lock_acquire+0x804/0xb47
[    0.280000]  [<c101a73d>] ? spin_unlock_irqrestore+0x8/0xa
[    0.280000]  [<c101a891>] ? __wake_up+0x32/0x3b
[    0.280000]  [<c1042020>] lock_acquire+0x42/0x59
[    0.280000]  [<c1149776>] ? device_attach+0x14/0x6e
[    0.280000]  [<c11c90f6>] __mutex_lock_common+0x39/0x38f
[    0.280000]  [<c1149776>] ? device_attach+0x14/0x6e
[    0.280000]  [<c1040e2e>] ? trace_hardirqs_on+0xb/0xd
[    0.280000]  [<c10ed5a7>] ? kobject_uevent_env+0x2e9/0x30a
[    0.280000]  [<c10ed5a7>] ? kobject_uevent_env+0x2e9/0x30a
[    0.280000]  [<c11c94db>] mutex_lock_nested+0x2b/0x33
[    0.280000]  [<c1149776>] ? device_attach+0x14/0x6e
[    0.280000]  [<c1149776>] device_attach+0x14/0x6e
[    0.280000]  [<c1148aa1>] bus_probe_device+0x1b/0x30
[    0.280000]  [<c1147b6c>] device_add+0x310/0x458
[    0.280000]  [<c10f96ac>] pci_bus_add_device+0xf/0x30
[    0.280000]  [<c10f96f0>] pci_bus_add_devices+0x23/0xdd
[    0.280000]  [<c11c011b>] ? acpi_pci_root_add+0x1cf/0x1ff
[    0.280000]  [<c11088a3>] acpi_pci_root_start+0x11/0x15
[    0.280000]  [<c1106370>] acpi_start_single_object+0x1e/0x3f
[    0.280000]  [<c11064a9>] acpi_device_probe+0x78/0xf4
[    0.280000]  [<c1149632>] driver_probe_device+0x87/0x107
[    0.280000]  [<c11496f9>] __driver_attach+0x47/0x63
[    0.280000]  [<c1148e36>] bus_for_each_dev+0x3d/0x67
[    0.280000]  [<c11494fb>] driver_attach+0x14/0x16
[    0.280000]  [<c11496b2>] ? __driver_attach+0x0/0x63
[    0.280000]  [<c11491ed>] bus_add_driver+0x92/0x1c5
[    0.280000]  [<c1319798>] ? acpi_pci_root_init+0x0/0x25
[    0.280000]  [<c114993b>] driver_register+0x79/0xe0
[    0.280000]  [<c1319798>] ? acpi_pci_root_init+0x0/0x25
[    0.280000]  [<c1106d32>] acpi_bus_register_driver+0x3a/0x3c
[    0.280000]  [<c13197ae>] acpi_pci_root_init+0x16/0x25
[    0.280000]  [<c1001139>] do_one_initcall+0x4c/0x136
[    0.280000]  [<c130130b>] kernel_init+0x11c/0x16d
[    0.280000]  [<c13011ef>] ? kernel_init+0x0/0x16d
[    0.280000]  [<c1002cba>] kernel_thread_helper+0x6/0x10
[    0.328206] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 *10 11 12 14 15)
[    0.329223] ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 *5 6 7 10 11 12 14 15)


> Then there was a problem were we could lock all child devices while
> holding the parent device lock (forgot why though), this would, on
> taking the second child dev->lock, again lead to recursive lock
> warnings. 

AFAIK, the code that used to do this is no longer present.  There may 
be other places where it is still done, but I'm not aware of any.

However in view of the other difficulties, it still doesn't seem
possible to make device mutexes work with lockdep.  I suggest removing 
Thomas's patch.

Alan Stern


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Patch 0/2] sysfs: fix s_active lockdep warning
  2010-02-04 16:35                       ` Alan Stern
@ 2010-02-04 16:41                         ` Peter Zijlstra
  2010-02-04 18:37                           ` Alan Stern
  2010-02-04 16:46                         ` Peter Zijlstra
  2010-02-04 16:46                         ` Greg KH
  2 siblings, 1 reply; 39+ messages in thread
From: Peter Zijlstra @ 2010-02-04 16:41 UTC (permalink / raw)
  To: Alan Stern
  Cc: Eric W. Biederman, Greg KH, Thomas Gleixner, Cong Wang,
	Kernel development list, Tejun Heo, Miles Lane, Heiko Carstens,
	Benjamin Herrenschmidt, Larry Finger, Andrew Morton

On Thu, 2010-02-04 at 11:35 -0500, Alan Stern wrote:
> Greg:
> 
> You have accepted Thomas's patch "drivers/base: Convert dev->sem to
> mutex".  It generates lockdep violations galore during device probing
> and removal!  Luckily lockdep is smart enough only to print the first
> occurrence.  Here's what I get early on during bootup:

<snip lockdep splat>

> On Thu, 4 Feb 2010, Peter Zijlstra wrote:
> 
> > The device tree had the problem that we could basically hold a device
> > lock and an unspecified number of parent locks (iirc this was due to
> > device probing, where we hold the bus lock while probing/adding child
> > device, recursively). 
> > 
> > If we place each dev->lock into the same class (which would naively
> > happen), then this would lead to recursive lock warnings. The proposed
> > solution for this is to create MAX_LOCK_DEPTH classes and assign them to
> > the dev->lock depending on the depth in the device tree (Alan said that
> > MAX_LOCK_DEPTH is sufficient for all practical cases).
> > 
> > static struct lock_class_key dev_tree_classes[MAX_LOCK_DEPTH];
> > 
> > device_add() or thereabouts would have something like:
> > 
> > #ifdef CONFIG_PROVE_LOCKING
> > 	BUG_ON(dev->depth >= MAX_LOCK_DEPTH);
> > 	lockdep_set_class(dev->lock, &dev_tree_classes[dev->depth]);
> > #endif
> 
> Unfortunately this doesn't really work.  Here is a patch implementing
> the scheme:
> 
> Index: usb-2.6/drivers/base/core.c
> ===================================================================
> --- usb-2.6.orig/drivers/base/core.c
> +++ usb-2.6/drivers/base/core.c
> @@ -22,6 +22,7 @@
>  #include <linux/kallsyms.h>
>  #include <linux/mutex.h>
>  #include <linux/async.h>
> +#include <linux/sched.h>
>  
>  #include "base.h"
>  #include "power/power.h"
> @@ -671,6 +672,26 @@ static void setup_parent(struct device *
>  		dev->kobj.parent = kobj;
>  }
>  
> +#ifdef CONFIG_PROVE_LOCKING
> +static struct lock_class_key dev_tree_classes[MAX_LOCK_DEPTH];
> +
> +static void setup_mutex_depth(struct device *dev, struct device *parent)
> +{
> +	int depth = 0;
> +
> +	/* Dynamically determine the device's depth in the device tree */
> +	while (parent) {
> +		++depth;
> +		parent = parent->parent;
> +	}
> +	BUG_ON(depth > MAX_LOCK_DEPTH);
> +	lockdep_set_class(&dev->mutex, &dev_tree_classes[depth]);
> +}
> +#else
> +static inline void setup_mutex_depth(struct device *dev,
> +		struct device *parent) {}
> +#endif
> +
>  static int device_add_class_symlinks(struct device *dev)
>  {
>  	int error;
> @@ -912,6 +933,7 @@ int device_add(struct device *dev)
>  
>  	parent = get_device(dev->parent);
>  	setup_parent(dev, parent);
> +	setup_mutex_depth(dev, parent);
>  
>  	/* use parent numa_node */
>  	if (parent)
> 
> 
> This doesn't address the fact that we really have multiple device trees
> (for example, class devices are handled separately from normal
> devices).  With the above patch installed, I still get lockdep
> violations farther on during boot:

<snip lockdep splat>

Hmm, so you have multiple interacting trees? I had understood you only
had a single device tree. So how many trees are there, is that fixed?
Does the device know what tree it is going to end up in?

If yes, then you can extend the setup_mutex_depth() function to pick a
different class stack for each tree.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Patch 0/2] sysfs: fix s_active lockdep warning
  2010-02-04 16:35                       ` Alan Stern
  2010-02-04 16:41                         ` Peter Zijlstra
@ 2010-02-04 16:46                         ` Peter Zijlstra
  2010-02-04 18:40                           ` Alan Stern
  2010-02-04 16:46                         ` Greg KH
  2 siblings, 1 reply; 39+ messages in thread
From: Peter Zijlstra @ 2010-02-04 16:46 UTC (permalink / raw)
  To: Alan Stern
  Cc: Eric W. Biederman, Greg KH, Thomas Gleixner, Cong Wang,
	Kernel development list, Tejun Heo, Miles Lane, Heiko Carstens,
	Benjamin Herrenschmidt, Larry Finger, Andrew Morton

On Thu, 2010-02-04 at 11:35 -0500, Alan Stern wrote:
> +#include <linux/sched.h>

#include <linux/lockdep.h>

should be sufficient I think.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Patch 0/2] sysfs: fix s_active lockdep warning
  2010-02-04 16:35                       ` Alan Stern
  2010-02-04 16:41                         ` Peter Zijlstra
  2010-02-04 16:46                         ` Peter Zijlstra
@ 2010-02-04 16:46                         ` Greg KH
  2010-02-04 16:59                           ` Thomas Gleixner
  2 siblings, 1 reply; 39+ messages in thread
From: Greg KH @ 2010-02-04 16:46 UTC (permalink / raw)
  To: Alan Stern
  Cc: Peter Zijlstra, Eric W. Biederman, Thomas Gleixner, Cong Wang,
	Kernel development list, Tejun Heo, Miles Lane, Heiko Carstens,
	Benjamin Herrenschmidt, Larry Finger, Andrew Morton

On Thu, Feb 04, 2010 at 11:35:28AM -0500, Alan Stern wrote:
> Greg:
> 
> You have accepted Thomas's patch "drivers/base: Convert dev->sem to
> mutex".  It generates lockdep violations galore during device probing
> and removal!  Luckily lockdep is smart enough only to print the first
> occurrence.  Here's what I get early on during bootup:

Ugh, I forgot this is why we haven't done this yet.

Thomas, you didn't see these warnings?

> [    0.149911] ACPI: EC: Look up EC in DSDT
> [    0.170665] ACPI: Executed 1 blocks of module-level executable AML code
> [    0.198111] ACPI: Interpreter enabled
> [    0.198267] ACPI: (supports S0 S3 S4 S5)
> [    0.198802] ACPI: Using IOAPIC for interrupt routing
> [    0.266493] 
> [    0.266496] =============================================
> [    0.266775] [ INFO: possible recursive locking detected ]
> [    0.266917] 2.6.33-rc6 #1
> [    0.267051] ---------------------------------------------
> [    0.267192] swapper/1 is trying to acquire lock:
> [    0.267332]  (&dev->mutex){+.+...}, at: [<c11496be>] __driver_attach+0x38/0x63
> [    0.267683] 
> [    0.267685] but task is already holding lock:
> [    0.267953]  (&dev->mutex){+.+...}, at: [<c11496b2>] __driver_attach+0x2c/0x63
> [    0.268000] 
> [    0.268000] other info that might help us debug this:
> [    0.268000] 1 lock held by swapper/1:
> [    0.268000]  #0:  (&dev->mutex){+.+...}, at: [<c11496b2>] __driver_attach+0x2c/0x63
> [    0.268000] 
> [    0.268000] stack backtrace:
> [    0.268000] Pid: 1, comm: swapper Not tainted 2.6.33-rc6 #1
> [    0.268000] Call Trace:
> [    0.268000]  [<c11c819e>] ? printk+0xf/0x11
> [    0.268000]  [<c1041c9b>] __lock_acquire+0x804/0xb47
> [    0.268000]  [<c10b2026>] ? sysfs_addrm_finish+0x19/0xe2
> [    0.268000]  [<c1042020>] lock_acquire+0x42/0x59
> [    0.268000]  [<c11496be>] ? __driver_attach+0x38/0x63
> [    0.268000]  [<c11c90c6>] __mutex_lock_common+0x39/0x38f
> [    0.268000]  [<c11496be>] ? __driver_attach+0x38/0x63
> [    0.268000]  [<c11c94ab>] mutex_lock_nested+0x2b/0x33
> [    0.268000]  [<c11496be>] ? __driver_attach+0x38/0x63
> [    0.268000]  [<c11496be>] __driver_attach+0x38/0x63
> [    0.268000]  [<c1148e0a>] bus_for_each_dev+0x3d/0x67
> [    0.268000]  [<c11494cf>] driver_attach+0x14/0x16
> [    0.268000]  [<c1149686>] ? __driver_attach+0x0/0x63
> [    0.268000]  [<c11491c1>] bus_add_driver+0x92/0x1c5
> [    0.268000]  [<c114990f>] driver_register+0x79/0xe0
> [    0.268000]  [<c1106d32>] acpi_bus_register_driver+0x3a/0x3c
> [    0.268000]  [<c131999f>] acpi_power_init+0x3f/0x5e
> [    0.268000]  [<c1319422>] acpi_init+0x28e/0x2c8
> [    0.268000]  [<c1319194>] ? acpi_init+0x0/0x2c8
> [    0.268000]  [<c1001139>] do_one_initcall+0x4c/0x136
> [    0.268000]  [<c130130b>] kernel_init+0x11c/0x16d
> [    0.268000]  [<c13011ef>] ? kernel_init+0x0/0x16d
> [    0.268000]  [<c1002cba>] kernel_thread_helper+0x6/0x10
> [    0.268485] ACPI: Power Resource [GFAN] (on)
> 
> 
> On Thu, 4 Feb 2010, Peter Zijlstra wrote:
> 
> > The device tree had the problem that we could basically hold a device
> > lock and an unspecified number of parent locks (iirc this was due to
> > device probing, where we hold the bus lock while probing/adding child
> > device, recursively). 
> > 
> > If we place each dev->lock into the same class (which would naively
> > happen), then this would lead to recursive lock warnings. The proposed
> > solution for this is to create MAX_LOCK_DEPTH classes and assign them to
> > the dev->lock depending on the depth in the device tree (Alan said that
> > MAX_LOCK_DEPTH is sufficient for all practical cases).
> > 
> > static struct lock_class_key dev_tree_classes[MAX_LOCK_DEPTH];
> > 
> > device_add() or thereabouts would have something like:
> > 
> > #ifdef CONFIG_PROVE_LOCKING
> > 	BUG_ON(dev->depth >= MAX_LOCK_DEPTH);
> > 	lockdep_set_class(dev->lock, &dev_tree_classes[dev->depth]);
> > #endif
> 
> Unfortunately this doesn't really work.  Here is a patch implementing
> the scheme:
> 
> Index: usb-2.6/drivers/base/core.c
> ===================================================================
> --- usb-2.6.orig/drivers/base/core.c
> +++ usb-2.6/drivers/base/core.c
> @@ -22,6 +22,7 @@
>  #include <linux/kallsyms.h>
>  #include <linux/mutex.h>
>  #include <linux/async.h>
> +#include <linux/sched.h>
>  
>  #include "base.h"
>  #include "power/power.h"
> @@ -671,6 +672,26 @@ static void setup_parent(struct device *
>  		dev->kobj.parent = kobj;
>  }
>  
> +#ifdef CONFIG_PROVE_LOCKING
> +static struct lock_class_key dev_tree_classes[MAX_LOCK_DEPTH];
> +
> +static void setup_mutex_depth(struct device *dev, struct device *parent)
> +{
> +	int depth = 0;
> +
> +	/* Dynamically determine the device's depth in the device tree */
> +	while (parent) {
> +		++depth;
> +		parent = parent->parent;
> +	}
> +	BUG_ON(depth > MAX_LOCK_DEPTH);
> +	lockdep_set_class(&dev->mutex, &dev_tree_classes[depth]);
> +}
> +#else
> +static inline void setup_mutex_depth(struct device *dev,
> +		struct device *parent) {}
> +#endif
> +
>  static int device_add_class_symlinks(struct device *dev)
>  {
>  	int error;
> @@ -912,6 +933,7 @@ int device_add(struct device *dev)
>  
>  	parent = get_device(dev->parent);
>  	setup_parent(dev, parent);
> +	setup_mutex_depth(dev, parent);
>  
>  	/* use parent numa_node */
>  	if (parent)
> 
> 
> This doesn't address the fact that we really have multiple device trees
> (for example, class devices are handled separately from normal
> devices).  With the above patch installed, I still get lockdep
> violations farther on during boot:
> 
> [    0.272332] pci_bus 0000:00: on NUMA node 0
> [    0.272355] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
> [    0.273503] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P4._PRT]
> [    0.279205] 
> [    0.279208] =============================================
> [    0.279485] [ INFO: possible recursive locking detected ]
> [    0.279628] 2.6.33-rc6 #2
> [    0.279763] ---------------------------------------------
> [    0.279905] swapper/1 is trying to acquire lock:
> [    0.280000]  (&dev_tree_classes[depth]#2){+.+.+.}, at: [<c1149776>] device_attach+0x14/0x6e
> [    0.280000] 
> [    0.280000] but task is already holding lock:
> [    0.280000]  (&dev_tree_classes[depth]#2){+.+.+.}, at: [<c11496de>] __driver_attach+0x2c/0x63
> [    0.280000] 
> [    0.280000] other info that might help us debug this:
> [    0.280000] 2 locks held by swapper/1:
> [    0.280000]  #0:  (&dev_tree_classes[depth]#2){+.+.+.}, at: [<c11496de>] __driver_attach+0x2c/0x63
> [    0.280000]  #1:  (&dev_tree_classes[depth]#3){+.+.+.}, at: [<c11496ea>] __driver_attach+0x38/0x63
> [    0.280000] 
> [    0.280000] stack backtrace:
> [    0.280000] Pid: 1, comm: swapper Not tainted 2.6.33-rc6 #2
> [    0.280000] Call Trace:
> [    0.280000]  [<c11c81ce>] ? printk+0xf/0x11
> [    0.280000]  [<c1041c9b>] __lock_acquire+0x804/0xb47
> [    0.280000]  [<c101a73d>] ? spin_unlock_irqrestore+0x8/0xa
> [    0.280000]  [<c101a891>] ? __wake_up+0x32/0x3b
> [    0.280000]  [<c1042020>] lock_acquire+0x42/0x59
> [    0.280000]  [<c1149776>] ? device_attach+0x14/0x6e
> [    0.280000]  [<c11c90f6>] __mutex_lock_common+0x39/0x38f
> [    0.280000]  [<c1149776>] ? device_attach+0x14/0x6e
> [    0.280000]  [<c1040e2e>] ? trace_hardirqs_on+0xb/0xd
> [    0.280000]  [<c10ed5a7>] ? kobject_uevent_env+0x2e9/0x30a
> [    0.280000]  [<c10ed5a7>] ? kobject_uevent_env+0x2e9/0x30a
> [    0.280000]  [<c11c94db>] mutex_lock_nested+0x2b/0x33
> [    0.280000]  [<c1149776>] ? device_attach+0x14/0x6e
> [    0.280000]  [<c1149776>] device_attach+0x14/0x6e
> [    0.280000]  [<c1148aa1>] bus_probe_device+0x1b/0x30
> [    0.280000]  [<c1147b6c>] device_add+0x310/0x458
> [    0.280000]  [<c10f96ac>] pci_bus_add_device+0xf/0x30
> [    0.280000]  [<c10f96f0>] pci_bus_add_devices+0x23/0xdd
> [    0.280000]  [<c11c011b>] ? acpi_pci_root_add+0x1cf/0x1ff
> [    0.280000]  [<c11088a3>] acpi_pci_root_start+0x11/0x15
> [    0.280000]  [<c1106370>] acpi_start_single_object+0x1e/0x3f
> [    0.280000]  [<c11064a9>] acpi_device_probe+0x78/0xf4
> [    0.280000]  [<c1149632>] driver_probe_device+0x87/0x107
> [    0.280000]  [<c11496f9>] __driver_attach+0x47/0x63
> [    0.280000]  [<c1148e36>] bus_for_each_dev+0x3d/0x67
> [    0.280000]  [<c11494fb>] driver_attach+0x14/0x16
> [    0.280000]  [<c11496b2>] ? __driver_attach+0x0/0x63
> [    0.280000]  [<c11491ed>] bus_add_driver+0x92/0x1c5
> [    0.280000]  [<c1319798>] ? acpi_pci_root_init+0x0/0x25
> [    0.280000]  [<c114993b>] driver_register+0x79/0xe0
> [    0.280000]  [<c1319798>] ? acpi_pci_root_init+0x0/0x25
> [    0.280000]  [<c1106d32>] acpi_bus_register_driver+0x3a/0x3c
> [    0.280000]  [<c13197ae>] acpi_pci_root_init+0x16/0x25
> [    0.280000]  [<c1001139>] do_one_initcall+0x4c/0x136
> [    0.280000]  [<c130130b>] kernel_init+0x11c/0x16d
> [    0.280000]  [<c13011ef>] ? kernel_init+0x0/0x16d
> [    0.280000]  [<c1002cba>] kernel_thread_helper+0x6/0x10
> [    0.328206] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 *10 11 12 14 15)
> [    0.329223] ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 *5 6 7 10 11 12 14 15)
> 
> 
> > Then there was a problem were we could lock all child devices while
> > holding the parent device lock (forgot why though), this would, on
> > taking the second child dev->lock, again lead to recursive lock
> > warnings. 
> 
> AFAIK, the code that used to do this is no longer present.  There may 
> be other places where it is still done, but I'm not aware of any.
> 
> However in view of the other difficulties, it still doesn't seem
> possible to make device mutexes work with lockdep.  I suggest removing 
> Thomas's patch.

Unless Thomas or anyone else thinks of something that can solve these
problems, I will do so.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Patch 0/2] sysfs: fix s_active lockdep warning
  2010-02-04 16:46                         ` Greg KH
@ 2010-02-04 16:59                           ` Thomas Gleixner
  2010-02-26 19:36                             ` Alan Stern
  0 siblings, 1 reply; 39+ messages in thread
From: Thomas Gleixner @ 2010-02-04 16:59 UTC (permalink / raw)
  To: Greg KH
  Cc: Alan Stern, Peter Zijlstra, Eric W. Biederman, Cong Wang,
	Kernel development list, Tejun Heo, Miles Lane, Heiko Carstens,
	Benjamin Herrenschmidt, Larry Finger, Andrew Morton

On Thu, 4 Feb 2010, Greg KH wrote:
> > However in view of the other difficulties, it still doesn't seem
> > possible to make device mutexes work with lockdep.  I suggest removing 
> > Thomas's patch.
> 
> Unless Thomas or anyone else thinks of something that can solve these
> problems, I will do so.

Hmm, I have not seen those lockdep splats on -rt where we have the
mutex conversion for quite some time already. Need to look at it.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Patch 0/2] sysfs: fix s_active lockdep warning
  2010-02-04 16:41                         ` Peter Zijlstra
@ 2010-02-04 18:37                           ` Alan Stern
  2010-02-05 10:18                             ` Peter Zijlstra
  0 siblings, 1 reply; 39+ messages in thread
From: Alan Stern @ 2010-02-04 18:37 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Eric W. Biederman, Greg KH, Thomas Gleixner, Cong Wang,
	Kernel development list, Tejun Heo, Miles Lane, Heiko Carstens,
	Benjamin Herrenschmidt, Larry Finger, Andrew Morton

On Thu, 4 Feb 2010, Peter Zijlstra wrote:

> > This doesn't address the fact that we really have multiple device trees
> > (for example, class devices are handled separately from normal
> > devices).  With the above patch installed, I still get lockdep
> > violations farther on during boot:
> 
> <snip lockdep splat>
> 
> Hmm, so you have multiple interacting trees? I had understood you only
> had a single device tree.

The real situation is kind of complicated, and I'm not familiar with
all the details.  But it's certainly true that a driver will want to
work with (and lock!) multiple struct device's that don't have a
parent-child relation in the tree.  The simplest example is regular
devices together with class devices, and another might be PCI devices
together with their "shadow" ACPI devices.

>  So how many trees are there, is that fixed?
> Does the device know what tree it is going to end up in?

The driver generally knows, but AFAIK that information is not passed 
back to the driver core.  At least, not directly -- you might say that 
it could be deduced from the parent pointer, assuming the core already 
knows all about the parent.

> If yes, then you can extend the setup_mutex_depth() function to pick a
> different class stack for each tree.

Maybe this could be done.  But for now perhaps a compromise is in
order.  We could make the switch from semaphores to mutexes while
avoiding lockdep issues by assigning the device mutexes to a
"don't-verify" class.  Is there such a thing, or could it be added?

Alan Stern


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Patch 0/2] sysfs: fix s_active lockdep warning
  2010-02-04 16:46                         ` Peter Zijlstra
@ 2010-02-04 18:40                           ` Alan Stern
  2010-02-05  3:09                             ` Cong Wang
  0 siblings, 1 reply; 39+ messages in thread
From: Alan Stern @ 2010-02-04 18:40 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Eric W. Biederman, Greg KH, Thomas Gleixner, Cong Wang,
	Kernel development list, Tejun Heo, Miles Lane, Heiko Carstens,
	Benjamin Herrenschmidt, Larry Finger, Andrew Morton

On Thu, 4 Feb 2010, Peter Zijlstra wrote:

> On Thu, 2010-02-04 at 11:35 -0500, Alan Stern wrote:
> > +#include <linux/sched.h>
> 
> #include <linux/lockdep.h>
> 
> should be sufficient I think.

No, it's not.  It leaves MAX_LOCK_DEPTH undeclared.  Beats me why that 
symbol ended up in sched.h...

Alan Stern


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Patch 0/2] sysfs: fix s_active lockdep warning
  2010-02-04 18:40                           ` Alan Stern
@ 2010-02-05  3:09                             ` Cong Wang
  2010-02-05  4:06                               ` Alan Stern
  0 siblings, 1 reply; 39+ messages in thread
From: Cong Wang @ 2010-02-05  3:09 UTC (permalink / raw)
  To: Alan Stern
  Cc: Peter Zijlstra, Eric W. Biederman, Greg KH, Thomas Gleixner,
	Kernel development list, Tejun Heo, Miles Lane, Heiko Carstens,
	Benjamin Herrenschmidt, Larry Finger, Andrew Morton

Alan Stern wrote:
> On Thu, 4 Feb 2010, Peter Zijlstra wrote:
> 
>> On Thu, 2010-02-04 at 11:35 -0500, Alan Stern wrote:
>>> +#include <linux/sched.h>
>> #include <linux/lockdep.h>
>>
>> should be sufficient I think.
> 
> No, it's not.  It leaves MAX_LOCK_DEPTH undeclared.  Beats me why that 
> symbol ended up in sched.h...
> 

because task_struct->held_locks uses it, but it seems better to move
it into lockdep.h...

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Patch 0/2] sysfs: fix s_active lockdep warning
  2010-02-04 11:38                     ` Peter Zijlstra
  2010-02-04 16:35                       ` Alan Stern
@ 2010-02-05  3:43                       ` Cong Wang
  2010-02-05  8:55                       ` Eric W. Biederman
  2 siblings, 0 replies; 39+ messages in thread
From: Cong Wang @ 2010-02-05  3:43 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Eric W. Biederman, Greg KH, linux-kernel, Tejun Heo, Miles Lane,
	Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, akpm,
	Alan Stern

Peter Zijlstra wrote:
> On Fri, 2010-01-29 at 12:30 -0800, Eric W. Biederman wrote:
> 
>> We get false positives when the code of a sysfs attribute
>> synchronously removes other sysfs attributes.  In general that is not
>> safe due to hotplug etc, but there are specific instances of static
>> sysfs entries like the pm_core where it appears to be safe.
>>
>> I am not familiar with the device core lockdep issues.  Are they similar?
> 
> The device tree had the problem that we could basically hold a device
> lock and an unspecified number of parent locks (iirc this was due to
> device probing, where we hold the bus lock while probing/adding child
> device, recursively). 
> 
> If we place each dev->lock into the same class (which would naively
> happen), then this would lead to recursive lock warnings. The proposed
> solution for this is to create MAX_LOCK_DEPTH classes and assign them to
> the dev->lock depending on the depth in the device tree (Alan said that
> MAX_LOCK_DEPTH is sufficient for all practical cases).
> 
> static struct lock_class_key dev_tree_classes[MAX_LOCK_DEPTH];
> 
> device_add() or thereabouts would have something like:
> 
> #ifdef CONFIG_PROVE_LOCKING
> 	BUG_ON(dev->depth >= MAX_LOCK_DEPTH);
> 	lockdep_set_class(dev->lock, &dev_tree_classes[dev->depth]);
> #endif
> 
> 

Nice explanation!

I see, we should set the class of the lock when after we holding it...
I will update my patch.

Thank you!

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Patch 0/2] sysfs: fix s_active lockdep warning
  2010-02-05  3:09                             ` Cong Wang
@ 2010-02-05  4:06                               ` Alan Stern
  0 siblings, 0 replies; 39+ messages in thread
From: Alan Stern @ 2010-02-05  4:06 UTC (permalink / raw)
  To: Cong Wang
  Cc: Peter Zijlstra, Eric W. Biederman, Greg KH, Thomas Gleixner,
	Kernel development list, Tejun Heo, Miles Lane, Heiko Carstens,
	Benjamin Herrenschmidt, Larry Finger, Andrew Morton

On Fri, 5 Feb 2010, Cong Wang wrote:

> Alan Stern wrote:
> > On Thu, 4 Feb 2010, Peter Zijlstra wrote:
> > 
> >> On Thu, 2010-02-04 at 11:35 -0500, Alan Stern wrote:
> >>> +#include <linux/sched.h>
> >> #include <linux/lockdep.h>
> >>
> >> should be sufficient I think.
> > 
> > No, it's not.  It leaves MAX_LOCK_DEPTH undeclared.  Beats me why that 
> > symbol ended up in sched.h...
> > 
> 
> because task_struct->held_locks uses it, but it seems better to move
> it into lockdep.h...

Be my guest!  :-)

Alan Stern


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Patch 0/2] sysfs: fix s_active lockdep warning
  2010-02-04 11:38                     ` Peter Zijlstra
  2010-02-04 16:35                       ` Alan Stern
  2010-02-05  3:43                       ` Cong Wang
@ 2010-02-05  8:55                       ` Eric W. Biederman
  2 siblings, 0 replies; 39+ messages in thread
From: Eric W. Biederman @ 2010-02-05  8:55 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Greg KH, Cong Wang, linux-kernel, Tejun Heo, Miles Lane,
	Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, akpm,
	Alan Stern

Peter Zijlstra <peterz@infradead.org> writes:

> On Fri, 2010-01-29 at 12:30 -0800, Eric W. Biederman wrote:
>
>> We get false positives when the code of a sysfs attribute
>> synchronously removes other sysfs attributes.  In general that is not
>> safe due to hotplug etc, but there are specific instances of static
>> sysfs entries like the pm_core where it appears to be safe.
>> 
>> I am not familiar with the device core lockdep issues.  Are they similar?
>
> The device tree had the problem that we could basically hold a device
> lock and an unspecified number of parent locks (iirc this was due to
> device probing, where we hold the bus lock while probing/adding child
> device, recursively). 
>
> If we place each dev->lock into the same class (which would naively
> happen), then this would lead to recursive lock warnings. The proposed
> solution for this is to create MAX_LOCK_DEPTH classes and assign them to
> the dev->lock depending on the depth in the device tree (Alan said that
> MAX_LOCK_DEPTH is sufficient for all practical cases).
>
> static struct lock_class_key dev_tree_classes[MAX_LOCK_DEPTH];
>
> device_add() or thereabouts would have something like:
>
> #ifdef CONFIG_PROVE_LOCKING
> 	BUG_ON(dev->depth >= MAX_LOCK_DEPTH);
> 	lockdep_set_class(dev->lock, &dev_tree_classes[dev->depth]);
> #endif
>
>
> Then there was a problem were we could lock all child devices while
> holding the parent device lock (forgot why though), this would, on
> taking the second child dev->lock, again lead to recursive lock
> warnings. 
>
> We have an annotation for that: lock_nest_lock (currently only
> spin_lock_nest_lock exists, but mutex_lock_nest_lock is easily created),
> and this would allow you to do things like:
>
> 	mutex_lock(&parent->lock);
> 	for_each_device_child(child, parent) {
> 		mutex_lock_nest_lock(&child->lock, &parent->lock);
> 		...
> 	}
>
>
> I hope this helps in figuring out the sysfs case..

Interesting.  The sysfs attributes in general don't have this problem
because the common case is effectively a reader/writer lock where we
can have all of the readers we want.

In the sysfs case the classic problem is when an attribute grabs a
lock and that lock is also held while the attribute is being deleted.
I first found this with the rtnl_lock but there are other cases
as well.

The particularly tricky case to handle in lockdep is when that
other lock also the s_active lock from another part of the tree.  There
is an entire Alice style rabbit whole there, I am trying to figure out.

Eric

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Patch 0/2] sysfs: fix s_active lockdep warning
  2010-02-04 18:37                           ` Alan Stern
@ 2010-02-05 10:18                             ` Peter Zijlstra
  2010-02-05 15:30                               ` Alan Stern
  2010-02-08 15:38                               ` Alan Stern
  0 siblings, 2 replies; 39+ messages in thread
From: Peter Zijlstra @ 2010-02-05 10:18 UTC (permalink / raw)
  To: Alan Stern
  Cc: Eric W. Biederman, Greg KH, Thomas Gleixner, Cong Wang,
	Kernel development list, Tejun Heo, Miles Lane, Heiko Carstens,
	Benjamin Herrenschmidt, Larry Finger, Andrew Morton

On Thu, 2010-02-04 at 13:37 -0500, Alan Stern wrote:
> On Thu, 4 Feb 2010, Peter Zijlstra wrote:
> 
> > > This doesn't address the fact that we really have multiple device trees
> > > (for example, class devices are handled separately from normal
> > > devices).  With the above patch installed, I still get lockdep
> > > violations farther on during boot:
> > 
> > <snip lockdep splat>
> > 
> > Hmm, so you have multiple interacting trees? I had understood you only
> > had a single device tree.
> 
> The real situation is kind of complicated, and I'm not familiar with
> all the details.  But it's certainly true that a driver will want to
> work with (and lock!) multiple struct device's that don't have a
> parent-child relation in the tree.  The simplest example is regular
> devices together with class devices, and another might be PCI devices
> together with their "shadow" ACPI devices.
> 
> >  So how many trees are there, is that fixed?
> > Does the device know what tree it is going to end up in?
> 
> The driver generally knows, but AFAIK that information is not passed 
> back to the driver core.  At least, not directly -- you might say that 
> it could be deduced from the parent pointer, assuming the core already 
> knows all about the parent.
> 
> > If yes, then you can extend the setup_mutex_depth() function to pick a
> > different class stack for each tree.
> 
> Maybe this could be done. 

Right, so this device stuff is much more complicated than I was led to
believe ;-)

So the device core doesn't know, so how are you guys making sure there
really are no deadlocks hidden in there somewhere?

>  But for now perhaps a compromise is in
> order.  We could make the switch from semaphores to mutexes while
> avoiding lockdep issues by assigning the device mutexes to a
> "don't-verify" class.  Is there such a thing, or could it be added?

Something like the below might work, but it should go along with a
checkpatch.pl mod to ensure we don't grow any new users (just don't feel
like brushing up my perl fu enough to actually make sense of that
script)

---
 include/linux/lockdep.h |    2 ++
 kernel/lockdep.c        |    5 +++++
 2 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index 9ccf0e2..4e30ab4 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -40,6 +40,8 @@ struct lock_class_key {
 	struct lockdep_subclass_key	subkeys[MAX_LOCKDEP_SUBCLASSES];
 };
 
+extern struct lock_class_key __lockdep_no_validate__;
+
 #define LOCKSTAT_POINTS		4
 
 /*
diff --git a/kernel/lockdep.c b/kernel/lockdep.c
index c62ec14..af65a34 100644
--- a/kernel/lockdep.c
+++ b/kernel/lockdep.c
@@ -2716,6 +2716,8 @@ void lockdep_init_map(struct lockdep_map *lock, const char *name,
 }
 EXPORT_SYMBOL_GPL(lockdep_init_map);
 
+struct lock_class_key __lockdep_no_validate__;
+
 /*
  * This gets called for every mutex_lock*()/spin_lock*() operation.
  * We maintain the dependency maps and validate the locking attempt:
@@ -2750,6 +2752,9 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 		return 0;
 	}
 
+	if (lock->key == &__lockdep_no_validate__)
+		check = 1;
+
 	if (!subclass)
 		class = lock->class_cache;
 	/*



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [Patch 0/2] sysfs: fix s_active lockdep warning
  2010-02-05 10:18                             ` Peter Zijlstra
@ 2010-02-05 15:30                               ` Alan Stern
  2010-02-05 15:41                                 ` Peter Zijlstra
  2010-02-08  3:06                                 ` Cong Wang
  2010-02-08 15:38                               ` Alan Stern
  1 sibling, 2 replies; 39+ messages in thread
From: Alan Stern @ 2010-02-05 15:30 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Eric W. Biederman, Greg KH, Thomas Gleixner, Cong Wang,
	Kernel development list, Tejun Heo, Miles Lane, Heiko Carstens,
	Benjamin Herrenschmidt, Larry Finger, Andrew Morton

On Fri, 5 Feb 2010, Peter Zijlstra wrote:

> Right, so this device stuff is much more complicated than I was led to
> believe ;-)

Haven't I told you all along that tree-structured locking is
complicated?  :-)

> So the device core doesn't know, so how are you guys making sure there
> really are no deadlocks hidden in there somewhere?

In the code I've seen, deadlocks are avoided by always taking the locks
in the same order.  But who knows?  Maybe there _are_ some hidden
deadlocks lurking.  For now we can't rely on lockdep to find them,
though, because it gets sidetracked by all the false positives.

> >  But for now perhaps a compromise is in
> > order.  We could make the switch from semaphores to mutexes while
> > avoiding lockdep issues by assigning the device mutexes to a
> > "don't-verify" class.  Is there such a thing, or could it be added?
> 
> Something like the below might work, but it should go along with a
> checkpatch.pl mod to ensure we don't grow any new users (just don't feel
> like brushing up my perl fu enough to actually make sense of that
> script)

I'll try it out in the next few days, and if it looks good maybe the 
checkpatch maintainers can lend some assistance before it gets 
submitted.

Alan Stern


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Patch 0/2] sysfs: fix s_active lockdep warning
  2010-02-05 15:30                               ` Alan Stern
@ 2010-02-05 15:41                                 ` Peter Zijlstra
  2010-02-07  9:22                                   ` Dave Young
  2010-02-08  3:06                                 ` Cong Wang
  1 sibling, 1 reply; 39+ messages in thread
From: Peter Zijlstra @ 2010-02-05 15:41 UTC (permalink / raw)
  To: Alan Stern
  Cc: Eric W. Biederman, Greg KH, Thomas Gleixner, Cong Wang,
	Kernel development list, Tejun Heo, Miles Lane, Heiko Carstens,
	Benjamin Herrenschmidt, Larry Finger, Andrew Morton

On Fri, 2010-02-05 at 10:30 -0500, Alan Stern wrote:
> On Fri, 5 Feb 2010, Peter Zijlstra wrote:
> 
> > Right, so this device stuff is much more complicated than I was led to
> > believe ;-)
> 
> Haven't I told you all along that tree-structured locking is
> complicated?  :-)

Well, regular tree's aren't all that complicated, but multiple
inter-locking trees is a whole different story indeed.



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Patch 0/2] sysfs: fix s_active lockdep warning
  2010-02-05 15:41                                 ` Peter Zijlstra
@ 2010-02-07  9:22                                   ` Dave Young
  2010-02-08  3:08                                     ` Cong Wang
  0 siblings, 1 reply; 39+ messages in thread
From: Dave Young @ 2010-02-07  9:22 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Alan Stern, Eric W. Biederman, Greg KH, Thomas Gleixner,
	Cong Wang, Kernel development list, Tejun Heo, Miles Lane,
	Heiko Carstens, Benjamin Herrenschmidt, Larry Finger,
	Andrew Morton

On Fri, Feb 05, 2010 at 04:41:57PM +0100, Peter Zijlstra wrote:
> On Fri, 2010-02-05 at 10:30 -0500, Alan Stern wrote:
> > On Fri, 5 Feb 2010, Peter Zijlstra wrote:
> > 
> > > Right, so this device stuff is much more complicated than I was led to
> > > believe ;-)
> > 
> > Haven't I told you all along that tree-structured locking is
> > complicated?  :-)
> 
> Well, regular tree's aren't all that complicated, but multiple
> inter-locking trees is a whole different story indeed.
> 

I ever tried converting device semaphore to mutex, but failed with same issue.

At least now there's no lockdep solution for it, so I recommend revert
the mutex converting patch.

following lockdep warning with rc6-mm1:

[    0.397123] 
[    0.397124] =============================================
[    0.397359] [ INFO: possible recursive locking detected ]
[    0.397480] 2.6.33-rc6-mm1 #1
[    0.397596] ---------------------------------------------
[    0.397717] swapper/1 is trying to acquire lock:
[    0.397836]  (&dev->mutex){+.+...}, at: [<c12662e4>] __driver_attach+0x38/0x63
[    0.398162] 
[    0.398162] but task is already holding lock:
[    0.398393]  (&dev->mutex){+.+...}, at: [<c12662d8>] __driver_attach+0x2c/0x63
[    0.399999] 
[    0.399999] other info that might help us debug this:
[    0.399999] 1 lock held by swapper/1:
[    0.399999]  #0:  (&dev->mutex){+.+...}, at: [<c12662d8>] __driver_attach+0x2c/0x63
[    0.399999] 
[    0.399999] stack backtrace:
[    0.399999] Pid: 1, comm: swapper Not tainted 2.6.33-rc6-mm1 #1
[    0.399999] Call Trace:
[    0.399999]  [<c13d5851>] ? printk+0xf/0x11
[    0.399999]  [<c105460c>] __lock_acquire+0x880/0xc0d
[    0.399999]  [<c13d66e4>] ? __mutex_unlock_slowpath+0xf0/0xff
[    0.399999]  [<c1054a34>] lock_acquire+0x9b/0xb8
[    0.399999]  [<c12662e4>] ? __driver_attach+0x38/0x63
[    0.399999]  [<c13d685c>] __mutex_lock_common+0x35/0x2f2
[    0.399999]  [<c12662e4>] ? __driver_attach+0x38/0x63
[    0.399999]  [<c13d6bb7>] mutex_lock_nested+0x30/0x38
[    0.399999]  [<c12662e4>] ? __driver_attach+0x38/0x63
[    0.399999]  [<c12662e4>] __driver_attach+0x38/0x63
[    0.399999]  [<c1265a43>] bus_for_each_dev+0x3d/0x67
[    0.399999]  [<c1265f8f>] driver_attach+0x14/0x16
[    0.399999]  [<c12662ac>] ? __driver_attach+0x0/0x63
[    0.399999]  [<c126541c>] bus_add_driver+0xc4/0x20c
[    0.399999]  [<c1266553>] driver_register+0x8b/0xeb
[    0.399999]  [<c11d8e3a>] acpi_bus_register_driver+0x3a/0x3c
[    0.399999]  [<c165dc9a>] acpi_ec_init+0x2b/0x4a
[    0.399999]  [<c165daf8>] acpi_init+0x289/0x2c8
[    0.399999]  [<c165d86f>] ? acpi_init+0x0/0x2c8
[    0.399999]  [<c1001139>] do_one_initcall+0x4c/0x13f
[    0.399999]  [<c163d315>] kernel_init+0x123/0x174
[    0.399999]  [<c163d1f2>] ? kernel_init+0x0/0x174
[    0.399999]  [<c10035ba>] kernel_thread_helper+0x6/0x10
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Patch 0/2] sysfs: fix s_active lockdep warning
  2010-02-05 15:30                               ` Alan Stern
  2010-02-05 15:41                                 ` Peter Zijlstra
@ 2010-02-08  3:06                                 ` Cong Wang
  1 sibling, 0 replies; 39+ messages in thread
From: Cong Wang @ 2010-02-08  3:06 UTC (permalink / raw)
  To: Alan Stern
  Cc: Peter Zijlstra, Eric W. Biederman, Greg KH, Thomas Gleixner,
	Kernel development list, Tejun Heo, Miles Lane, Heiko Carstens,
	Benjamin Herrenschmidt, Larry Finger, Andrew Morton

Alan Stern wrote:
> On Fri, 5 Feb 2010, Peter Zijlstra wrote:
> 
>> Right, so this device stuff is much more complicated than I was led to
>> believe ;-)
> 
> Haven't I told you all along that tree-structured locking is
> complicated?  :-)
> 
>> So the device core doesn't know, so how are you guys making sure there
>> really are no deadlocks hidden in there somewhere?
> 
> In the code I've seen, deadlocks are avoided by always taking the locks
> in the same order.  But who knows?  Maybe there _are_ some hidden
> deadlocks lurking.  For now we can't rely on lockdep to find them,
> though, because it gets sidetracked by all the false positives.
> 

This is almost the same with the sysfs case...

Thanks.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Patch 0/2] sysfs: fix s_active lockdep warning
  2010-02-07  9:22                                   ` Dave Young
@ 2010-02-08  3:08                                     ` Cong Wang
  2010-02-08  3:14                                       ` Dave Young
  0 siblings, 1 reply; 39+ messages in thread
From: Cong Wang @ 2010-02-08  3:08 UTC (permalink / raw)
  To: Dave Young
  Cc: Peter Zijlstra, Alan Stern, Eric W. Biederman, Greg KH,
	Thomas Gleixner, Kernel development list, Tejun Heo, Miles Lane,
	Heiko Carstens, Benjamin Herrenschmidt, Larry Finger,
	Andrew Morton

Dave Young wrote:
> On Fri, Feb 05, 2010 at 04:41:57PM +0100, Peter Zijlstra wrote:
>> On Fri, 2010-02-05 at 10:30 -0500, Alan Stern wrote:
>>> On Fri, 5 Feb 2010, Peter Zijlstra wrote:
>>>
>>>> Right, so this device stuff is much more complicated than I was led to
>>>> believe ;-)
>>> Haven't I told you all along that tree-structured locking is
>>> complicated?  :-)
>> Well, regular tree's aren't all that complicated, but multiple
>> inter-locking trees is a whole different story indeed.
>>
> 
> I ever tried converting device semaphore to mutex, but failed with same issue.
> 
> At least now there's no lockdep solution for it, so I recommend revert
> the mutex converting patch.
> 
> following lockdep warning with rc6-mm1:
> 
> [    0.397123] 
> [    0.397124] =============================================
> [    0.397359] [ INFO: possible recursive locking detected ]
> [    0.397480] 2.6.33-rc6-mm1 #1
> [    0.397596] ---------------------------------------------
> [    0.397717] swapper/1 is trying to acquire lock:
> [    0.397836]  (&dev->mutex){+.+...}, at: [<c12662e4>] __driver_attach+0x38/0x63
> [    0.398162] 
> [    0.398162] but task is already holding lock:
> [    0.398393]  (&dev->mutex){+.+...}, at: [<c12662d8>] __driver_attach+0x2c/0x63
> [    0.399999] 

Alan already provided a patch for this issue earlier in this thread.

Thanks.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Patch 0/2] sysfs: fix s_active lockdep warning
  2010-02-08  3:08                                     ` Cong Wang
@ 2010-02-08  3:14                                       ` Dave Young
  2010-02-08  3:30                                         ` Cong Wang
  0 siblings, 1 reply; 39+ messages in thread
From: Dave Young @ 2010-02-08  3:14 UTC (permalink / raw)
  To: Cong Wang
  Cc: Peter Zijlstra, Alan Stern, Eric W. Biederman, Greg KH,
	Thomas Gleixner, Kernel development list, Tejun Heo, Miles Lane,
	Heiko Carstens, Benjamin Herrenschmidt, Larry Finger,
	Andrew Morton

On Mon, Feb 8, 2010 at 11:08 AM, Cong Wang <amwang@redhat.com> wrote:
> Dave Young wrote:
>>
>> On Fri, Feb 05, 2010 at 04:41:57PM +0100, Peter Zijlstra wrote:
>>>
>>> On Fri, 2010-02-05 at 10:30 -0500, Alan Stern wrote:
>>>>
>>>> On Fri, 5 Feb 2010, Peter Zijlstra wrote:
>>>>
>>>>> Right, so this device stuff is much more complicated than I was led to
>>>>> believe ;-)
>>>>
>>>> Haven't I told you all along that tree-structured locking is
>>>> complicated?  :-)
>>>
>>> Well, regular tree's aren't all that complicated, but multiple
>>> inter-locking trees is a whole different story indeed.
>>>
>>
>> I ever tried converting device semaphore to mutex, but failed with same
>> issue.
>>
>> At least now there's no lockdep solution for it, so I recommend revert
>> the mutex converting patch.
>>
>> following lockdep warning with rc6-mm1:
>>
>> [    0.397123] [    0.397124]
>> =============================================
>> [    0.397359] [ INFO: possible recursive locking detected ]
>> [    0.397480] 2.6.33-rc6-mm1 #1
>> [    0.397596] ---------------------------------------------
>> [    0.397717] swapper/1 is trying to acquire lock:
>> [    0.397836]  (&dev->mutex){+.+...}, at: [<c12662e4>]
>> __driver_attach+0x38/0x63
>> [    0.398162] [    0.398162] but task is already holding lock:
>> [    0.398393]  (&dev->mutex){+.+...}, at: [<c12662d8>]
>> __driver_attach+0x2c/0x63
>> [    0.399999]
>
> Alan already provided a patch for this issue earlier in this thread.

Yes, but device locks can not be classified with regular tree style.
Please read the whole thread.

>
> Thanks.
>
>



-- 
Regards
dave

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Patch 0/2] sysfs: fix s_active lockdep warning
  2010-02-08  3:14                                       ` Dave Young
@ 2010-02-08  3:30                                         ` Cong Wang
  0 siblings, 0 replies; 39+ messages in thread
From: Cong Wang @ 2010-02-08  3:30 UTC (permalink / raw)
  To: Dave Young
  Cc: Peter Zijlstra, Alan Stern, Eric W. Biederman, Greg KH,
	Thomas Gleixner, Kernel development list, Tejun Heo, Miles Lane,
	Heiko Carstens, Benjamin Herrenschmidt, Larry Finger,
	Andrew Morton

Dave Young wrote:
> On Mon, Feb 8, 2010 at 11:08 AM, Cong Wang <amwang@redhat.com> wrote:
>> Dave Young wrote:
>>> On Fri, Feb 05, 2010 at 04:41:57PM +0100, Peter Zijlstra wrote:
>>>> On Fri, 2010-02-05 at 10:30 -0500, Alan Stern wrote:
>>>>> On Fri, 5 Feb 2010, Peter Zijlstra wrote:
>>>>>
>>>>>> Right, so this device stuff is much more complicated than I was led to
>>>>>> believe ;-)
>>>>> Haven't I told you all along that tree-structured locking is
>>>>> complicated?  :-)
>>>> Well, regular tree's aren't all that complicated, but multiple
>>>> inter-locking trees is a whole different story indeed.
>>>>
>>> I ever tried converting device semaphore to mutex, but failed with same
>>> issue.
>>>
>>> At least now there's no lockdep solution for it, so I recommend revert
>>> the mutex converting patch.
>>>
>>> following lockdep warning with rc6-mm1:
>>>
>>> [    0.397123] [    0.397124]
>>> =============================================
>>> [    0.397359] [ INFO: possible recursive locking detected ]
>>> [    0.397480] 2.6.33-rc6-mm1 #1
>>> [    0.397596] ---------------------------------------------
>>> [    0.397717] swapper/1 is trying to acquire lock:
>>> [    0.397836]  (&dev->mutex){+.+...}, at: [<c12662e4>]
>>> __driver_attach+0x38/0x63
>>> [    0.398162] [    0.398162] but task is already holding lock:
>>> [    0.398393]  (&dev->mutex){+.+...}, at: [<c12662d8>]
>>> __driver_attach+0x2c/0x63
>>> [    0.399999]
>> Alan already provided a patch for this issue earlier in this thread.
> 
> Yes, but device locks can not be classified with regular tree style.

True, Alan mentioned the device trees could be more than one,
which is the difference with the sysfs, I think, where we only
have one tree.

> Please read the whole thread.

Surely I did.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Patch 0/2] sysfs: fix s_active lockdep warning
  2010-02-05 10:18                             ` Peter Zijlstra
  2010-02-05 15:30                               ` Alan Stern
@ 2010-02-08 15:38                               ` Alan Stern
  1 sibling, 0 replies; 39+ messages in thread
From: Alan Stern @ 2010-02-08 15:38 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Eric W. Biederman, Greg KH, Thomas Gleixner, Cong Wang,
	Kernel development list, Tejun Heo, Miles Lane, Heiko Carstens,
	Benjamin Herrenschmidt, Larry Finger, Andrew Morton

On Fri, 5 Feb 2010, Peter Zijlstra wrote:

> >  But for now perhaps a compromise is in
> > order.  We could make the switch from semaphores to mutexes while
> > avoiding lockdep issues by assigning the device mutexes to a
> > "don't-verify" class.  Is there such a thing, or could it be added?
> 
> Something like the below might work, but it should go along with a
> checkpatch.pl mod to ensure we don't grow any new users (just don't feel
> like brushing up my perl fu enough to actually make sense of that
> script)

I tried this out and it does work; no more lockdep warnings during 
bootup.  The complete patch is below.

But before we apply this, it might be worthwhile to spend a little time 
figuring out how many independent device trees there really are and 
whether they can be handled with separate depth-associated lockdep 
classes.  It's not at all obvious that they can.

Alan Stern


Index: usb-2.6/include/linux/lockdep.h
===================================================================
--- usb-2.6.orig/include/linux/lockdep.h
+++ usb-2.6/include/linux/lockdep.h
@@ -40,6 +40,8 @@ struct lock_class_key {
 	struct lockdep_subclass_key	subkeys[MAX_LOCKDEP_SUBCLASSES];
 };
 
+extern struct lock_class_key __lockdep_no_validate__;
+
 #define LOCKSTAT_POINTS		4
 
 /*
Index: usb-2.6/kernel/lockdep.c
===================================================================
--- usb-2.6.orig/kernel/lockdep.c
+++ usb-2.6/kernel/lockdep.c
@@ -2716,6 +2716,8 @@ void lockdep_init_map(struct lockdep_map
 }
 EXPORT_SYMBOL_GPL(lockdep_init_map);
 
+struct lock_class_key __lockdep_no_validate__;
+
 /*
  * This gets called for every mutex_lock*()/spin_lock*() operation.
  * We maintain the dependency maps and validate the locking attempt:
@@ -2750,6 +2752,9 @@ static int __lock_acquire(struct lockdep
 		return 0;
 	}
 
+	if (lock->key == &__lockdep_no_validate__)
+		check = 1;
+
 	if (!subclass)
 		class = lock->class_cache;
 	/*
Index: usb-2.6/drivers/base/core.c
===================================================================
--- usb-2.6.orig/drivers/base/core.c
+++ usb-2.6/drivers/base/core.c
@@ -559,6 +559,7 @@ void device_initialize(struct device *de
 	kobject_init(&dev->kobj, &device_ktype);
 	INIT_LIST_HEAD(&dev->dma_pools);
 	mutex_init(&dev->mutex);
+	lockdep_set_class(&dev->mutex, &__lockdep_no_validate__);
 	spin_lock_init(&dev->devres_lock);
 	INIT_LIST_HEAD(&dev->devres_head);
 	device_init_wakeup(dev, 0);


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Patch 0/2] sysfs: fix s_active lockdep warning
  2010-02-04 16:59                           ` Thomas Gleixner
@ 2010-02-26 19:36                             ` Alan Stern
  2010-02-26 20:54                               ` Thomas Gleixner
  0 siblings, 1 reply; 39+ messages in thread
From: Alan Stern @ 2010-02-26 19:36 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Greg KH, Peter Zijlstra, Eric W. Biederman, Cong Wang,
	Kernel development list, Tejun Heo, Miles Lane, Heiko Carstens,
	Benjamin Herrenschmidt, Larry Finger, Andrew Morton

On Thu, 4 Feb 2010, Thomas Gleixner wrote:

> On Thu, 4 Feb 2010, Greg KH wrote:
> > > However in view of the other difficulties, it still doesn't seem
> > > possible to make device mutexes work with lockdep.  I suggest removing 
> > > Thomas's patch.
> > 
> > Unless Thomas or anyone else thinks of something that can solve these
> > problems, I will do so.
> 
> Hmm, I have not seen those lockdep splats on -rt where we have the
> mutex conversion for quite some time already. Need to look at it.

Thomas, did you ever figure out what was going on?  It would be nice to 
know why the -rt kernels don't have any lockdep problems associated 
with the device mutexes.

Alan Stern


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Patch 0/2] sysfs: fix s_active lockdep warning
  2010-02-26 19:36                             ` Alan Stern
@ 2010-02-26 20:54                               ` Thomas Gleixner
  0 siblings, 0 replies; 39+ messages in thread
From: Thomas Gleixner @ 2010-02-26 20:54 UTC (permalink / raw)
  To: Alan Stern
  Cc: Greg KH, Peter Zijlstra, Eric W. Biederman, Cong Wang,
	Kernel development list, Tejun Heo, Miles Lane, Heiko Carstens,
	Benjamin Herrenschmidt, Larry Finger, Andrew Morton

On Fri, 26 Feb 2010, Alan Stern wrote:
> On Thu, 4 Feb 2010, Thomas Gleixner wrote:
> 
> > On Thu, 4 Feb 2010, Greg KH wrote:
> > > > However in view of the other difficulties, it still doesn't seem
> > > > possible to make device mutexes work with lockdep.  I suggest removing 
> > > > Thomas's patch.
> > > 
> > > Unless Thomas or anyone else thinks of something that can solve these
> > > problems, I will do so.
> > 
> > Hmm, I have not seen those lockdep splats on -rt where we have the
> > mutex conversion for quite some time already. Need to look at it.
> 
> Thomas, did you ever figure out what was going on?  It would be nice to 
> know why the -rt kernels don't have any lockdep problems associated 
> with the device mutexes.

Yes, it just did not happen on the systems which I tested with lockdep
enabled, but I found a box where it triggered.

Sorry should have tested better. :(

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2010-02-26 20:55 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-29  7:01 [Patch 0/2] sysfs: fix s_active lockdep warning Amerigo Wang
2010-01-29  7:02 ` [Patch 1/2] sysfs: add support for lockdep subclasses to s_active Amerigo Wang
2010-01-29  7:02 ` [PATCH 2/2] sysfs: fix the incomplete part of subclass support for s_active Amerigo Wang
2010-01-29  7:21 ` [Patch 0/2] sysfs: fix s_active lockdep warning Eric W. Biederman
2010-01-29  8:38   ` Cong Wang
2010-01-29 13:44     ` Eric W. Biederman
2010-01-29 14:22       ` Greg KH
2010-01-29 17:57         ` Peter Zijlstra
2010-01-29 18:10           ` Greg KH
2010-01-29 18:14             ` Peter Zijlstra
2010-01-29 18:21               ` Greg KH
2010-01-29 20:10                 ` Peter Zijlstra
2010-01-29 20:30                   ` Eric W. Biederman
2010-02-04 11:38                     ` Peter Zijlstra
2010-02-04 16:35                       ` Alan Stern
2010-02-04 16:41                         ` Peter Zijlstra
2010-02-04 18:37                           ` Alan Stern
2010-02-05 10:18                             ` Peter Zijlstra
2010-02-05 15:30                               ` Alan Stern
2010-02-05 15:41                                 ` Peter Zijlstra
2010-02-07  9:22                                   ` Dave Young
2010-02-08  3:08                                     ` Cong Wang
2010-02-08  3:14                                       ` Dave Young
2010-02-08  3:30                                         ` Cong Wang
2010-02-08  3:06                                 ` Cong Wang
2010-02-08 15:38                               ` Alan Stern
2010-02-04 16:46                         ` Peter Zijlstra
2010-02-04 18:40                           ` Alan Stern
2010-02-05  3:09                             ` Cong Wang
2010-02-05  4:06                               ` Alan Stern
2010-02-04 16:46                         ` Greg KH
2010-02-04 16:59                           ` Thomas Gleixner
2010-02-26 19:36                             ` Alan Stern
2010-02-26 20:54                               ` Thomas Gleixner
2010-02-05  3:43                       ` Cong Wang
2010-02-05  8:55                       ` Eric W. Biederman
2010-01-29 20:25         ` Eric W. Biederman
2010-01-30  5:30           ` Greg KH
2010-01-29 18:02   ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox