* [Patch 0/2] sysfs: fix s_active lockdep warning
@ 2010-01-29 7:01 Amerigo Wang
2010-01-29 7:02 ` [Patch 1/2] sysfs: add support for lockdep subclasses to s_active Amerigo Wang
` (2 more replies)
0 siblings, 3 replies; 39+ messages in thread
From: Amerigo Wang @ 2010-01-29 7:01 UTC (permalink / raw)
To: linux-kernel
Cc: Tejun Heo, Greg Kroah-Hartman, Eric W. Biederman, Miles Lane,
Heiko Carstens, Benjamin Herrenschmidt, Larry Finger,
Amerigo Wang, akpm
Recently we met a lockdep warning from sysfs during s2ram or cpu hotplug.
As reported by several people, it is something like:
[ 6967.926563] ACPI: Preparing to enter system sleep state S3
[ 6967.956156] Disabling non-boot CPUs ...
[ 6967.970401]
[ 6967.970408] =============================================
[ 6967.970419] [ INFO: possible recursive locking detected ]
[ 6967.970431] 2.6.33-rc2-git6 #27
[ 6967.970439] ---------------------------------------------
[ 6967.970450] pm-suspend/22147 is trying to acquire lock:
[ 6967.970460] (s_active){++++.+}, at: [<c10d2941>]
sysfs_hash_and_remove+0x3d/0x4f
[ 6967.970493]
[ 6967.970497] but task is already holding lock:
[ 6967.970506] (s_active){++++.+}, at: [<c10d4110>]
sysfs_get_active_two+0x16/0x36
[...]
Eric already provides a patch for this[1], but it still can't fix the
problem. I add the missing part of Eric's patch and send these two patches
together, hopefully we can fix the warning completely.
1. http://lkml.org/lkml/2010/1/10/282
Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reported-by: Larry Finger <Larry.Finger@lwfinger.net>
Reported-by: Miles Lane <miles.lane@gmail.com>
Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: WANG Cong <amwang@redhat.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
^ permalink raw reply [flat|nested] 39+ messages in thread* [Patch 1/2] sysfs: add support for lockdep subclasses to s_active 2010-01-29 7:01 [Patch 0/2] sysfs: fix s_active lockdep warning Amerigo Wang @ 2010-01-29 7:02 ` Amerigo Wang 2010-01-29 7:02 ` [PATCH 2/2] sysfs: fix the incomplete part of subclass support for s_active Amerigo Wang 2010-01-29 7:21 ` [Patch 0/2] sysfs: fix s_active lockdep warning Eric W. Biederman 2 siblings, 0 replies; 39+ messages in thread From: Amerigo Wang @ 2010-01-29 7:02 UTC (permalink / raw) To: linux-kernel Cc: Tejun Heo, Greg Kroah-Hartman, Eric W. Biederman, Benjamin Herrenschmidt, Heiko Carstens, Miles Lane, Larry Finger, Amerigo Wang, akpm From: Eric W. Biederman <ebiederm@xmission.com> Date: Fri, 29 Jan 2010 14:03:39 +0800 Subject: [PATCH 1/2] sysfs: add support for lockdep subclasses to s_active We have apparently valid cases where the code for a sysfs attribute removes other sysfs attributes. Without support for subclasses lockdep flags a possible recursive lock problem as it figures the first sysfs attribute could be attempting to remove itself. By adding support for sysfs subclasses we can teach lockdep to distinguish between different types of sysfs attributes and not get confused. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: WANG Cong <amwang@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Greg Kroah-Hartman <gregkh@suse.de> --- fs/sysfs/dir.c | 14 ++++++++++++-- include/linux/sysfs.h | 7 +++++++ kernel/power/power.h | 15 ++++++++------- 3 files changed, 27 insertions(+), 9 deletions(-) diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c index 699f371..e0cd4a0 100644 --- a/fs/sysfs/dir.c +++ b/fs/sysfs/dir.c @@ -95,9 +95,14 @@ static void sysfs_unlink_sibling(struct sysfs_dirent *sd) */ static struct sysfs_dirent *sysfs_get_active(struct sysfs_dirent *sd) { + int subclass; if (unlikely(!sd)) return NULL; + subclass = SYSFS_ATTR_NORMAL; + if (sysfs_type(sd) == SYSFS_KOBJ_ATTR) + subclass = sd->s_attr.attr->subclass; + while (1) { int v, t; @@ -107,7 +112,7 @@ static struct sysfs_dirent *sysfs_get_active(struct sysfs_dirent *sd) t = atomic_cmpxchg(&sd->s_active, v, v + 1); if (likely(t == v)) { - rwsem_acquire_read(&sd->dep_map, 0, 1, _RET_IP_); + rwsem_acquire_read(&sd->dep_map, subclass, 1, _RET_IP_); return sd; } if (t < 0) @@ -192,12 +197,17 @@ void sysfs_put_active_two(struct sysfs_dirent *sd) static void sysfs_deactivate(struct sysfs_dirent *sd) { DECLARE_COMPLETION_ONSTACK(wait); + int subclass; int v; BUG_ON(sd->s_sibling || !(sd->s_flags & SYSFS_FLAG_REMOVED)); sd->s_sibling = (void *)&wait; - rwsem_acquire(&sd->dep_map, 0, 0, _RET_IP_); + subclass = SYSFS_ATTR_NORMAL; + if (sysfs_type(sd) == SYSFS_KOBJ_ATTR) + subclass = sd->s_attr.attr->subclass; + + rwsem_acquire(&sd->dep_map, subclass, 0, _RET_IP_); /* atomic_add_return() is a mb(), put_active() will always see * the updated sd->s_sibling. */ diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h index cfa8308..2f50fec 100644 --- a/include/linux/sysfs.h +++ b/include/linux/sysfs.h @@ -20,6 +20,12 @@ struct kobject; struct module; +enum sysfs_attr_lock_class +{ + SYSFS_ATTR_NORMAL, + SYSFS_ATTR_PM_CONTROL, +}; + /* FIXME * The *owner field is no longer used. * x86 tree has been cleaned up. The owner @@ -29,6 +35,7 @@ struct attribute { const char *name; struct module *owner; mode_t mode; + enum sysfs_attr_lock_class subclass; }; struct attribute_group { diff --git a/kernel/power/power.h b/kernel/power/power.h index 46c5a26..0459f27 100644 --- a/kernel/power/power.h +++ b/kernel/power/power.h @@ -54,13 +54,14 @@ extern int hibernation_platform_enter(void); extern int pfn_is_nosave(unsigned long); #define power_attr(_name) \ -static struct kobj_attribute _name##_attr = { \ - .attr = { \ - .name = __stringify(_name), \ - .mode = 0644, \ - }, \ - .show = _name##_show, \ - .store = _name##_store, \ +static struct kobj_attribute _name##_attr = { \ + .attr = { \ + .name = __stringify(_name), \ + .mode = 0644, \ + .subclass = SYSFS_ATTR_PM_CONTROL, \ + }, \ + .show = _name##_show, \ + .store = _name##_store, \ } /* Preferred image size in bytes (default 500 MB) */ -- 1.5.5.6 ^ permalink raw reply related [flat|nested] 39+ messages in thread
* [PATCH 2/2] sysfs: fix the incomplete part of subclass support for s_active 2010-01-29 7:01 [Patch 0/2] sysfs: fix s_active lockdep warning Amerigo Wang 2010-01-29 7:02 ` [Patch 1/2] sysfs: add support for lockdep subclasses to s_active Amerigo Wang @ 2010-01-29 7:02 ` Amerigo Wang 2010-01-29 7:21 ` [Patch 0/2] sysfs: fix s_active lockdep warning Eric W. Biederman 2 siblings, 0 replies; 39+ messages in thread From: Amerigo Wang @ 2010-01-29 7:02 UTC (permalink / raw) To: linux-kernel Cc: Tejun Heo, Greg Kroah-Hartman, Eric W. Biederman, Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, Amerigo Wang, akpm From: Amerigo Wang <amwang@redhat.com> Date: Fri, 29 Jan 2010 14:18:58 +0800 Subject: [PATCH 2/2] sysfs: fix the incomplete part of subclass support for s_active Patch 1/2 missed the initialization part for subclass, which I think is the reason why it still can't stop the lockdep warning. This depends on patch 1/2. Compile test only. Signed-off-by: WANG Cong <amwang@redhat.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Tejun Heo <tj@kernel.org> Cc: Greg Kroah-Hartman <gregkh@suse.de> --- fs/sysfs/dir.c | 1 - fs/sysfs/file.c | 2 ++ fs/sysfs/sysfs.h | 6 +++--- 3 files changed, 5 insertions(+), 4 deletions(-) diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c index e0cd4a0..6b63198 100644 --- a/fs/sysfs/dir.c +++ b/fs/sysfs/dir.c @@ -364,7 +364,6 @@ struct sysfs_dirent *sysfs_new_dirent(const char *name, umode_t mode, int type) atomic_set(&sd->s_count, 1); atomic_set(&sd->s_active, 0); - sysfs_dirent_init_lockdep(sd); sd->s_name = name; sd->s_mode = mode; diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c index dc30d9e..abf30d7 100644 --- a/fs/sysfs/file.c +++ b/fs/sysfs/file.c @@ -509,6 +509,8 @@ int sysfs_add_file_mode(struct sysfs_dirent *dir_sd, if (!sd) return -ENOMEM; sd->s_attr.attr = (void *)attr; + if ((type & SYSFS_TYPE_MASK) == SYSFS_KOBJ_ATTR) + sysfs_dirent_init_lockdep(sd, sd->s_attr.attr->subclass); sysfs_addrm_start(&acxt, dir_sd); rc = sysfs_add_one(&acxt, sd); diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h index cdd9377..7e78f2e 100644 --- a/fs/sysfs/sysfs.h +++ b/fs/sysfs/sysfs.h @@ -89,14 +89,14 @@ static inline unsigned int sysfs_type(struct sysfs_dirent *sd) } #ifdef CONFIG_DEBUG_LOCK_ALLOC -#define sysfs_dirent_init_lockdep(sd) \ +#define sysfs_dirent_init_lockdep(sd, c) \ do { \ static struct lock_class_key __key; \ \ - lockdep_init_map(&sd->dep_map, "s_active", &__key, 0); \ + lockdep_init_map(&sd->dep_map, "s_active", &__key, c); \ } while(0) #else -#define sysfs_dirent_init_lockdep(sd) do {} while(0) +#define sysfs_dirent_init_lockdep(sd, c) do {} while(0) #endif /* -- 1.5.5.6 ^ permalink raw reply related [flat|nested] 39+ messages in thread
* Re: [Patch 0/2] sysfs: fix s_active lockdep warning 2010-01-29 7:01 [Patch 0/2] sysfs: fix s_active lockdep warning Amerigo Wang 2010-01-29 7:02 ` [Patch 1/2] sysfs: add support for lockdep subclasses to s_active Amerigo Wang 2010-01-29 7:02 ` [PATCH 2/2] sysfs: fix the incomplete part of subclass support for s_active Amerigo Wang @ 2010-01-29 7:21 ` Eric W. Biederman 2010-01-29 8:38 ` Cong Wang 2010-01-29 18:02 ` Peter Zijlstra 2 siblings, 2 replies; 39+ messages in thread From: Eric W. Biederman @ 2010-01-29 7:21 UTC (permalink / raw) To: Amerigo Wang Cc: linux-kernel, Tejun Heo, Greg Kroah-Hartman, Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, akpm Amerigo Wang <amwang@redhat.com> writes: > Recently we met a lockdep warning from sysfs during s2ram or cpu hotplug. > As reported by several people, it is something like: > > [ 6967.926563] ACPI: Preparing to enter system sleep state S3 > [ 6967.956156] Disabling non-boot CPUs ... > [ 6967.970401] > [ 6967.970408] ============================================= > [ 6967.970419] [ INFO: possible recursive locking detected ] > [ 6967.970431] 2.6.33-rc2-git6 #27 > [ 6967.970439] --------------------------------------------- > [ 6967.970450] pm-suspend/22147 is trying to acquire lock: > [ 6967.970460] (s_active){++++.+}, at: [<c10d2941>] > sysfs_hash_and_remove+0x3d/0x4f > [ 6967.970493] > [ 6967.970497] but task is already holding lock: > [ 6967.970506] (s_active){++++.+}, at: [<c10d4110>] > sysfs_get_active_two+0x16/0x36 > [...] > > Eric already provides a patch for this[1], but it still can't fix the > problem. I add the missing part of Eric's patch and send these two patches > together, hopefully we can fix the warning completely. > > 1. http://lkml.org/lkml/2010/1/10/282 > > > Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> > Reported-by: Larry Finger <Larry.Finger@lwfinger.net> > Reported-by: Miles Lane <miles.lane@gmail.com> > Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com> > Signed-off-by: WANG Cong <amwang@redhat.com> > Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> > Cc: Tejun Heo <tj@kernel.org> > Cc: Greg Kroah-Hartman <gregkh@suse.de> Thanks for following up on this. I suspect we may want to create a separate class for each sysfs file instead of playing whack-a-mole and creating a subclass each time we have problems. I don't see why the rules for one sysfs file should be the same as for any other sysfs file. Eric ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [Patch 0/2] sysfs: fix s_active lockdep warning 2010-01-29 7:21 ` [Patch 0/2] sysfs: fix s_active lockdep warning Eric W. Biederman @ 2010-01-29 8:38 ` Cong Wang 2010-01-29 13:44 ` Eric W. Biederman 2010-01-29 18:02 ` Peter Zijlstra 1 sibling, 1 reply; 39+ messages in thread From: Cong Wang @ 2010-01-29 8:38 UTC (permalink / raw) To: Eric W. Biederman Cc: linux-kernel, Tejun Heo, Greg Kroah-Hartman, Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, akpm Eric W. Biederman wrote: > Amerigo Wang <amwang@redhat.com> writes: > >> Recently we met a lockdep warning from sysfs during s2ram or cpu hotplug. >> As reported by several people, it is something like: >> >> [ 6967.926563] ACPI: Preparing to enter system sleep state S3 >> [ 6967.956156] Disabling non-boot CPUs ... >> [ 6967.970401] >> [ 6967.970408] ============================================= >> [ 6967.970419] [ INFO: possible recursive locking detected ] >> [ 6967.970431] 2.6.33-rc2-git6 #27 >> [ 6967.970439] --------------------------------------------- >> [ 6967.970450] pm-suspend/22147 is trying to acquire lock: >> [ 6967.970460] (s_active){++++.+}, at: [<c10d2941>] >> sysfs_hash_and_remove+0x3d/0x4f >> [ 6967.970493] >> [ 6967.970497] but task is already holding lock: >> [ 6967.970506] (s_active){++++.+}, at: [<c10d4110>] >> sysfs_get_active_two+0x16/0x36 >> [...] >> >> Eric already provides a patch for this[1], but it still can't fix the >> problem. I add the missing part of Eric's patch and send these two patches >> together, hopefully we can fix the warning completely. >> >> 1. http://lkml.org/lkml/2010/1/10/282 >> >> >> Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> >> Reported-by: Larry Finger <Larry.Finger@lwfinger.net> >> Reported-by: Miles Lane <miles.lane@gmail.com> >> Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com> >> Signed-off-by: WANG Cong <amwang@redhat.com> >> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> >> Cc: Tejun Heo <tj@kernel.org> >> Cc: Greg Kroah-Hartman <gregkh@suse.de> > > Thanks for following up on this. > > I suspect we may want to create a separate class for each sysfs file > instead of playing whack-a-mole and creating a subclass each time we > have problems. > > I don't see why the rules for one sysfs file should be the same as for > any other sysfs file. > I am confused, we don't know who created sysfs files unless we separate them by subclasses, the way of your patch is very straight ward. ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [Patch 0/2] sysfs: fix s_active lockdep warning 2010-01-29 8:38 ` Cong Wang @ 2010-01-29 13:44 ` Eric W. Biederman 2010-01-29 14:22 ` Greg KH 0 siblings, 1 reply; 39+ messages in thread From: Eric W. Biederman @ 2010-01-29 13:44 UTC (permalink / raw) To: Cong Wang Cc: linux-kernel, Tejun Heo, Greg Kroah-Hartman, Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, akpm Cong Wang <amwang@redhat.com> writes: > Eric W. Biederman wrote: >> Amerigo Wang <amwang@redhat.com> writes: >> >>> Recently we met a lockdep warning from sysfs during s2ram or cpu hotplug. >>> As reported by several people, it is something like: >>> >>> [ 6967.926563] ACPI: Preparing to enter system sleep state S3 >>> [ 6967.956156] Disabling non-boot CPUs ... >>> [ 6967.970401] >>> [ 6967.970408] ============================================= >>> [ 6967.970419] [ INFO: possible recursive locking detected ] >>> [ 6967.970431] 2.6.33-rc2-git6 #27 >>> [ 6967.970439] --------------------------------------------- >>> [ 6967.970450] pm-suspend/22147 is trying to acquire lock: >>> [ 6967.970460] (s_active){++++.+}, at: [<c10d2941>] >>> sysfs_hash_and_remove+0x3d/0x4f >>> [ 6967.970493] >>> [ 6967.970497] but task is already holding lock: >>> [ 6967.970506] (s_active){++++.+}, at: [<c10d4110>] >>> sysfs_get_active_two+0x16/0x36 >>> [...] >>> >>> Eric already provides a patch for this[1], but it still can't fix the >>> problem. I add the missing part of Eric's patch and send these two patches >>> together, hopefully we can fix the warning completely. >>> >>> 1. http://lkml.org/lkml/2010/1/10/282 >>> >>> >>> Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> >>> Reported-by: Larry Finger <Larry.Finger@lwfinger.net> >>> Reported-by: Miles Lane <miles.lane@gmail.com> >>> Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com> >>> Signed-off-by: WANG Cong <amwang@redhat.com> >>> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> >>> Cc: Tejun Heo <tj@kernel.org> >>> Cc: Greg Kroah-Hartman <gregkh@suse.de> >> >> Thanks for following up on this. >> >> I suspect we may want to create a separate class for each sysfs file >> instead of playing whack-a-mole and creating a subclass each time we >> have problems. >> >> I don't see why the rules for one sysfs file should be the same as for >> any other sysfs file. >> > > I am confused, we don't know who created sysfs files unless we > separate them by subclasses, the way of your patch is very straight > ward. The assumption is that all entities in a class are used very similarly. What I was suggesting is that it may make sense, and be simpler to have a separate __key value and thus place each sysfs file in it's class. Doing that is a lot more for lockdep to track, but it would not produce the confusing false positives that we see now. From the reports I have seen we may have more that 16 subclasses in sysfs, and it will likely take us a while to find them all. Eric ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [Patch 0/2] sysfs: fix s_active lockdep warning 2010-01-29 13:44 ` Eric W. Biederman @ 2010-01-29 14:22 ` Greg KH 2010-01-29 17:57 ` Peter Zijlstra 2010-01-29 20:25 ` Eric W. Biederman 0 siblings, 2 replies; 39+ messages in thread From: Greg KH @ 2010-01-29 14:22 UTC (permalink / raw) To: Eric W. Biederman Cc: Cong Wang, linux-kernel, Tejun Heo, Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, akpm On Fri, Jan 29, 2010 at 05:44:07AM -0800, Eric W. Biederman wrote: > Cong Wang <amwang@redhat.com> writes: > > > Eric W. Biederman wrote: > >> Amerigo Wang <amwang@redhat.com> writes: > >> > >>> Recently we met a lockdep warning from sysfs during s2ram or cpu hotplug. > >>> As reported by several people, it is something like: > >>> > >>> [ 6967.926563] ACPI: Preparing to enter system sleep state S3 > >>> [ 6967.956156] Disabling non-boot CPUs ... > >>> [ 6967.970401] > >>> [ 6967.970408] ============================================= > >>> [ 6967.970419] [ INFO: possible recursive locking detected ] > >>> [ 6967.970431] 2.6.33-rc2-git6 #27 > >>> [ 6967.970439] --------------------------------------------- > >>> [ 6967.970450] pm-suspend/22147 is trying to acquire lock: > >>> [ 6967.970460] (s_active){++++.+}, at: [<c10d2941>] > >>> sysfs_hash_and_remove+0x3d/0x4f > >>> [ 6967.970493] > >>> [ 6967.970497] but task is already holding lock: > >>> [ 6967.970506] (s_active){++++.+}, at: [<c10d4110>] > >>> sysfs_get_active_two+0x16/0x36 > >>> [...] > >>> > >>> Eric already provides a patch for this[1], but it still can't fix the > >>> problem. I add the missing part of Eric's patch and send these two patches > >>> together, hopefully we can fix the warning completely. > >>> > >>> 1. http://lkml.org/lkml/2010/1/10/282 > >>> > >>> > >>> Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> > >>> Reported-by: Larry Finger <Larry.Finger@lwfinger.net> > >>> Reported-by: Miles Lane <miles.lane@gmail.com> > >>> Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com> > >>> Signed-off-by: WANG Cong <amwang@redhat.com> > >>> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> > >>> Cc: Tejun Heo <tj@kernel.org> > >>> Cc: Greg Kroah-Hartman <gregkh@suse.de> > >> > >> Thanks for following up on this. > >> > >> I suspect we may want to create a separate class for each sysfs file > >> instead of playing whack-a-mole and creating a subclass each time we > >> have problems. > >> > >> I don't see why the rules for one sysfs file should be the same as for > >> any other sysfs file. > >> > > > > I am confused, we don't know who created sysfs files unless we > > separate them by subclasses, the way of your patch is very straight > > ward. > > The assumption is that all entities in a class are used very similarly. > What I was suggesting is that it may make sense, and be simpler to have > a separate __key value and thus place each sysfs file in it's class. > > Doing that is a lot more for lockdep to track, but it would not produce > the confusing false positives that we see now. From the reports I have > seen we may have more that 16 subclasses in sysfs, and it will likely > take us a while to find them all. Heh, this whole mess is the very reason we didn't add lockdep support to the driver core. Nested devices that all look alike from the driver core, are really different objects and the locking lifetimes are separate, but lockdep can't see that. I suggest we just remove the original patch, as it seems to be causing way too many problems. Any objections to that? thanks, greg k-h ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [Patch 0/2] sysfs: fix s_active lockdep warning 2010-01-29 14:22 ` Greg KH @ 2010-01-29 17:57 ` Peter Zijlstra 2010-01-29 18:10 ` Greg KH 2010-01-29 20:25 ` Eric W. Biederman 1 sibling, 1 reply; 39+ messages in thread From: Peter Zijlstra @ 2010-01-29 17:57 UTC (permalink / raw) To: Greg KH Cc: Eric W. Biederman, Cong Wang, linux-kernel, Tejun Heo, Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, akpm On Fri, 2010-01-29 at 06:22 -0800, Greg KH wrote: > > Heh, this whole mess is the very reason we didn't add lockdep support to > the driver core. Nested devices that all look alike from the driver > core, are really different objects and the locking lifetimes are > separate, but lockdep can't see that. And here I through Alan Stern had a handle on making the driver core play nice. ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [Patch 0/2] sysfs: fix s_active lockdep warning 2010-01-29 17:57 ` Peter Zijlstra @ 2010-01-29 18:10 ` Greg KH 2010-01-29 18:14 ` Peter Zijlstra 0 siblings, 1 reply; 39+ messages in thread From: Greg KH @ 2010-01-29 18:10 UTC (permalink / raw) To: Peter Zijlstra Cc: Eric W. Biederman, Cong Wang, linux-kernel, Tejun Heo, Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, akpm On Fri, Jan 29, 2010 at 06:57:28PM +0100, Peter Zijlstra wrote: > On Fri, 2010-01-29 at 06:22 -0800, Greg KH wrote: > > > > Heh, this whole mess is the very reason we didn't add lockdep support to > > the driver core. Nested devices that all look alike from the driver > > core, are really different objects and the locking lifetimes are > > separate, but lockdep can't see that. > > And here I through Alan Stern had a handle on making the driver core > play nice. It's not the driver core that is the issue here, it's that lockdep can't handle the tree structure of devices that is represented in the kernel. I don't think it is a driver core problem, but rather, a lockdep issue. thanks, greg k-h ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [Patch 0/2] sysfs: fix s_active lockdep warning 2010-01-29 18:10 ` Greg KH @ 2010-01-29 18:14 ` Peter Zijlstra 2010-01-29 18:21 ` Greg KH 0 siblings, 1 reply; 39+ messages in thread From: Peter Zijlstra @ 2010-01-29 18:14 UTC (permalink / raw) To: Greg KH Cc: Eric W. Biederman, Cong Wang, linux-kernel, Tejun Heo, Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, akpm On Fri, 2010-01-29 at 10:10 -0800, Greg KH wrote: > On Fri, Jan 29, 2010 at 06:57:28PM +0100, Peter Zijlstra wrote: > > On Fri, 2010-01-29 at 06:22 -0800, Greg KH wrote: > > > > > > Heh, this whole mess is the very reason we didn't add lockdep support to > > > the driver core. Nested devices that all look alike from the driver > > > core, are really different objects and the locking lifetimes are > > > separate, but lockdep can't see that. > > > > And here I through Alan Stern had a handle on making the driver core > > play nice. > > It's not the driver core that is the issue here, it's that lockdep can't > handle the tree structure of devices that is represented in the kernel. > > I don't think it is a driver core problem, but rather, a lockdep issue. Right, we've been over that and I think I added enough lockdep annotations to make it work for the device tree. At least, Alan and I seemed to agree on that last time we talked about it. ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [Patch 0/2] sysfs: fix s_active lockdep warning 2010-01-29 18:14 ` Peter Zijlstra @ 2010-01-29 18:21 ` Greg KH 2010-01-29 20:10 ` Peter Zijlstra 0 siblings, 1 reply; 39+ messages in thread From: Greg KH @ 2010-01-29 18:21 UTC (permalink / raw) To: Peter Zijlstra Cc: Eric W. Biederman, Cong Wang, linux-kernel, Tejun Heo, Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, akpm On Fri, Jan 29, 2010 at 07:14:20PM +0100, Peter Zijlstra wrote: > On Fri, 2010-01-29 at 10:10 -0800, Greg KH wrote: > > On Fri, Jan 29, 2010 at 06:57:28PM +0100, Peter Zijlstra wrote: > > > On Fri, 2010-01-29 at 06:22 -0800, Greg KH wrote: > > > > > > > > Heh, this whole mess is the very reason we didn't add lockdep support to > > > > the driver core. Nested devices that all look alike from the driver > > > > core, are really different objects and the locking lifetimes are > > > > separate, but lockdep can't see that. > > > > > > And here I through Alan Stern had a handle on making the driver core > > > play nice. > > > > It's not the driver core that is the issue here, it's that lockdep can't > > handle the tree structure of devices that is represented in the kernel. > > > > I don't think it is a driver core problem, but rather, a lockdep issue. > > Right, we've been over that and I think I added enough lockdep > annotations to make it work for the device tree. At least, Alan and I > seemed to agree on that last time we talked about it. Ah, I didn't realize that, very nice. If so, then this sysfs lock stuff should be able to use those annotations and we shouldn't have this issue, right? thanks, greg k-h ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [Patch 0/2] sysfs: fix s_active lockdep warning 2010-01-29 18:21 ` Greg KH @ 2010-01-29 20:10 ` Peter Zijlstra 2010-01-29 20:30 ` Eric W. Biederman 0 siblings, 1 reply; 39+ messages in thread From: Peter Zijlstra @ 2010-01-29 20:10 UTC (permalink / raw) To: Greg KH Cc: Eric W. Biederman, Cong Wang, linux-kernel, Tejun Heo, Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, akpm On Fri, 2010-01-29 at 10:21 -0800, Greg KH wrote: > On Fri, Jan 29, 2010 at 07:14:20PM +0100, Peter Zijlstra wrote: > > On Fri, 2010-01-29 at 10:10 -0800, Greg KH wrote: > > > On Fri, Jan 29, 2010 at 06:57:28PM +0100, Peter Zijlstra wrote: > > > > On Fri, 2010-01-29 at 06:22 -0800, Greg KH wrote: > > > > > > > > > > Heh, this whole mess is the very reason we didn't add lockdep support to > > > > > the driver core. Nested devices that all look alike from the driver > > > > > core, are really different objects and the locking lifetimes are > > > > > separate, but lockdep can't see that. > > > > > > > > And here I through Alan Stern had a handle on making the driver core > > > > play nice. > > > > > > It's not the driver core that is the issue here, it's that lockdep can't > > > handle the tree structure of devices that is represented in the kernel. > > > > > > I don't think it is a driver core problem, but rather, a lockdep issue. > > > > Right, we've been over that and I think I added enough lockdep > > annotations to make it work for the device tree. At least, Alan and I > > seemed to agree on that last time we talked about it. > > Ah, I didn't realize that, very nice. > > If so, then this sysfs lock stuff should be able to use those > annotations and we shouldn't have this issue, right? I really wouldn't know, I've not yet looked at sysfs to see what the particular issue is. But possibly, if you say the problem space is similar. ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [Patch 0/2] sysfs: fix s_active lockdep warning 2010-01-29 20:10 ` Peter Zijlstra @ 2010-01-29 20:30 ` Eric W. Biederman 2010-02-04 11:38 ` Peter Zijlstra 0 siblings, 1 reply; 39+ messages in thread From: Eric W. Biederman @ 2010-01-29 20:30 UTC (permalink / raw) To: Peter Zijlstra Cc: Greg KH, Cong Wang, linux-kernel, Tejun Heo, Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, akpm Peter Zijlstra <peterz@infradead.org> writes: > On Fri, 2010-01-29 at 10:21 -0800, Greg KH wrote: >> On Fri, Jan 29, 2010 at 07:14:20PM +0100, Peter Zijlstra wrote: >> > On Fri, 2010-01-29 at 10:10 -0800, Greg KH wrote: >> > > On Fri, Jan 29, 2010 at 06:57:28PM +0100, Peter Zijlstra wrote: >> > > > On Fri, 2010-01-29 at 06:22 -0800, Greg KH wrote: >> > > > > >> > > > > Heh, this whole mess is the very reason we didn't add lockdep support to >> > > > > the driver core. Nested devices that all look alike from the driver >> > > > > core, are really different objects and the locking lifetimes are >> > > > > separate, but lockdep can't see that. >> > > > >> > > > And here I through Alan Stern had a handle on making the driver core >> > > > play nice. >> > > >> > > It's not the driver core that is the issue here, it's that lockdep can't >> > > handle the tree structure of devices that is represented in the kernel. >> > > >> > > I don't think it is a driver core problem, but rather, a lockdep issue. >> > >> > Right, we've been over that and I think I added enough lockdep >> > annotations to make it work for the device tree. At least, Alan and I >> > seemed to agree on that last time we talked about it. >> >> Ah, I didn't realize that, very nice. >> >> If so, then this sysfs lock stuff should be able to use those >> annotations and we shouldn't have this issue, right? > > I really wouldn't know, I've not yet looked at sysfs to see what the > particular issue is. But possibly, if you say the problem space is > similar. We get false positives when the code of a sysfs attribute synchronously removes other sysfs attributes. In general that is not safe due to hotplug etc, but there are specific instances of static sysfs entries like the pm_core where it appears to be safe. I am not familiar with the device core lockdep issues. Are they similar? Eric ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [Patch 0/2] sysfs: fix s_active lockdep warning 2010-01-29 20:30 ` Eric W. Biederman @ 2010-02-04 11:38 ` Peter Zijlstra 2010-02-04 16:35 ` Alan Stern ` (2 more replies) 0 siblings, 3 replies; 39+ messages in thread From: Peter Zijlstra @ 2010-02-04 11:38 UTC (permalink / raw) To: Eric W. Biederman Cc: Greg KH, Cong Wang, linux-kernel, Tejun Heo, Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, akpm, Alan Stern On Fri, 2010-01-29 at 12:30 -0800, Eric W. Biederman wrote: > We get false positives when the code of a sysfs attribute > synchronously removes other sysfs attributes. In general that is not > safe due to hotplug etc, but there are specific instances of static > sysfs entries like the pm_core where it appears to be safe. > > I am not familiar with the device core lockdep issues. Are they similar? The device tree had the problem that we could basically hold a device lock and an unspecified number of parent locks (iirc this was due to device probing, where we hold the bus lock while probing/adding child device, recursively). If we place each dev->lock into the same class (which would naively happen), then this would lead to recursive lock warnings. The proposed solution for this is to create MAX_LOCK_DEPTH classes and assign them to the dev->lock depending on the depth in the device tree (Alan said that MAX_LOCK_DEPTH is sufficient for all practical cases). static struct lock_class_key dev_tree_classes[MAX_LOCK_DEPTH]; device_add() or thereabouts would have something like: #ifdef CONFIG_PROVE_LOCKING BUG_ON(dev->depth >= MAX_LOCK_DEPTH); lockdep_set_class(dev->lock, &dev_tree_classes[dev->depth]); #endif Then there was a problem were we could lock all child devices while holding the parent device lock (forgot why though), this would, on taking the second child dev->lock, again lead to recursive lock warnings. We have an annotation for that: lock_nest_lock (currently only spin_lock_nest_lock exists, but mutex_lock_nest_lock is easily created), and this would allow you to do things like: mutex_lock(&parent->lock); for_each_device_child(child, parent) { mutex_lock_nest_lock(&child->lock, &parent->lock); ... } I hope this helps in figuring out the sysfs case.. ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [Patch 0/2] sysfs: fix s_active lockdep warning 2010-02-04 11:38 ` Peter Zijlstra @ 2010-02-04 16:35 ` Alan Stern 2010-02-04 16:41 ` Peter Zijlstra ` (2 more replies) 2010-02-05 3:43 ` Cong Wang 2010-02-05 8:55 ` Eric W. Biederman 2 siblings, 3 replies; 39+ messages in thread From: Alan Stern @ 2010-02-04 16:35 UTC (permalink / raw) To: Peter Zijlstra Cc: Eric W. Biederman, Greg KH, Thomas Gleixner, Cong Wang, Kernel development list, Tejun Heo, Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, Andrew Morton Greg: You have accepted Thomas's patch "drivers/base: Convert dev->sem to mutex". It generates lockdep violations galore during device probing and removal! Luckily lockdep is smart enough only to print the first occurrence. Here's what I get early on during bootup: [ 0.149911] ACPI: EC: Look up EC in DSDT [ 0.170665] ACPI: Executed 1 blocks of module-level executable AML code [ 0.198111] ACPI: Interpreter enabled [ 0.198267] ACPI: (supports S0 S3 S4 S5) [ 0.198802] ACPI: Using IOAPIC for interrupt routing [ 0.266493] [ 0.266496] ============================================= [ 0.266775] [ INFO: possible recursive locking detected ] [ 0.266917] 2.6.33-rc6 #1 [ 0.267051] --------------------------------------------- [ 0.267192] swapper/1 is trying to acquire lock: [ 0.267332] (&dev->mutex){+.+...}, at: [<c11496be>] __driver_attach+0x38/0x63 [ 0.267683] [ 0.267685] but task is already holding lock: [ 0.267953] (&dev->mutex){+.+...}, at: [<c11496b2>] __driver_attach+0x2c/0x63 [ 0.268000] [ 0.268000] other info that might help us debug this: [ 0.268000] 1 lock held by swapper/1: [ 0.268000] #0: (&dev->mutex){+.+...}, at: [<c11496b2>] __driver_attach+0x2c/0x63 [ 0.268000] [ 0.268000] stack backtrace: [ 0.268000] Pid: 1, comm: swapper Not tainted 2.6.33-rc6 #1 [ 0.268000] Call Trace: [ 0.268000] [<c11c819e>] ? printk+0xf/0x11 [ 0.268000] [<c1041c9b>] __lock_acquire+0x804/0xb47 [ 0.268000] [<c10b2026>] ? sysfs_addrm_finish+0x19/0xe2 [ 0.268000] [<c1042020>] lock_acquire+0x42/0x59 [ 0.268000] [<c11496be>] ? __driver_attach+0x38/0x63 [ 0.268000] [<c11c90c6>] __mutex_lock_common+0x39/0x38f [ 0.268000] [<c11496be>] ? __driver_attach+0x38/0x63 [ 0.268000] [<c11c94ab>] mutex_lock_nested+0x2b/0x33 [ 0.268000] [<c11496be>] ? __driver_attach+0x38/0x63 [ 0.268000] [<c11496be>] __driver_attach+0x38/0x63 [ 0.268000] [<c1148e0a>] bus_for_each_dev+0x3d/0x67 [ 0.268000] [<c11494cf>] driver_attach+0x14/0x16 [ 0.268000] [<c1149686>] ? __driver_attach+0x0/0x63 [ 0.268000] [<c11491c1>] bus_add_driver+0x92/0x1c5 [ 0.268000] [<c114990f>] driver_register+0x79/0xe0 [ 0.268000] [<c1106d32>] acpi_bus_register_driver+0x3a/0x3c [ 0.268000] [<c131999f>] acpi_power_init+0x3f/0x5e [ 0.268000] [<c1319422>] acpi_init+0x28e/0x2c8 [ 0.268000] [<c1319194>] ? acpi_init+0x0/0x2c8 [ 0.268000] [<c1001139>] do_one_initcall+0x4c/0x136 [ 0.268000] [<c130130b>] kernel_init+0x11c/0x16d [ 0.268000] [<c13011ef>] ? kernel_init+0x0/0x16d [ 0.268000] [<c1002cba>] kernel_thread_helper+0x6/0x10 [ 0.268485] ACPI: Power Resource [GFAN] (on) On Thu, 4 Feb 2010, Peter Zijlstra wrote: > The device tree had the problem that we could basically hold a device > lock and an unspecified number of parent locks (iirc this was due to > device probing, where we hold the bus lock while probing/adding child > device, recursively). > > If we place each dev->lock into the same class (which would naively > happen), then this would lead to recursive lock warnings. The proposed > solution for this is to create MAX_LOCK_DEPTH classes and assign them to > the dev->lock depending on the depth in the device tree (Alan said that > MAX_LOCK_DEPTH is sufficient for all practical cases). > > static struct lock_class_key dev_tree_classes[MAX_LOCK_DEPTH]; > > device_add() or thereabouts would have something like: > > #ifdef CONFIG_PROVE_LOCKING > BUG_ON(dev->depth >= MAX_LOCK_DEPTH); > lockdep_set_class(dev->lock, &dev_tree_classes[dev->depth]); > #endif Unfortunately this doesn't really work. Here is a patch implementing the scheme: Index: usb-2.6/drivers/base/core.c =================================================================== --- usb-2.6.orig/drivers/base/core.c +++ usb-2.6/drivers/base/core.c @@ -22,6 +22,7 @@ #include <linux/kallsyms.h> #include <linux/mutex.h> #include <linux/async.h> +#include <linux/sched.h> #include "base.h" #include "power/power.h" @@ -671,6 +672,26 @@ static void setup_parent(struct device * dev->kobj.parent = kobj; } +#ifdef CONFIG_PROVE_LOCKING +static struct lock_class_key dev_tree_classes[MAX_LOCK_DEPTH]; + +static void setup_mutex_depth(struct device *dev, struct device *parent) +{ + int depth = 0; + + /* Dynamically determine the device's depth in the device tree */ + while (parent) { + ++depth; + parent = parent->parent; + } + BUG_ON(depth > MAX_LOCK_DEPTH); + lockdep_set_class(&dev->mutex, &dev_tree_classes[depth]); +} +#else +static inline void setup_mutex_depth(struct device *dev, + struct device *parent) {} +#endif + static int device_add_class_symlinks(struct device *dev) { int error; @@ -912,6 +933,7 @@ int device_add(struct device *dev) parent = get_device(dev->parent); setup_parent(dev, parent); + setup_mutex_depth(dev, parent); /* use parent numa_node */ if (parent) This doesn't address the fact that we really have multiple device trees (for example, class devices are handled separately from normal devices). With the above patch installed, I still get lockdep violations farther on during boot: [ 0.272332] pci_bus 0000:00: on NUMA node 0 [ 0.272355] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT] [ 0.273503] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P4._PRT] [ 0.279205] [ 0.279208] ============================================= [ 0.279485] [ INFO: possible recursive locking detected ] [ 0.279628] 2.6.33-rc6 #2 [ 0.279763] --------------------------------------------- [ 0.279905] swapper/1 is trying to acquire lock: [ 0.280000] (&dev_tree_classes[depth]#2){+.+.+.}, at: [<c1149776>] device_attach+0x14/0x6e [ 0.280000] [ 0.280000] but task is already holding lock: [ 0.280000] (&dev_tree_classes[depth]#2){+.+.+.}, at: [<c11496de>] __driver_attach+0x2c/0x63 [ 0.280000] [ 0.280000] other info that might help us debug this: [ 0.280000] 2 locks held by swapper/1: [ 0.280000] #0: (&dev_tree_classes[depth]#2){+.+.+.}, at: [<c11496de>] __driver_attach+0x2c/0x63 [ 0.280000] #1: (&dev_tree_classes[depth]#3){+.+.+.}, at: [<c11496ea>] __driver_attach+0x38/0x63 [ 0.280000] [ 0.280000] stack backtrace: [ 0.280000] Pid: 1, comm: swapper Not tainted 2.6.33-rc6 #2 [ 0.280000] Call Trace: [ 0.280000] [<c11c81ce>] ? printk+0xf/0x11 [ 0.280000] [<c1041c9b>] __lock_acquire+0x804/0xb47 [ 0.280000] [<c101a73d>] ? spin_unlock_irqrestore+0x8/0xa [ 0.280000] [<c101a891>] ? __wake_up+0x32/0x3b [ 0.280000] [<c1042020>] lock_acquire+0x42/0x59 [ 0.280000] [<c1149776>] ? device_attach+0x14/0x6e [ 0.280000] [<c11c90f6>] __mutex_lock_common+0x39/0x38f [ 0.280000] [<c1149776>] ? device_attach+0x14/0x6e [ 0.280000] [<c1040e2e>] ? trace_hardirqs_on+0xb/0xd [ 0.280000] [<c10ed5a7>] ? kobject_uevent_env+0x2e9/0x30a [ 0.280000] [<c10ed5a7>] ? kobject_uevent_env+0x2e9/0x30a [ 0.280000] [<c11c94db>] mutex_lock_nested+0x2b/0x33 [ 0.280000] [<c1149776>] ? device_attach+0x14/0x6e [ 0.280000] [<c1149776>] device_attach+0x14/0x6e [ 0.280000] [<c1148aa1>] bus_probe_device+0x1b/0x30 [ 0.280000] [<c1147b6c>] device_add+0x310/0x458 [ 0.280000] [<c10f96ac>] pci_bus_add_device+0xf/0x30 [ 0.280000] [<c10f96f0>] pci_bus_add_devices+0x23/0xdd [ 0.280000] [<c11c011b>] ? acpi_pci_root_add+0x1cf/0x1ff [ 0.280000] [<c11088a3>] acpi_pci_root_start+0x11/0x15 [ 0.280000] [<c1106370>] acpi_start_single_object+0x1e/0x3f [ 0.280000] [<c11064a9>] acpi_device_probe+0x78/0xf4 [ 0.280000] [<c1149632>] driver_probe_device+0x87/0x107 [ 0.280000] [<c11496f9>] __driver_attach+0x47/0x63 [ 0.280000] [<c1148e36>] bus_for_each_dev+0x3d/0x67 [ 0.280000] [<c11494fb>] driver_attach+0x14/0x16 [ 0.280000] [<c11496b2>] ? __driver_attach+0x0/0x63 [ 0.280000] [<c11491ed>] bus_add_driver+0x92/0x1c5 [ 0.280000] [<c1319798>] ? acpi_pci_root_init+0x0/0x25 [ 0.280000] [<c114993b>] driver_register+0x79/0xe0 [ 0.280000] [<c1319798>] ? acpi_pci_root_init+0x0/0x25 [ 0.280000] [<c1106d32>] acpi_bus_register_driver+0x3a/0x3c [ 0.280000] [<c13197ae>] acpi_pci_root_init+0x16/0x25 [ 0.280000] [<c1001139>] do_one_initcall+0x4c/0x136 [ 0.280000] [<c130130b>] kernel_init+0x11c/0x16d [ 0.280000] [<c13011ef>] ? kernel_init+0x0/0x16d [ 0.280000] [<c1002cba>] kernel_thread_helper+0x6/0x10 [ 0.328206] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 *10 11 12 14 15) [ 0.329223] ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 *5 6 7 10 11 12 14 15) > Then there was a problem were we could lock all child devices while > holding the parent device lock (forgot why though), this would, on > taking the second child dev->lock, again lead to recursive lock > warnings. AFAIK, the code that used to do this is no longer present. There may be other places where it is still done, but I'm not aware of any. However in view of the other difficulties, it still doesn't seem possible to make device mutexes work with lockdep. I suggest removing Thomas's patch. Alan Stern ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [Patch 0/2] sysfs: fix s_active lockdep warning 2010-02-04 16:35 ` Alan Stern @ 2010-02-04 16:41 ` Peter Zijlstra 2010-02-04 18:37 ` Alan Stern 2010-02-04 16:46 ` Peter Zijlstra 2010-02-04 16:46 ` Greg KH 2 siblings, 1 reply; 39+ messages in thread From: Peter Zijlstra @ 2010-02-04 16:41 UTC (permalink / raw) To: Alan Stern Cc: Eric W. Biederman, Greg KH, Thomas Gleixner, Cong Wang, Kernel development list, Tejun Heo, Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, Andrew Morton On Thu, 2010-02-04 at 11:35 -0500, Alan Stern wrote: > Greg: > > You have accepted Thomas's patch "drivers/base: Convert dev->sem to > mutex". It generates lockdep violations galore during device probing > and removal! Luckily lockdep is smart enough only to print the first > occurrence. Here's what I get early on during bootup: <snip lockdep splat> > On Thu, 4 Feb 2010, Peter Zijlstra wrote: > > > The device tree had the problem that we could basically hold a device > > lock and an unspecified number of parent locks (iirc this was due to > > device probing, where we hold the bus lock while probing/adding child > > device, recursively). > > > > If we place each dev->lock into the same class (which would naively > > happen), then this would lead to recursive lock warnings. The proposed > > solution for this is to create MAX_LOCK_DEPTH classes and assign them to > > the dev->lock depending on the depth in the device tree (Alan said that > > MAX_LOCK_DEPTH is sufficient for all practical cases). > > > > static struct lock_class_key dev_tree_classes[MAX_LOCK_DEPTH]; > > > > device_add() or thereabouts would have something like: > > > > #ifdef CONFIG_PROVE_LOCKING > > BUG_ON(dev->depth >= MAX_LOCK_DEPTH); > > lockdep_set_class(dev->lock, &dev_tree_classes[dev->depth]); > > #endif > > Unfortunately this doesn't really work. Here is a patch implementing > the scheme: > > Index: usb-2.6/drivers/base/core.c > =================================================================== > --- usb-2.6.orig/drivers/base/core.c > +++ usb-2.6/drivers/base/core.c > @@ -22,6 +22,7 @@ > #include <linux/kallsyms.h> > #include <linux/mutex.h> > #include <linux/async.h> > +#include <linux/sched.h> > > #include "base.h" > #include "power/power.h" > @@ -671,6 +672,26 @@ static void setup_parent(struct device * > dev->kobj.parent = kobj; > } > > +#ifdef CONFIG_PROVE_LOCKING > +static struct lock_class_key dev_tree_classes[MAX_LOCK_DEPTH]; > + > +static void setup_mutex_depth(struct device *dev, struct device *parent) > +{ > + int depth = 0; > + > + /* Dynamically determine the device's depth in the device tree */ > + while (parent) { > + ++depth; > + parent = parent->parent; > + } > + BUG_ON(depth > MAX_LOCK_DEPTH); > + lockdep_set_class(&dev->mutex, &dev_tree_classes[depth]); > +} > +#else > +static inline void setup_mutex_depth(struct device *dev, > + struct device *parent) {} > +#endif > + > static int device_add_class_symlinks(struct device *dev) > { > int error; > @@ -912,6 +933,7 @@ int device_add(struct device *dev) > > parent = get_device(dev->parent); > setup_parent(dev, parent); > + setup_mutex_depth(dev, parent); > > /* use parent numa_node */ > if (parent) > > > This doesn't address the fact that we really have multiple device trees > (for example, class devices are handled separately from normal > devices). With the above patch installed, I still get lockdep > violations farther on during boot: <snip lockdep splat> Hmm, so you have multiple interacting trees? I had understood you only had a single device tree. So how many trees are there, is that fixed? Does the device know what tree it is going to end up in? If yes, then you can extend the setup_mutex_depth() function to pick a different class stack for each tree. ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [Patch 0/2] sysfs: fix s_active lockdep warning 2010-02-04 16:41 ` Peter Zijlstra @ 2010-02-04 18:37 ` Alan Stern 2010-02-05 10:18 ` Peter Zijlstra 0 siblings, 1 reply; 39+ messages in thread From: Alan Stern @ 2010-02-04 18:37 UTC (permalink / raw) To: Peter Zijlstra Cc: Eric W. Biederman, Greg KH, Thomas Gleixner, Cong Wang, Kernel development list, Tejun Heo, Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, Andrew Morton On Thu, 4 Feb 2010, Peter Zijlstra wrote: > > This doesn't address the fact that we really have multiple device trees > > (for example, class devices are handled separately from normal > > devices). With the above patch installed, I still get lockdep > > violations farther on during boot: > > <snip lockdep splat> > > Hmm, so you have multiple interacting trees? I had understood you only > had a single device tree. The real situation is kind of complicated, and I'm not familiar with all the details. But it's certainly true that a driver will want to work with (and lock!) multiple struct device's that don't have a parent-child relation in the tree. The simplest example is regular devices together with class devices, and another might be PCI devices together with their "shadow" ACPI devices. > So how many trees are there, is that fixed? > Does the device know what tree it is going to end up in? The driver generally knows, but AFAIK that information is not passed back to the driver core. At least, not directly -- you might say that it could be deduced from the parent pointer, assuming the core already knows all about the parent. > If yes, then you can extend the setup_mutex_depth() function to pick a > different class stack for each tree. Maybe this could be done. But for now perhaps a compromise is in order. We could make the switch from semaphores to mutexes while avoiding lockdep issues by assigning the device mutexes to a "don't-verify" class. Is there such a thing, or could it be added? Alan Stern ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [Patch 0/2] sysfs: fix s_active lockdep warning 2010-02-04 18:37 ` Alan Stern @ 2010-02-05 10:18 ` Peter Zijlstra 2010-02-05 15:30 ` Alan Stern 2010-02-08 15:38 ` Alan Stern 0 siblings, 2 replies; 39+ messages in thread From: Peter Zijlstra @ 2010-02-05 10:18 UTC (permalink / raw) To: Alan Stern Cc: Eric W. Biederman, Greg KH, Thomas Gleixner, Cong Wang, Kernel development list, Tejun Heo, Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, Andrew Morton On Thu, 2010-02-04 at 13:37 -0500, Alan Stern wrote: > On Thu, 4 Feb 2010, Peter Zijlstra wrote: > > > > This doesn't address the fact that we really have multiple device trees > > > (for example, class devices are handled separately from normal > > > devices). With the above patch installed, I still get lockdep > > > violations farther on during boot: > > > > <snip lockdep splat> > > > > Hmm, so you have multiple interacting trees? I had understood you only > > had a single device tree. > > The real situation is kind of complicated, and I'm not familiar with > all the details. But it's certainly true that a driver will want to > work with (and lock!) multiple struct device's that don't have a > parent-child relation in the tree. The simplest example is regular > devices together with class devices, and another might be PCI devices > together with their "shadow" ACPI devices. > > > So how many trees are there, is that fixed? > > Does the device know what tree it is going to end up in? > > The driver generally knows, but AFAIK that information is not passed > back to the driver core. At least, not directly -- you might say that > it could be deduced from the parent pointer, assuming the core already > knows all about the parent. > > > If yes, then you can extend the setup_mutex_depth() function to pick a > > different class stack for each tree. > > Maybe this could be done. Right, so this device stuff is much more complicated than I was led to believe ;-) So the device core doesn't know, so how are you guys making sure there really are no deadlocks hidden in there somewhere? > But for now perhaps a compromise is in > order. We could make the switch from semaphores to mutexes while > avoiding lockdep issues by assigning the device mutexes to a > "don't-verify" class. Is there such a thing, or could it be added? Something like the below might work, but it should go along with a checkpatch.pl mod to ensure we don't grow any new users (just don't feel like brushing up my perl fu enough to actually make sense of that script) --- include/linux/lockdep.h | 2 ++ kernel/lockdep.c | 5 +++++ 2 files changed, 7 insertions(+), 0 deletions(-) diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h index 9ccf0e2..4e30ab4 100644 --- a/include/linux/lockdep.h +++ b/include/linux/lockdep.h @@ -40,6 +40,8 @@ struct lock_class_key { struct lockdep_subclass_key subkeys[MAX_LOCKDEP_SUBCLASSES]; }; +extern struct lock_class_key __lockdep_no_validate__; + #define LOCKSTAT_POINTS 4 /* diff --git a/kernel/lockdep.c b/kernel/lockdep.c index c62ec14..af65a34 100644 --- a/kernel/lockdep.c +++ b/kernel/lockdep.c @@ -2716,6 +2716,8 @@ void lockdep_init_map(struct lockdep_map *lock, const char *name, } EXPORT_SYMBOL_GPL(lockdep_init_map); +struct lock_class_key __lockdep_no_validate__; + /* * This gets called for every mutex_lock*()/spin_lock*() operation. * We maintain the dependency maps and validate the locking attempt: @@ -2750,6 +2752,9 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass, return 0; } + if (lock->key == &__lockdep_no_validate__) + check = 1; + if (!subclass) class = lock->class_cache; /* ^ permalink raw reply related [flat|nested] 39+ messages in thread
* Re: [Patch 0/2] sysfs: fix s_active lockdep warning 2010-02-05 10:18 ` Peter Zijlstra @ 2010-02-05 15:30 ` Alan Stern 2010-02-05 15:41 ` Peter Zijlstra 2010-02-08 3:06 ` Cong Wang 2010-02-08 15:38 ` Alan Stern 1 sibling, 2 replies; 39+ messages in thread From: Alan Stern @ 2010-02-05 15:30 UTC (permalink / raw) To: Peter Zijlstra Cc: Eric W. Biederman, Greg KH, Thomas Gleixner, Cong Wang, Kernel development list, Tejun Heo, Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, Andrew Morton On Fri, 5 Feb 2010, Peter Zijlstra wrote: > Right, so this device stuff is much more complicated than I was led to > believe ;-) Haven't I told you all along that tree-structured locking is complicated? :-) > So the device core doesn't know, so how are you guys making sure there > really are no deadlocks hidden in there somewhere? In the code I've seen, deadlocks are avoided by always taking the locks in the same order. But who knows? Maybe there _are_ some hidden deadlocks lurking. For now we can't rely on lockdep to find them, though, because it gets sidetracked by all the false positives. > > But for now perhaps a compromise is in > > order. We could make the switch from semaphores to mutexes while > > avoiding lockdep issues by assigning the device mutexes to a > > "don't-verify" class. Is there such a thing, or could it be added? > > Something like the below might work, but it should go along with a > checkpatch.pl mod to ensure we don't grow any new users (just don't feel > like brushing up my perl fu enough to actually make sense of that > script) I'll try it out in the next few days, and if it looks good maybe the checkpatch maintainers can lend some assistance before it gets submitted. Alan Stern ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [Patch 0/2] sysfs: fix s_active lockdep warning 2010-02-05 15:30 ` Alan Stern @ 2010-02-05 15:41 ` Peter Zijlstra 2010-02-07 9:22 ` Dave Young 2010-02-08 3:06 ` Cong Wang 1 sibling, 1 reply; 39+ messages in thread From: Peter Zijlstra @ 2010-02-05 15:41 UTC (permalink / raw) To: Alan Stern Cc: Eric W. Biederman, Greg KH, Thomas Gleixner, Cong Wang, Kernel development list, Tejun Heo, Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, Andrew Morton On Fri, 2010-02-05 at 10:30 -0500, Alan Stern wrote: > On Fri, 5 Feb 2010, Peter Zijlstra wrote: > > > Right, so this device stuff is much more complicated than I was led to > > believe ;-) > > Haven't I told you all along that tree-structured locking is > complicated? :-) Well, regular tree's aren't all that complicated, but multiple inter-locking trees is a whole different story indeed. ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [Patch 0/2] sysfs: fix s_active lockdep warning 2010-02-05 15:41 ` Peter Zijlstra @ 2010-02-07 9:22 ` Dave Young 2010-02-08 3:08 ` Cong Wang 0 siblings, 1 reply; 39+ messages in thread From: Dave Young @ 2010-02-07 9:22 UTC (permalink / raw) To: Peter Zijlstra Cc: Alan Stern, Eric W. Biederman, Greg KH, Thomas Gleixner, Cong Wang, Kernel development list, Tejun Heo, Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, Andrew Morton On Fri, Feb 05, 2010 at 04:41:57PM +0100, Peter Zijlstra wrote: > On Fri, 2010-02-05 at 10:30 -0500, Alan Stern wrote: > > On Fri, 5 Feb 2010, Peter Zijlstra wrote: > > > > > Right, so this device stuff is much more complicated than I was led to > > > believe ;-) > > > > Haven't I told you all along that tree-structured locking is > > complicated? :-) > > Well, regular tree's aren't all that complicated, but multiple > inter-locking trees is a whole different story indeed. > I ever tried converting device semaphore to mutex, but failed with same issue. At least now there's no lockdep solution for it, so I recommend revert the mutex converting patch. following lockdep warning with rc6-mm1: [ 0.397123] [ 0.397124] ============================================= [ 0.397359] [ INFO: possible recursive locking detected ] [ 0.397480] 2.6.33-rc6-mm1 #1 [ 0.397596] --------------------------------------------- [ 0.397717] swapper/1 is trying to acquire lock: [ 0.397836] (&dev->mutex){+.+...}, at: [<c12662e4>] __driver_attach+0x38/0x63 [ 0.398162] [ 0.398162] but task is already holding lock: [ 0.398393] (&dev->mutex){+.+...}, at: [<c12662d8>] __driver_attach+0x2c/0x63 [ 0.399999] [ 0.399999] other info that might help us debug this: [ 0.399999] 1 lock held by swapper/1: [ 0.399999] #0: (&dev->mutex){+.+...}, at: [<c12662d8>] __driver_attach+0x2c/0x63 [ 0.399999] [ 0.399999] stack backtrace: [ 0.399999] Pid: 1, comm: swapper Not tainted 2.6.33-rc6-mm1 #1 [ 0.399999] Call Trace: [ 0.399999] [<c13d5851>] ? printk+0xf/0x11 [ 0.399999] [<c105460c>] __lock_acquire+0x880/0xc0d [ 0.399999] [<c13d66e4>] ? __mutex_unlock_slowpath+0xf0/0xff [ 0.399999] [<c1054a34>] lock_acquire+0x9b/0xb8 [ 0.399999] [<c12662e4>] ? __driver_attach+0x38/0x63 [ 0.399999] [<c13d685c>] __mutex_lock_common+0x35/0x2f2 [ 0.399999] [<c12662e4>] ? __driver_attach+0x38/0x63 [ 0.399999] [<c13d6bb7>] mutex_lock_nested+0x30/0x38 [ 0.399999] [<c12662e4>] ? __driver_attach+0x38/0x63 [ 0.399999] [<c12662e4>] __driver_attach+0x38/0x63 [ 0.399999] [<c1265a43>] bus_for_each_dev+0x3d/0x67 [ 0.399999] [<c1265f8f>] driver_attach+0x14/0x16 [ 0.399999] [<c12662ac>] ? __driver_attach+0x0/0x63 [ 0.399999] [<c126541c>] bus_add_driver+0xc4/0x20c [ 0.399999] [<c1266553>] driver_register+0x8b/0xeb [ 0.399999] [<c11d8e3a>] acpi_bus_register_driver+0x3a/0x3c [ 0.399999] [<c165dc9a>] acpi_ec_init+0x2b/0x4a [ 0.399999] [<c165daf8>] acpi_init+0x289/0x2c8 [ 0.399999] [<c165d86f>] ? acpi_init+0x0/0x2c8 [ 0.399999] [<c1001139>] do_one_initcall+0x4c/0x13f [ 0.399999] [<c163d315>] kernel_init+0x123/0x174 [ 0.399999] [<c163d1f2>] ? kernel_init+0x0/0x174 [ 0.399999] [<c10035ba>] kernel_thread_helper+0x6/0x10 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [Patch 0/2] sysfs: fix s_active lockdep warning 2010-02-07 9:22 ` Dave Young @ 2010-02-08 3:08 ` Cong Wang 2010-02-08 3:14 ` Dave Young 0 siblings, 1 reply; 39+ messages in thread From: Cong Wang @ 2010-02-08 3:08 UTC (permalink / raw) To: Dave Young Cc: Peter Zijlstra, Alan Stern, Eric W. Biederman, Greg KH, Thomas Gleixner, Kernel development list, Tejun Heo, Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, Andrew Morton Dave Young wrote: > On Fri, Feb 05, 2010 at 04:41:57PM +0100, Peter Zijlstra wrote: >> On Fri, 2010-02-05 at 10:30 -0500, Alan Stern wrote: >>> On Fri, 5 Feb 2010, Peter Zijlstra wrote: >>> >>>> Right, so this device stuff is much more complicated than I was led to >>>> believe ;-) >>> Haven't I told you all along that tree-structured locking is >>> complicated? :-) >> Well, regular tree's aren't all that complicated, but multiple >> inter-locking trees is a whole different story indeed. >> > > I ever tried converting device semaphore to mutex, but failed with same issue. > > At least now there's no lockdep solution for it, so I recommend revert > the mutex converting patch. > > following lockdep warning with rc6-mm1: > > [ 0.397123] > [ 0.397124] ============================================= > [ 0.397359] [ INFO: possible recursive locking detected ] > [ 0.397480] 2.6.33-rc6-mm1 #1 > [ 0.397596] --------------------------------------------- > [ 0.397717] swapper/1 is trying to acquire lock: > [ 0.397836] (&dev->mutex){+.+...}, at: [<c12662e4>] __driver_attach+0x38/0x63 > [ 0.398162] > [ 0.398162] but task is already holding lock: > [ 0.398393] (&dev->mutex){+.+...}, at: [<c12662d8>] __driver_attach+0x2c/0x63 > [ 0.399999] Alan already provided a patch for this issue earlier in this thread. Thanks. ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [Patch 0/2] sysfs: fix s_active lockdep warning 2010-02-08 3:08 ` Cong Wang @ 2010-02-08 3:14 ` Dave Young 2010-02-08 3:30 ` Cong Wang 0 siblings, 1 reply; 39+ messages in thread From: Dave Young @ 2010-02-08 3:14 UTC (permalink / raw) To: Cong Wang Cc: Peter Zijlstra, Alan Stern, Eric W. Biederman, Greg KH, Thomas Gleixner, Kernel development list, Tejun Heo, Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, Andrew Morton On Mon, Feb 8, 2010 at 11:08 AM, Cong Wang <amwang@redhat.com> wrote: > Dave Young wrote: >> >> On Fri, Feb 05, 2010 at 04:41:57PM +0100, Peter Zijlstra wrote: >>> >>> On Fri, 2010-02-05 at 10:30 -0500, Alan Stern wrote: >>>> >>>> On Fri, 5 Feb 2010, Peter Zijlstra wrote: >>>> >>>>> Right, so this device stuff is much more complicated than I was led to >>>>> believe ;-) >>>> >>>> Haven't I told you all along that tree-structured locking is >>>> complicated? :-) >>> >>> Well, regular tree's aren't all that complicated, but multiple >>> inter-locking trees is a whole different story indeed. >>> >> >> I ever tried converting device semaphore to mutex, but failed with same >> issue. >> >> At least now there's no lockdep solution for it, so I recommend revert >> the mutex converting patch. >> >> following lockdep warning with rc6-mm1: >> >> [ 0.397123] [ 0.397124] >> ============================================= >> [ 0.397359] [ INFO: possible recursive locking detected ] >> [ 0.397480] 2.6.33-rc6-mm1 #1 >> [ 0.397596] --------------------------------------------- >> [ 0.397717] swapper/1 is trying to acquire lock: >> [ 0.397836] (&dev->mutex){+.+...}, at: [<c12662e4>] >> __driver_attach+0x38/0x63 >> [ 0.398162] [ 0.398162] but task is already holding lock: >> [ 0.398393] (&dev->mutex){+.+...}, at: [<c12662d8>] >> __driver_attach+0x2c/0x63 >> [ 0.399999] > > Alan already provided a patch for this issue earlier in this thread. Yes, but device locks can not be classified with regular tree style. Please read the whole thread. > > Thanks. > > -- Regards dave ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [Patch 0/2] sysfs: fix s_active lockdep warning 2010-02-08 3:14 ` Dave Young @ 2010-02-08 3:30 ` Cong Wang 0 siblings, 0 replies; 39+ messages in thread From: Cong Wang @ 2010-02-08 3:30 UTC (permalink / raw) To: Dave Young Cc: Peter Zijlstra, Alan Stern, Eric W. Biederman, Greg KH, Thomas Gleixner, Kernel development list, Tejun Heo, Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, Andrew Morton Dave Young wrote: > On Mon, Feb 8, 2010 at 11:08 AM, Cong Wang <amwang@redhat.com> wrote: >> Dave Young wrote: >>> On Fri, Feb 05, 2010 at 04:41:57PM +0100, Peter Zijlstra wrote: >>>> On Fri, 2010-02-05 at 10:30 -0500, Alan Stern wrote: >>>>> On Fri, 5 Feb 2010, Peter Zijlstra wrote: >>>>> >>>>>> Right, so this device stuff is much more complicated than I was led to >>>>>> believe ;-) >>>>> Haven't I told you all along that tree-structured locking is >>>>> complicated? :-) >>>> Well, regular tree's aren't all that complicated, but multiple >>>> inter-locking trees is a whole different story indeed. >>>> >>> I ever tried converting device semaphore to mutex, but failed with same >>> issue. >>> >>> At least now there's no lockdep solution for it, so I recommend revert >>> the mutex converting patch. >>> >>> following lockdep warning with rc6-mm1: >>> >>> [ 0.397123] [ 0.397124] >>> ============================================= >>> [ 0.397359] [ INFO: possible recursive locking detected ] >>> [ 0.397480] 2.6.33-rc6-mm1 #1 >>> [ 0.397596] --------------------------------------------- >>> [ 0.397717] swapper/1 is trying to acquire lock: >>> [ 0.397836] (&dev->mutex){+.+...}, at: [<c12662e4>] >>> __driver_attach+0x38/0x63 >>> [ 0.398162] [ 0.398162] but task is already holding lock: >>> [ 0.398393] (&dev->mutex){+.+...}, at: [<c12662d8>] >>> __driver_attach+0x2c/0x63 >>> [ 0.399999] >> Alan already provided a patch for this issue earlier in this thread. > > Yes, but device locks can not be classified with regular tree style. True, Alan mentioned the device trees could be more than one, which is the difference with the sysfs, I think, where we only have one tree. > Please read the whole thread. Surely I did. ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [Patch 0/2] sysfs: fix s_active lockdep warning 2010-02-05 15:30 ` Alan Stern 2010-02-05 15:41 ` Peter Zijlstra @ 2010-02-08 3:06 ` Cong Wang 1 sibling, 0 replies; 39+ messages in thread From: Cong Wang @ 2010-02-08 3:06 UTC (permalink / raw) To: Alan Stern Cc: Peter Zijlstra, Eric W. Biederman, Greg KH, Thomas Gleixner, Kernel development list, Tejun Heo, Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, Andrew Morton Alan Stern wrote: > On Fri, 5 Feb 2010, Peter Zijlstra wrote: > >> Right, so this device stuff is much more complicated than I was led to >> believe ;-) > > Haven't I told you all along that tree-structured locking is > complicated? :-) > >> So the device core doesn't know, so how are you guys making sure there >> really are no deadlocks hidden in there somewhere? > > In the code I've seen, deadlocks are avoided by always taking the locks > in the same order. But who knows? Maybe there _are_ some hidden > deadlocks lurking. For now we can't rely on lockdep to find them, > though, because it gets sidetracked by all the false positives. > This is almost the same with the sysfs case... Thanks. ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [Patch 0/2] sysfs: fix s_active lockdep warning 2010-02-05 10:18 ` Peter Zijlstra 2010-02-05 15:30 ` Alan Stern @ 2010-02-08 15:38 ` Alan Stern 1 sibling, 0 replies; 39+ messages in thread From: Alan Stern @ 2010-02-08 15:38 UTC (permalink / raw) To: Peter Zijlstra Cc: Eric W. Biederman, Greg KH, Thomas Gleixner, Cong Wang, Kernel development list, Tejun Heo, Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, Andrew Morton On Fri, 5 Feb 2010, Peter Zijlstra wrote: > > But for now perhaps a compromise is in > > order. We could make the switch from semaphores to mutexes while > > avoiding lockdep issues by assigning the device mutexes to a > > "don't-verify" class. Is there such a thing, or could it be added? > > Something like the below might work, but it should go along with a > checkpatch.pl mod to ensure we don't grow any new users (just don't feel > like brushing up my perl fu enough to actually make sense of that > script) I tried this out and it does work; no more lockdep warnings during bootup. The complete patch is below. But before we apply this, it might be worthwhile to spend a little time figuring out how many independent device trees there really are and whether they can be handled with separate depth-associated lockdep classes. It's not at all obvious that they can. Alan Stern Index: usb-2.6/include/linux/lockdep.h =================================================================== --- usb-2.6.orig/include/linux/lockdep.h +++ usb-2.6/include/linux/lockdep.h @@ -40,6 +40,8 @@ struct lock_class_key { struct lockdep_subclass_key subkeys[MAX_LOCKDEP_SUBCLASSES]; }; +extern struct lock_class_key __lockdep_no_validate__; + #define LOCKSTAT_POINTS 4 /* Index: usb-2.6/kernel/lockdep.c =================================================================== --- usb-2.6.orig/kernel/lockdep.c +++ usb-2.6/kernel/lockdep.c @@ -2716,6 +2716,8 @@ void lockdep_init_map(struct lockdep_map } EXPORT_SYMBOL_GPL(lockdep_init_map); +struct lock_class_key __lockdep_no_validate__; + /* * This gets called for every mutex_lock*()/spin_lock*() operation. * We maintain the dependency maps and validate the locking attempt: @@ -2750,6 +2752,9 @@ static int __lock_acquire(struct lockdep return 0; } + if (lock->key == &__lockdep_no_validate__) + check = 1; + if (!subclass) class = lock->class_cache; /* Index: usb-2.6/drivers/base/core.c =================================================================== --- usb-2.6.orig/drivers/base/core.c +++ usb-2.6/drivers/base/core.c @@ -559,6 +559,7 @@ void device_initialize(struct device *de kobject_init(&dev->kobj, &device_ktype); INIT_LIST_HEAD(&dev->dma_pools); mutex_init(&dev->mutex); + lockdep_set_class(&dev->mutex, &__lockdep_no_validate__); spin_lock_init(&dev->devres_lock); INIT_LIST_HEAD(&dev->devres_head); device_init_wakeup(dev, 0); ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [Patch 0/2] sysfs: fix s_active lockdep warning 2010-02-04 16:35 ` Alan Stern 2010-02-04 16:41 ` Peter Zijlstra @ 2010-02-04 16:46 ` Peter Zijlstra 2010-02-04 18:40 ` Alan Stern 2010-02-04 16:46 ` Greg KH 2 siblings, 1 reply; 39+ messages in thread From: Peter Zijlstra @ 2010-02-04 16:46 UTC (permalink / raw) To: Alan Stern Cc: Eric W. Biederman, Greg KH, Thomas Gleixner, Cong Wang, Kernel development list, Tejun Heo, Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, Andrew Morton On Thu, 2010-02-04 at 11:35 -0500, Alan Stern wrote: > +#include <linux/sched.h> #include <linux/lockdep.h> should be sufficient I think. ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [Patch 0/2] sysfs: fix s_active lockdep warning 2010-02-04 16:46 ` Peter Zijlstra @ 2010-02-04 18:40 ` Alan Stern 2010-02-05 3:09 ` Cong Wang 0 siblings, 1 reply; 39+ messages in thread From: Alan Stern @ 2010-02-04 18:40 UTC (permalink / raw) To: Peter Zijlstra Cc: Eric W. Biederman, Greg KH, Thomas Gleixner, Cong Wang, Kernel development list, Tejun Heo, Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, Andrew Morton On Thu, 4 Feb 2010, Peter Zijlstra wrote: > On Thu, 2010-02-04 at 11:35 -0500, Alan Stern wrote: > > +#include <linux/sched.h> > > #include <linux/lockdep.h> > > should be sufficient I think. No, it's not. It leaves MAX_LOCK_DEPTH undeclared. Beats me why that symbol ended up in sched.h... Alan Stern ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [Patch 0/2] sysfs: fix s_active lockdep warning 2010-02-04 18:40 ` Alan Stern @ 2010-02-05 3:09 ` Cong Wang 2010-02-05 4:06 ` Alan Stern 0 siblings, 1 reply; 39+ messages in thread From: Cong Wang @ 2010-02-05 3:09 UTC (permalink / raw) To: Alan Stern Cc: Peter Zijlstra, Eric W. Biederman, Greg KH, Thomas Gleixner, Kernel development list, Tejun Heo, Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, Andrew Morton Alan Stern wrote: > On Thu, 4 Feb 2010, Peter Zijlstra wrote: > >> On Thu, 2010-02-04 at 11:35 -0500, Alan Stern wrote: >>> +#include <linux/sched.h> >> #include <linux/lockdep.h> >> >> should be sufficient I think. > > No, it's not. It leaves MAX_LOCK_DEPTH undeclared. Beats me why that > symbol ended up in sched.h... > because task_struct->held_locks uses it, but it seems better to move it into lockdep.h... ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [Patch 0/2] sysfs: fix s_active lockdep warning 2010-02-05 3:09 ` Cong Wang @ 2010-02-05 4:06 ` Alan Stern 0 siblings, 0 replies; 39+ messages in thread From: Alan Stern @ 2010-02-05 4:06 UTC (permalink / raw) To: Cong Wang Cc: Peter Zijlstra, Eric W. Biederman, Greg KH, Thomas Gleixner, Kernel development list, Tejun Heo, Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, Andrew Morton On Fri, 5 Feb 2010, Cong Wang wrote: > Alan Stern wrote: > > On Thu, 4 Feb 2010, Peter Zijlstra wrote: > > > >> On Thu, 2010-02-04 at 11:35 -0500, Alan Stern wrote: > >>> +#include <linux/sched.h> > >> #include <linux/lockdep.h> > >> > >> should be sufficient I think. > > > > No, it's not. It leaves MAX_LOCK_DEPTH undeclared. Beats me why that > > symbol ended up in sched.h... > > > > because task_struct->held_locks uses it, but it seems better to move > it into lockdep.h... Be my guest! :-) Alan Stern ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [Patch 0/2] sysfs: fix s_active lockdep warning 2010-02-04 16:35 ` Alan Stern 2010-02-04 16:41 ` Peter Zijlstra 2010-02-04 16:46 ` Peter Zijlstra @ 2010-02-04 16:46 ` Greg KH 2010-02-04 16:59 ` Thomas Gleixner 2 siblings, 1 reply; 39+ messages in thread From: Greg KH @ 2010-02-04 16:46 UTC (permalink / raw) To: Alan Stern Cc: Peter Zijlstra, Eric W. Biederman, Thomas Gleixner, Cong Wang, Kernel development list, Tejun Heo, Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, Andrew Morton On Thu, Feb 04, 2010 at 11:35:28AM -0500, Alan Stern wrote: > Greg: > > You have accepted Thomas's patch "drivers/base: Convert dev->sem to > mutex". It generates lockdep violations galore during device probing > and removal! Luckily lockdep is smart enough only to print the first > occurrence. Here's what I get early on during bootup: Ugh, I forgot this is why we haven't done this yet. Thomas, you didn't see these warnings? > [ 0.149911] ACPI: EC: Look up EC in DSDT > [ 0.170665] ACPI: Executed 1 blocks of module-level executable AML code > [ 0.198111] ACPI: Interpreter enabled > [ 0.198267] ACPI: (supports S0 S3 S4 S5) > [ 0.198802] ACPI: Using IOAPIC for interrupt routing > [ 0.266493] > [ 0.266496] ============================================= > [ 0.266775] [ INFO: possible recursive locking detected ] > [ 0.266917] 2.6.33-rc6 #1 > [ 0.267051] --------------------------------------------- > [ 0.267192] swapper/1 is trying to acquire lock: > [ 0.267332] (&dev->mutex){+.+...}, at: [<c11496be>] __driver_attach+0x38/0x63 > [ 0.267683] > [ 0.267685] but task is already holding lock: > [ 0.267953] (&dev->mutex){+.+...}, at: [<c11496b2>] __driver_attach+0x2c/0x63 > [ 0.268000] > [ 0.268000] other info that might help us debug this: > [ 0.268000] 1 lock held by swapper/1: > [ 0.268000] #0: (&dev->mutex){+.+...}, at: [<c11496b2>] __driver_attach+0x2c/0x63 > [ 0.268000] > [ 0.268000] stack backtrace: > [ 0.268000] Pid: 1, comm: swapper Not tainted 2.6.33-rc6 #1 > [ 0.268000] Call Trace: > [ 0.268000] [<c11c819e>] ? printk+0xf/0x11 > [ 0.268000] [<c1041c9b>] __lock_acquire+0x804/0xb47 > [ 0.268000] [<c10b2026>] ? sysfs_addrm_finish+0x19/0xe2 > [ 0.268000] [<c1042020>] lock_acquire+0x42/0x59 > [ 0.268000] [<c11496be>] ? __driver_attach+0x38/0x63 > [ 0.268000] [<c11c90c6>] __mutex_lock_common+0x39/0x38f > [ 0.268000] [<c11496be>] ? __driver_attach+0x38/0x63 > [ 0.268000] [<c11c94ab>] mutex_lock_nested+0x2b/0x33 > [ 0.268000] [<c11496be>] ? __driver_attach+0x38/0x63 > [ 0.268000] [<c11496be>] __driver_attach+0x38/0x63 > [ 0.268000] [<c1148e0a>] bus_for_each_dev+0x3d/0x67 > [ 0.268000] [<c11494cf>] driver_attach+0x14/0x16 > [ 0.268000] [<c1149686>] ? __driver_attach+0x0/0x63 > [ 0.268000] [<c11491c1>] bus_add_driver+0x92/0x1c5 > [ 0.268000] [<c114990f>] driver_register+0x79/0xe0 > [ 0.268000] [<c1106d32>] acpi_bus_register_driver+0x3a/0x3c > [ 0.268000] [<c131999f>] acpi_power_init+0x3f/0x5e > [ 0.268000] [<c1319422>] acpi_init+0x28e/0x2c8 > [ 0.268000] [<c1319194>] ? acpi_init+0x0/0x2c8 > [ 0.268000] [<c1001139>] do_one_initcall+0x4c/0x136 > [ 0.268000] [<c130130b>] kernel_init+0x11c/0x16d > [ 0.268000] [<c13011ef>] ? kernel_init+0x0/0x16d > [ 0.268000] [<c1002cba>] kernel_thread_helper+0x6/0x10 > [ 0.268485] ACPI: Power Resource [GFAN] (on) > > > On Thu, 4 Feb 2010, Peter Zijlstra wrote: > > > The device tree had the problem that we could basically hold a device > > lock and an unspecified number of parent locks (iirc this was due to > > device probing, where we hold the bus lock while probing/adding child > > device, recursively). > > > > If we place each dev->lock into the same class (which would naively > > happen), then this would lead to recursive lock warnings. The proposed > > solution for this is to create MAX_LOCK_DEPTH classes and assign them to > > the dev->lock depending on the depth in the device tree (Alan said that > > MAX_LOCK_DEPTH is sufficient for all practical cases). > > > > static struct lock_class_key dev_tree_classes[MAX_LOCK_DEPTH]; > > > > device_add() or thereabouts would have something like: > > > > #ifdef CONFIG_PROVE_LOCKING > > BUG_ON(dev->depth >= MAX_LOCK_DEPTH); > > lockdep_set_class(dev->lock, &dev_tree_classes[dev->depth]); > > #endif > > Unfortunately this doesn't really work. Here is a patch implementing > the scheme: > > Index: usb-2.6/drivers/base/core.c > =================================================================== > --- usb-2.6.orig/drivers/base/core.c > +++ usb-2.6/drivers/base/core.c > @@ -22,6 +22,7 @@ > #include <linux/kallsyms.h> > #include <linux/mutex.h> > #include <linux/async.h> > +#include <linux/sched.h> > > #include "base.h" > #include "power/power.h" > @@ -671,6 +672,26 @@ static void setup_parent(struct device * > dev->kobj.parent = kobj; > } > > +#ifdef CONFIG_PROVE_LOCKING > +static struct lock_class_key dev_tree_classes[MAX_LOCK_DEPTH]; > + > +static void setup_mutex_depth(struct device *dev, struct device *parent) > +{ > + int depth = 0; > + > + /* Dynamically determine the device's depth in the device tree */ > + while (parent) { > + ++depth; > + parent = parent->parent; > + } > + BUG_ON(depth > MAX_LOCK_DEPTH); > + lockdep_set_class(&dev->mutex, &dev_tree_classes[depth]); > +} > +#else > +static inline void setup_mutex_depth(struct device *dev, > + struct device *parent) {} > +#endif > + > static int device_add_class_symlinks(struct device *dev) > { > int error; > @@ -912,6 +933,7 @@ int device_add(struct device *dev) > > parent = get_device(dev->parent); > setup_parent(dev, parent); > + setup_mutex_depth(dev, parent); > > /* use parent numa_node */ > if (parent) > > > This doesn't address the fact that we really have multiple device trees > (for example, class devices are handled separately from normal > devices). With the above patch installed, I still get lockdep > violations farther on during boot: > > [ 0.272332] pci_bus 0000:00: on NUMA node 0 > [ 0.272355] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT] > [ 0.273503] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P4._PRT] > [ 0.279205] > [ 0.279208] ============================================= > [ 0.279485] [ INFO: possible recursive locking detected ] > [ 0.279628] 2.6.33-rc6 #2 > [ 0.279763] --------------------------------------------- > [ 0.279905] swapper/1 is trying to acquire lock: > [ 0.280000] (&dev_tree_classes[depth]#2){+.+.+.}, at: [<c1149776>] device_attach+0x14/0x6e > [ 0.280000] > [ 0.280000] but task is already holding lock: > [ 0.280000] (&dev_tree_classes[depth]#2){+.+.+.}, at: [<c11496de>] __driver_attach+0x2c/0x63 > [ 0.280000] > [ 0.280000] other info that might help us debug this: > [ 0.280000] 2 locks held by swapper/1: > [ 0.280000] #0: (&dev_tree_classes[depth]#2){+.+.+.}, at: [<c11496de>] __driver_attach+0x2c/0x63 > [ 0.280000] #1: (&dev_tree_classes[depth]#3){+.+.+.}, at: [<c11496ea>] __driver_attach+0x38/0x63 > [ 0.280000] > [ 0.280000] stack backtrace: > [ 0.280000] Pid: 1, comm: swapper Not tainted 2.6.33-rc6 #2 > [ 0.280000] Call Trace: > [ 0.280000] [<c11c81ce>] ? printk+0xf/0x11 > [ 0.280000] [<c1041c9b>] __lock_acquire+0x804/0xb47 > [ 0.280000] [<c101a73d>] ? spin_unlock_irqrestore+0x8/0xa > [ 0.280000] [<c101a891>] ? __wake_up+0x32/0x3b > [ 0.280000] [<c1042020>] lock_acquire+0x42/0x59 > [ 0.280000] [<c1149776>] ? device_attach+0x14/0x6e > [ 0.280000] [<c11c90f6>] __mutex_lock_common+0x39/0x38f > [ 0.280000] [<c1149776>] ? device_attach+0x14/0x6e > [ 0.280000] [<c1040e2e>] ? trace_hardirqs_on+0xb/0xd > [ 0.280000] [<c10ed5a7>] ? kobject_uevent_env+0x2e9/0x30a > [ 0.280000] [<c10ed5a7>] ? kobject_uevent_env+0x2e9/0x30a > [ 0.280000] [<c11c94db>] mutex_lock_nested+0x2b/0x33 > [ 0.280000] [<c1149776>] ? device_attach+0x14/0x6e > [ 0.280000] [<c1149776>] device_attach+0x14/0x6e > [ 0.280000] [<c1148aa1>] bus_probe_device+0x1b/0x30 > [ 0.280000] [<c1147b6c>] device_add+0x310/0x458 > [ 0.280000] [<c10f96ac>] pci_bus_add_device+0xf/0x30 > [ 0.280000] [<c10f96f0>] pci_bus_add_devices+0x23/0xdd > [ 0.280000] [<c11c011b>] ? acpi_pci_root_add+0x1cf/0x1ff > [ 0.280000] [<c11088a3>] acpi_pci_root_start+0x11/0x15 > [ 0.280000] [<c1106370>] acpi_start_single_object+0x1e/0x3f > [ 0.280000] [<c11064a9>] acpi_device_probe+0x78/0xf4 > [ 0.280000] [<c1149632>] driver_probe_device+0x87/0x107 > [ 0.280000] [<c11496f9>] __driver_attach+0x47/0x63 > [ 0.280000] [<c1148e36>] bus_for_each_dev+0x3d/0x67 > [ 0.280000] [<c11494fb>] driver_attach+0x14/0x16 > [ 0.280000] [<c11496b2>] ? __driver_attach+0x0/0x63 > [ 0.280000] [<c11491ed>] bus_add_driver+0x92/0x1c5 > [ 0.280000] [<c1319798>] ? acpi_pci_root_init+0x0/0x25 > [ 0.280000] [<c114993b>] driver_register+0x79/0xe0 > [ 0.280000] [<c1319798>] ? acpi_pci_root_init+0x0/0x25 > [ 0.280000] [<c1106d32>] acpi_bus_register_driver+0x3a/0x3c > [ 0.280000] [<c13197ae>] acpi_pci_root_init+0x16/0x25 > [ 0.280000] [<c1001139>] do_one_initcall+0x4c/0x136 > [ 0.280000] [<c130130b>] kernel_init+0x11c/0x16d > [ 0.280000] [<c13011ef>] ? kernel_init+0x0/0x16d > [ 0.280000] [<c1002cba>] kernel_thread_helper+0x6/0x10 > [ 0.328206] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 *10 11 12 14 15) > [ 0.329223] ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 *5 6 7 10 11 12 14 15) > > > > Then there was a problem were we could lock all child devices while > > holding the parent device lock (forgot why though), this would, on > > taking the second child dev->lock, again lead to recursive lock > > warnings. > > AFAIK, the code that used to do this is no longer present. There may > be other places where it is still done, but I'm not aware of any. > > However in view of the other difficulties, it still doesn't seem > possible to make device mutexes work with lockdep. I suggest removing > Thomas's patch. Unless Thomas or anyone else thinks of something that can solve these problems, I will do so. thanks, greg k-h ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [Patch 0/2] sysfs: fix s_active lockdep warning 2010-02-04 16:46 ` Greg KH @ 2010-02-04 16:59 ` Thomas Gleixner 2010-02-26 19:36 ` Alan Stern 0 siblings, 1 reply; 39+ messages in thread From: Thomas Gleixner @ 2010-02-04 16:59 UTC (permalink / raw) To: Greg KH Cc: Alan Stern, Peter Zijlstra, Eric W. Biederman, Cong Wang, Kernel development list, Tejun Heo, Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, Andrew Morton On Thu, 4 Feb 2010, Greg KH wrote: > > However in view of the other difficulties, it still doesn't seem > > possible to make device mutexes work with lockdep. I suggest removing > > Thomas's patch. > > Unless Thomas or anyone else thinks of something that can solve these > problems, I will do so. Hmm, I have not seen those lockdep splats on -rt where we have the mutex conversion for quite some time already. Need to look at it. Thanks, tglx ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [Patch 0/2] sysfs: fix s_active lockdep warning 2010-02-04 16:59 ` Thomas Gleixner @ 2010-02-26 19:36 ` Alan Stern 2010-02-26 20:54 ` Thomas Gleixner 0 siblings, 1 reply; 39+ messages in thread From: Alan Stern @ 2010-02-26 19:36 UTC (permalink / raw) To: Thomas Gleixner Cc: Greg KH, Peter Zijlstra, Eric W. Biederman, Cong Wang, Kernel development list, Tejun Heo, Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, Andrew Morton On Thu, 4 Feb 2010, Thomas Gleixner wrote: > On Thu, 4 Feb 2010, Greg KH wrote: > > > However in view of the other difficulties, it still doesn't seem > > > possible to make device mutexes work with lockdep. I suggest removing > > > Thomas's patch. > > > > Unless Thomas or anyone else thinks of something that can solve these > > problems, I will do so. > > Hmm, I have not seen those lockdep splats on -rt where we have the > mutex conversion for quite some time already. Need to look at it. Thomas, did you ever figure out what was going on? It would be nice to know why the -rt kernels don't have any lockdep problems associated with the device mutexes. Alan Stern ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [Patch 0/2] sysfs: fix s_active lockdep warning 2010-02-26 19:36 ` Alan Stern @ 2010-02-26 20:54 ` Thomas Gleixner 0 siblings, 0 replies; 39+ messages in thread From: Thomas Gleixner @ 2010-02-26 20:54 UTC (permalink / raw) To: Alan Stern Cc: Greg KH, Peter Zijlstra, Eric W. Biederman, Cong Wang, Kernel development list, Tejun Heo, Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, Andrew Morton On Fri, 26 Feb 2010, Alan Stern wrote: > On Thu, 4 Feb 2010, Thomas Gleixner wrote: > > > On Thu, 4 Feb 2010, Greg KH wrote: > > > > However in view of the other difficulties, it still doesn't seem > > > > possible to make device mutexes work with lockdep. I suggest removing > > > > Thomas's patch. > > > > > > Unless Thomas or anyone else thinks of something that can solve these > > > problems, I will do so. > > > > Hmm, I have not seen those lockdep splats on -rt where we have the > > mutex conversion for quite some time already. Need to look at it. > > Thomas, did you ever figure out what was going on? It would be nice to > know why the -rt kernels don't have any lockdep problems associated > with the device mutexes. Yes, it just did not happen on the systems which I tested with lockdep enabled, but I found a box where it triggered. Sorry should have tested better. :( Thanks, tglx ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [Patch 0/2] sysfs: fix s_active lockdep warning 2010-02-04 11:38 ` Peter Zijlstra 2010-02-04 16:35 ` Alan Stern @ 2010-02-05 3:43 ` Cong Wang 2010-02-05 8:55 ` Eric W. Biederman 2 siblings, 0 replies; 39+ messages in thread From: Cong Wang @ 2010-02-05 3:43 UTC (permalink / raw) To: Peter Zijlstra Cc: Eric W. Biederman, Greg KH, linux-kernel, Tejun Heo, Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, akpm, Alan Stern Peter Zijlstra wrote: > On Fri, 2010-01-29 at 12:30 -0800, Eric W. Biederman wrote: > >> We get false positives when the code of a sysfs attribute >> synchronously removes other sysfs attributes. In general that is not >> safe due to hotplug etc, but there are specific instances of static >> sysfs entries like the pm_core where it appears to be safe. >> >> I am not familiar with the device core lockdep issues. Are they similar? > > The device tree had the problem that we could basically hold a device > lock and an unspecified number of parent locks (iirc this was due to > device probing, where we hold the bus lock while probing/adding child > device, recursively). > > If we place each dev->lock into the same class (which would naively > happen), then this would lead to recursive lock warnings. The proposed > solution for this is to create MAX_LOCK_DEPTH classes and assign them to > the dev->lock depending on the depth in the device tree (Alan said that > MAX_LOCK_DEPTH is sufficient for all practical cases). > > static struct lock_class_key dev_tree_classes[MAX_LOCK_DEPTH]; > > device_add() or thereabouts would have something like: > > #ifdef CONFIG_PROVE_LOCKING > BUG_ON(dev->depth >= MAX_LOCK_DEPTH); > lockdep_set_class(dev->lock, &dev_tree_classes[dev->depth]); > #endif > > Nice explanation! I see, we should set the class of the lock when after we holding it... I will update my patch. Thank you! ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [Patch 0/2] sysfs: fix s_active lockdep warning 2010-02-04 11:38 ` Peter Zijlstra 2010-02-04 16:35 ` Alan Stern 2010-02-05 3:43 ` Cong Wang @ 2010-02-05 8:55 ` Eric W. Biederman 2 siblings, 0 replies; 39+ messages in thread From: Eric W. Biederman @ 2010-02-05 8:55 UTC (permalink / raw) To: Peter Zijlstra Cc: Greg KH, Cong Wang, linux-kernel, Tejun Heo, Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, akpm, Alan Stern Peter Zijlstra <peterz@infradead.org> writes: > On Fri, 2010-01-29 at 12:30 -0800, Eric W. Biederman wrote: > >> We get false positives when the code of a sysfs attribute >> synchronously removes other sysfs attributes. In general that is not >> safe due to hotplug etc, but there are specific instances of static >> sysfs entries like the pm_core where it appears to be safe. >> >> I am not familiar with the device core lockdep issues. Are they similar? > > The device tree had the problem that we could basically hold a device > lock and an unspecified number of parent locks (iirc this was due to > device probing, where we hold the bus lock while probing/adding child > device, recursively). > > If we place each dev->lock into the same class (which would naively > happen), then this would lead to recursive lock warnings. The proposed > solution for this is to create MAX_LOCK_DEPTH classes and assign them to > the dev->lock depending on the depth in the device tree (Alan said that > MAX_LOCK_DEPTH is sufficient for all practical cases). > > static struct lock_class_key dev_tree_classes[MAX_LOCK_DEPTH]; > > device_add() or thereabouts would have something like: > > #ifdef CONFIG_PROVE_LOCKING > BUG_ON(dev->depth >= MAX_LOCK_DEPTH); > lockdep_set_class(dev->lock, &dev_tree_classes[dev->depth]); > #endif > > > Then there was a problem were we could lock all child devices while > holding the parent device lock (forgot why though), this would, on > taking the second child dev->lock, again lead to recursive lock > warnings. > > We have an annotation for that: lock_nest_lock (currently only > spin_lock_nest_lock exists, but mutex_lock_nest_lock is easily created), > and this would allow you to do things like: > > mutex_lock(&parent->lock); > for_each_device_child(child, parent) { > mutex_lock_nest_lock(&child->lock, &parent->lock); > ... > } > > > I hope this helps in figuring out the sysfs case.. Interesting. The sysfs attributes in general don't have this problem because the common case is effectively a reader/writer lock where we can have all of the readers we want. In the sysfs case the classic problem is when an attribute grabs a lock and that lock is also held while the attribute is being deleted. I first found this with the rtnl_lock but there are other cases as well. The particularly tricky case to handle in lockdep is when that other lock also the s_active lock from another part of the tree. There is an entire Alice style rabbit whole there, I am trying to figure out. Eric ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [Patch 0/2] sysfs: fix s_active lockdep warning 2010-01-29 14:22 ` Greg KH 2010-01-29 17:57 ` Peter Zijlstra @ 2010-01-29 20:25 ` Eric W. Biederman 2010-01-30 5:30 ` Greg KH 1 sibling, 1 reply; 39+ messages in thread From: Eric W. Biederman @ 2010-01-29 20:25 UTC (permalink / raw) To: Greg KH Cc: Cong Wang, linux-kernel, Tejun Heo, Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, akpm Greg KH <gregkh@suse.de> writes: > Heh, this whole mess is the very reason we didn't add lockdep support to > the driver core. Nested devices that all look alike from the driver > core, are really different objects and the locking lifetimes are > separate, but lockdep can't see that. > > I suggest we just remove the original patch, as it seems to be causing > way too many problems. > > Any objections to that? I think the hit rate for real problems has been about 25-50%. Of the false positives a lot of those have been, code that is at least questionable. Furthermore there are problems we can find this way that we won't know about any other way. Unfortunately I haven't had much time to do anything kernel related lately, or I would have done more with this. My comment was about simply about finding a good way to increase the signal to noise ration so investigations can reasonably start with the presumption that code lockdep is complaining about real problems. The deadlocks that we can hit in sysfs are very nasty to find, they have persisted for years, and they pop back up after they are fixed. So far the pain from lockdep annotations seems a lot lower. Right now annotating with subclasses as Amerigo is attempting will work, and remove the false positives. I was simply hoping to find a faster way to get there. So yes, I do object to removing the original patch. Let's put in the work to find a good path to remove the handful of cases that cause false positives. It's a shame we aren't getting stack overflow errors on the same paths that are removing sysfs attributes from the callback handlers of sysfs attributes, or we could just rule out that questionable practice and call all of the lockdep warnings valid. Unfortunately that would just be the tail wagging the horse. Eric ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [Patch 0/2] sysfs: fix s_active lockdep warning 2010-01-29 20:25 ` Eric W. Biederman @ 2010-01-30 5:30 ` Greg KH 0 siblings, 0 replies; 39+ messages in thread From: Greg KH @ 2010-01-30 5:30 UTC (permalink / raw) To: Eric W. Biederman Cc: Cong Wang, linux-kernel, Tejun Heo, Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, akpm On Fri, Jan 29, 2010 at 12:25:22PM -0800, Eric W. Biederman wrote: > Greg KH <gregkh@suse.de> writes: > > > Heh, this whole mess is the very reason we didn't add lockdep support to > > the driver core. Nested devices that all look alike from the driver > > core, are really different objects and the locking lifetimes are > > separate, but lockdep can't see that. > > > > I suggest we just remove the original patch, as it seems to be causing > > way too many problems. > > > > Any objections to that? > > I think the hit rate for real problems has been about 25-50%. Of the > false positives a lot of those have been, code that is at least > questionable. > > Furthermore there are problems we can find this way that we won't know > about any other way. Unfortunately I haven't had much time to do > anything kernel related lately, or I would have done more with this. > My comment was about simply about finding a good way to increase the > signal to noise ration so investigations can reasonably start with the > presumption that code lockdep is complaining about real problems. > > The deadlocks that we can hit in sysfs are very nasty to find, they > have persisted for years, and they pop back up after they are fixed. > So far the pain from lockdep annotations seems a lot lower. > > Right now annotating with subclasses as Amerigo is attempting will work, > and remove the false positives. I was simply hoping to find a faster > way to get there. > > So yes, I do object to removing the original patch. Let's put in the > work to find a good path to remove the handful of cases that cause > false positives. Ok, that sounds good to me. thanks, greg k-h ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [Patch 0/2] sysfs: fix s_active lockdep warning 2010-01-29 7:21 ` [Patch 0/2] sysfs: fix s_active lockdep warning Eric W. Biederman 2010-01-29 8:38 ` Cong Wang @ 2010-01-29 18:02 ` Peter Zijlstra 1 sibling, 0 replies; 39+ messages in thread From: Peter Zijlstra @ 2010-01-29 18:02 UTC (permalink / raw) To: Eric W. Biederman Cc: Amerigo Wang, linux-kernel, Tejun Heo, Greg Kroah-Hartman, Miles Lane, Heiko Carstens, Benjamin Herrenschmidt, Larry Finger, akpm On Thu, 2010-01-28 at 23:21 -0800, Eric W. Biederman wrote: > > I suspect we may want to create a separate class for each sysfs file > instead of playing whack-a-mole and creating a subclass each time we > have problems. Well, you can make a class every time you have a problem, I don't think you can make a class for each file since classes have to have static storage, also subclasses are limited to 8. ^ permalink raw reply [flat|nested] 39+ messages in thread
end of thread, other threads:[~2010-02-26 20:55 UTC | newest] Thread overview: 39+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-01-29 7:01 [Patch 0/2] sysfs: fix s_active lockdep warning Amerigo Wang 2010-01-29 7:02 ` [Patch 1/2] sysfs: add support for lockdep subclasses to s_active Amerigo Wang 2010-01-29 7:02 ` [PATCH 2/2] sysfs: fix the incomplete part of subclass support for s_active Amerigo Wang 2010-01-29 7:21 ` [Patch 0/2] sysfs: fix s_active lockdep warning Eric W. Biederman 2010-01-29 8:38 ` Cong Wang 2010-01-29 13:44 ` Eric W. Biederman 2010-01-29 14:22 ` Greg KH 2010-01-29 17:57 ` Peter Zijlstra 2010-01-29 18:10 ` Greg KH 2010-01-29 18:14 ` Peter Zijlstra 2010-01-29 18:21 ` Greg KH 2010-01-29 20:10 ` Peter Zijlstra 2010-01-29 20:30 ` Eric W. Biederman 2010-02-04 11:38 ` Peter Zijlstra 2010-02-04 16:35 ` Alan Stern 2010-02-04 16:41 ` Peter Zijlstra 2010-02-04 18:37 ` Alan Stern 2010-02-05 10:18 ` Peter Zijlstra 2010-02-05 15:30 ` Alan Stern 2010-02-05 15:41 ` Peter Zijlstra 2010-02-07 9:22 ` Dave Young 2010-02-08 3:08 ` Cong Wang 2010-02-08 3:14 ` Dave Young 2010-02-08 3:30 ` Cong Wang 2010-02-08 3:06 ` Cong Wang 2010-02-08 15:38 ` Alan Stern 2010-02-04 16:46 ` Peter Zijlstra 2010-02-04 18:40 ` Alan Stern 2010-02-05 3:09 ` Cong Wang 2010-02-05 4:06 ` Alan Stern 2010-02-04 16:46 ` Greg KH 2010-02-04 16:59 ` Thomas Gleixner 2010-02-26 19:36 ` Alan Stern 2010-02-26 20:54 ` Thomas Gleixner 2010-02-05 3:43 ` Cong Wang 2010-02-05 8:55 ` Eric W. Biederman 2010-01-29 20:25 ` Eric W. Biederman 2010-01-30 5:30 ` Greg KH 2010-01-29 18:02 ` Peter Zijlstra
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox