public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [Patch 1/2] sysfs: add lockdep class support to s_active
@ 2010-02-08  9:51 Amerigo Wang
  2010-02-08  9:52 ` [Patch 2/2] block: add sysfs lockdep class for iosched Amerigo Wang
  0 siblings, 1 reply; 4+ messages in thread
From: Amerigo Wang @ 2010-02-08  9:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Tejun Heo, Greg Kroah-Hartman, Peter Zijlstra, Eric W. Biederman,
	Heiko Carstens, Jens Axboe, Miles Lane, Larry Finger,
	Amerigo Wang, Hugh Dickins, akpm

Recently we met a lockdep warning from sysfs during s2ram.
As reported by several people, it is something like:

[ 6967.926563] ACPI: Preparing to enter system sleep state S3
[ 6967.956156] Disabling non-boot CPUs ...
[ 6967.970401]
[ 6967.970408] =============================================
[ 6967.970419] [ INFO: possible recursive locking detected ]
[ 6967.970431] 2.6.33-rc2-git6 #27
[ 6967.970439] ---------------------------------------------
[ 6967.970450] pm-suspend/22147 is trying to acquire lock:
[ 6967.970460]  (s_active){++++.+}, at: [<c10d2941>]
sysfs_hash_and_remove+0x3d/0x4f
[ 6967.970493]
[ 6967.970497] but task is already holding lock:
[ 6967.970506]  (s_active){++++.+}, at: [<c10d4110>]
sysfs_get_active_two+0x16/0x36
[...]

Eric already provides a patch for this[1], but it still can't fix the
problem. Based on his work and Peter's suggestion, I write this patch,
hopefully we can fix the warning completely.

This patch put sysfs s_active into two classes, one is for PM, the other
is for the rest, so lockdep will distinguish them.

Still, using a workqueue to do the cleaning work is another choice,
as pointed by Eric. But not sure if it's better than this approach,
this depends on if we want to eliminate all the similar cases hold
the same class of locks, or just eliminate this one case. Please
comment.

I tested this patch, it fixes the problem.

1. http://lkml.org/lkml/2010/1/10/282


Reported-by: Larry Finger <Larry.Finger@lwfinger.net>
Reported-by: Miles Lane <miles.lane@gmail.com>
Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@suse.de>

---
 fs/sysfs/dir.c        |    1 -
 fs/sysfs/file.c       |    7 +++++++
 fs/sysfs/sysfs.h      |   11 -----------
 include/linux/sysfs.h |    7 +++++++
 kernel/power/power.h  |   15 ++++++++-------
 5 files changed, 22 insertions(+), 19 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 699f371..d7de269 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -354,7 +354,6 @@ struct sysfs_dirent *sysfs_new_dirent(const char *name, umode_t mode, int type)
 
 	atomic_set(&sd->s_count, 1);
 	atomic_set(&sd->s_active, 0);
-	sysfs_dirent_init_lockdep(sd);
 
 	sd->s_name = name;
 	sd->s_mode = mode;
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index dc30d9e..97e397a 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -24,6 +24,8 @@
 
 #include "sysfs.h"
 
+static struct lock_class_key sysfs_classes[SYSFS_NR_CLASSES];
+
 /* used in crash dumps to help with debugging */
 static char last_sysfs_file[PATH_MAX];
 void sysfs_printk_last_file(void)
@@ -504,11 +506,16 @@ int sysfs_add_file_mode(struct sysfs_dirent *dir_sd,
 	struct sysfs_addrm_cxt acxt;
 	struct sysfs_dirent *sd;
 	int rc;
+	int class;
 
 	sd = sysfs_new_dirent(attr->name, mode, type);
 	if (!sd)
 		return -ENOMEM;
 	sd->s_attr.attr = (void *)attr;
+	class = SYSFS_ATTR_NORMAL;
+	if (sysfs_type(sd) == SYSFS_KOBJ_ATTR)
+		class = sd->s_attr.attr->class;
+	lockdep_set_class_and_name(sd, &sysfs_classes[class], "s_active");
 
 	sysfs_addrm_start(&acxt, dir_sd);
 	rc = sysfs_add_one(&acxt, sd);
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index cdd9377..dde4d73 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -88,17 +88,6 @@ static inline unsigned int sysfs_type(struct sysfs_dirent *sd)
 	return sd->s_flags & SYSFS_TYPE_MASK;
 }
 
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
-#define sysfs_dirent_init_lockdep(sd)				\
-do {								\
-	static struct lock_class_key __key;			\
-								\
-	lockdep_init_map(&sd->dep_map, "s_active", &__key, 0);	\
-} while(0)
-#else
-#define sysfs_dirent_init_lockdep(sd) do {} while(0)
-#endif
-
 /*
  * Context structure to be used while adding/removing nodes.
  */
diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h
index cfa8308..2b91b74 100644
--- a/include/linux/sysfs.h
+++ b/include/linux/sysfs.h
@@ -20,6 +20,12 @@
 struct kobject;
 struct module;
 
+enum sysfs_attr_lock_class {
+	SYSFS_ATTR_NORMAL,
+	SYSFS_ATTR_PM_CONTROL,
+	SYSFS_NR_CLASSES,
+};
+
 /* FIXME
  * The *owner field is no longer used.
  * x86 tree has been cleaned up. The owner
@@ -29,6 +35,7 @@ struct attribute {
 	const char		*name;
 	struct module		*owner;
 	mode_t			mode;
+	enum sysfs_attr_lock_class	class;
 };
 
 struct attribute_group {
diff --git a/kernel/power/power.h b/kernel/power/power.h
index 46c5a26..67a6fe7 100644
--- a/kernel/power/power.h
+++ b/kernel/power/power.h
@@ -54,13 +54,14 @@ extern int hibernation_platform_enter(void);
 extern int pfn_is_nosave(unsigned long);
 
 #define power_attr(_name) \
-static struct kobj_attribute _name##_attr = {	\
-	.attr	= {				\
-		.name = __stringify(_name),	\
-		.mode = 0644,			\
-	},					\
-	.show	= _name##_show,			\
-	.store	= _name##_store,		\
+static struct kobj_attribute _name##_attr = {		\
+	.attr	= {					\
+		.name = __stringify(_name),		\
+		.mode = 0644,				\
+		.class = SYSFS_ATTR_PM_CONTROL,		\
+	},						\
+	.show	= _name##_show,				\
+	.store	= _name##_store,			\
 }
 
 /* Preferred image size in bytes (default 500 MB) */
-- 
1.5.5.6


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [Patch 2/2] block: add sysfs lockdep class for iosched
  2010-02-08  9:51 [Patch 1/2] sysfs: add lockdep class support to s_active Amerigo Wang
@ 2010-02-08  9:52 ` Amerigo Wang
  2010-02-08 20:50   ` Larry Finger
  0 siblings, 1 reply; 4+ messages in thread
From: Amerigo Wang @ 2010-02-08  9:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: Tejun Heo, Greg Kroah-Hartman, Peter Zijlstra, Eric W. Biederman,
	Heiko Carstens, Jens Axboe, Miles Lane, Larry Finger,
	Amerigo Wang, Hugh Dickins, akpm


Similar to the previous PM case, in iosched, we hold an s_active
lock to store "scheduler", meanwhile we want to remove "iosched/*"
files.

This patch depends on the previous one. I tested it on my machine,
it fixes the problem.

Reported-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>

---
 block/blk-sysfs.c     |  120 +++++++++++++++----------------------------------
 include/linux/sysfs.h |    1 +
 2 files changed, 38 insertions(+), 83 deletions(-)

diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 8606c95..f863d4d 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -6,6 +6,7 @@
 #include <linux/bio.h>
 #include <linux/blkdev.h>
 #include <linux/blktrace_api.h>
+#include <linux/sysfs.h>
 
 #include "blk.h"
 
@@ -254,105 +255,58 @@ static ssize_t queue_iostats_store(struct request_queue *q, const char *page,
 	return ret;
 }
 
-static struct queue_sysfs_entry queue_requests_entry = {
-	.attr = {.name = "nr_requests", .mode = S_IRUGO | S_IWUSR },
-	.show = queue_requests_show,
-	.store = queue_requests_store,
-};
-
-static struct queue_sysfs_entry queue_ra_entry = {
-	.attr = {.name = "read_ahead_kb", .mode = S_IRUGO | S_IWUSR },
-	.show = queue_ra_show,
-	.store = queue_ra_store,
-};
+#define queue_sysfs_rw_attr(_name, _filename)			\
+static struct queue_sysfs_entry _name##_entry = {		\
+	.attr = {						\
+		.name = _filename,				\
+		.mode = S_IRUGO | S_IWUSR,			\
+		.class = SYSFS_ATTR_IOSCHED,			\
+	},							\
+	.show = _name##_show,					\
+	.store = _name##_store,					\
+}
 
-static struct queue_sysfs_entry queue_max_sectors_entry = {
-	.attr = {.name = "max_sectors_kb", .mode = S_IRUGO | S_IWUSR },
-	.show = queue_max_sectors_show,
-	.store = queue_max_sectors_store,
-};
+#define queue_sysfs_ro_attr(_name, _filename)			\
+static struct queue_sysfs_entry _name##_entry = {		\
+	.attr = {						\
+		.name = _filename,				\
+		.mode = S_IRUGO,				\
+		.class = SYSFS_ATTR_IOSCHED,			\
+	},							\
+	.show = _name##_show,					\
+}
 
-static struct queue_sysfs_entry queue_max_hw_sectors_entry = {
-	.attr = {.name = "max_hw_sectors_kb", .mode = S_IRUGO },
-	.show = queue_max_hw_sectors_show,
-};
 
-static struct queue_sysfs_entry queue_iosched_entry = {
-	.attr = {.name = "scheduler", .mode = S_IRUGO | S_IWUSR },
-	.show = elv_iosched_show,
-	.store = elv_iosched_store,
-};
+queue_sysfs_rw_attr(queue_requests, "nr_requests");
+queue_sysfs_rw_attr(queue_ra, "read_ahead_kb");
+queue_sysfs_rw_attr(queue_max_sectors, "max_sectors_kb");
+queue_sysfs_ro_attr(queue_max_hw_sectors, "max_hw_sectors_kb");
+queue_sysfs_rw_attr(elv_iosched, "scheduler");
+queue_sysfs_ro_attr(queue_logical_block_size, "logical_block_size");
 
 static struct queue_sysfs_entry queue_hw_sector_size_entry = {
 	.attr = {.name = "hw_sector_size", .mode = S_IRUGO },
 	.show = queue_logical_block_size_show,
 };
 
-static struct queue_sysfs_entry queue_logical_block_size_entry = {
-	.attr = {.name = "logical_block_size", .mode = S_IRUGO },
-	.show = queue_logical_block_size_show,
-};
-
-static struct queue_sysfs_entry queue_physical_block_size_entry = {
-	.attr = {.name = "physical_block_size", .mode = S_IRUGO },
-	.show = queue_physical_block_size_show,
-};
+queue_sysfs_ro_attr(queue_physical_block_size, "physical_block_size");
+queue_sysfs_ro_attr(queue_io_min, "minimum_io_size");
+queue_sysfs_ro_attr(queue_io_opt, "optimal_io_size");
+queue_sysfs_ro_attr(queue_discard_granularity, "discard_granularity");
+queue_sysfs_ro_attr(queue_discard_max, "discard_max_bytes");
+queue_sysfs_ro_attr(queue_discard_zeroes_data, "discard_zeroes_data");
 
-static struct queue_sysfs_entry queue_io_min_entry = {
-	.attr = {.name = "minimum_io_size", .mode = S_IRUGO },
-	.show = queue_io_min_show,
-};
-
-static struct queue_sysfs_entry queue_io_opt_entry = {
-	.attr = {.name = "optimal_io_size", .mode = S_IRUGO },
-	.show = queue_io_opt_show,
-};
-
-static struct queue_sysfs_entry queue_discard_granularity_entry = {
-	.attr = {.name = "discard_granularity", .mode = S_IRUGO },
-	.show = queue_discard_granularity_show,
-};
-
-static struct queue_sysfs_entry queue_discard_max_entry = {
-	.attr = {.name = "discard_max_bytes", .mode = S_IRUGO },
-	.show = queue_discard_max_show,
-};
-
-static struct queue_sysfs_entry queue_discard_zeroes_data_entry = {
-	.attr = {.name = "discard_zeroes_data", .mode = S_IRUGO },
-	.show = queue_discard_zeroes_data_show,
-};
-
-static struct queue_sysfs_entry queue_nonrot_entry = {
-	.attr = {.name = "rotational", .mode = S_IRUGO | S_IWUSR },
-	.show = queue_nonrot_show,
-	.store = queue_nonrot_store,
-};
-
-static struct queue_sysfs_entry queue_nomerges_entry = {
-	.attr = {.name = "nomerges", .mode = S_IRUGO | S_IWUSR },
-	.show = queue_nomerges_show,
-	.store = queue_nomerges_store,
-};
-
-static struct queue_sysfs_entry queue_rq_affinity_entry = {
-	.attr = {.name = "rq_affinity", .mode = S_IRUGO | S_IWUSR },
-	.show = queue_rq_affinity_show,
-	.store = queue_rq_affinity_store,
-};
-
-static struct queue_sysfs_entry queue_iostats_entry = {
-	.attr = {.name = "iostats", .mode = S_IRUGO | S_IWUSR },
-	.show = queue_iostats_show,
-	.store = queue_iostats_store,
-};
+queue_sysfs_rw_attr(queue_nonrot, "rotational");
+queue_sysfs_rw_attr(queue_nomerges, "nomerges");
+queue_sysfs_rw_attr(queue_rq_affinity, "rq_affinity");
+queue_sysfs_rw_attr(queue_iostats, "iostats");
 
 static struct attribute *default_attrs[] = {
 	&queue_requests_entry.attr,
 	&queue_ra_entry.attr,
 	&queue_max_hw_sectors_entry.attr,
 	&queue_max_sectors_entry.attr,
-	&queue_iosched_entry.attr,
+	&elv_iosched_entry.attr,
 	&queue_hw_sector_size_entry.attr,
 	&queue_logical_block_size_entry.attr,
 	&queue_physical_block_size_entry.attr,
diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h
index 2b91b74..3a91008 100644
--- a/include/linux/sysfs.h
+++ b/include/linux/sysfs.h
@@ -23,6 +23,7 @@ struct module;
 enum sysfs_attr_lock_class {
 	SYSFS_ATTR_NORMAL,
 	SYSFS_ATTR_PM_CONTROL,
+	SYSFS_ATTR_IOSCHED,
 	SYSFS_NR_CLASSES,
 };
 
-- 
1.5.5.6


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [Patch 2/2] block: add sysfs lockdep class for iosched
  2010-02-08  9:52 ` [Patch 2/2] block: add sysfs lockdep class for iosched Amerigo Wang
@ 2010-02-08 20:50   ` Larry Finger
  2010-02-09  2:56     ` Cong Wang
  0 siblings, 1 reply; 4+ messages in thread
From: Larry Finger @ 2010-02-08 20:50 UTC (permalink / raw)
  To: Amerigo Wang
  Cc: linux-kernel, Tejun Heo, Greg Kroah-Hartman, Peter Zijlstra,
	Eric W. Biederman, Heiko Carstens, Jens Axboe, Miles Lane,
	Hugh Dickins, akpm

On 02/08/2010 03:52 AM, Amerigo Wang wrote:
> Similar to the previous PM case, in iosched, we hold an s_active
> lock to store "scheduler", meanwhile we want to remove "iosched/*"
> files.
> 
> This patch depends on the previous one. I tested it on my machine,
> it fixes the problem.
> 
> Reported-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
> Signed-off-by: WANG Cong <amwang@redhat.com>
> Cc: Jens Axboe <jens.axboe@oracle.com>

After applying the 2 patches to 2.6.33-rc7, I get the following:

ACPI: bus type pci registered
PCI: MMCONFIG for domain 0000 [bus 00-09] at [mem 0xe0000000-0xe09fffff] (base
0xe0000000)
PCI: MMCONFIG at [mem 0xe0000000-0xe09fffff] reserved in E820
PCI: Using configuration type 1 for base access
INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
Pid: 1, comm: swapper Not tainted 2.6.33-rc7-Linus-00010-g6339204-dirty #181
Call Trace:
 [<ffffffff8107c6e6>] __lock_acquire+0xf86/0x1d30
 [<ffffffff81078e7f>] ? lockdep_init_map+0x5f/0x5d0
 [<ffffffff8107d52b>] lock_acquire+0x9b/0x120
 [<ffffffff81167a93>] ? sysfs_addrm_finish+0x43/0x70
 [<ffffffff81167243>] sysfs_deactivate+0xc3/0x110
 [<ffffffff81167a93>] ? sysfs_addrm_finish+0x43/0x70
 [<ffffffff813124d3>] ? mutex_lock_nested+0x243/0x300
 [<ffffffff81167a93>] sysfs_addrm_finish+0x43/0x70
 [<ffffffff81167af6>] remove_dir+0x36/0x40
 [<ffffffff81167b09>] sysfs_remove_subdir+0x9/0x10
 [<ffffffff81168ff6>] sysfs_remove_group+0x66/0xf0
 [<ffffffff81861555>] param_sysfs_init+0x102/0x277
 [<ffffffff8124a5bd>] ? sysdev_create_file+0xd/0x10
 [<ffffffff8130fe46>] ? register_cpu+0xa3/0xa5
 [<ffffffff81861453>] ? param_sysfs_init+0x0/0x277
 [<ffffffff810001d7>] do_one_initcall+0x37/0x190
 [<ffffffff8184c6d0>] kernel_init+0x14f/0x1a5
 [<ffffffff81003bd4>] kernel_thread_helper+0x4/0x10
 [<ffffffff8131417c>] ? restore_args+0x0/0x30
 [<ffffffff8184c581>] ? kernel_init+0x0/0x1a5
 [<ffffffff81003bd0>] ? kernel_thread_helper+0x0/0x10

This dump does not occur with standard 2.6.33-rc7. As the above turns off the
locking correctness validator, I cannot really test to see what happens when
suspending.

Larry

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Patch 2/2] block: add sysfs lockdep class for iosched
  2010-02-08 20:50   ` Larry Finger
@ 2010-02-09  2:56     ` Cong Wang
  0 siblings, 0 replies; 4+ messages in thread
From: Cong Wang @ 2010-02-09  2:56 UTC (permalink / raw)
  To: Larry Finger
  Cc: linux-kernel, Tejun Heo, Greg Kroah-Hartman, Peter Zijlstra,
	Eric W. Biederman, Heiko Carstens, Jens Axboe, Miles Lane,
	Hugh Dickins, akpm

Larry Finger wrote:
> On 02/08/2010 03:52 AM, Amerigo Wang wrote:
>> Similar to the previous PM case, in iosched, we hold an s_active
>> lock to store "scheduler", meanwhile we want to remove "iosched/*"
>> files.
>>
>> This patch depends on the previous one. I tested it on my machine,
>> it fixes the problem.
>>
>> Reported-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
>> Signed-off-by: WANG Cong <amwang@redhat.com>
>> Cc: Jens Axboe <jens.axboe@oracle.com>
> 
> After applying the 2 patches to 2.6.33-rc7, I get the following:
> 
> ACPI: bus type pci registered
> PCI: MMCONFIG for domain 0000 [bus 00-09] at [mem 0xe0000000-0xe09fffff] (base
> 0xe0000000)
> PCI: MMCONFIG at [mem 0xe0000000-0xe09fffff] reserved in E820
> PCI: Using configuration type 1 for base access
> INFO: trying to register non-static key.
> the code is fine but needs lockdep annotation.
> turning off the locking correctness validator.
> Pid: 1, comm: swapper Not tainted 2.6.33-rc7-Linus-00010-g6339204-dirty #181
> Call Trace:
>  [<ffffffff8107c6e6>] __lock_acquire+0xf86/0x1d30
>  [<ffffffff81078e7f>] ? lockdep_init_map+0x5f/0x5d0
>  [<ffffffff8107d52b>] lock_acquire+0x9b/0x120
>  [<ffffffff81167a93>] ? sysfs_addrm_finish+0x43/0x70
>  [<ffffffff81167243>] sysfs_deactivate+0xc3/0x110
>  [<ffffffff81167a93>] ? sysfs_addrm_finish+0x43/0x70
>  [<ffffffff813124d3>] ? mutex_lock_nested+0x243/0x300
>  [<ffffffff81167a93>] sysfs_addrm_finish+0x43/0x70
>  [<ffffffff81167af6>] remove_dir+0x36/0x40
>  [<ffffffff81167b09>] sysfs_remove_subdir+0x9/0x10
>  [<ffffffff81168ff6>] sysfs_remove_group+0x66/0xf0
>  [<ffffffff81861555>] param_sysfs_init+0x102/0x277
>  [<ffffffff8124a5bd>] ? sysdev_create_file+0xd/0x10
>  [<ffffffff8130fe46>] ? register_cpu+0xa3/0xa5
>  [<ffffffff81861453>] ? param_sysfs_init+0x0/0x277
>  [<ffffffff810001d7>] do_one_initcall+0x37/0x190
>  [<ffffffff8184c6d0>] kernel_init+0x14f/0x1a5
>  [<ffffffff81003bd4>] kernel_thread_helper+0x4/0x10
>  [<ffffffff8131417c>] ? restore_args+0x0/0x30
>  [<ffffffff8184c581>] ? kernel_init+0x0/0x1a5
>  [<ffffffff81003bd0>] ? kernel_thread_helper+0x0/0x10
> 
> This dump does not occur with standard 2.6.33-rc7. As the above turns off the
> locking correctness validator, I cannot really test to see what happens when
> suspending.
> 

Ouch! I forgot to add the annotations to sysfs dirs...

Thanks much for the report, I will send an updated version soon!


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-02-09  2:54 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-02-08  9:51 [Patch 1/2] sysfs: add lockdep class support to s_active Amerigo Wang
2010-02-08  9:52 ` [Patch 2/2] block: add sysfs lockdep class for iosched Amerigo Wang
2010-02-08 20:50   ` Larry Finger
2010-02-09  2:56     ` Cong Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox