public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCHSET] sysfs: sysfs rework, take #2
@ 2007-04-28 13:39 Tejun Heo
  2007-04-28 13:39 ` [PATCH 01/21] idr: fix obscure bug in allocation path Tejun Heo
                   ` (22 more replies)
  0 siblings, 23 replies; 30+ messages in thread
From: Tejun Heo @ 2007-04-28 13:39 UTC (permalink / raw)
  To: gregkh, dmitry.torokhov, cornelia.huck, oneukum, rpurdie, stern,
	maneesh, akpm, linux-kernel, htejun

Subject: [PATCHSET] sysfs: sysfs rework, take #2

Hello, all.

This is respin of syswork rework patchset in gregkh patch series.
Changes are...

* ida implemented and used to allocate sysfs_dirent inode numbers
  instead of using pointer values which caused problems on certain
  configurations including 32bit userland on 64bit kernel.

* reordered patches a bit such that minor updates added later are now
  in the front of the patchset.

* 09-sysfs-flatten-and-fix-sysfs_rename_dir-error_handling added.

* 12-sysfs-add-sysfs_dirent-s_name now updates syfs_rename_dir() such
  that the renaming is reflected in sd->s_name.  This fixes netif
  rename problem.

* lockdep annotation updated to handle sysfs nodes deleting other
  nodes properlyand merged into 17-sysfs-implement-sysfs_dirent-
  active-references-and-immediate-disconnect

* patches which fix fallouts of kill-attr-owner conversion were added
  19-sysfs-kill-unnecessary-attribute-owner.  Source tree is scanned
  with grep and brace matching perl script and more fallouts have been
  converted.

* reimplement-sysfs_drop_dentry() patches are completely redone.  This
  was necessary because there was no way to look up the dentry
  matching a sysfs_dirent for shadow directories, not without getting
  the hands all dirty with dcache internals.  Please read descriptions
  of patch #20 and 21 for more info.  This fixes two races conditions
  around sysfs dentry/inode reclamation which triggers BUG_ON().

All the known regrssions are fixed.  I think I've got all the
attr->owner but haven't verified with cross compiling yet, just
allyesconfig on x86-64 and i386.  I'll try to setup some cross compile
environments tomorrow and follow up if I can find more fallouts.

The patches are against Linus's master branch of the day -
b9099ff63c75216d6ca10bce5a1abcd9293c27e6

Thanks.

--
tejun



^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 01/21] idr: fix obscure bug in allocation path
  2007-04-28 13:39 [PATCHSET] sysfs: sysfs rework, take #2 Tejun Heo
@ 2007-04-28 13:39 ` Tejun Heo
  2007-04-28 13:39 ` [PATCH 02/21] idr: separate out idr_mark_full() Tejun Heo
                   ` (21 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Tejun Heo @ 2007-04-28 13:39 UTC (permalink / raw)
  To: gregkh, dmitry.torokhov, cornelia.huck, oneukum, rpurdie, stern,
	maneesh, akpm, linux-kernel, htejun
  Cc: Tejun Heo

In sub_alloc(), when bitmap search fails, it goes up one level to
continue search.  This is done by updating the id cursor and searching
the upper level again.  If the cursor was at the end of the upper
level, we need to go further than that.

This wasn't implemented and when that happens the part of the cursor
which indexes into the upper level wraps and sub_alloc() ends up
searching the wrong bitmap.  It allocates id which doesn't match the
actual slot.

This patch fixes this by restarting from the top if the search needs
to go higher than one level.

Signed-off-by: Tejun Heo <htejun@gmail.com>
---
 lib/idr.c |   16 ++++++++++++++--
 1 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/lib/idr.c b/lib/idr.c
index 305117c..7b5a59c 100644
--- a/lib/idr.c
+++ b/lib/idr.c
@@ -100,10 +100,11 @@ static int sub_alloc(struct idr *idp, void *ptr, int *starting_id)
 	int n, m, sh;
 	struct idr_layer *p, *new;
 	struct idr_layer *pa[MAX_LEVEL];
-	int l, id;
+	int l, id, oid;
 	long bm;
 
 	id = *starting_id;
+ restart:
 	p = idp->top;
 	l = idp->layers;
 	pa[l--] = NULL;
@@ -117,12 +118,23 @@ static int sub_alloc(struct idr *idp, void *ptr, int *starting_id)
 		if (m == IDR_SIZE) {
 			/* no space available go back to previous layer. */
 			l++;
+			oid = id;
 			id = (id | ((1 << (IDR_BITS * l)) - 1)) + 1;
+
+			/* if already at the top layer, we need to grow */
 			if (!(p = pa[l])) {
 				*starting_id = id;
 				return -2;
 			}
-			continue;
+
+			/* If we need to go up one layer, continue the
+			 * loop; otherwise, restart from the top.
+			 */
+			sh = IDR_BITS * (l + 1);
+			if (oid >> sh == id >> sh)
+				continue;
+			else
+				goto restart;
 		}
 		if (m != n) {
 			sh = IDR_BITS*l;
-- 
1.5.0.3



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 02/21] idr: separate out idr_mark_full()
  2007-04-28 13:39 [PATCHSET] sysfs: sysfs rework, take #2 Tejun Heo
  2007-04-28 13:39 ` [PATCH 01/21] idr: fix obscure bug in allocation path Tejun Heo
@ 2007-04-28 13:39 ` Tejun Heo
  2007-04-28 13:39 ` [PATCH 06/21] sysfs: make sysfs_put() ignore NULL sd Tejun Heo
                   ` (20 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Tejun Heo @ 2007-04-28 13:39 UTC (permalink / raw)
  To: gregkh, dmitry.torokhov, cornelia.huck, oneukum, rpurdie, stern,
	maneesh, akpm, linux-kernel, htejun
  Cc: Tejun Heo

Separate out idr_mark_full() from sub_alloc() and make marking the
allocated slot full the responsibility of idr_get_new_above_int().

Allocation part of idr_get_new_above_int() is renamed to
idr_get_empty_slot().  New idr_get_new_above_int() allocates a slot
using the function, install the user pointer and marks it full using
idr_mark_full().

This change doesn't introduce any behavior change.  This will be
used by ida.

Signed-off-by: Tejun Heo <htejun@gmail.com>
---
 lib/idr.c |   71 +++++++++++++++++++++++++++++++++++++++---------------------
 1 files changed, 46 insertions(+), 25 deletions(-)

diff --git a/lib/idr.c b/lib/idr.c
index 7b5a59c..30b33e2 100644
--- a/lib/idr.c
+++ b/lib/idr.c
@@ -70,6 +70,26 @@ static void free_layer(struct idr *idp, struct idr_layer *p)
 	spin_unlock_irqrestore(&idp->lock, flags);
 }
 
+static void idr_mark_full(struct idr_layer **pa, int id)
+{
+	struct idr_layer *p = pa[0];
+	int l = 0;
+
+	__set_bit(id & IDR_MASK, &p->bitmap);
+	/*
+	 * If this layer is full mark the bit in the layer above to
+	 * show that this part of the radix tree is full.  This may
+	 * complete the layer above and require walking up the radix
+	 * tree.
+	 */
+	while (p->bitmap == IDR_FULL) {
+		if (!(p = pa[++l]))
+			break;
+		id = id >> IDR_BITS;
+		__set_bit((id & IDR_MASK), &p->bitmap);
+	}
+}
+
 /**
  * idr_pre_get - reserver resources for idr allocation
  * @idp:	idr handle
@@ -95,11 +115,10 @@ int idr_pre_get(struct idr *idp, gfp_t gfp_mask)
 }
 EXPORT_SYMBOL(idr_pre_get);
 
-static int sub_alloc(struct idr *idp, void *ptr, int *starting_id)
+static int sub_alloc(struct idr *idp, int *starting_id, struct idr_layer **pa)
 {
 	int n, m, sh;
 	struct idr_layer *p, *new;
-	struct idr_layer *pa[MAX_LEVEL];
 	int l, id, oid;
 	long bm;
 
@@ -156,30 +175,13 @@ static int sub_alloc(struct idr *idp, void *ptr, int *starting_id)
 		pa[l--] = p;
 		p = p->ary[m];
 	}
-	/*
-	 * We have reached the leaf node, plant the
-	 * users pointer and return the raw id.
-	 */
-	p->ary[m] = (struct idr_layer *)ptr;
-	__set_bit(m, &p->bitmap);
-	p->count++;
-	/*
-	 * If this layer is full mark the bit in the layer above
-	 * to show that this part of the radix tree is full.
-	 * This may complete the layer above and require walking
-	 * up the radix tree.
-	 */
-	n = id;
-	while (p->bitmap == IDR_FULL) {
-		if (!(p = pa[++l]))
-			break;
-		n = n >> IDR_BITS;
-		__set_bit((n & IDR_MASK), &p->bitmap);
-	}
-	return(id);
+
+	pa[l] = p;
+	return id;
 }
 
-static int idr_get_new_above_int(struct idr *idp, void *ptr, int starting_id)
+static int idr_get_empty_slot(struct idr *idp, int starting_id,
+			      struct idr_layer **pa)
 {
 	struct idr_layer *p, *new;
 	int layers, v, id;
@@ -225,12 +227,31 @@ build_up:
 	}
 	idp->top = p;
 	idp->layers = layers;
-	v = sub_alloc(idp, ptr, &id);
+	v = sub_alloc(idp, &id, pa);
 	if (v == -2)
 		goto build_up;
 	return(v);
 }
 
+static int idr_get_new_above_int(struct idr *idp, void *ptr, int starting_id)
+{
+	struct idr_layer *pa[MAX_LEVEL];
+	int id;
+
+	id = idr_get_empty_slot(idp, starting_id, pa);
+	if (id >= 0) {
+		/*
+		 * Successfully found an empty slot.  Install the user
+		 * pointer and mark the slot full.
+		 */
+		pa[0]->ary[id & IDR_MASK] = (struct idr_layer *)ptr;
+		pa[0]->count++;
+		idr_mark_full(pa, id);
+	}
+
+	return id;
+}
+
 /**
  * idr_get_new_above - allocate new idr entry above or equal to a start id
  * @idp: idr handle
-- 
1.5.0.3



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 05/21] sysfs: allocate inode number using ida
  2007-04-28 13:39 [PATCHSET] sysfs: sysfs rework, take #2 Tejun Heo
                   ` (5 preceding siblings ...)
  2007-04-28 13:39 ` [PATCH 07/21] sysfs: fix error handling in binattr write() Tejun Heo
@ 2007-04-28 13:39 ` Tejun Heo
  2007-04-28 13:39 ` [PATCH 11/21] sysfs: add sysfs_dirent->s_parent Tejun Heo
                   ` (15 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Tejun Heo @ 2007-04-28 13:39 UTC (permalink / raw)
  To: gregkh, dmitry.torokhov, cornelia.huck, oneukum, rpurdie, stern,
	maneesh, akpm, linux-kernel, htejun
  Cc: Tejun Heo

Inode number handling was incorrect in two ways.

1. sysfs uses the inode number allocated by new_inode() and never
   hashes it.  When reporting the inode number, it uses iunique() if
   inode is inaccessible.  This is incorrect because iunique() assumes
   the inodes are hashed.  This can cause duplicate inode numbers and
   the condition is likely to happen because new_inode() and iunique()
   use separate increasing static counters to scan for empty slot.

2. sysfs_dirent->s_dentry can go away anytime and can't be referenced
   unless the caller knows the dentry is not and not going to be
   deleted.

This patch adds sysfs_dirent->s_ino and uses ida to give each sd a
unique inode number on allocation which is freed when the sd is
released.

Signed-off-by: Tejun Heo <htejun@gmail.com>
---
 fs/sysfs/dir.c   |   43 ++++++++++++++++++++++++++++++++++++++-----
 fs/sysfs/inode.c |    1 +
 fs/sysfs/mount.c |    1 +
 fs/sysfs/sysfs.h |    1 +
 4 files changed, 41 insertions(+), 5 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 4dc4322..ada8405 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -9,11 +9,41 @@
 #include <linux/module.h>
 #include <linux/kobject.h>
 #include <linux/namei.h>
+#include <linux/idr.h>
 #include <asm/semaphore.h>
 #include "sysfs.h"
 
 DECLARE_RWSEM(sysfs_rename_sem);
 
+static spinlock_t sysfs_ino_lock = SPIN_LOCK_UNLOCKED;
+static DEFINE_IDA(sysfs_ino_ida);
+
+int sysfs_alloc_ino(ino_t *pino)
+{
+	int ino, rc;
+
+ retry:
+	spin_lock(&sysfs_ino_lock);
+	rc = ida_get_new_above(&sysfs_ino_ida, 2, &ino);
+	spin_unlock(&sysfs_ino_lock);
+
+	if (rc == -EAGAIN) {
+		if (ida_pre_get(&sysfs_ino_ida, GFP_KERNEL))
+			goto retry;
+		rc = -ENOMEM;
+	}
+
+	*pino = ino;
+	return rc;
+}
+
+static void sysfs_free_ino(ino_t ino)
+{
+	spin_lock(&sysfs_ino_lock);
+	ida_remove(&sysfs_ino_ida, ino);
+	spin_unlock(&sysfs_ino_lock);
+}
+
 void release_sysfs_dirent(struct sysfs_dirent * sd)
 {
 	if (sd->s_type & SYSFS_KOBJ_LINK) {
@@ -23,6 +53,7 @@ void release_sysfs_dirent(struct sysfs_dirent * sd)
 		kfree(sl);
 	}
 	kfree(sd->s_iattr);
+	sysfs_free_ino(sd->s_ino);
 	kmem_cache_free(sysfs_dir_cachep, sd);
 }
 
@@ -53,6 +84,11 @@ static struct sysfs_dirent * __sysfs_new_dirent(void * element)
 	if (!sd)
 		return NULL;
 
+	if (sysfs_alloc_ino(&sd->s_ino)) {
+		kmem_cache_free(sysfs_dir_cachep, sd);
+		return NULL;
+	}
+
 	atomic_set(&sd->s_count, 1);
 	atomic_set(&sd->s_event, 1);
 	INIT_LIST_HEAD(&sd->s_children);
@@ -521,7 +557,7 @@ static int sysfs_readdir(struct file * filp, void * dirent, filldir_t filldir)
 
 	switch (i) {
 		case 0:
-			ino = dentry->d_inode->i_ino;
+			ino = parent_sd->s_ino;
 			if (filldir(dirent, ".", 1, i, ino, DT_DIR) < 0)
 				break;
 			filp->f_pos++;
@@ -550,10 +586,7 @@ static int sysfs_readdir(struct file * filp, void * dirent, filldir_t filldir)
 
 				name = sysfs_get_name(next);
 				len = strlen(name);
-				if (next->s_dentry)
-					ino = next->s_dentry->d_inode->i_ino;
-				else
-					ino = iunique(sysfs_sb, 2);
+				ino = next->s_ino;
 
 				if (filldir(dirent, name, len, filp->f_pos, ino,
 						 dt_type(next)) < 0)
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index 4de5c6b..7c1f4e8 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -140,6 +140,7 @@ struct inode * sysfs_new_inode(mode_t mode, struct sysfs_dirent * sd)
 		inode->i_mapping->a_ops = &sysfs_aops;
 		inode->i_mapping->backing_dev_info = &sysfs_backing_dev_info;
 		inode->i_op = &sysfs_inode_operations;
+		inode->i_ino = sd->s_ino;
 		lockdep_set_class(&inode->i_mutex, &sysfs_inode_imutex_key);
 
 		if (sd->s_iattr) {
diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c
index 23a48a3..c1bafef 100644
--- a/fs/sysfs/mount.c
+++ b/fs/sysfs/mount.c
@@ -32,6 +32,7 @@ static struct sysfs_dirent sysfs_root = {
 	.s_children	= LIST_HEAD_INIT(sysfs_root.s_children),
 	.s_element	= NULL,
 	.s_type		= SYSFS_ROOT,
+	.s_ino		= 1,
 	.s_iattr	= NULL,
 };
 
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 3b8aae0..1d61027 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -5,6 +5,7 @@ struct sysfs_dirent {
 	void 			* s_element;
 	int			s_type;
 	umode_t			s_mode;
+	ino_t			s_ino;
 	struct dentry		* s_dentry;
 	struct iattr		* s_iattr;
 	atomic_t		s_event;
-- 
1.5.0.3



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 04/21] sysfs: move release_sysfs_dirent() to dir.c
  2007-04-28 13:39 [PATCHSET] sysfs: sysfs rework, take #2 Tejun Heo
                   ` (3 preceding siblings ...)
  2007-04-28 13:39 ` [PATCH 03/21] ida: implement idr based id allocator Tejun Heo
@ 2007-04-28 13:39 ` Tejun Heo
  2007-04-28 13:39 ` [PATCH 07/21] sysfs: fix error handling in binattr write() Tejun Heo
                   ` (17 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Tejun Heo @ 2007-04-28 13:39 UTC (permalink / raw)
  To: gregkh, dmitry.torokhov, cornelia.huck, oneukum, rpurdie, stern,
	maneesh, akpm, linux-kernel, htejun
  Cc: Tejun Heo

There is no reason this function should be inlined and soon to follow
sysfs object reference simplification will make it heavier.  Move it
to dir.c.

Signed-off-by: Tejun Heo <htejun@gmail.com>
---
 fs/sysfs/dir.c   |   12 ++++++++++++
 fs/sysfs/sysfs.h |   13 +------------
 2 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 85a6686..4dc4322 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -14,6 +14,18 @@
 
 DECLARE_RWSEM(sysfs_rename_sem);
 
+void release_sysfs_dirent(struct sysfs_dirent * sd)
+{
+	if (sd->s_type & SYSFS_KOBJ_LINK) {
+		struct sysfs_symlink * sl = sd->s_element;
+		kfree(sl->link_name);
+		kobject_put(sl->target_kobj);
+		kfree(sl);
+	}
+	kfree(sd->s_iattr);
+	kmem_cache_free(sysfs_dir_cachep, sd);
+}
+
 static void sysfs_d_iput(struct dentry * dentry, struct inode * inode)
 {
 	struct sysfs_dirent * sd = dentry->d_fsdata;
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index a77c57e..3b8aae0 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -17,6 +17,7 @@ extern void sysfs_delete_inode(struct inode *inode);
 extern struct inode * sysfs_new_inode(mode_t mode, struct sysfs_dirent *);
 extern int sysfs_create(struct dentry *, int mode, int (*init)(struct inode *));
 
+extern void release_sysfs_dirent(struct sysfs_dirent * sd);
 extern int sysfs_dirent_exist(struct sysfs_dirent *, const unsigned char *);
 extern int sysfs_make_dirent(struct sysfs_dirent *, struct dentry *, void *,
 				umode_t, int);
@@ -97,18 +98,6 @@ static inline struct kobject *sysfs_get_kobject(struct dentry *dentry)
 	return kobj;
 }
 
-static inline void release_sysfs_dirent(struct sysfs_dirent * sd)
-{
-	if (sd->s_type & SYSFS_KOBJ_LINK) {
-		struct sysfs_symlink * sl = sd->s_element;
-		kfree(sl->link_name);
-		kobject_put(sl->target_kobj);
-		kfree(sl);
-	}
-	kfree(sd->s_iattr);
-	kmem_cache_free(sysfs_dir_cachep, sd);
-}
-
 static inline struct sysfs_dirent * sysfs_get(struct sysfs_dirent * sd)
 {
 	if (sd) {
-- 
1.5.0.3



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 03/21] ida: implement idr based id allocator
  2007-04-28 13:39 [PATCHSET] sysfs: sysfs rework, take #2 Tejun Heo
                   ` (2 preceding siblings ...)
  2007-04-28 13:39 ` [PATCH 06/21] sysfs: make sysfs_put() ignore NULL sd Tejun Heo
@ 2007-04-28 13:39 ` Tejun Heo
  2007-04-28 13:39 ` [PATCH 04/21] sysfs: move release_sysfs_dirent() to dir.c Tejun Heo
                   ` (18 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Tejun Heo @ 2007-04-28 13:39 UTC (permalink / raw)
  To: gregkh, dmitry.torokhov, cornelia.huck, oneukum, rpurdie, stern,
	maneesh, akpm, linux-kernel, htejun
  Cc: Tejun Heo

Implement idr based id allocator.  ida is used the same way idr is
used but lacks id -> ptr translation and thus consumes much less
memory.  struct ida_bitmap is attached as leaf nodes to idr tree which
is managed by the idr code.  Each ida_bitmap is 128bytes long and
contains slightly less than a thousand slots.

ida is more aggressive with releasing extra resources acquired using
ida_pre_get().  After every successful id allocation, ida frees one
reserved idr_layer if possible.  Reserved ida_bitmap is not freed
automatically but only one ida_bitmap is reserved and it's almost
always used right away.  Under most circumstances, ida won't hold on
to memory for too long which isn't actively used.

Signed-off-by: Tejun Heo <htejun@gmail.com>
---
 include/linux/idr.h |   29 ++++++
 lib/idr.c           |  245 +++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 274 insertions(+), 0 deletions(-)

diff --git a/include/linux/idr.h b/include/linux/idr.h
index 8268034..915572f 100644
--- a/include/linux/idr.h
+++ b/include/linux/idr.h
@@ -83,4 +83,33 @@ void idr_remove(struct idr *idp, int id);
 void idr_destroy(struct idr *idp);
 void idr_init(struct idr *idp);
 
+
+/*
+ * IDA - IDR based id allocator, use when translation from id to
+ * pointer isn't necessary.
+ */
+#define IDA_CHUNK_SIZE		128	/* 128 bytes per chunk */
+#define IDA_BITMAP_LONGS	(128 / sizeof(long) - 1)
+#define IDA_BITMAP_BITS		(IDA_BITMAP_LONGS * sizeof(long) * 8)
+
+struct ida_bitmap {
+	long			nr_busy;
+	unsigned long		bitmap[IDA_BITMAP_LONGS];
+};
+
+struct ida {
+	struct idr		idr;
+	struct ida_bitmap	*free_bitmap;
+};
+
+#define IDA_INIT(name)		{ .idr = IDR_INIT(name), .free_bitmap = NULL, }
+#define DEFINE_IDA(name)	struct ida name = IDA_INIT(name)
+
+int ida_pre_get(struct ida *ida, gfp_t gfp_mask);
+int ida_get_new_above(struct ida *ida, int starting_id, int *p_id);
+int ida_get_new(struct ida *ida, int *p_id);
+void ida_remove(struct ida *ida, int id);
+void ida_destroy(struct ida *ida);
+void ida_init(struct ida *ida);
+
 #endif /* __IDR_H__ */
diff --git a/lib/idr.c b/lib/idr.c
index 30b33e2..b98f01a 100644
--- a/lib/idr.c
+++ b/lib/idr.c
@@ -506,3 +506,248 @@ void idr_init(struct idr *idp)
 	spin_lock_init(&idp->lock);
 }
 EXPORT_SYMBOL(idr_init);
+
+
+/*
+ * IDA - IDR based ID allocator
+ *
+ * this is id allocator without id -> pointer translation.  Memory
+ * usage is much lower than full blown idr because each id only
+ * occupies a bit.  ida uses a custom leaf node which contains
+ * IDA_BITMAP_BITS slots.
+ *
+ * 2007-04-25  written by Tejun Heo <htejun@gmail.com>
+ */
+
+static void free_bitmap(struct ida *ida, struct ida_bitmap *bitmap)
+{
+	unsigned long flags;
+
+	if (!ida->free_bitmap) {
+		spin_lock_irqsave(&ida->idr.lock, flags);
+		if (!ida->free_bitmap) {
+			ida->free_bitmap = bitmap;
+			bitmap = NULL;
+		}
+		spin_unlock_irqrestore(&ida->idr.lock, flags);
+	}
+
+	kfree(bitmap);
+}
+
+/**
+ * ida_pre_get - reserve resources for ida allocation
+ * @ida:	ida handle
+ * @gfp_mask:	memory allocation flag
+ *
+ * This function should be called prior to locking and calling the
+ * following function.  It preallocates enough memory to satisfy the
+ * worst possible allocation.
+ *
+ * If the system is REALLY out of memory this function returns 0,
+ * otherwise 1.
+ */
+int ida_pre_get(struct ida *ida, gfp_t gfp_mask)
+{
+	/* allocate idr_layers */
+	if (!idr_pre_get(&ida->idr, gfp_mask))
+		return 0;
+
+	/* allocate free_bitmap */
+	if (!ida->free_bitmap) {
+		struct ida_bitmap *bitmap;
+
+		bitmap = kmalloc(sizeof(struct ida_bitmap), gfp_mask);
+		if (!bitmap)
+			return 0;
+
+		free_bitmap(ida, bitmap);
+	}
+
+	return 1;
+}
+EXPORT_SYMBOL(ida_pre_get);
+
+/**
+ * ida_get_new_above - allocate new ID above or equal to a start id
+ * @ida:	ida handle
+ * @staring_id:	id to start search at
+ * @p_id:	pointer to the allocated handle
+ *
+ * Allocate new ID above or equal to @ida.  It should be called with
+ * any required locks.
+ *
+ * If memory is required, it will return -EAGAIN, you should unlock
+ * and go back to the ida_pre_get() call.  If the ida is full, it will
+ * return -ENOSPC.
+ *
+ * @p_id returns a value in the range 0 ... 0x7fffffff.
+ */
+int ida_get_new_above(struct ida *ida, int starting_id, int *p_id)
+{
+	struct idr_layer *pa[MAX_LEVEL];
+	struct ida_bitmap *bitmap;
+	unsigned long flags;
+	int idr_id = starting_id / IDA_BITMAP_BITS;
+	int offset = starting_id % IDA_BITMAP_BITS;
+	int t, id;
+
+ restart:
+	/* get vacant slot */
+	t = idr_get_empty_slot(&ida->idr, idr_id, pa);
+	if (t < 0) {
+		if (t == -1)
+			return -EAGAIN;
+		else /* will be -3 */
+			return -ENOSPC;
+	}
+
+	if (t * IDA_BITMAP_BITS >= MAX_ID_BIT)
+		return -ENOSPC;
+
+	if (t != idr_id)
+		offset = 0;
+	idr_id = t;
+
+	/* if bitmap isn't there, create a new one */
+	bitmap = (void *)pa[0]->ary[idr_id & IDR_MASK];
+	if (!bitmap) {
+		spin_lock_irqsave(&ida->idr.lock, flags);
+		bitmap = ida->free_bitmap;
+		ida->free_bitmap = NULL;
+		spin_unlock_irqrestore(&ida->idr.lock, flags);
+
+		if (!bitmap)
+			return -EAGAIN;
+
+		memset(bitmap, 0, sizeof(struct ida_bitmap));
+		pa[0]->ary[idr_id & IDR_MASK] = (void *)bitmap;
+		pa[0]->count++;
+	}
+
+	/* lookup for empty slot */
+	t = find_next_zero_bit(bitmap->bitmap, IDA_BITMAP_BITS, offset);
+	if (t == IDA_BITMAP_BITS) {
+		/* no empty slot after offset, continue to the next chunk */
+		idr_id++;
+		offset = 0;
+		goto restart;
+	}
+
+	id = idr_id * IDA_BITMAP_BITS + t;
+	if (id >= MAX_ID_BIT)
+		return -ENOSPC;
+
+	__set_bit(t, bitmap->bitmap);
+	if (++bitmap->nr_busy == IDA_BITMAP_BITS)
+		idr_mark_full(pa, idr_id);
+
+	*p_id = id;
+
+	/* Each leaf node can handle nearly a thousand slots and the
+	 * whole idea of ida is to have small memory foot print.
+	 * Throw away extra resources one by one after each successful
+	 * allocation.
+	 */
+	if (ida->idr.id_free_cnt || ida->free_bitmap) {
+		struct idr_layer *p = alloc_layer(&ida->idr);
+		if (p)
+			kmem_cache_free(idr_layer_cache, p);
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL(ida_get_new_above);
+
+/**
+ * ida_get_new - allocate new ID
+ * @ida:	idr handle
+ * @p_id:	pointer to the allocated handle
+ *
+ * Allocate new ID.  It should be called with any required locks.
+ *
+ * If memory is required, it will return -EAGAIN, you should unlock
+ * and go back to the idr_pre_get() call.  If the idr is full, it will
+ * return -ENOSPC.
+ *
+ * @id returns a value in the range 0 ... 0x7fffffff.
+ */
+int ida_get_new(struct ida *ida, int *p_id)
+{
+	return ida_get_new_above(ida, 0, p_id);
+}
+EXPORT_SYMBOL(ida_get_new);
+
+/**
+ * ida_remove - remove the given ID
+ * @ida:	ida handle
+ * @id:		ID to free
+ */
+void ida_remove(struct ida *ida, int id)
+{
+	struct idr_layer *p = ida->idr.top;
+	int shift = (ida->idr.layers - 1) * IDR_BITS;
+	int idr_id = id / IDA_BITMAP_BITS;
+	int offset = id % IDA_BITMAP_BITS;
+	int n;
+	struct ida_bitmap *bitmap;
+
+	/* clear full bits while looking up the leaf idr_layer */
+	while ((shift > 0) && p) {
+		n = (idr_id >> shift) & IDR_MASK;
+		__clear_bit(n, &p->bitmap);
+		p = p->ary[n];
+		shift -= IDR_BITS;
+	}
+
+	if (p == NULL)
+		goto err;
+
+	n = idr_id & IDR_MASK;
+	__clear_bit(n, &p->bitmap);
+
+	bitmap = (void *)p->ary[n];
+	if (!test_bit(offset, bitmap->bitmap))
+		goto err;
+
+	/* update bitmap and remove it if empty */
+	__clear_bit(offset, bitmap->bitmap);
+	if (--bitmap->nr_busy == 0) {
+		__set_bit(n, &p->bitmap);	/* to please idr_remove() */
+		idr_remove(&ida->idr, idr_id);
+		free_bitmap(ida, bitmap);
+	}
+
+	return;
+
+ err:
+	printk(KERN_WARNING
+	       "ida_remove called for id=%d which is not allocated.\n", id);
+}
+EXPORT_SYMBOL(ida_remove);
+
+/**
+ * ida_destroy - release all cached layers within an ida tree
+ * ida:		ida handle
+ */
+void ida_destroy(struct ida *ida)
+{
+	idr_destroy(&ida->idr);
+	kfree(ida->free_bitmap);
+}
+EXPORT_SYMBOL(ida_destroy);
+
+/**
+ * ida_init - initialize ida handle
+ * @ida:	ida handle
+ *
+ * This function is use to set up the handle (@ida) that you will pass
+ * to the rest of the functions.
+ */
+void ida_init(struct ida *ida)
+{
+	memset(ida, 0, sizeof(struct ida));
+	idr_init(&ida->idr);
+
+}
+EXPORT_SYMBOL(ida_init);
-- 
1.5.0.3



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 07/21] sysfs: fix error handling in binattr write()
  2007-04-28 13:39 [PATCHSET] sysfs: sysfs rework, take #2 Tejun Heo
                   ` (4 preceding siblings ...)
  2007-04-28 13:39 ` [PATCH 04/21] sysfs: move release_sysfs_dirent() to dir.c Tejun Heo
@ 2007-04-28 13:39 ` Tejun Heo
  2007-04-28 13:39 ` [PATCH 05/21] sysfs: allocate inode number using ida Tejun Heo
                   ` (16 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Tejun Heo @ 2007-04-28 13:39 UTC (permalink / raw)
  To: gregkh, dmitry.torokhov, cornelia.huck, oneukum, rpurdie, stern,
	maneesh, akpm, linux-kernel, htejun
  Cc: Tejun Heo

Error handling in fs/sysfs/bin.c:write() was wrong because size_t
count is used to receive return value from flush_write() which is
negative on failure.

This patch updates write() such that int variable is used instead.
read() is updated the same way for consistency.

Signed-off-by: Tejun Heo <htejun@gmail.com>
---
 fs/sysfs/bin.c |   21 ++++++++-------------
 1 files changed, 8 insertions(+), 13 deletions(-)

diff --git a/fs/sysfs/bin.c b/fs/sysfs/bin.c
index 8ea2a51..606267a 100644
--- a/fs/sysfs/bin.c
+++ b/fs/sysfs/bin.c
@@ -33,16 +33,13 @@ fill_read(struct dentry *dentry, char *buffer, loff_t off, size_t count)
 }
 
 static ssize_t
-read(struct file * file, char __user * userbuf, size_t count, loff_t * off)
+read(struct file *file, char __user *userbuf, size_t bytes, loff_t *off)
 {
 	char *buffer = file->private_data;
 	struct dentry *dentry = file->f_path.dentry;
 	int size = dentry->d_inode->i_size;
 	loff_t offs = *off;
-	int ret;
-
-	if (count > PAGE_SIZE)
-		count = PAGE_SIZE;
+	int count = min_t(size_t, bytes, PAGE_SIZE);
 
 	if (size) {
 		if (offs > size)
@@ -51,10 +48,9 @@ read(struct file * file, char __user * userbuf, size_t count, loff_t * off)
 			count = size - offs;
 	}
 
-	ret = fill_read(dentry, buffer, offs, count);
-	if (ret < 0) 
-		return ret;
-	count = ret;
+	count = fill_read(dentry, buffer, offs, count);
+	if (count < 0)
+		return count;
 
 	if (copy_to_user(userbuf, buffer, count))
 		return -EFAULT;
@@ -78,16 +74,15 @@ flush_write(struct dentry *dentry, char *buffer, loff_t offset, size_t count)
 	return attr->write(kobj, buffer, offset, count);
 }
 
-static ssize_t write(struct file * file, const char __user * userbuf,
-		     size_t count, loff_t * off)
+static ssize_t write(struct file *file, const char __user *userbuf,
+		     size_t bytes, loff_t *off)
 {
 	char *buffer = file->private_data;
 	struct dentry *dentry = file->f_path.dentry;
 	int size = dentry->d_inode->i_size;
 	loff_t offs = *off;
+	int count = min_t(size_t, bytes, PAGE_SIZE);
 
-	if (count > PAGE_SIZE)
-		count = PAGE_SIZE;
 	if (size) {
 		if (offs > size)
 			return 0;
-- 
1.5.0.3



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 06/21] sysfs: make sysfs_put() ignore NULL sd
  2007-04-28 13:39 [PATCHSET] sysfs: sysfs rework, take #2 Tejun Heo
  2007-04-28 13:39 ` [PATCH 01/21] idr: fix obscure bug in allocation path Tejun Heo
  2007-04-28 13:39 ` [PATCH 02/21] idr: separate out idr_mark_full() Tejun Heo
@ 2007-04-28 13:39 ` Tejun Heo
  2007-04-28 13:39 ` [PATCH 03/21] ida: implement idr based id allocator Tejun Heo
                   ` (19 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Tejun Heo @ 2007-04-28 13:39 UTC (permalink / raw)
  To: gregkh, dmitry.torokhov, cornelia.huck, oneukum, rpurdie, stern,
	maneesh, akpm, linux-kernel, htejun
  Cc: Tejun Heo

Make sysfs_put() ignore NULL sd instead of oopsing.

Signed-off-by: Tejun Heo <htejun@gmail.com>
---
 fs/sysfs/sysfs.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 1d61027..a3ca6d6 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -110,7 +110,7 @@ static inline struct sysfs_dirent * sysfs_get(struct sysfs_dirent * sd)
 
 static inline void sysfs_put(struct sysfs_dirent * sd)
 {
-	if (atomic_dec_and_test(&sd->s_count))
+	if (sd && atomic_dec_and_test(&sd->s_count))
 		release_sysfs_dirent(sd);
 }
 
-- 
1.5.0.3



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 12/21] sysfs: add sysfs_dirent->s_name
  2007-04-28 13:39 [PATCHSET] sysfs: sysfs rework, take #2 Tejun Heo
                   ` (10 preceding siblings ...)
  2007-04-28 13:39 ` [PATCH 09/21] sysfs: flatten and fix sysfs_rename_dir() error handling Tejun Heo
@ 2007-04-28 13:39 ` Tejun Heo
  2007-04-28 13:39 ` [PATCH 14/21] sysfs: implement kobj_sysfs_assoc_lock Tejun Heo
                   ` (10 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Tejun Heo @ 2007-04-28 13:39 UTC (permalink / raw)
  To: gregkh, dmitry.torokhov, cornelia.huck, oneukum, rpurdie, stern,
	maneesh, akpm, linux-kernel, htejun
  Cc: Tejun Heo

Add s_name to sysfs_dirent.  This is to further reduce dependency to
the associated dentry.  Name is copied for directories and symlinks
but not for attributes.

Where possible, name dereferences are converted to use sd->s_name.
sysfs_symlink->link_name and sysfs_get_name() are unused now and
removed.

This change allows symlink to be implemented using sysfs_dirent tree
proper, which is the last remaining dentry-dependent sysfs walk.

Signed-off-by: Tejun Heo <htejun@gmail.com>
---
 fs/sysfs/dir.c        |   67 ++++++++++++++++++++++++++++++++-----------------
 fs/sysfs/file.c       |    2 +-
 fs/sysfs/inode.c      |   33 +-----------------------
 fs/sysfs/symlink.c    |    8 +-----
 fs/sysfs/sysfs.h      |    7 ++---
 include/linux/sysfs.h |    1 +
 6 files changed, 51 insertions(+), 67 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 609a2f5..e2ed341 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -53,10 +53,11 @@ void release_sysfs_dirent(struct sysfs_dirent * sd)
 
 	if (sd->s_type & SYSFS_KOBJ_LINK) {
 		struct sysfs_symlink * sl = sd->s_element;
-		kfree(sl->link_name);
 		kobject_put(sl->target_kobj);
 		kfree(sl);
 	}
+	if (sd->s_type & SYSFS_COPY_NAME)
+		kfree(sd->s_name);
 	kfree(sd->s_iattr);
 	sysfs_free_ino(sd->s_ino);
 	kmem_cache_free(sysfs_dir_cachep, sd);
@@ -82,29 +83,41 @@ static struct dentry_operations sysfs_dentry_ops = {
 	.d_iput		= sysfs_d_iput,
 };
 
-struct sysfs_dirent *sysfs_new_dirent(void *element, umode_t mode, int type)
+struct sysfs_dirent *sysfs_new_dirent(const char *name, void *element,
+				      umode_t mode, int type)
 {
-	struct sysfs_dirent * sd;
+	char *dup_name = NULL;
+	struct sysfs_dirent *sd = NULL;
+
+	if (type & SYSFS_COPY_NAME) {
+		name = dup_name = kstrdup(name, GFP_KERNEL);
+		if (!name)
+			goto err_out;
+	}
 
 	sd = kmem_cache_zalloc(sysfs_dir_cachep, GFP_KERNEL);
 	if (!sd)
-		return NULL;
+		goto err_out;
 
-	if (sysfs_alloc_ino(&sd->s_ino)) {
-		kmem_cache_free(sysfs_dir_cachep, sd);
-		return NULL;
-	}
+	if (sysfs_alloc_ino(&sd->s_ino))
+		goto err_out;
 
 	atomic_set(&sd->s_count, 1);
 	atomic_set(&sd->s_event, 1);
 	INIT_LIST_HEAD(&sd->s_children);
 	INIT_LIST_HEAD(&sd->s_sibling);
 
+	sd->s_name = name;
 	sd->s_element = element;
 	sd->s_mode = mode;
 	sd->s_type = type;
 
 	return sd;
+
+ err_out:
+	kfree(dup_name);
+	kmem_cache_free(sysfs_dir_cachep, sd);
+	return NULL;
 }
 
 void sysfs_attach_dirent(struct sysfs_dirent *sd,
@@ -136,8 +149,7 @@ int sysfs_dirent_exist(struct sysfs_dirent *parent_sd,
 
 	list_for_each_entry(sd, &parent_sd->s_children, s_sibling) {
 		if (sd->s_element) {
-			const unsigned char *existing = sysfs_get_name(sd);
-			if (strcmp(existing, new))
+			if (strcmp(sd->s_name, new))
 				continue;
 			else
 				return -EEXIST;
@@ -191,7 +203,7 @@ static int create_dir(struct kobject *kobj, struct dentry *parent,
 		goto out_dput;
 
 	error = -ENOMEM;
-	sd = sysfs_new_dirent(kobj, mode, SYSFS_DIR);
+	sd = sysfs_new_dirent(name, kobj, mode, SYSFS_DIR);
 	if (!sd)
 		goto out_drop;
 	sysfs_attach_dirent(sd, parent->d_fsdata, dentry);
@@ -316,9 +328,7 @@ static struct dentry * sysfs_lookup(struct inode *dir, struct dentry *dentry,
 
 	list_for_each_entry(sd, &parent_sd->s_children, s_sibling) {
 		if (sd->s_type & SYSFS_NOT_PINNED) {
-			const unsigned char * name = sysfs_get_name(sd);
-
-			if (strcmp(name, dentry->d_name.name))
+			if (strcmp(sd->s_name, dentry->d_name.name))
 				continue;
 
 			if (sd->s_type & SYSFS_KOBJ_LINK)
@@ -409,9 +419,11 @@ void sysfs_remove_dir(struct kobject * kobj)
 int sysfs_rename_dir(struct kobject * kobj, struct dentry *new_parent,
 		     const char *new_name)
 {
+	struct sysfs_dirent *sd = kobj->dentry->d_fsdata;
+	struct sysfs_dirent *parent_sd = new_parent->d_fsdata;
+	struct dentry *new_dentry;
+	char *dup_name;
 	int error;
-	struct dentry * new_dentry;
-	struct sysfs_dirent *sd, *parent_sd;
 
 	if (!new_parent)
 		return -EFAULT;
@@ -439,22 +451,31 @@ int sysfs_rename_dir(struct kobject * kobj, struct dentry *new_parent,
 	if (new_dentry->d_inode)
 		goto out_dput;
 
+	/* rename kobject and sysfs_dirent */
+	error = -ENOMEM;
+	new_name = dup_name = kstrdup(new_name, GFP_KERNEL);
+	if (!new_name)
+		goto out_drop;
+
 	error = kobject_set_name(kobj, "%s", new_name);
 	if (error)
-		goto out_drop;
+		goto out_free;
 
+	kfree(sd->s_name);
+	sd->s_name = new_name;
+
+	/* move under the new parent */
 	d_add(new_dentry, NULL);
 	d_move(kobj->dentry, new_dentry);
 
-	sd = kobj->dentry->d_fsdata;
-	parent_sd = new_parent->d_fsdata;
-
 	list_del_init(&sd->s_sibling);
 	list_add(&sd->s_sibling, &parent_sd->s_children);
 
 	error = 0;
 	goto out_unlock;
 
+ out_free:
+	kfree(dup_name);
  out_drop:
 	d_drop(new_dentry);
  out_dput:
@@ -517,7 +538,7 @@ static int sysfs_dir_open(struct inode *inode, struct file *file)
 	struct sysfs_dirent * sd;
 
 	mutex_lock(&dentry->d_inode->i_mutex);
-	sd = sysfs_new_dirent(NULL, 0, 0);
+	sd = sysfs_new_dirent("_DIR_", NULL, 0, 0);
 	if (sd)
 		sysfs_attach_dirent(sd, parent_sd, NULL);
 	mutex_unlock(&dentry->d_inode->i_mutex);
@@ -587,7 +608,7 @@ static int sysfs_readdir(struct file * filp, void * dirent, filldir_t filldir)
 				if (!next->s_element)
 					continue;
 
-				name = sysfs_get_name(next);
+				name = next->s_name;
 				len = strlen(name);
 				ino = next->s_ino;
 
@@ -699,7 +720,7 @@ struct dentry *sysfs_create_shadow_dir(struct kobject *kobj)
 	if (!shadow)
 		goto nomem;
 
-	sd = sysfs_new_dirent(kobj, inode->i_mode, SYSFS_DIR);
+	sd = sysfs_new_dirent("_SHADOW_", kobj, inode->i_mode, SYSFS_DIR);
 	if (!sd)
 		goto nomem;
 	/* point to parent_sd but don't attach to it */
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 80bd3b7..fe7db4e 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -484,7 +484,7 @@ int sysfs_add_file(struct dentry * dir, const struct attribute * attr, int type)
 		goto out_unlock;
 	}
 
-	sd = sysfs_new_dirent((void *)attr, mode, type);
+	sd = sysfs_new_dirent(attr->name, (void *)attr, mode, type);
 	if (!sd) {
 		error = -ENOMEM;
 		goto out_unlock;
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index 7c1f4e8..5d4d32b 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -190,37 +190,6 @@ int sysfs_create(struct dentry * dentry, int mode, int (*init)(struct inode *))
 	return error;
 }
 
-/*
- * Get the name for corresponding element represented by the given sysfs_dirent
- */
-const unsigned char * sysfs_get_name(struct sysfs_dirent *sd)
-{
-	struct attribute * attr;
-	struct bin_attribute * bin_attr;
-	struct sysfs_symlink  * sl;
-
-	BUG_ON(!sd || !sd->s_element);
-
-	switch (sd->s_type) {
-		case SYSFS_DIR:
-			/* Always have a dentry so use that */
-			return sd->s_dentry->d_name.name;
-
-		case SYSFS_KOBJ_ATTR:
-			attr = sd->s_element;
-			return attr->name;
-
-		case SYSFS_KOBJ_BIN_ATTR:
-			bin_attr = sd->s_element;
-			return bin_attr->attr.name;
-
-		case SYSFS_KOBJ_LINK:
-			sl = sd->s_element;
-			return sl->link_name;
-	}
-	return NULL;
-}
-
 static inline void orphan_all_buffers(struct inode *node)
 {
 	struct sysfs_buffer_collection *set;
@@ -288,7 +257,7 @@ int sysfs_hash_and_remove(struct dentry * dir, const char * name)
 	list_for_each_entry(sd, &parent_sd->s_children, s_sibling) {
 		if (!sd->s_element)
 			continue;
-		if (!strcmp(sysfs_get_name(sd), name)) {
+		if (!strcmp(sd->s_name, name)) {
 			list_del_init(&sd->s_sibling);
 			sysfs_drop_dentry(sd, dir);
 			sysfs_put(sd);
diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index d96bb9c..c728204 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -57,14 +57,9 @@ static int sysfs_add_link(struct dentry * parent, const char * name, struct kobj
 	if (!sl)
 		goto err_out;
 
-	sl->link_name = kmalloc(strlen(name) + 1, GFP_KERNEL);
-	if (!sl->link_name)
-		goto err_out;
-
-	strcpy(sl->link_name, name);
 	sl->target_kobj = kobject_get(target);
 
-	sd = sysfs_new_dirent(sl, S_IFLNK|S_IRWXUGO, SYSFS_KOBJ_LINK);
+	sd = sysfs_new_dirent(name, sl, S_IFLNK|S_IRWXUGO, SYSFS_KOBJ_LINK);
 	if (!sd)
 		goto err_out;
 	sysfs_attach_dirent(sd, parent_sd, NULL);
@@ -74,7 +69,6 @@ static int sysfs_add_link(struct dentry * parent, const char * name, struct kobj
  err_out:
 	if (sl) {
 		kobject_put(sl->target_kobj);
-		kfree(sl->link_name);
 		kfree(sl);
 	}
 	return error;
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 0051ae2..9d58b0f 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -3,6 +3,7 @@ struct sysfs_dirent {
 	struct sysfs_dirent	* s_parent;
 	struct list_head	s_sibling;
 	struct list_head	s_children;
+	const char		* s_name;
 	void 			* s_element;
 	int			s_type;
 	umode_t			s_mode;
@@ -21,8 +22,8 @@ extern int sysfs_create(struct dentry *, int mode, int (*init)(struct inode *));
 
 extern void release_sysfs_dirent(struct sysfs_dirent * sd);
 extern int sysfs_dirent_exist(struct sysfs_dirent *, const unsigned char *);
-extern struct sysfs_dirent *sysfs_new_dirent(void *element, umode_t mode,
-					     int type);
+extern struct sysfs_dirent *sysfs_new_dirent(const char *name, void *element,
+					     umode_t mode, int type);
 extern void sysfs_attach_dirent(struct sysfs_dirent *sd,
 				struct sysfs_dirent *parent_sd,
 				struct dentry *dentry);
@@ -34,7 +35,6 @@ extern struct sysfs_dirent *sysfs_find(struct sysfs_dirent *dir, const char * na
 extern int sysfs_create_subdir(struct kobject *, const char *, struct dentry **);
 extern void sysfs_remove_subdir(struct dentry *);
 
-extern const unsigned char * sysfs_get_name(struct sysfs_dirent *sd);
 extern void sysfs_drop_dentry(struct sysfs_dirent *sd, struct dentry *parent);
 extern int sysfs_setattr(struct dentry *dentry, struct iattr *iattr);
 
@@ -47,7 +47,6 @@ extern const struct inode_operations sysfs_dir_inode_operations;
 extern const struct inode_operations sysfs_symlink_inode_operations;
 
 struct sysfs_symlink {
-	char * link_name;
 	struct kobject * target_kobj;
 };
 
diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h
index 7d5d1ec..2f86b08 100644
--- a/include/linux/sysfs.h
+++ b/include/linux/sysfs.h
@@ -76,6 +76,7 @@ struct sysfs_ops {
 #define SYSFS_KOBJ_BIN_ATTR	0x0008
 #define SYSFS_KOBJ_LINK 	0x0020
 #define SYSFS_NOT_PINNED	(SYSFS_KOBJ_ATTR | SYSFS_KOBJ_BIN_ATTR | SYSFS_KOBJ_LINK)
+#define SYSFS_COPY_NAME		(SYSFS_DIR | SYSFS_KOBJ_LINK)
 
 #ifdef CONFIG_SYSFS
 
-- 
1.5.0.3



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 11/21] sysfs: add sysfs_dirent->s_parent
  2007-04-28 13:39 [PATCHSET] sysfs: sysfs rework, take #2 Tejun Heo
                   ` (6 preceding siblings ...)
  2007-04-28 13:39 ` [PATCH 05/21] sysfs: allocate inode number using ida Tejun Heo
@ 2007-04-28 13:39 ` Tejun Heo
  2007-04-28 13:39 ` [PATCH 08/21] sysfs: flatten cleanup paths in sysfs_add_link() and create_dir() Tejun Heo
                   ` (14 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Tejun Heo @ 2007-04-28 13:39 UTC (permalink / raw)
  To: gregkh, dmitry.torokhov, cornelia.huck, oneukum, rpurdie, stern,
	maneesh, akpm, linux-kernel, htejun
  Cc: Tejun Heo

Add sysfs_dirent->s_parent.  With this patch, each sd points to and
holds a reference to its parent.  This allows walking sysfs tree
without referencing sd->s_dentry which can go away anytime if the user
doesn't control when it's deleted.

sd->s_parent is initialized and parent is referenced in
sysfs_attach_dirent().  Reference to parent is released when the sd is
released, so as long as reference to a sd is held, s_parent can be
followed.

dentry walk in sysfs_readdir() is convereted to s_parent walk.

This will be used to reimplement symlink such that it uses only
sysfs_dirent tree.

Signed-off-by: Tejun Heo <htejun@gmail.com>
---
 fs/sysfs/dir.c   |   30 +++++++++++++++++++++++-------
 fs/sysfs/mount.c |    1 +
 fs/sysfs/sysfs.h |    1 +
 3 files changed, 25 insertions(+), 7 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index a339e47..609a2f5 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -46,6 +46,11 @@ static void sysfs_free_ino(ino_t ino)
 
 void release_sysfs_dirent(struct sysfs_dirent * sd)
 {
+	struct sysfs_dirent *parent_sd;
+
+ repeat:
+	parent_sd = sd->s_parent;
+
 	if (sd->s_type & SYSFS_KOBJ_LINK) {
 		struct sysfs_symlink * sl = sd->s_element;
 		kfree(sl->link_name);
@@ -55,6 +60,10 @@ void release_sysfs_dirent(struct sysfs_dirent * sd)
 	kfree(sd->s_iattr);
 	sysfs_free_ino(sd->s_ino);
 	kmem_cache_free(sysfs_dir_cachep, sd);
+
+	sd = parent_sd;
+	if (sd && atomic_dec_and_test(&sd->s_count))
+		goto repeat;
 }
 
 static void sysfs_d_iput(struct dentry * dentry, struct inode * inode)
@@ -107,8 +116,10 @@ void sysfs_attach_dirent(struct sysfs_dirent *sd,
 		dentry->d_op = &sysfs_dentry_ops;
 	}
 
-	if (parent_sd)
+	if (parent_sd) {
+		sd->s_parent = sysfs_get(parent_sd);
 		list_add(&sd->s_sibling, &parent_sd->s_children);
+	}
 }
 
 /*
@@ -553,7 +564,10 @@ static int sysfs_readdir(struct file * filp, void * dirent, filldir_t filldir)
 			i++;
 			/* fallthrough */
 		case 1:
-			ino = parent_ino(dentry);
+			if (parent_sd->s_parent)
+				ino = parent_sd->s_parent->s_ino;
+			else
+				ino = parent_sd->s_ino;
 			if (filldir(dirent, "..", 2, i, ino, DT_DIR) < 0)
 				break;
 			filp->f_pos++;
@@ -670,13 +684,13 @@ int sysfs_make_shadowed_dir(struct kobject *kobj,
 
 struct dentry *sysfs_create_shadow_dir(struct kobject *kobj)
 {
+	struct dentry *dir = kobj->dentry;
+	struct inode *inode = dir->d_inode;
+	struct dentry *parent = dir->d_parent;
+	struct sysfs_dirent *parent_sd = parent->d_fsdata;
+	struct dentry *shadow;
 	struct sysfs_dirent *sd;
-	struct dentry *parent, *dir, *shadow;
-	struct inode *inode;
 
-	dir = kobj->dentry;
-	inode = dir->d_inode;
-	parent = dir->d_parent;
 	shadow = ERR_PTR(-EINVAL);
 	if (!sysfs_is_shadowed_inode(inode))
 		goto out;
@@ -688,6 +702,8 @@ struct dentry *sysfs_create_shadow_dir(struct kobject *kobj)
 	sd = sysfs_new_dirent(kobj, inode->i_mode, SYSFS_DIR);
 	if (!sd)
 		goto nomem;
+	/* point to parent_sd but don't attach to it */
+	sd->s_parent = sysfs_get(parent_sd);
 	sysfs_attach_dirent(sd, NULL, shadow);
 
 	d_instantiate(shadow, igrab(inode));
diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c
index c1bafef..bc613f6 100644
--- a/fs/sysfs/mount.c
+++ b/fs/sysfs/mount.c
@@ -28,6 +28,7 @@ static const struct super_operations sysfs_ops = {
 };
 
 static struct sysfs_dirent sysfs_root = {
+	.s_count	= ATOMIC_INIT(1),
 	.s_sibling	= LIST_HEAD_INIT(sysfs_root.s_sibling),
 	.s_children	= LIST_HEAD_INIT(sysfs_root.s_children),
 	.s_element	= NULL,
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index c490156..0051ae2 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -1,5 +1,6 @@
 struct sysfs_dirent {
 	atomic_t		s_count;
+	struct sysfs_dirent	* s_parent;
 	struct list_head	s_sibling;
 	struct list_head	s_children;
 	void 			* s_element;
-- 
1.5.0.3



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 10/21] sysfs: consolidate sysfs_dirent creation functions
  2007-04-28 13:39 [PATCHSET] sysfs: sysfs rework, take #2 Tejun Heo
                   ` (8 preceding siblings ...)
  2007-04-28 13:39 ` [PATCH 08/21] sysfs: flatten cleanup paths in sysfs_add_link() and create_dir() Tejun Heo
@ 2007-04-28 13:39 ` Tejun Heo
  2007-04-28 13:39 ` [PATCH 09/21] sysfs: flatten and fix sysfs_rename_dir() error handling Tejun Heo
                   ` (12 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Tejun Heo @ 2007-04-28 13:39 UTC (permalink / raw)
  To: gregkh, dmitry.torokhov, cornelia.huck, oneukum, rpurdie, stern,
	maneesh, akpm, linux-kernel, htejun
  Cc: Tejun Heo

Currently there are four functions to create sysfs_dirent -
__sysfs_new_dirent(), sysfs_new_dirent(), __sysfs_make_dirent() and
sysfs_make_dirent().  Other than sysfs_make_dirent(), no function has
two users if calls to implement other functions are excluded.

This patch consolidates sysfs_dirent creation functions into the
following two.

* sysfs_new_dirent() : allocate and initialize
* sysfs_attach_dirent() : attach to sysfs_dirent hierarchy and/or
			  associate with dentry

This simplifies interface and gives callers more flexibility.  This is
in preparation of object reference simplification.

Signed-off-by: Tejun Heo <htejun@gmail.com>
---
 fs/sysfs/dir.c     |   82 ++++++++++++++++------------------------------------
 fs/sysfs/file.c    |   21 ++++++++++---
 fs/sysfs/symlink.c |    7 ++--
 fs/sysfs/sysfs.h   |    7 +++-
 4 files changed, 50 insertions(+), 67 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index df057da..a339e47 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -73,10 +73,7 @@ static struct dentry_operations sysfs_dentry_ops = {
 	.d_iput		= sysfs_d_iput,
 };
 
-/*
- * Allocates a new sysfs_dirent and links it to the parent sysfs_dirent
- */
-static struct sysfs_dirent * __sysfs_new_dirent(void * element)
+struct sysfs_dirent *sysfs_new_dirent(void *element, umode_t mode, int type)
 {
 	struct sysfs_dirent * sd;
 
@@ -93,25 +90,25 @@ static struct sysfs_dirent * __sysfs_new_dirent(void * element)
 	atomic_set(&sd->s_event, 1);
 	INIT_LIST_HEAD(&sd->s_children);
 	INIT_LIST_HEAD(&sd->s_sibling);
+
 	sd->s_element = element;
+	sd->s_mode = mode;
+	sd->s_type = type;
 
 	return sd;
 }
 
-static void __sysfs_list_dirent(struct sysfs_dirent *parent_sd,
-			      struct sysfs_dirent *sd)
+void sysfs_attach_dirent(struct sysfs_dirent *sd,
+			 struct sysfs_dirent *parent_sd, struct dentry *dentry)
 {
-	if (sd)
-		list_add(&sd->s_sibling, &parent_sd->s_children);
-}
+	if (dentry) {
+		sd->s_dentry = dentry;
+		dentry->d_fsdata = sysfs_get(sd);
+		dentry->d_op = &sysfs_dentry_ops;
+	}
 
-static struct sysfs_dirent * sysfs_new_dirent(struct sysfs_dirent *parent_sd,
-						void * element)
-{
-	struct sysfs_dirent *sd;
-	sd = __sysfs_new_dirent(element);
-	__sysfs_list_dirent(parent_sd, sd);
-	return sd;
+	if (parent_sd)
+		list_add(&sd->s_sibling, &parent_sd->s_children);
 }
 
 /*
@@ -139,39 +136,6 @@ int sysfs_dirent_exist(struct sysfs_dirent *parent_sd,
 	return 0;
 }
 
-
-static struct sysfs_dirent *
-__sysfs_make_dirent(struct dentry *dentry, void *element, mode_t mode, int type)
-{
-	struct sysfs_dirent * sd;
-
-	sd = __sysfs_new_dirent(element);
-	if (!sd)
-		goto out;
-
-	sd->s_mode = mode;
-	sd->s_type = type;
-	sd->s_dentry = dentry;
-	if (dentry) {
-		dentry->d_fsdata = sysfs_get(sd);
-		dentry->d_op = &sysfs_dentry_ops;
-	}
-
-out:
-	return sd;
-}
-
-int sysfs_make_dirent(struct sysfs_dirent * parent_sd, struct dentry * dentry,
-			void * element, umode_t mode, int type)
-{
-	struct sysfs_dirent *sd;
-
-	sd = __sysfs_make_dirent(dentry, element, mode, type);
-	__sysfs_list_dirent(parent_sd, sd);
-
-	return sd ? 0 : -ENOMEM;
-}
-
 static int init_dir(struct inode * inode)
 {
 	inode->i_op = &sysfs_dir_inode_operations;
@@ -215,10 +179,11 @@ static int create_dir(struct kobject *kobj, struct dentry *parent,
 	if (sysfs_dirent_exist(parent->d_fsdata, name))
 		goto out_dput;
 
-	error = sysfs_make_dirent(parent->d_fsdata, dentry, kobj, mode,
-				  SYSFS_DIR);
-	if (error)
+	error = -ENOMEM;
+	sd = sysfs_new_dirent(kobj, mode, SYSFS_DIR);
+	if (!sd)
 		goto out_drop;
+	sysfs_attach_dirent(sd, parent->d_fsdata, dentry);
 
 	error = sysfs_create(dentry, mode, init_dir);
 	if (error)
@@ -233,7 +198,6 @@ static int create_dir(struct kobject *kobj, struct dentry *parent,
 	goto out_dput;
 
  out_sput:
-	sd = dentry->d_fsdata;
 	list_del_init(&sd->s_sibling);
 	sysfs_put(sd);
  out_drop:
@@ -539,13 +503,16 @@ static int sysfs_dir_open(struct inode *inode, struct file *file)
 {
 	struct dentry * dentry = file->f_path.dentry;
 	struct sysfs_dirent * parent_sd = dentry->d_fsdata;
+	struct sysfs_dirent * sd;
 
 	mutex_lock(&dentry->d_inode->i_mutex);
-	file->private_data = sysfs_new_dirent(parent_sd, NULL);
+	sd = sysfs_new_dirent(NULL, 0, 0);
+	if (sd)
+		sysfs_attach_dirent(sd, parent_sd, NULL);
 	mutex_unlock(&dentry->d_inode->i_mutex);
 
-	return file->private_data ? 0 : -ENOMEM;
-
+	file->private_data = sd;
+	return sd ? 0 : -ENOMEM;
 }
 
 static int sysfs_dir_close(struct inode *inode, struct file *file)
@@ -718,9 +685,10 @@ struct dentry *sysfs_create_shadow_dir(struct kobject *kobj)
 	if (!shadow)
 		goto nomem;
 
-	sd = __sysfs_make_dirent(shadow, kobj, inode->i_mode, SYSFS_DIR);
+	sd = sysfs_new_dirent(kobj, inode->i_mode, SYSFS_DIR);
 	if (!sd)
 		goto nomem;
+	sysfs_attach_dirent(sd, NULL, shadow);
 
 	d_instantiate(shadow, igrab(inode));
 	inc_nlink(inode);
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index db0413a..80bd3b7 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -474,14 +474,25 @@ int sysfs_add_file(struct dentry * dir, const struct attribute * attr, int type)
 {
 	struct sysfs_dirent * parent_sd = dir->d_fsdata;
 	umode_t mode = (attr->mode & S_IALLUGO) | S_IFREG;
-	int error = -EEXIST;
+	struct sysfs_dirent *sd;
+	int error = 0;
 
 	mutex_lock(&dir->d_inode->i_mutex);
-	if (!sysfs_dirent_exist(parent_sd, attr->name))
-		error = sysfs_make_dirent(parent_sd, NULL, (void *)attr,
-					  mode, type);
-	mutex_unlock(&dir->d_inode->i_mutex);
 
+	if (sysfs_dirent_exist(parent_sd, attr->name)) {
+		error = -EEXIST;
+		goto out_unlock;
+	}
+
+	sd = sysfs_new_dirent((void *)attr, mode, type);
+	if (!sd) {
+		error = -ENOMEM;
+		goto out_unlock;
+	}
+	sysfs_attach_dirent(sd, parent_sd, NULL);
+
+ out_unlock:
+	mutex_unlock(&dir->d_inode->i_mutex);
 	return error;
 }
 
diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index b463f17..d96bb9c 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -49,6 +49,7 @@ static int sysfs_add_link(struct dentry * parent, const char * name, struct kobj
 {
 	struct sysfs_dirent * parent_sd = parent->d_fsdata;
 	struct sysfs_symlink * sl;
+	struct sysfs_dirent * sd;
 	int error;
 
 	error = -ENOMEM;
@@ -63,10 +64,10 @@ static int sysfs_add_link(struct dentry * parent, const char * name, struct kobj
 	strcpy(sl->link_name, name);
 	sl->target_kobj = kobject_get(target);
 
-	error = sysfs_make_dirent(parent_sd, NULL, sl, S_IFLNK|S_IRWXUGO,
-				SYSFS_KOBJ_LINK);
-	if (error)
+	sd = sysfs_new_dirent(sl, S_IFLNK|S_IRWXUGO, SYSFS_KOBJ_LINK);
+	if (!sd)
 		goto err_out;
+	sysfs_attach_dirent(sd, parent_sd, NULL);
 
 	return 0;
 
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index a3ca6d6..c490156 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -20,8 +20,11 @@ extern int sysfs_create(struct dentry *, int mode, int (*init)(struct inode *));
 
 extern void release_sysfs_dirent(struct sysfs_dirent * sd);
 extern int sysfs_dirent_exist(struct sysfs_dirent *, const unsigned char *);
-extern int sysfs_make_dirent(struct sysfs_dirent *, struct dentry *, void *,
-				umode_t, int);
+extern struct sysfs_dirent *sysfs_new_dirent(void *element, umode_t mode,
+					     int type);
+extern void sysfs_attach_dirent(struct sysfs_dirent *sd,
+				struct sysfs_dirent *parent_sd,
+				struct dentry *dentry);
 
 extern int sysfs_add_file(struct dentry *, const struct attribute *, int);
 extern int sysfs_hash_and_remove(struct dentry * dir, const char * name);
-- 
1.5.0.3



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 09/21] sysfs: flatten and fix sysfs_rename_dir() error handling
  2007-04-28 13:39 [PATCHSET] sysfs: sysfs rework, take #2 Tejun Heo
                   ` (9 preceding siblings ...)
  2007-04-28 13:39 ` [PATCH 10/21] sysfs: consolidate sysfs_dirent creation functions Tejun Heo
@ 2007-04-28 13:39 ` Tejun Heo
  2007-04-28 13:39 ` [PATCH 12/21] sysfs: add sysfs_dirent->s_name Tejun Heo
                   ` (11 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Tejun Heo @ 2007-04-28 13:39 UTC (permalink / raw)
  To: gregkh, dmitry.torokhov, cornelia.huck, oneukum, rpurdie, stern,
	maneesh, akpm, linux-kernel, htejun
  Cc: Tejun Heo

Error handling in sysfs_rename_dir() was broken.

* When lookup_one_len() fails, 0 is returned.

* If parent inode check fails, returns with inode mutex and rename
  rwsem held.

This patch fixes the above bugs and flattens error handling such that
it's more readable and easier to modify.

Signed-off-by: Tejun Heo <htejun@gmail.com>
---
 fs/sysfs/dir.c |   73 +++++++++++++++++++++++++++++++------------------------
 1 files changed, 41 insertions(+), 32 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 0dca2c4..df057da 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -434,8 +434,9 @@ void sysfs_remove_dir(struct kobject * kobj)
 int sysfs_rename_dir(struct kobject * kobj, struct dentry *new_parent,
 		     const char *new_name)
 {
-	int error = 0;
+	int error;
 	struct dentry * new_dentry;
+	struct sysfs_dirent *sd, *parent_sd;
 
 	if (!new_parent)
 		return -EFAULT;
@@ -444,40 +445,48 @@ int sysfs_rename_dir(struct kobject * kobj, struct dentry *new_parent,
 	mutex_lock(&new_parent->d_inode->i_mutex);
 
 	new_dentry = lookup_one_len(new_name, new_parent, strlen(new_name));
-	if (!IS_ERR(new_dentry)) {
-		/* By allowing two different directories with the
-		 * same d_parent we allow this routine to move
-		 * between different shadows of the same directory
-		 */
-		if (kobj->dentry->d_parent->d_inode != new_parent->d_inode)
-			return -EINVAL;
-		else if (new_dentry->d_parent->d_inode != new_parent->d_inode)
-			error = -EINVAL;
-		else if (new_dentry == kobj->dentry)
-			error = -EINVAL;
-		else if (!new_dentry->d_inode) {
-			error = kobject_set_name(kobj, "%s", new_name);
-			if (!error) {
-				struct sysfs_dirent *sd, *parent_sd;
-
-				d_add(new_dentry, NULL);
-				d_move(kobj->dentry, new_dentry);
-
-				sd = kobj->dentry->d_fsdata;
-				parent_sd = new_parent->d_fsdata;
-
-				list_del_init(&sd->s_sibling);
-				list_add(&sd->s_sibling, &parent_sd->s_children);
-			}
-			else
-				d_drop(new_dentry);
-		} else
-			error = -EEXIST;
-		dput(new_dentry);
+	if (IS_ERR(new_dentry)) {
+		error = PTR_ERR(new_dentry);
+		goto out_unlock;
 	}
+
+	/* By allowing two different directories with the same
+	 * d_parent we allow this routine to move between different
+	 * shadows of the same directory
+	 */
+	error = -EINVAL;
+	if (kobj->dentry->d_parent->d_inode != new_parent->d_inode ||
+	    new_dentry->d_parent->d_inode != new_parent->d_inode ||
+	    new_dentry == kobj->dentry)
+		goto out_dput;
+
+	error = -EEXIST;
+	if (new_dentry->d_inode)
+		goto out_dput;
+
+	error = kobject_set_name(kobj, "%s", new_name);
+	if (error)
+		goto out_drop;
+
+	d_add(new_dentry, NULL);
+	d_move(kobj->dentry, new_dentry);
+
+	sd = kobj->dentry->d_fsdata;
+	parent_sd = new_parent->d_fsdata;
+
+	list_del_init(&sd->s_sibling);
+	list_add(&sd->s_sibling, &parent_sd->s_children);
+
+	error = 0;
+	goto out_unlock;
+
+ out_drop:
+	d_drop(new_dentry);
+ out_dput:
+	dput(new_dentry);
+ out_unlock:
 	mutex_unlock(&new_parent->d_inode->i_mutex);
 	up_write(&sysfs_rename_sem);
-
 	return error;
 }
 
-- 
1.5.0.3



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 08/21] sysfs: flatten cleanup paths in sysfs_add_link() and create_dir()
  2007-04-28 13:39 [PATCHSET] sysfs: sysfs rework, take #2 Tejun Heo
                   ` (7 preceding siblings ...)
  2007-04-28 13:39 ` [PATCH 11/21] sysfs: add sysfs_dirent->s_parent Tejun Heo
@ 2007-04-28 13:39 ` Tejun Heo
  2007-04-28 13:39 ` [PATCH 10/21] sysfs: consolidate sysfs_dirent creation functions Tejun Heo
                   ` (13 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Tejun Heo @ 2007-04-28 13:39 UTC (permalink / raw)
  To: gregkh, dmitry.torokhov, cornelia.huck, oneukum, rpurdie, stern,
	maneesh, akpm, linux-kernel, htejun
  Cc: Tejun Heo

Flatten cleanup paths in sysfs_add_link() and create_dir() to improve
readability and ease further changes to these functions.  This is in
preparation of object reference simplification.

Signed-off-by: Tejun Heo <htejun@gmail.com>
---
 fs/sysfs/dir.c     |   73 ++++++++++++++++++++++++++++++---------------------
 fs/sysfs/symlink.c |   27 ++++++++++--------
 2 files changed, 58 insertions(+), 42 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index ada8405..0dca2c4 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -195,40 +195,53 @@ static int init_symlink(struct inode * inode)
 	return 0;
 }
 
-static int create_dir(struct kobject * k, struct dentry * p,
-		      const char * n, struct dentry ** d)
+static int create_dir(struct kobject *kobj, struct dentry *parent,
+		      const char *name, struct dentry **p_dentry)
 {
 	int error;
 	umode_t mode = S_IFDIR| S_IRWXU | S_IRUGO | S_IXUGO;
+	struct dentry *dentry;
+	struct sysfs_dirent *sd;
 
-	mutex_lock(&p->d_inode->i_mutex);
-	*d = lookup_one_len(n, p, strlen(n));
-	if (!IS_ERR(*d)) {
- 		if (sysfs_dirent_exist(p->d_fsdata, n))
-  			error = -EEXIST;
-  		else
-			error = sysfs_make_dirent(p->d_fsdata, *d, k, mode,
-								SYSFS_DIR);
-		if (!error) {
-			error = sysfs_create(*d, mode, init_dir);
-			if (!error) {
-				inc_nlink(p->d_inode);
-				(*d)->d_op = &sysfs_dentry_ops;
-				d_rehash(*d);
-			}
-		}
-		if (error && (error != -EEXIST)) {
-			struct sysfs_dirent *sd = (*d)->d_fsdata;
-			if (sd) {
- 				list_del_init(&sd->s_sibling);
-				sysfs_put(sd);
-			}
-			d_drop(*d);
-		}
-		dput(*d);
-	} else
-		error = PTR_ERR(*d);
-	mutex_unlock(&p->d_inode->i_mutex);
+	mutex_lock(&parent->d_inode->i_mutex);
+
+	dentry = lookup_one_len(name, parent, strlen(name));
+	if (IS_ERR(dentry)) {
+		error = PTR_ERR(dentry);
+		goto out_unlock;
+	}
+
+	error = -EEXIST;
+	if (sysfs_dirent_exist(parent->d_fsdata, name))
+		goto out_dput;
+
+	error = sysfs_make_dirent(parent->d_fsdata, dentry, kobj, mode,
+				  SYSFS_DIR);
+	if (error)
+		goto out_drop;
+
+	error = sysfs_create(dentry, mode, init_dir);
+	if (error)
+		goto out_sput;
+
+	inc_nlink(parent->d_inode);
+	dentry->d_op = &sysfs_dentry_ops;
+	d_rehash(dentry);
+
+	*p_dentry = dentry;
+	error = 0;
+	goto out_dput;
+
+ out_sput:
+	sd = dentry->d_fsdata;
+	list_del_init(&sd->s_sibling);
+	sysfs_put(sd);
+ out_drop:
+	d_drop(dentry);
+ out_dput:
+	dput(dentry);
+ out_unlock:
+	mutex_unlock(&parent->d_inode->i_mutex);
 	return error;
 }
 
diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index 7b9c5bf..b463f17 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -49,30 +49,33 @@ static int sysfs_add_link(struct dentry * parent, const char * name, struct kobj
 {
 	struct sysfs_dirent * parent_sd = parent->d_fsdata;
 	struct sysfs_symlink * sl;
-	int error = 0;
+	int error;
 
 	error = -ENOMEM;
-	sl = kmalloc(sizeof(*sl), GFP_KERNEL);
+	sl = kzalloc(sizeof(*sl), GFP_KERNEL);
 	if (!sl)
-		goto exit1;
+		goto err_out;
 
 	sl->link_name = kmalloc(strlen(name) + 1, GFP_KERNEL);
 	if (!sl->link_name)
-		goto exit2;
+		goto err_out;
 
 	strcpy(sl->link_name, name);
 	sl->target_kobj = kobject_get(target);
 
 	error = sysfs_make_dirent(parent_sd, NULL, sl, S_IFLNK|S_IRWXUGO,
 				SYSFS_KOBJ_LINK);
-	if (!error)
-		return 0;
-
-	kobject_put(target);
-	kfree(sl->link_name);
-exit2:
-	kfree(sl);
-exit1:
+	if (error)
+		goto err_out;
+
+	return 0;
+
+ err_out:
+	if (sl) {
+		kobject_put(sl->target_kobj);
+		kfree(sl->link_name);
+		kfree(sl);
+	}
 	return error;
 }
 
-- 
1.5.0.3



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 20/21] sysfs: separate out sysfs_attach_dentry()
  2007-04-28 13:39 [PATCHSET] sysfs: sysfs rework, take #2 Tejun Heo
                   ` (16 preceding siblings ...)
  2007-04-28 13:39 ` [PATCH 13/21] sysfs: make sysfs_dirent->s_element a union Tejun Heo
@ 2007-04-28 13:39 ` Tejun Heo
  2007-04-28 13:39 ` [PATCH 17/21] sysfs: implement sysfs_dirent active reference and immediate disconnect Tejun Heo
                   ` (4 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Tejun Heo @ 2007-04-28 13:39 UTC (permalink / raw)
  To: gregkh, dmitry.torokhov, cornelia.huck, oneukum, rpurdie, stern,
	maneesh, akpm, linux-kernel, htejun
  Cc: Tejun Heo

Consolidate sd <-> dentry association into sysfs_attach_dentry() and
call it after dentry and inode are properly set up.  This is in
preparation of sysfs_drop_dentry() updates.

Signed-off-by: Tejun Heo <htejun@gmail.com>
---
 fs/sysfs/dir.c   |   48 +++++++++++++++++++++---------------------------
 fs/sysfs/inode.c |    4 ++--
 fs/sysfs/sysfs.h |    3 ++-
 3 files changed, 25 insertions(+), 30 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 265042a..1a45a1d 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -130,14 +130,19 @@ struct sysfs_dirent *sysfs_new_dirent(const char *name, umode_t mode, int type)
 	return NULL;
 }
 
+static void sysfs_attach_dentry(struct sysfs_dirent *sd, struct dentry *dentry)
+{
+	dentry->d_op = &sysfs_dentry_ops;
+	dentry->d_fsdata = sysfs_get(sd);
+	sd->s_dentry = dentry;
+	d_rehash(dentry);
+}
+
 void sysfs_attach_dirent(struct sysfs_dirent *sd,
 			 struct sysfs_dirent *parent_sd, struct dentry *dentry)
 {
-	if (dentry) {
-		sd->s_dentry = dentry;
-		dentry->d_fsdata = sysfs_get(sd);
-		dentry->d_op = &sysfs_dentry_ops;
-	}
+	if (dentry)
+		sysfs_attach_dentry(sd, dentry);
 
 	if (parent_sd) {
 		sd->s_parent = sysfs_get(parent_sd);
@@ -217,15 +222,13 @@ static int create_dir(struct kobject *kobj, struct dentry *parent,
 	if (!sd)
 		goto out_drop;
 	sd->s_elem.dir.kobj = kobj;
-	sysfs_attach_dirent(sd, parent->d_fsdata, dentry);
 
-	error = sysfs_create(dentry, mode, init_dir);
+	error = sysfs_create(sd, dentry, mode, init_dir);
 	if (error)
 		goto out_sput;
 
 	inc_nlink(parent->d_inode);
-	dentry->d_op = &sysfs_dentry_ops;
-	d_rehash(dentry);
+	sysfs_attach_dirent(sd, parent->d_fsdata, dentry);
 
 	*p_dentry = dentry;
 	error = 0;
@@ -296,36 +299,28 @@ static int sysfs_attach_attr(struct sysfs_dirent * sd, struct dentry * dentry)
                 init = init_file;
         }
 
-	dentry->d_fsdata = sysfs_get(sd);
-	sd->s_dentry = dentry;
-	error = sysfs_create(dentry, (attr->mode & S_IALLUGO) | S_IFREG, init);
-	if (error) {
-		sysfs_put(sd);
+	error = sysfs_create(sd, dentry,
+			     (attr->mode & S_IALLUGO) | S_IFREG, init);
+	if (error)
 		return error;
-	}
 
         if (bin_attr) {
 		dentry->d_inode->i_size = bin_attr->size;
 		dentry->d_inode->i_fop = &bin_fops;
 	}
-	dentry->d_op = &sysfs_dentry_ops;
-	d_rehash(dentry);
+
+	sysfs_attach_dentry(sd, dentry);
 
 	return 0;
 }
 
 static int sysfs_attach_link(struct sysfs_dirent * sd, struct dentry * dentry)
 {
-	int err = 0;
+	int err;
 
-	dentry->d_fsdata = sysfs_get(sd);
-	sd->s_dentry = dentry;
-	err = sysfs_create(dentry, S_IFLNK|S_IRWXUGO, init_symlink);
-	if (!err) {
-		dentry->d_op = &sysfs_dentry_ops;
-		d_rehash(dentry);
-	} else
-		sysfs_put(sd);
+	err = sysfs_create(sd, dentry, S_IFLNK|S_IRWXUGO, init_symlink);
+	if (!err)
+		sysfs_attach_dentry(sd, dentry);
 
 	return err;
 }
@@ -755,7 +750,6 @@ struct dentry *sysfs_create_shadow_dir(struct kobject *kobj)
 	d_instantiate(shadow, igrab(inode));
 	inc_nlink(inode);
 	inc_nlink(parent->d_inode);
-	shadow->d_op = &sysfs_dentry_ops;
 
 	dget(shadow);		/* Extra count - pin the dentry in core */
 
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index b72d18a..10aa33f 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -155,13 +155,13 @@ struct inode * sysfs_new_inode(mode_t mode, struct sysfs_dirent * sd)
 	return inode;
 }
 
-int sysfs_create(struct dentry * dentry, int mode, int (*init)(struct inode *))
+int sysfs_create(struct sysfs_dirent *sd, struct dentry *dentry, int mode,
+		 int (*init)(struct inode *))
 {
 	int error = 0;
 	struct inode * inode = NULL;
 	if (dentry) {
 		if (!dentry->d_inode) {
-			struct sysfs_dirent * sd = dentry->d_fsdata;
 			if ((inode = sysfs_new_inode(mode, sd))) {
 				if (dentry->d_parent && dentry->d_parent->d_inode) {
 					struct inode *p_inode = dentry->d_parent->d_inode;
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 537c64b..06a237e 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -58,7 +58,8 @@ extern struct kmem_cache *sysfs_dir_cachep;
 
 extern void sysfs_delete_inode(struct inode *inode);
 extern struct inode * sysfs_new_inode(mode_t mode, struct sysfs_dirent *);
-extern int sysfs_create(struct dentry *, int mode, int (*init)(struct inode *));
+extern int sysfs_create(struct sysfs_dirent *sd, struct dentry *dentry,
+			int mode, int (*init)(struct inode *));
 
 extern void release_sysfs_dirent(struct sysfs_dirent * sd);
 extern int sysfs_dirent_exist(struct sysfs_dirent *, const unsigned char *);
-- 
1.5.0.3



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 16/21] sysfs: implement bin_buffer
  2007-04-28 13:39 [PATCHSET] sysfs: sysfs rework, take #2 Tejun Heo
                   ` (12 preceding siblings ...)
  2007-04-28 13:39 ` [PATCH 14/21] sysfs: implement kobj_sysfs_assoc_lock Tejun Heo
@ 2007-04-28 13:39 ` Tejun Heo
  2007-05-01 14:25   ` Satyam Sharma
  2007-04-28 13:39 ` [PATCH 18/21] sysfs: kill attribute file orphaning Tejun Heo
                   ` (8 subsequent siblings)
  22 siblings, 1 reply; 30+ messages in thread
From: Tejun Heo @ 2007-04-28 13:39 UTC (permalink / raw)
  To: gregkh, dmitry.torokhov, cornelia.huck, oneukum, rpurdie, stern,
	maneesh, akpm, linux-kernel, htejun
  Cc: Tejun Heo

Implement bin_buffer which contains a mutex and pointer to PAGE_SIZE
buffer to properly synchronize accesses to per-openfile buffer and
prepare for immediate-kobj-disconnect.

Signed-off-by: Tejun Heo <htejun@gmail.com>
---
 fs/sysfs/bin.c |   64 ++++++++++++++++++++++++++++++++++++++++++-------------
 1 files changed, 49 insertions(+), 15 deletions(-)

diff --git a/fs/sysfs/bin.c b/fs/sysfs/bin.c
index 67a0d50..5dc47fe 100644
--- a/fs/sysfs/bin.c
+++ b/fs/sysfs/bin.c
@@ -20,6 +20,11 @@
 
 #include "sysfs.h"
 
+struct bin_buffer {
+	struct mutex	mutex;
+	void		*buffer;
+};
+
 static int
 fill_read(struct dentry *dentry, char *buffer, loff_t off, size_t count)
 {
@@ -36,7 +41,7 @@ fill_read(struct dentry *dentry, char *buffer, loff_t off, size_t count)
 static ssize_t
 read(struct file *file, char __user *userbuf, size_t bytes, loff_t *off)
 {
-	char *buffer = file->private_data;
+	struct bin_buffer *bb = file->private_data;
 	struct dentry *dentry = file->f_path.dentry;
 	int size = dentry->d_inode->i_size;
 	loff_t offs = *off;
@@ -49,17 +54,23 @@ read(struct file *file, char __user *userbuf, size_t bytes, loff_t *off)
 			count = size - offs;
 	}
 
-	count = fill_read(dentry, buffer, offs, count);
+	mutex_lock(&bb->mutex);
+
+	count = fill_read(dentry, bb->buffer, offs, count);
 	if (count < 0)
-		return count;
+		goto out_unlock;
 
-	if (copy_to_user(userbuf, buffer, count))
-		return -EFAULT;
+	if (copy_to_user(userbuf, bb->buffer, count)) {
+		count = -EFAULT;
+		goto out_unlock;
+	}
 
 	pr_debug("offs = %lld, *off = %lld, count = %d\n", offs, *off, count);
 
 	*off = offs + count;
 
+ out_unlock:
+	mutex_unlock(&bb->mutex);
 	return count;
 }
 
@@ -79,7 +90,7 @@ flush_write(struct dentry *dentry, char *buffer, loff_t offset, size_t count)
 static ssize_t write(struct file *file, const char __user *userbuf,
 		     size_t bytes, loff_t *off)
 {
-	char *buffer = file->private_data;
+	struct bin_buffer *bb = file->private_data;
 	struct dentry *dentry = file->f_path.dentry;
 	int size = dentry->d_inode->i_size;
 	loff_t offs = *off;
@@ -92,25 +103,38 @@ static ssize_t write(struct file *file, const char __user *userbuf,
 			count = size - offs;
 	}
 
-	if (copy_from_user(buffer, userbuf, count))
-		return -EFAULT;
+	mutex_lock(&bb->mutex);
+
+	if (copy_from_user(bb->buffer, userbuf, count)) {
+		count = -EFAULT;
+		goto out_unlock;
+	}
 
-	count = flush_write(dentry, buffer, offs, count);
+	count = flush_write(dentry, bb->buffer, offs, count);
 	if (count > 0)
 		*off = offs + count;
+
+ out_unlock:
+	mutex_unlock(&bb->mutex);
 	return count;
 }
 
 static int mmap(struct file *file, struct vm_area_struct *vma)
 {
+	struct bin_buffer *bb = file->private_data;
 	struct sysfs_dirent *attr_sd = file->f_path.dentry->d_fsdata;
 	struct bin_attribute *attr = attr_sd->s_elem.bin_attr.bin_attr;
 	struct kobject *kobj = to_kobj(file->f_path.dentry->d_parent);
+	int rc;
 
 	if (!attr->mmap)
 		return -EINVAL;
 
-	return attr->mmap(kobj, attr, vma);
+	mutex_lock(&bb->mutex);
+	rc = attr->mmap(kobj, attr, vma);
+	mutex_unlock(&bb->mutex);
+
+	return rc;
 }
 
 static int open(struct inode * inode, struct file * file)
@@ -118,6 +142,7 @@ static int open(struct inode * inode, struct file * file)
 	struct kobject *kobj = sysfs_get_kobject(file->f_path.dentry->d_parent);
 	struct sysfs_dirent *attr_sd = file->f_path.dentry->d_fsdata;
 	struct bin_attribute *attr = attr_sd->s_elem.bin_attr.bin_attr;
+	struct bin_buffer *bb = NULL;
 	int error = -EINVAL;
 
 	if (!kobj || !attr)
@@ -135,14 +160,22 @@ static int open(struct inode * inode, struct file * file)
 		goto Error;
 
 	error = -ENOMEM;
-	file->private_data = kmalloc(PAGE_SIZE, GFP_KERNEL);
-	if (!file->private_data)
+	bb = kzalloc(sizeof(*bb), GFP_KERNEL);
+	if (!bb)
 		goto Error;
 
+	bb->buffer = kmalloc(PAGE_SIZE, GFP_KERNEL);
+	if (!bb->buffer)
+		goto Error;
+
+	mutex_init(&bb->mutex);
+	file->private_data = bb;
+
 	error = 0;
-    goto Done;
+	goto Done;
 
  Error:
+	kfree(bb);
 	module_put(attr->attr.owner);
  Done:
 	if (error)
@@ -155,11 +188,12 @@ static int release(struct inode * inode, struct file * file)
 	struct kobject * kobj = to_kobj(file->f_path.dentry->d_parent);
 	struct sysfs_dirent *attr_sd = file->f_path.dentry->d_fsdata;
 	struct bin_attribute *attr = attr_sd->s_elem.bin_attr.bin_attr;
-	u8 * buffer = file->private_data;
+	struct bin_buffer *bb = file->private_data;
 
 	kobject_put(kobj);
 	module_put(attr->attr.owner);
-	kfree(buffer);
+	kfree(bb->buffer);
+	kfree(bb);
 	return 0;
 }
 
-- 
1.5.0.3



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 17/21] sysfs: implement sysfs_dirent active reference and immediate disconnect
  2007-04-28 13:39 [PATCHSET] sysfs: sysfs rework, take #2 Tejun Heo
                   ` (17 preceding siblings ...)
  2007-04-28 13:39 ` [PATCH 20/21] sysfs: separate out sysfs_attach_dentry() Tejun Heo
@ 2007-04-28 13:39 ` Tejun Heo
  2007-04-28 13:39 ` [PATCH 19/21] sysfs: kill unnecessary attribute->owner Tejun Heo
                   ` (3 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Tejun Heo @ 2007-04-28 13:39 UTC (permalink / raw)
  To: gregkh, dmitry.torokhov, cornelia.huck, oneukum, rpurdie, stern,
	maneesh, akpm, linux-kernel, htejun
  Cc: Tejun Heo

[PATCH] sysfs: implement sysfs_dirent active reference and immediate disconnect

Opening a sysfs node references its associated kobject, so userland
can arbitrarily prolong lifetime of a kobject which complicates
lifetime rules in drivers.  This patch implements active reference and
makes the association between kobject and sysfs immediately breakable.

Now each sysfs_dirent has two reference counts - s_count and s_active.
s_count is a regular reference count which guarantees that the
containing sysfs_dirent is accessible.  As long as s_count reference
is held, all sysfs internal fields in sysfs_dirent are accessible
including s_parent and s_name.

The newly added s_active is active reference count.  This is acquired
by invoking sysfs_get_active() and it's the caller's responsibility to
ensure sysfs_dirent itself is accessible (should be holding s_count
one way or the other).  Dereferencing sysfs_dirent to access objects
out of sysfs proper requires active reference.  This includes access
to the associated kobjects, attributes and ops.

The active references can be drained and denied by calling
sysfs_deactivate().  All active sysfs_dirents must be deactivated
after deletion but before the default reference is dropped.  This
enables immediate disconnect of sysfs nodes.  Once a sysfs_dirent is
deleted, it won't access any entity external to sysfs proper.

Because attr/bin_attr ops access both the node itself and its parent
for kobject, they need to hold active references to both.
sysfs_get/put_active_two() helpers are provided to help grabbing both
references.  Parent's is acquired first and released last.

Unlike other operations, mmapped area lingers on after mmap() is
finished and the module implement implementing it and kobj need to
stay referenced till all the mapped pages are gone.  This is
accomplished by holding one set of active references to the bin_attr
and its parent if there have been any mmap during lifetime of an
openfile.  The references are dropped when the openfile is released.

This change makes sysfs lifetime rules independent from both kobject's
and module's.  It not only fixes several race conditions caused by
sysfs not holding onto the proper module when referencing kobject, but
also helps fixing and simplifying lifetime management in driver model
and drivers by taking sysfs out of the equation.

Please read the following message for more info.

  http://article.gmane.org/gmane.linux.kernel/510293

Signed-off-by: Tejun Heo <htejun@gmail.com>
---
 fs/sysfs/bin.c   |   95 ++++++++++++++++++++++++++--------------
 fs/sysfs/dir.c   |   28 ++++++++++-
 fs/sysfs/file.c  |  130 +++++++++++++++++++++++++++++++----------------------
 fs/sysfs/inode.c |    8 +++-
 fs/sysfs/sysfs.h |  123 ++++++++++++++++++++++++++++++++++++++++++---------
 5 files changed, 271 insertions(+), 113 deletions(-)

diff --git a/fs/sysfs/bin.c b/fs/sysfs/bin.c
index 5dc47fe..618b8ae 100644
--- a/fs/sysfs/bin.c
+++ b/fs/sysfs/bin.c
@@ -23,6 +23,7 @@
 struct bin_buffer {
 	struct mutex	mutex;
 	void		*buffer;
+	int		mmapped;
 };
 
 static int
@@ -30,12 +31,20 @@ fill_read(struct dentry *dentry, char *buffer, loff_t off, size_t count)
 {
 	struct sysfs_dirent *attr_sd = dentry->d_fsdata;
 	struct bin_attribute *attr = attr_sd->s_elem.bin_attr.bin_attr;
-	struct kobject * kobj = to_kobj(dentry->d_parent);
+	struct kobject *kobj = attr_sd->s_parent->s_elem.dir.kobj;
+	int rc;
+
+	/* need attr_sd for attr, its parent for kobj */
+	if (!sysfs_get_active_two(attr_sd))
+		return -ENODEV;
 
-	if (!attr->read)
-		return -EIO;
+	rc = -EIO;
+	if (attr->read)
+		rc = attr->read(kobj, buffer, off, count);
 
-	return attr->read(kobj, buffer, off, count);
+	sysfs_put_active_two(attr_sd);
+
+	return rc;
 }
 
 static ssize_t
@@ -79,12 +88,20 @@ flush_write(struct dentry *dentry, char *buffer, loff_t offset, size_t count)
 {
 	struct sysfs_dirent *attr_sd = dentry->d_fsdata;
 	struct bin_attribute *attr = attr_sd->s_elem.bin_attr.bin_attr;
-	struct kobject *kobj = to_kobj(dentry->d_parent);
+	struct kobject *kobj = attr_sd->s_parent->s_elem.dir.kobj;
+	int rc;
+
+	/* need attr_sd for attr, its parent for kobj */
+	if (!sysfs_get_active_two(attr_sd))
+		return -ENODEV;
 
-	if (!attr->write)
-		return -EIO;
+	rc = -EIO;
+	if (attr->write)
+		rc = attr->write(kobj, buffer, offset, count);
 
-	return attr->write(kobj, buffer, offset, count);
+	sysfs_put_active_two(attr_sd);
+
+	return rc;
 }
 
 static ssize_t write(struct file *file, const char __user *userbuf,
@@ -124,14 +141,24 @@ static int mmap(struct file *file, struct vm_area_struct *vma)
 	struct bin_buffer *bb = file->private_data;
 	struct sysfs_dirent *attr_sd = file->f_path.dentry->d_fsdata;
 	struct bin_attribute *attr = attr_sd->s_elem.bin_attr.bin_attr;
-	struct kobject *kobj = to_kobj(file->f_path.dentry->d_parent);
+	struct kobject *kobj = attr_sd->s_parent->s_elem.dir.kobj;
 	int rc;
 
-	if (!attr->mmap)
-		return -EINVAL;
-
 	mutex_lock(&bb->mutex);
-	rc = attr->mmap(kobj, attr, vma);
+
+	/* need attr_sd for attr, its parent for kobj */
+	if (!sysfs_get_active_two(attr_sd))
+		return -ENODEV;
+
+	rc = -EINVAL;
+	if (attr->mmap)
+		rc = attr->mmap(kobj, attr, vma);
+
+	if (rc == 0 && !bb->mmapped)
+		bb->mmapped = 1;
+	else
+		sysfs_put_active_two(attr_sd);
+
 	mutex_unlock(&bb->mutex);
 
 	return rc;
@@ -139,58 +166,60 @@ static int mmap(struct file *file, struct vm_area_struct *vma)
 
 static int open(struct inode * inode, struct file * file)
 {
-	struct kobject *kobj = sysfs_get_kobject(file->f_path.dentry->d_parent);
 	struct sysfs_dirent *attr_sd = file->f_path.dentry->d_fsdata;
 	struct bin_attribute *attr = attr_sd->s_elem.bin_attr.bin_attr;
 	struct bin_buffer *bb = NULL;
-	int error = -EINVAL;
+	int error;
 
-	if (!kobj || !attr)
-		goto Done;
+	/* need attr_sd for attr */
+	if (!sysfs_get_active(attr_sd))
+		return -ENODEV;
 
-	/* Grab the module reference for this attribute if we have one */
+	/* Grab the module reference for this attribute */
 	error = -ENODEV;
-	if (!try_module_get(attr->attr.owner)) 
-		goto Done;
+	if (!try_module_get(attr->attr.owner))
+		goto err_sput;
 
 	error = -EACCES;
 	if ((file->f_mode & FMODE_WRITE) && !(attr->write || attr->mmap))
-		goto Error;
+		goto err_mput;
 	if ((file->f_mode & FMODE_READ) && !(attr->read || attr->mmap))
-		goto Error;
+		goto err_mput;
 
 	error = -ENOMEM;
 	bb = kzalloc(sizeof(*bb), GFP_KERNEL);
 	if (!bb)
-		goto Error;
+		goto err_mput;
 
 	bb->buffer = kmalloc(PAGE_SIZE, GFP_KERNEL);
 	if (!bb->buffer)
-		goto Error;
+		goto err_mput;
 
 	mutex_init(&bb->mutex);
 	file->private_data = bb;
 
-	error = 0;
-	goto Done;
+	/* open succeeded, put active reference and pin attr_sd */
+	sysfs_put_active(attr_sd);
+	sysfs_get(attr_sd);
+	return 0;
 
- Error:
-	kfree(bb);
+ err_mput:
 	module_put(attr->attr.owner);
- Done:
-	if (error)
-		kobject_put(kobj);
+ err_sput:
+	sysfs_put_active(attr_sd);
+	kfree(bb);
 	return error;
 }
 
 static int release(struct inode * inode, struct file * file)
 {
-	struct kobject * kobj = to_kobj(file->f_path.dentry->d_parent);
 	struct sysfs_dirent *attr_sd = file->f_path.dentry->d_fsdata;
 	struct bin_attribute *attr = attr_sd->s_elem.bin_attr.bin_attr;
 	struct bin_buffer *bb = file->private_data;
 
-	kobject_put(kobj);
+	if (bb->mmapped)
+		sysfs_put_active_two(attr_sd);
+	sysfs_put(attr_sd);
 	module_put(attr->attr.owner);
 	kfree(bb->buffer);
 	kfree(bb);
diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 36d47d3..265042a 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -52,6 +52,19 @@ void release_sysfs_dirent(struct sysfs_dirent * sd)
  repeat:
 	parent_sd = sd->s_parent;
 
+	/* If @sd is being released after deletion, s_active is write
+	 * locked.  If @sd is cursor for directory walk or being
+	 * released prematurely, s_active has no reader or writer.
+	 *
+	 * sysfs_deactivate() lies to lockdep that s_active is
+	 * unlocked immediately.  Lie one more time to cover the
+	 * previous lie.
+	 */
+	if (!down_write_trylock(&sd->s_active))
+		rwsem_acquire(&sd->s_active.dep_map,
+			      SYSFS_S_ACTIVE_DEACTIVATE, 0, _RET_IP_);
+	up_write(&sd->s_active);
+
 	if (sd->s_type & SYSFS_KOBJ_LINK)
 		sysfs_put(sd->s_elem.symlink.target_sd);
 	if (sd->s_type & SYSFS_COPY_NAME)
@@ -101,6 +114,7 @@ struct sysfs_dirent *sysfs_new_dirent(const char *name, umode_t mode, int type)
 
 	atomic_set(&sd->s_count, 1);
 	atomic_set(&sd->s_event, 1);
+	init_rwsem(&sd->s_active);
 	INIT_LIST_HEAD(&sd->s_children);
 	INIT_LIST_HEAD(&sd->s_sibling);
 
@@ -353,7 +367,6 @@ static void remove_dir(struct dentry * d)
 	d_delete(d);
 	sd = d->d_fsdata;
  	list_del_init(&sd->s_sibling);
-	sysfs_put(sd);
 	if (d->d_inode)
 		simple_rmdir(parent->d_inode,d);
 
@@ -362,6 +375,9 @@ static void remove_dir(struct dentry * d)
 
 	mutex_unlock(&parent->d_inode->i_mutex);
 	dput(parent);
+
+	sysfs_deactivate(sd);
+	sysfs_put(sd);
 }
 
 void sysfs_remove_subdir(struct dentry * d)
@@ -372,6 +388,7 @@ void sysfs_remove_subdir(struct dentry * d)
 
 static void __sysfs_remove_dir(struct dentry *dentry)
 {
+	LIST_HEAD(removed);
 	struct sysfs_dirent * parent_sd;
 	struct sysfs_dirent * sd, * tmp;
 
@@ -385,12 +402,17 @@ static void __sysfs_remove_dir(struct dentry *dentry)
 	list_for_each_entry_safe(sd, tmp, &parent_sd->s_children, s_sibling) {
 		if (!sd->s_type || !(sd->s_type & SYSFS_NOT_PINNED))
 			continue;
-		list_del_init(&sd->s_sibling);
+		list_move(&sd->s_sibling, &removed);
 		sysfs_drop_dentry(sd, dentry);
-		sysfs_put(sd);
 	}
 	mutex_unlock(&dentry->d_inode->i_mutex);
 
+	list_for_each_entry_safe(sd, tmp, &removed, s_sibling) {
+		list_del_init(&sd->s_sibling);
+		sysfs_deactivate(sd);
+		sysfs_put(sd);
+	}
+
 	remove_dir(dentry);
 	/**
 	 * Drop reference from dget() on entrance.
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index b52b4da..df0e0c0 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -88,8 +88,8 @@ remove_from_collection(struct sysfs_buffer *buffer, struct inode *node)
  */
 static int fill_read_buffer(struct dentry * dentry, struct sysfs_buffer * buffer)
 {
-	struct sysfs_dirent * sd = dentry->d_fsdata;
-	struct kobject * kobj = to_kobj(dentry->d_parent);
+	struct sysfs_dirent *attr_sd = dentry->d_fsdata;
+	struct kobject *kobj = attr_sd->s_parent->s_elem.dir.kobj;
 	struct sysfs_ops * ops = buffer->ops;
 	int ret = 0;
 	ssize_t count;
@@ -99,8 +99,15 @@ static int fill_read_buffer(struct dentry * dentry, struct sysfs_buffer * buffer
 	if (!buffer->page)
 		return -ENOMEM;
 
-	buffer->event = atomic_read(&sd->s_event);
-	count = ops->show(kobj, sd->s_elem.attr.attr, buffer->page);
+	/* need attr_sd for attr and ops, its parent for kobj */
+	if (!sysfs_get_active_two(attr_sd))
+		return -ENODEV;
+
+	buffer->event = atomic_read(&attr_sd->s_event);
+	count = ops->show(kobj, attr_sd->s_elem.attr.attr, buffer->page);
+
+	sysfs_put_active_two(attr_sd);
+
 	BUG_ON(count > (ssize_t)PAGE_SIZE);
 	if (count >= 0) {
 		buffer->needs_read_fill = 0;
@@ -225,14 +232,23 @@ fill_write_buffer(struct sysfs_buffer * buffer, const char __user * buf, size_t
  *	passing the buffer that we acquired in fill_write_buffer().
  */
 
-static int 
+static int
 flush_write_buffer(struct dentry * dentry, struct sysfs_buffer * buffer, size_t count)
 {
 	struct sysfs_dirent *attr_sd = dentry->d_fsdata;
-	struct kobject * kobj = to_kobj(dentry->d_parent);
+	struct kobject *kobj = attr_sd->s_parent->s_elem.dir.kobj;
 	struct sysfs_ops * ops = buffer->ops;
+	int rc;
+
+	/* need attr_sd for attr and ops, its parent for kobj */
+	if (!sysfs_get_active_two(attr_sd))
+		return -ENODEV;
+
+	rc = ops->store(kobj, attr_sd->s_elem.attr.attr, buffer->page, count);
+
+	sysfs_put_active_two(attr_sd);
 
-	return ops->store(kobj, attr_sd->s_elem.attr.attr, buffer->page, count);
+	return rc;
 }
 
 
@@ -276,22 +292,22 @@ out:
 
 static int sysfs_open_file(struct inode *inode, struct file *file)
 {
-	struct kobject *kobj = sysfs_get_kobject(file->f_path.dentry->d_parent);
 	struct sysfs_dirent *attr_sd = file->f_path.dentry->d_fsdata;
 	struct attribute *attr = attr_sd->s_elem.attr.attr;
+	struct kobject *kobj = attr_sd->s_parent->s_elem.dir.kobj;
 	struct sysfs_buffer_collection *set;
 	struct sysfs_buffer * buffer;
 	struct sysfs_ops * ops = NULL;
-	int error = 0;
+	int error;
 
-	if (!kobj || !attr)
-		goto Einval;
+	/* need attr_sd for attr and ops, its parent for kobj */
+	if (!sysfs_get_active_two(attr_sd))
+		return -ENODEV;
 
-	/* Grab the module reference for this attribute if we have one */
-	if (!try_module_get(attr->owner)) {
-		error = -ENODEV;
-		goto Done;
-	}
+	/* Grab the module reference for this attribute */
+	error = -ENODEV;
+	if (!try_module_get(attr->owner))
+		goto err_sput;
 
 	/* if the kobject has no ktype, then we assume that it is a subsystem
 	 * itself, and use ops for it.
@@ -306,30 +322,30 @@ static int sysfs_open_file(struct inode *inode, struct file *file)
 	/* No sysfs operations, either from having no subsystem,
 	 * or the subsystem have no operations.
 	 */
+	error = -EACCES;
 	if (!ops)
-		goto Eaccess;
+		goto err_mput;
 
 	/* make sure we have a collection to add our buffers to */
 	mutex_lock(&inode->i_mutex);
 	if (!(set = inode->i_private)) {
-		if (!(set = inode->i_private = kmalloc(sizeof(struct sysfs_buffer_collection), GFP_KERNEL))) {
-			error = -ENOMEM;
-			goto Done;
-		} else {
+		error = -ENOMEM;
+		if (!(set = inode->i_private = kmalloc(sizeof(struct sysfs_buffer_collection), GFP_KERNEL)))
+			goto err_mput;
+		else
 			INIT_LIST_HEAD(&set->associates);
-		}
 	}
 	mutex_unlock(&inode->i_mutex);
 
+	error = -EACCES;
+
 	/* File needs write support.
 	 * The inode's perms must say it's ok, 
 	 * and we must have a store method.
 	 */
 	if (file->f_mode & FMODE_WRITE) {
-
 		if (!(inode->i_mode & S_IWUGO) || !ops->store)
-			goto Eaccess;
-
+			goto err_mput;
 	}
 
 	/* File needs read support.
@@ -338,46 +354,45 @@ static int sysfs_open_file(struct inode *inode, struct file *file)
 	 */
 	if (file->f_mode & FMODE_READ) {
 		if (!(inode->i_mode & S_IRUGO) || !ops->show)
-			goto Eaccess;
+			goto err_mput;
 	}
 
 	/* No error? Great, allocate a buffer for the file, and store it
 	 * it in file->private_data for easy access.
 	 */
+	error = -ENOMEM;
 	buffer = kzalloc(sizeof(struct sysfs_buffer), GFP_KERNEL);
-	if (buffer) {
-		INIT_LIST_HEAD(&buffer->associates);
-		init_MUTEX(&buffer->sem);
-		buffer->needs_read_fill = 1;
-		buffer->ops = ops;
-		add_to_collection(buffer, inode);
-		file->private_data = buffer;
-	} else
-		error = -ENOMEM;
-	goto Done;
+	if (!buffer)
+		goto err_mput;
 
- Einval:
-	error = -EINVAL;
-	goto Done;
- Eaccess:
-	error = -EACCES;
+	INIT_LIST_HEAD(&buffer->associates);
+	init_MUTEX(&buffer->sem);
+	buffer->needs_read_fill = 1;
+	buffer->ops = ops;
+	add_to_collection(buffer, inode);
+	file->private_data = buffer;
+
+	/* open succeeded, put active references and pin attr_sd */
+	sysfs_put_active_two(attr_sd);
+	sysfs_get(attr_sd);
+	return 0;
+
+ err_mput:
 	module_put(attr->owner);
- Done:
-	if (error)
-		kobject_put(kobj);
+ err_sput:
+	sysfs_put_active_two(attr_sd);
 	return error;
 }
 
 static int sysfs_release(struct inode * inode, struct file * filp)
 {
-	struct kobject * kobj = to_kobj(filp->f_path.dentry->d_parent);
 	struct sysfs_dirent *attr_sd = filp->f_path.dentry->d_fsdata;
 	struct attribute *attr = attr_sd->s_elem.attr.attr;
 	struct sysfs_buffer * buffer = filp->private_data;
 
 	if (buffer)
 		remove_from_collection(buffer, inode);
-	kobject_put(kobj);
+	sysfs_put(attr_sd);
 	/* After this point, attr should not be accessed. */
 	module_put(attr->owner);
 
@@ -406,18 +421,25 @@ static int sysfs_release(struct inode * inode, struct file * filp)
 static unsigned int sysfs_poll(struct file *filp, poll_table *wait)
 {
 	struct sysfs_buffer * buffer = filp->private_data;
-	struct kobject * kobj = to_kobj(filp->f_path.dentry->d_parent);
-	struct sysfs_dirent * sd = filp->f_path.dentry->d_fsdata;
-	int res = 0;
+	struct sysfs_dirent *attr_sd = filp->f_path.dentry->d_fsdata;
+	struct kobject *kobj = attr_sd->s_parent->s_elem.dir.kobj;
+
+	/* need parent for the kobj, grab both */
+	if (!sysfs_get_active_two(attr_sd))
+		goto trigger;
 
 	poll_wait(filp, &kobj->poll, wait);
 
-	if (buffer->event != atomic_read(&sd->s_event)) {
-		res = POLLERR|POLLPRI;
-		buffer->needs_read_fill = 1;
-	}
+	sysfs_put_active_two(attr_sd);
 
-	return res;
+	if (buffer->event != atomic_read(&attr_sd->s_event))
+		goto trigger;
+
+	return 0;
+
+ trigger:
+	buffer->needs_read_fill = 1;
+	return POLLERR|POLLPRI;
 }
 
 
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index 9085dc4..a222064 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -260,12 +260,16 @@ int sysfs_hash_and_remove(struct dentry * dir, const char * name)
 		if (!strcmp(sd->s_name, name)) {
 			list_del_init(&sd->s_sibling);
 			sysfs_drop_dentry(sd, dir);
-			sysfs_put(sd);
 			found = 1;
 			break;
 		}
 	}
 	mutex_unlock(&dir->d_inode->i_mutex);
 
-	return found ? 0 : -ENOENT;
+	if (!found)
+		return -ENOENT;
+
+	sysfs_deactivate(sd);
+	sysfs_put(sd);
+	return 0;
 }
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index e1a213d..47d5c5a 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -14,8 +14,14 @@ struct sysfs_elem_bin_attr {
 	struct bin_attribute	* bin_attr;
 };
 
+/*
+ * As long as s_count reference is held, the sysfs_dirent itself is
+ * accessible.  Dereferencing s_elem or any other outer entity
+ * requires s_active reference.
+ */
 struct sysfs_dirent {
 	atomic_t		s_count;
+	struct rw_semaphore	s_active;
 	struct sysfs_dirent	* s_parent;
 	struct list_head	s_sibling;
 	struct list_head	s_children;
@@ -36,6 +42,17 @@ struct sysfs_dirent {
 	atomic_t		s_event;
 };
 
+/*
+ * A sysfs file which deletes another file when written to need to
+ * write lock the s_active of the victim while its s_active is read
+ * locked for the write operation.  Tell lockdep that this is okay.
+ */
+enum sysfs_s_active_class
+{
+	SYSFS_S_ACTIVE_NORMAL,		/* file r/w access, etc - default */
+	SYSFS_S_ACTIVE_DEACTIVATE,	/* file deactivation */
+};
+
 extern struct vfsmount * sysfs_mount;
 extern struct kmem_cache *sysfs_dir_cachep;
 
@@ -86,43 +103,107 @@ struct sysfs_buffer_collection {
 	struct list_head	associates;
 };
 
-static inline struct kobject * to_kobj(struct dentry * dentry)
+static inline struct sysfs_dirent * sysfs_get(struct sysfs_dirent * sd)
 {
-	struct sysfs_dirent * sd = dentry->d_fsdata;
-	return sd->s_elem.dir.kobj;
+	if (sd) {
+		WARN_ON(!atomic_read(&sd->s_count));
+		atomic_inc(&sd->s_count);
+	}
+	return sd;
 }
 
-static inline struct kobject *sysfs_get_kobject(struct dentry *dentry)
+static inline void sysfs_put(struct sysfs_dirent * sd)
 {
-	struct kobject * kobj = NULL;
-
-	spin_lock(&dcache_lock);
-	if (!d_unhashed(dentry)) {
-		struct sysfs_dirent * sd = dentry->d_fsdata;
-
-		if (sd->s_type & SYSFS_KOBJ_LINK)
-			sd = sd->s_elem.symlink.target_sd;
+	if (sd && atomic_dec_and_test(&sd->s_count))
+		release_sysfs_dirent(sd);
+}
 
-		kobj = kobject_get(sd->s_elem.dir.kobj);
+/**
+ *	sysfs_get_active - get an active reference to sysfs_dirent
+ *	@sd: sysfs_dirent to get an active reference to
+ *
+ *	Get an active reference of @sd.  This function is noop if @sd
+ *	is NULL.
+ *
+ *	RETURNS:
+ *	Pointer to @sd on success, NULL on failure.
+ */
+static inline struct sysfs_dirent *sysfs_get_active(struct sysfs_dirent *sd)
+{
+	if (sd) {
+		if (unlikely(!down_read_trylock(&sd->s_active)))
+			sd = NULL;
 	}
-	spin_unlock(&dcache_lock);
+	return sd;
+}
 
-	return kobj;
+/**
+ *	sysfs_put_active - put an active reference to sysfs_dirent
+ *	@sd: sysfs_dirent to put an active reference to
+ *
+ *	Put an active reference to @sd.  This function is noop if @sd
+ *	is NULL.
+ */
+static inline void sysfs_put_active(struct sysfs_dirent *sd)
+{
+	if (sd)
+		up_read(&sd->s_active);
 }
 
-static inline struct sysfs_dirent * sysfs_get(struct sysfs_dirent * sd)
+/**
+ *	sysfs_get_active_two - get active references to sysfs_dirent and parent
+ *	@sd: sysfs_dirent of interest
+ *
+ *	Get active reference to @sd and its parent.  Parent's active
+ *	reference is grabbed first.  This function is noop if @sd is
+ *	NULL.
+ *
+ *	RETURNS:
+ *	Pointer to @sd on success, NULL on failure.
+ */
+static inline struct sysfs_dirent *sysfs_get_active_two(struct sysfs_dirent *sd)
 {
 	if (sd) {
-		WARN_ON(!atomic_read(&sd->s_count));
-		atomic_inc(&sd->s_count);
+		if (sd->s_parent && unlikely(!sysfs_get_active(sd->s_parent)))
+			return NULL;
+		if (unlikely(!sysfs_get_active(sd))) {
+			sysfs_put_active(sd->s_parent);
+			return NULL;
+		}
 	}
 	return sd;
 }
 
-static inline void sysfs_put(struct sysfs_dirent * sd)
+/**
+ *	sysfs_put_active_two - put active references to sysfs_dirent and parent
+ *	@sd: sysfs_dirent of interest
+ *
+ *	Put active references to @sd and its parent.  This function is
+ *	noop if @sd is NULL.
+ */
+static inline void sysfs_put_active_two(struct sysfs_dirent *sd)
 {
-	if (sd && atomic_dec_and_test(&sd->s_count))
-		release_sysfs_dirent(sd);
+	if (sd) {
+		sysfs_put_active(sd);
+		sysfs_put_active(sd->s_parent);
+	}
+}
+
+/**
+ *	sysfs_deactivate - deactivate sysfs_dirent
+ *	@sd: sysfs_dirent to deactivate
+ *
+ *	Deny new active references and drain existing ones.  s_active
+ *	will be unlocked when the sysfs_dirent is released.
+ */
+static inline void sysfs_deactivate(struct sysfs_dirent *sd)
+{
+	down_write_nested(&sd->s_active, SYSFS_S_ACTIVE_DEACTIVATE);
+
+	/* s_active will be unlocked by the thread doing the final put
+	 * on @sd.  Lie to lockdep.
+	 */
+	rwsem_release(&sd->s_active.dep_map, 1, _RET_IP_);
 }
 
 static inline int sysfs_is_shadowed_inode(struct inode *inode)
-- 
1.5.0.3



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 15/21] sysfs: reimplement symlink using sysfs_dirent tree
  2007-04-28 13:39 [PATCHSET] sysfs: sysfs rework, take #2 Tejun Heo
                   ` (14 preceding siblings ...)
  2007-04-28 13:39 ` [PATCH 18/21] sysfs: kill attribute file orphaning Tejun Heo
@ 2007-04-28 13:39 ` Tejun Heo
  2007-04-28 13:39 ` [PATCH 13/21] sysfs: make sysfs_dirent->s_element a union Tejun Heo
                   ` (6 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Tejun Heo @ 2007-04-28 13:39 UTC (permalink / raw)
  To: gregkh, dmitry.torokhov, cornelia.huck, oneukum, rpurdie, stern,
	maneesh, akpm, linux-kernel, htejun
  Cc: Tejun Heo

sysfs symlink is implemented by referencing dentry and kobject from
sysfs_dirent - symlink entry references kobject, dentry is used to
walk the tree.  This complicates object lifetimes rules and is
dangerous - for example, there is no way to tell to which module the
target of a symlink belongs and referencing that kobject can make it
linger after the module is gone.

This patch reimplements symlink using only sysfs_dirent tree.  sd for
a symlink points and holds reference to the target sysfs_dirent and
all walking is done using sysfs_dirent tree.  Simpler and safer.

Please read the following message for more info.

  http://article.gmane.org/gmane.linux.kernel/510293

Signed-off-by: Tejun Heo <htejun@gmail.com>
---
 fs/sysfs/dir.c     |    2 +-
 fs/sysfs/symlink.c |   88 +++++++++++++++++++++++++++------------------------
 fs/sysfs/sysfs.h   |    9 +++--
 3 files changed, 53 insertions(+), 46 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 2fddf8a..36d47d3 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -53,7 +53,7 @@ void release_sysfs_dirent(struct sysfs_dirent * sd)
 	parent_sd = sd->s_parent;
 
 	if (sd->s_type & SYSFS_KOBJ_LINK)
-		kobject_put(sd->s_elem.symlink.target_kobj);
+		sysfs_put(sd->s_elem.symlink.target_sd);
 	if (sd->s_type & SYSFS_COPY_NAME)
 		kfree(sd->s_name);
 	kfree(sd->s_iattr);
diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index 27df635..ff605d3 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -11,50 +11,49 @@
 
 #include "sysfs.h"
 
-static int object_depth(struct kobject * kobj)
+static int object_depth(struct sysfs_dirent *sd)
 {
-	struct kobject * p = kobj;
 	int depth = 0;
-	do { depth++; } while ((p = p->parent));
+
+	for (; sd->s_parent; sd = sd->s_parent)
+		depth++;
+
 	return depth;
 }
 
-static int object_path_length(struct kobject * kobj)
+static int object_path_length(struct sysfs_dirent * sd)
 {
-	struct kobject * p = kobj;
 	int length = 1;
-	do {
-		length += strlen(kobject_name(p)) + 1;
-		p = p->parent;
-	} while (p);
+
+	for (; sd->s_parent; sd = sd->s_parent)
+		length += strlen(sd->s_name) + 1;
+
 	return length;
 }
 
-static void fill_object_path(struct kobject * kobj, char * buffer, int length)
+static void fill_object_path(struct sysfs_dirent *sd, char *buffer, int length)
 {
-	struct kobject * p;
-
 	--length;
-	for (p = kobj; p; p = p->parent) {
-		int cur = strlen(kobject_name(p));
+	for (; sd->s_parent; sd = sd->s_parent) {
+		int cur = strlen(sd->s_name);
 
 		/* back up enough to print this bus id with '/' */
 		length -= cur;
-		strncpy(buffer + length,kobject_name(p),cur);
+		strncpy(buffer + length, sd->s_name, cur);
 		*(buffer + --length) = '/';
 	}
 }
 
-static int sysfs_add_link(struct dentry * parent, const char * name, struct kobject * target)
+static int sysfs_add_link(struct sysfs_dirent * parent_sd, const char * name,
+			  struct sysfs_dirent * target_sd)
 {
-	struct sysfs_dirent * parent_sd = parent->d_fsdata;
 	struct sysfs_dirent * sd;
 
 	sd = sysfs_new_dirent(name, S_IFLNK|S_IRWXUGO, SYSFS_KOBJ_LINK);
 	if (!sd)
 		return -ENOMEM;
 
-	sd->s_elem.symlink.target_kobj = kobject_get(target);
+	sd->s_elem.symlink.target_sd = target_sd;
 	sysfs_attach_dirent(sd, parent_sd, NULL);
 	return 0;
 }
@@ -68,6 +67,8 @@ static int sysfs_add_link(struct dentry * parent, const char * name, struct kobj
 int sysfs_create_link(struct kobject * kobj, struct kobject * target, const char * name)
 {
 	struct dentry *dentry = NULL;
+	struct sysfs_dirent *parent_sd = NULL;
+	struct sysfs_dirent *target_sd = NULL;
 	int error = -EEXIST;
 
 	BUG_ON(!name);
@@ -80,11 +81,27 @@ int sysfs_create_link(struct kobject * kobj, struct kobject * target, const char
 
 	if (!dentry)
 		return -EFAULT;
+	parent_sd = dentry->d_fsdata;
+
+	/* target->dentry can go away beneath us but is protected with
+	 * kobj_sysfs_assoc_lock.  Fetch target_sd from it.
+	 */
+	spin_lock(&kobj_sysfs_assoc_lock);
+	if (target->dentry)
+		target_sd = sysfs_get(target->dentry->d_fsdata);
+	spin_unlock(&kobj_sysfs_assoc_lock);
+
+	if (!target_sd)
+		return -ENOENT;
 
 	mutex_lock(&dentry->d_inode->i_mutex);
 	if (!sysfs_dirent_exist(dentry->d_fsdata, name))
-		error = sysfs_add_link(dentry, name, target);
+		error = sysfs_add_link(parent_sd, name, target_sd);
 	mutex_unlock(&dentry->d_inode->i_mutex);
+
+	if (error)
+		sysfs_put(target_sd);
+
 	return error;
 }
 
@@ -100,14 +117,14 @@ void sysfs_remove_link(struct kobject * kobj, const char * name)
 	sysfs_hash_and_remove(kobj->dentry,name);
 }
 
-static int sysfs_get_target_path(struct kobject * kobj, struct kobject * target,
-				 char *path)
+static int sysfs_get_target_path(struct sysfs_dirent * parent_sd,
+				 struct sysfs_dirent * target_sd, char *path)
 {
 	char * s;
 	int depth, size;
 
-	depth = object_depth(kobj);
-	size = object_path_length(target) + depth * 3 - 1;
+	depth = object_depth(parent_sd);
+	size = object_path_length(target_sd) + depth * 3 - 1;
 	if (size > PATH_MAX)
 		return -ENAMETOOLONG;
 
@@ -116,7 +133,7 @@ static int sysfs_get_target_path(struct kobject * kobj, struct kobject * target,
 	for (s = path; depth--; s += 3)
 		strcpy(s,"../");
 
-	fill_object_path(target, path, size);
+	fill_object_path(target_sd, path, size);
 	pr_debug("%s: path = '%s'\n", __FUNCTION__, path);
 
 	return 0;
@@ -124,27 +141,16 @@ static int sysfs_get_target_path(struct kobject * kobj, struct kobject * target,
 
 static int sysfs_getlink(struct dentry *dentry, char * path)
 {
-	struct kobject *kobj, *target_kobj;
-	int error = 0;
-
-	kobj = sysfs_get_kobject(dentry->d_parent);
-	if (!kobj)
-		return -EINVAL;
-
-	target_kobj = sysfs_get_kobject(dentry);
-	if (!target_kobj) {
-		kobject_put(kobj);
-		return -EINVAL;
-	}
+	struct sysfs_dirent *sd = dentry->d_fsdata;
+	struct sysfs_dirent *parent_sd = sd->s_parent;
+	struct sysfs_dirent *target_sd = sd->s_elem.symlink.target_sd;
+	int error;
 
 	down_read(&sysfs_rename_sem);
-	error = sysfs_get_target_path(kobj, target_kobj, path);
+	error = sysfs_get_target_path(parent_sd, target_sd, path);
 	up_read(&sysfs_rename_sem);
-	
-	kobject_put(kobj);
-	kobject_put(target_kobj);
-	return error;
 
+	return error;
 }
 
 static void *sysfs_follow_link(struct dentry *dentry, struct nameidata *nd)
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index e37b6e8..e1a213d 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -3,7 +3,7 @@ struct sysfs_elem_dir {
 };
 
 struct sysfs_elem_symlink {
-	struct kobject		* target_kobj;
+	struct sysfs_dirent	* target_sd;
 };
 
 struct sysfs_elem_attr {
@@ -99,10 +99,11 @@ static inline struct kobject *sysfs_get_kobject(struct dentry *dentry)
 	spin_lock(&dcache_lock);
 	if (!d_unhashed(dentry)) {
 		struct sysfs_dirent * sd = dentry->d_fsdata;
+
 		if (sd->s_type & SYSFS_KOBJ_LINK)
-			kobj = kobject_get(sd->s_elem.symlink.target_kobj);
-		else
-			kobj = kobject_get(sd->s_elem.dir.kobj);
+			sd = sd->s_elem.symlink.target_sd;
+
+		kobj = kobject_get(sd->s_elem.dir.kobj);
 	}
 	spin_unlock(&dcache_lock);
 
-- 
1.5.0.3



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 13/21] sysfs: make sysfs_dirent->s_element a union
  2007-04-28 13:39 [PATCHSET] sysfs: sysfs rework, take #2 Tejun Heo
                   ` (15 preceding siblings ...)
  2007-04-28 13:39 ` [PATCH 15/21] sysfs: reimplement symlink using sysfs_dirent tree Tejun Heo
@ 2007-04-28 13:39 ` Tejun Heo
  2007-04-28 13:39 ` [PATCH 20/21] sysfs: separate out sysfs_attach_dentry() Tejun Heo
                   ` (5 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Tejun Heo @ 2007-04-28 13:39 UTC (permalink / raw)
  To: gregkh, dmitry.torokhov, cornelia.huck, oneukum, rpurdie, stern,
	maneesh, akpm, linux-kernel, htejun
  Cc: Tejun Heo

Make sd->s_element a union of sysfs_elem_{dir|symlink|attr|bin_attr}
and rename it to s_elem.  This is to achieve...

* some level of type checking : changing symlink to point to
  sysfs_dirent instead of kobject is much safer and less painful now.
* easier / standardized dereferencing
* allow sysfs_elem_* to contain more than one entry

Where possible, pointer is obtained by directly deferencing from sd
instead of going through other entities.  This reduces dependencies to
dentry, inode and kobject.  to_attr() and to_bin_attr() are unused now
and removed.

This is in preparation of object reference simplification.

Signed-off-by: Tejun Heo <htejun@gmail.com>
---
 fs/sysfs/bin.c     |   18 ++++++++++------
 fs/sysfs/dir.c     |   31 +++++++++++++---------------
 fs/sysfs/file.c    |   19 +++++++++--------
 fs/sysfs/inode.c   |    2 +-
 fs/sysfs/mount.c   |    1 -
 fs/sysfs/symlink.c |   23 +++-----------------
 fs/sysfs/sysfs.h   |   56 ++++++++++++++++++++++++++++-----------------------
 7 files changed, 71 insertions(+), 79 deletions(-)

diff --git a/fs/sysfs/bin.c b/fs/sysfs/bin.c
index 606267a..67a0d50 100644
--- a/fs/sysfs/bin.c
+++ b/fs/sysfs/bin.c
@@ -23,7 +23,8 @@
 static int
 fill_read(struct dentry *dentry, char *buffer, loff_t off, size_t count)
 {
-	struct bin_attribute * attr = to_bin_attr(dentry);
+	struct sysfs_dirent *attr_sd = dentry->d_fsdata;
+	struct bin_attribute *attr = attr_sd->s_elem.bin_attr.bin_attr;
 	struct kobject * kobj = to_kobj(dentry->d_parent);
 
 	if (!attr->read)
@@ -65,7 +66,8 @@ read(struct file *file, char __user *userbuf, size_t bytes, loff_t *off)
 static int
 flush_write(struct dentry *dentry, char *buffer, loff_t offset, size_t count)
 {
-	struct bin_attribute *attr = to_bin_attr(dentry);
+	struct sysfs_dirent *attr_sd = dentry->d_fsdata;
+	struct bin_attribute *attr = attr_sd->s_elem.bin_attr.bin_attr;
 	struct kobject *kobj = to_kobj(dentry->d_parent);
 
 	if (!attr->write)
@@ -101,9 +103,9 @@ static ssize_t write(struct file *file, const char __user *userbuf,
 
 static int mmap(struct file *file, struct vm_area_struct *vma)
 {
-	struct dentry *dentry = file->f_path.dentry;
-	struct bin_attribute *attr = to_bin_attr(dentry);
-	struct kobject *kobj = to_kobj(dentry->d_parent);
+	struct sysfs_dirent *attr_sd = file->f_path.dentry->d_fsdata;
+	struct bin_attribute *attr = attr_sd->s_elem.bin_attr.bin_attr;
+	struct kobject *kobj = to_kobj(file->f_path.dentry->d_parent);
 
 	if (!attr->mmap)
 		return -EINVAL;
@@ -114,7 +116,8 @@ static int mmap(struct file *file, struct vm_area_struct *vma)
 static int open(struct inode * inode, struct file * file)
 {
 	struct kobject *kobj = sysfs_get_kobject(file->f_path.dentry->d_parent);
-	struct bin_attribute * attr = to_bin_attr(file->f_path.dentry);
+	struct sysfs_dirent *attr_sd = file->f_path.dentry->d_fsdata;
+	struct bin_attribute *attr = attr_sd->s_elem.bin_attr.bin_attr;
 	int error = -EINVAL;
 
 	if (!kobj || !attr)
@@ -150,7 +153,8 @@ static int open(struct inode * inode, struct file * file)
 static int release(struct inode * inode, struct file * file)
 {
 	struct kobject * kobj = to_kobj(file->f_path.dentry->d_parent);
-	struct bin_attribute * attr = to_bin_attr(file->f_path.dentry);
+	struct sysfs_dirent *attr_sd = file->f_path.dentry->d_fsdata;
+	struct bin_attribute *attr = attr_sd->s_elem.bin_attr.bin_attr;
 	u8 * buffer = file->private_data;
 
 	kobject_put(kobj);
diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index e2ed341..0050de9 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -51,11 +51,8 @@ void release_sysfs_dirent(struct sysfs_dirent * sd)
  repeat:
 	parent_sd = sd->s_parent;
 
-	if (sd->s_type & SYSFS_KOBJ_LINK) {
-		struct sysfs_symlink * sl = sd->s_element;
-		kobject_put(sl->target_kobj);
-		kfree(sl);
-	}
+	if (sd->s_type & SYSFS_KOBJ_LINK)
+		kobject_put(sd->s_elem.symlink.target_kobj);
 	if (sd->s_type & SYSFS_COPY_NAME)
 		kfree(sd->s_name);
 	kfree(sd->s_iattr);
@@ -83,8 +80,7 @@ static struct dentry_operations sysfs_dentry_ops = {
 	.d_iput		= sysfs_d_iput,
 };
 
-struct sysfs_dirent *sysfs_new_dirent(const char *name, void *element,
-				      umode_t mode, int type)
+struct sysfs_dirent *sysfs_new_dirent(const char *name, umode_t mode, int type)
 {
 	char *dup_name = NULL;
 	struct sysfs_dirent *sd = NULL;
@@ -108,7 +104,6 @@ struct sysfs_dirent *sysfs_new_dirent(const char *name, void *element,
 	INIT_LIST_HEAD(&sd->s_sibling);
 
 	sd->s_name = name;
-	sd->s_element = element;
 	sd->s_mode = mode;
 	sd->s_type = type;
 
@@ -148,7 +143,7 @@ int sysfs_dirent_exist(struct sysfs_dirent *parent_sd,
 	struct sysfs_dirent * sd;
 
 	list_for_each_entry(sd, &parent_sd->s_children, s_sibling) {
-		if (sd->s_element) {
+		if (sd->s_type) {
 			if (strcmp(sd->s_name, new))
 				continue;
 			else
@@ -203,9 +198,10 @@ static int create_dir(struct kobject *kobj, struct dentry *parent,
 		goto out_dput;
 
 	error = -ENOMEM;
-	sd = sysfs_new_dirent(name, kobj, mode, SYSFS_DIR);
+	sd = sysfs_new_dirent(name, mode, SYSFS_DIR);
 	if (!sd)
 		goto out_drop;
+	sd->s_elem.dir.kobj = kobj;
 	sysfs_attach_dirent(sd, parent->d_fsdata, dentry);
 
 	error = sysfs_create(dentry, mode, init_dir);
@@ -278,10 +274,10 @@ static int sysfs_attach_attr(struct sysfs_dirent * sd, struct dentry * dentry)
 	int error = 0;
 
         if (sd->s_type & SYSFS_KOBJ_BIN_ATTR) {
-                bin_attr = sd->s_element;
+                bin_attr = sd->s_elem.bin_attr.bin_attr;
                 attr = &bin_attr->attr;
         } else {
-                attr = sd->s_element;
+                attr = sd->s_elem.attr.attr;
                 init = init_file;
         }
 
@@ -386,7 +382,7 @@ static void __sysfs_remove_dir(struct dentry *dentry)
 	mutex_lock(&dentry->d_inode->i_mutex);
 	parent_sd = dentry->d_fsdata;
 	list_for_each_entry_safe(sd, tmp, &parent_sd->s_children, s_sibling) {
-		if (!sd->s_element || !(sd->s_type & SYSFS_NOT_PINNED))
+		if (!sd->s_type || !(sd->s_type & SYSFS_NOT_PINNED))
 			continue;
 		list_del_init(&sd->s_sibling);
 		sysfs_drop_dentry(sd, dentry);
@@ -538,7 +534,7 @@ static int sysfs_dir_open(struct inode *inode, struct file *file)
 	struct sysfs_dirent * sd;
 
 	mutex_lock(&dentry->d_inode->i_mutex);
-	sd = sysfs_new_dirent("_DIR_", NULL, 0, 0);
+	sd = sysfs_new_dirent("_DIR_", 0, 0);
 	if (sd)
 		sysfs_attach_dirent(sd, parent_sd, NULL);
 	mutex_unlock(&dentry->d_inode->i_mutex);
@@ -605,7 +601,7 @@ static int sysfs_readdir(struct file * filp, void * dirent, filldir_t filldir)
 
 				next = list_entry(p, struct sysfs_dirent,
 						   s_sibling);
-				if (!next->s_element)
+				if (!next->s_type)
 					continue;
 
 				name = next->s_name;
@@ -653,7 +649,7 @@ static loff_t sysfs_dir_lseek(struct file * file, loff_t offset, int origin)
 				struct sysfs_dirent *next;
 				next = list_entry(p, struct sysfs_dirent,
 						   s_sibling);
-				if (next->s_element)
+				if (next->s_type)
 					n--;
 				p = p->next;
 			}
@@ -720,9 +716,10 @@ struct dentry *sysfs_create_shadow_dir(struct kobject *kobj)
 	if (!shadow)
 		goto nomem;
 
-	sd = sysfs_new_dirent("_SHADOW_", kobj, inode->i_mode, SYSFS_DIR);
+	sd = sysfs_new_dirent("_SHADOW_", inode->i_mode, SYSFS_DIR);
 	if (!sd)
 		goto nomem;
+	sd->s_elem.dir.kobj = kobj;
 	/* point to parent_sd but don't attach to it */
 	sd->s_parent = sysfs_get(parent_sd);
 	sysfs_attach_dirent(sd, NULL, shadow);
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index fe7db4e..b52b4da 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -89,7 +89,6 @@ remove_from_collection(struct sysfs_buffer *buffer, struct inode *node)
 static int fill_read_buffer(struct dentry * dentry, struct sysfs_buffer * buffer)
 {
 	struct sysfs_dirent * sd = dentry->d_fsdata;
-	struct attribute * attr = to_attr(dentry);
 	struct kobject * kobj = to_kobj(dentry->d_parent);
 	struct sysfs_ops * ops = buffer->ops;
 	int ret = 0;
@@ -101,7 +100,7 @@ static int fill_read_buffer(struct dentry * dentry, struct sysfs_buffer * buffer
 		return -ENOMEM;
 
 	buffer->event = atomic_read(&sd->s_event);
-	count = ops->show(kobj,attr,buffer->page);
+	count = ops->show(kobj, sd->s_elem.attr.attr, buffer->page);
 	BUG_ON(count > (ssize_t)PAGE_SIZE);
 	if (count >= 0) {
 		buffer->needs_read_fill = 0;
@@ -229,11 +228,11 @@ fill_write_buffer(struct sysfs_buffer * buffer, const char __user * buf, size_t
 static int 
 flush_write_buffer(struct dentry * dentry, struct sysfs_buffer * buffer, size_t count)
 {
-	struct attribute * attr = to_attr(dentry);
+	struct sysfs_dirent *attr_sd = dentry->d_fsdata;
 	struct kobject * kobj = to_kobj(dentry->d_parent);
 	struct sysfs_ops * ops = buffer->ops;
 
-	return ops->store(kobj,attr,buffer->page,count);
+	return ops->store(kobj, attr_sd->s_elem.attr.attr, buffer->page, count);
 }
 
 
@@ -278,7 +277,8 @@ out:
 static int sysfs_open_file(struct inode *inode, struct file *file)
 {
 	struct kobject *kobj = sysfs_get_kobject(file->f_path.dentry->d_parent);
-	struct attribute * attr = to_attr(file->f_path.dentry);
+	struct sysfs_dirent *attr_sd = file->f_path.dentry->d_fsdata;
+	struct attribute *attr = attr_sd->s_elem.attr.attr;
 	struct sysfs_buffer_collection *set;
 	struct sysfs_buffer * buffer;
 	struct sysfs_ops * ops = NULL;
@@ -371,15 +371,15 @@ static int sysfs_open_file(struct inode *inode, struct file *file)
 static int sysfs_release(struct inode * inode, struct file * filp)
 {
 	struct kobject * kobj = to_kobj(filp->f_path.dentry->d_parent);
-	struct attribute * attr = to_attr(filp->f_path.dentry);
-	struct module * owner = attr->owner;
+	struct sysfs_dirent *attr_sd = filp->f_path.dentry->d_fsdata;
+	struct attribute *attr = attr_sd->s_elem.attr.attr;
 	struct sysfs_buffer * buffer = filp->private_data;
 
 	if (buffer)
 		remove_from_collection(buffer, inode);
 	kobject_put(kobj);
 	/* After this point, attr should not be accessed. */
-	module_put(owner);
+	module_put(attr->owner);
 
 	if (buffer) {
 		if (buffer->page)
@@ -484,11 +484,12 @@ int sysfs_add_file(struct dentry * dir, const struct attribute * attr, int type)
 		goto out_unlock;
 	}
 
-	sd = sysfs_new_dirent(attr->name, (void *)attr, mode, type);
+	sd = sysfs_new_dirent(attr->name, mode, type);
 	if (!sd) {
 		error = -ENOMEM;
 		goto out_unlock;
 	}
+	sd->s_elem.attr.attr = (void *)attr;
 	sysfs_attach_dirent(sd, parent_sd, NULL);
 
  out_unlock:
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index 5d4d32b..9085dc4 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -255,7 +255,7 @@ int sysfs_hash_and_remove(struct dentry * dir, const char * name)
 	parent_sd = dir->d_fsdata;
 	mutex_lock_nested(&dir->d_inode->i_mutex, I_MUTEX_PARENT);
 	list_for_each_entry(sd, &parent_sd->s_children, s_sibling) {
-		if (!sd->s_element)
+		if (!sd->s_type)
 			continue;
 		if (!strcmp(sd->s_name, name)) {
 			list_del_init(&sd->s_sibling);
diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c
index bc613f6..b31256a 100644
--- a/fs/sysfs/mount.c
+++ b/fs/sysfs/mount.c
@@ -31,7 +31,6 @@ static struct sysfs_dirent sysfs_root = {
 	.s_count	= ATOMIC_INIT(1),
 	.s_sibling	= LIST_HEAD_INIT(sysfs_root.s_sibling),
 	.s_children	= LIST_HEAD_INIT(sysfs_root.s_children),
-	.s_element	= NULL,
 	.s_type		= SYSFS_ROOT,
 	.s_ino		= 1,
 	.s_iattr	= NULL,
diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index c728204..27df635 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -48,30 +48,15 @@ static void fill_object_path(struct kobject * kobj, char * buffer, int length)
 static int sysfs_add_link(struct dentry * parent, const char * name, struct kobject * target)
 {
 	struct sysfs_dirent * parent_sd = parent->d_fsdata;
-	struct sysfs_symlink * sl;
 	struct sysfs_dirent * sd;
-	int error;
 
-	error = -ENOMEM;
-	sl = kzalloc(sizeof(*sl), GFP_KERNEL);
-	if (!sl)
-		goto err_out;
-
-	sl->target_kobj = kobject_get(target);
-
-	sd = sysfs_new_dirent(name, sl, S_IFLNK|S_IRWXUGO, SYSFS_KOBJ_LINK);
+	sd = sysfs_new_dirent(name, S_IFLNK|S_IRWXUGO, SYSFS_KOBJ_LINK);
 	if (!sd)
-		goto err_out;
-	sysfs_attach_dirent(sd, parent_sd, NULL);
+		return -ENOMEM;
 
+	sd->s_elem.symlink.target_kobj = kobject_get(target);
+	sysfs_attach_dirent(sd, parent_sd, NULL);
 	return 0;
-
- err_out:
-	if (sl) {
-		kobject_put(sl->target_kobj);
-		kfree(sl);
-	}
-	return error;
 }
 
 /**
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 9d58b0f..14ca866 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -1,10 +1,33 @@
+struct sysfs_elem_dir {
+	struct kobject		* kobj;
+};
+
+struct sysfs_elem_symlink {
+	struct kobject		* target_kobj;
+};
+
+struct sysfs_elem_attr {
+	struct attribute	* attr;
+};
+
+struct sysfs_elem_bin_attr {
+	struct bin_attribute	* bin_attr;
+};
+
 struct sysfs_dirent {
 	atomic_t		s_count;
 	struct sysfs_dirent	* s_parent;
 	struct list_head	s_sibling;
 	struct list_head	s_children;
 	const char		* s_name;
-	void 			* s_element;
+
+	union {
+		struct sysfs_elem_dir		dir;
+		struct sysfs_elem_symlink	symlink;
+		struct sysfs_elem_attr		attr;
+		struct sysfs_elem_bin_attr	bin_attr;
+	}			s_elem;
+
 	int			s_type;
 	umode_t			s_mode;
 	ino_t			s_ino;
@@ -22,8 +45,8 @@ extern int sysfs_create(struct dentry *, int mode, int (*init)(struct inode *));
 
 extern void release_sysfs_dirent(struct sysfs_dirent * sd);
 extern int sysfs_dirent_exist(struct sysfs_dirent *, const unsigned char *);
-extern struct sysfs_dirent *sysfs_new_dirent(const char *name, void *element,
-					     umode_t mode, int type);
+extern struct sysfs_dirent *sysfs_new_dirent(const char *name, umode_t mode,
+					     int type);
 extern void sysfs_attach_dirent(struct sysfs_dirent *sd,
 				struct sysfs_dirent *parent_sd,
 				struct dentry *dentry);
@@ -46,10 +69,6 @@ extern const struct file_operations bin_fops;
 extern const struct inode_operations sysfs_dir_inode_operations;
 extern const struct inode_operations sysfs_symlink_inode_operations;
 
-struct sysfs_symlink {
-	struct kobject * target_kobj;
-};
-
 struct sysfs_buffer {
 	struct list_head		associates;
 	size_t				count;
@@ -69,19 +88,7 @@ struct sysfs_buffer_collection {
 static inline struct kobject * to_kobj(struct dentry * dentry)
 {
 	struct sysfs_dirent * sd = dentry->d_fsdata;
-	return ((struct kobject *) sd->s_element);
-}
-
-static inline struct attribute * to_attr(struct dentry * dentry)
-{
-	struct sysfs_dirent * sd = dentry->d_fsdata;
-	return ((struct attribute *) sd->s_element);
-}
-
-static inline struct bin_attribute * to_bin_attr(struct dentry * dentry)
-{
-	struct sysfs_dirent * sd = dentry->d_fsdata;
-	return ((struct bin_attribute *) sd->s_element);
+	return sd->s_elem.dir.kobj;
 }
 
 static inline struct kobject *sysfs_get_kobject(struct dentry *dentry)
@@ -91,11 +98,10 @@ static inline struct kobject *sysfs_get_kobject(struct dentry *dentry)
 	spin_lock(&dcache_lock);
 	if (!d_unhashed(dentry)) {
 		struct sysfs_dirent * sd = dentry->d_fsdata;
-		if (sd->s_type & SYSFS_KOBJ_LINK) {
-			struct sysfs_symlink * sl = sd->s_element;
-			kobj = kobject_get(sl->target_kobj);
-		} else
-			kobj = kobject_get(sd->s_element);
+		if (sd->s_type & SYSFS_KOBJ_LINK)
+			kobj = kobject_get(sd->s_elem.symlink.target_kobj);
+		else
+			kobj = kobject_get(sd->s_elem.dir.kobj);
 	}
 	spin_unlock(&dcache_lock);
 
-- 
1.5.0.3



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 14/21] sysfs: implement kobj_sysfs_assoc_lock
  2007-04-28 13:39 [PATCHSET] sysfs: sysfs rework, take #2 Tejun Heo
                   ` (11 preceding siblings ...)
  2007-04-28 13:39 ` [PATCH 12/21] sysfs: add sysfs_dirent->s_name Tejun Heo
@ 2007-04-28 13:39 ` Tejun Heo
  2007-04-28 13:39 ` [PATCH 16/21] sysfs: implement bin_buffer Tejun Heo
                   ` (9 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Tejun Heo @ 2007-04-28 13:39 UTC (permalink / raw)
  To: gregkh, dmitry.torokhov, cornelia.huck, oneukum, rpurdie, stern,
	maneesh, akpm, linux-kernel, htejun
  Cc: Tejun Heo

kobj->dentry can go away anytime unless the user controls when the
associated sysfs node is deleted.  This patch implements
kobj_sysfs_assoc_lock which protects kobj->dentry.  This will be used
to maintain kobj based API when converting sysfs to use sysfs_dirent
tree instead of dentry/kobject.

Note that this lock belongs to kobject/driver-model not sysfs.  Once
sysfs is converted to not use kobject in its interface, this can be
removed from sysfs.

This is in preparation of object reference simplification.

Signed-off-by: Tejun Heo <htejun@gmail.com>
---
 fs/sysfs/dir.c   |    8 +++++++-
 fs/sysfs/sysfs.h |    1 +
 2 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 0050de9..2fddf8a 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -14,6 +14,7 @@
 #include "sysfs.h"
 
 DECLARE_RWSEM(sysfs_rename_sem);
+spinlock_t kobj_sysfs_assoc_lock = SPIN_LOCK_UNLOCKED;
 
 static spinlock_t sysfs_ino_lock = SPIN_LOCK_UNLOCKED;
 static DEFINE_IDA(sysfs_ino_ida);
@@ -408,8 +409,13 @@ static void __sysfs_remove_dir(struct dentry *dentry)
 
 void sysfs_remove_dir(struct kobject * kobj)
 {
-	__sysfs_remove_dir(kobj->dentry);
+	struct dentry *d = kobj->dentry;
+
+	spin_lock(&kobj_sysfs_assoc_lock);
 	kobj->dentry = NULL;
+	spin_unlock(&kobj_sysfs_assoc_lock);
+
+	__sysfs_remove_dir(d);
 }
 
 int sysfs_rename_dir(struct kobject * kobj, struct dentry *new_parent,
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 14ca866..e37b6e8 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -61,6 +61,7 @@ extern void sysfs_remove_subdir(struct dentry *);
 extern void sysfs_drop_dentry(struct sysfs_dirent *sd, struct dentry *parent);
 extern int sysfs_setattr(struct dentry *dentry, struct iattr *iattr);
 
+extern spinlock_t kobj_sysfs_assoc_lock;
 extern struct rw_semaphore sysfs_rename_sem;
 extern struct super_block * sysfs_sb;
 extern const struct file_operations sysfs_dir_operations;
-- 
1.5.0.3



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 18/21] sysfs: kill attribute file orphaning
  2007-04-28 13:39 [PATCHSET] sysfs: sysfs rework, take #2 Tejun Heo
                   ` (13 preceding siblings ...)
  2007-04-28 13:39 ` [PATCH 16/21] sysfs: implement bin_buffer Tejun Heo
@ 2007-04-28 13:39 ` Tejun Heo
  2007-04-28 13:39 ` [PATCH 15/21] sysfs: reimplement symlink using sysfs_dirent tree Tejun Heo
                   ` (7 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Tejun Heo @ 2007-04-28 13:39 UTC (permalink / raw)
  To: gregkh, dmitry.torokhov, cornelia.huck, oneukum, rpurdie, stern,
	maneesh, akpm, linux-kernel, htejun
  Cc: Tejun Heo

Now that sysfs_dirent can be disconnected from kobject on deletion,
there is no need to orphan each attribute files.  All [bin_]attribute
nodes are automatically orphaned when the parent node is deleted.
Kill attribute file orphaning.

Signed-off-by: Tejun Heo <htejun@gmail.com>
---
 fs/sysfs/file.c  |   65 ++++++++++-------------------------------------------
 fs/sysfs/inode.c |   25 --------------------
 fs/sysfs/mount.c |    8 ------
 fs/sysfs/sysfs.h |   16 -------------
 4 files changed, 13 insertions(+), 101 deletions(-)

diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index df0e0c0..a34c2fa 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -51,29 +51,15 @@ static struct sysfs_ops subsys_sysfs_ops = {
 	.store	= subsys_attr_store,
 };
 
-/**
- *	add_to_collection - add buffer to a collection
- *	@buffer:	buffer to be added
- *	@node:		inode of set to add to
- */
-
-static inline void
-add_to_collection(struct sysfs_buffer *buffer, struct inode *node)
-{
-	struct sysfs_buffer_collection *set = node->i_private;
-
-	mutex_lock(&node->i_mutex);
-	list_add(&buffer->associates, &set->associates);
-	mutex_unlock(&node->i_mutex);
-}
-
-static inline void
-remove_from_collection(struct sysfs_buffer *buffer, struct inode *node)
-{
-	mutex_lock(&node->i_mutex);
-	list_del(&buffer->associates);
-	mutex_unlock(&node->i_mutex);
-}
+struct sysfs_buffer {
+	size_t			count;
+	loff_t			pos;
+	char			* page;
+	struct sysfs_ops	* ops;
+	struct semaphore	sem;
+	int			needs_read_fill;
+	int			event;
+};
 
 /**
  *	fill_read_buffer - allocate and fill buffer from object.
@@ -175,10 +161,7 @@ sysfs_read_file(struct file *file, char __user *buf, size_t count, loff_t *ppos)
 
 	down(&buffer->sem);
 	if (buffer->needs_read_fill) {
-		if (buffer->orphaned)
-			retval = -ENODEV;
-		else
-			retval = fill_read_buffer(file->f_path.dentry,buffer);
+		retval = fill_read_buffer(file->f_path.dentry,buffer);
 		if (retval)
 			goto out;
 	}
@@ -276,16 +259,11 @@ sysfs_write_file(struct file *file, const char __user *buf, size_t count, loff_t
 	ssize_t len;
 
 	down(&buffer->sem);
-	if (buffer->orphaned) {
-		len = -ENODEV;
-		goto out;
-	}
 	len = fill_write_buffer(buffer, buf, count);
 	if (len > 0)
 		len = flush_write_buffer(file->f_path.dentry, buffer, len);
 	if (len > 0)
 		*ppos += len;
-out:
 	up(&buffer->sem);
 	return len;
 }
@@ -295,7 +273,6 @@ static int sysfs_open_file(struct inode *inode, struct file *file)
 	struct sysfs_dirent *attr_sd = file->f_path.dentry->d_fsdata;
 	struct attribute *attr = attr_sd->s_elem.attr.attr;
 	struct kobject *kobj = attr_sd->s_parent->s_elem.dir.kobj;
-	struct sysfs_buffer_collection *set;
 	struct sysfs_buffer * buffer;
 	struct sysfs_ops * ops = NULL;
 	int error;
@@ -319,26 +296,14 @@ static int sysfs_open_file(struct inode *inode, struct file *file)
 	else
 		ops = &subsys_sysfs_ops;
 
+	error = -EACCES;
+
 	/* No sysfs operations, either from having no subsystem,
 	 * or the subsystem have no operations.
 	 */
-	error = -EACCES;
 	if (!ops)
 		goto err_mput;
 
-	/* make sure we have a collection to add our buffers to */
-	mutex_lock(&inode->i_mutex);
-	if (!(set = inode->i_private)) {
-		error = -ENOMEM;
-		if (!(set = inode->i_private = kmalloc(sizeof(struct sysfs_buffer_collection), GFP_KERNEL)))
-			goto err_mput;
-		else
-			INIT_LIST_HEAD(&set->associates);
-	}
-	mutex_unlock(&inode->i_mutex);
-
-	error = -EACCES;
-
 	/* File needs write support.
 	 * The inode's perms must say it's ok, 
 	 * and we must have a store method.
@@ -365,11 +330,9 @@ static int sysfs_open_file(struct inode *inode, struct file *file)
 	if (!buffer)
 		goto err_mput;
 
-	INIT_LIST_HEAD(&buffer->associates);
 	init_MUTEX(&buffer->sem);
 	buffer->needs_read_fill = 1;
 	buffer->ops = ops;
-	add_to_collection(buffer, inode);
 	file->private_data = buffer;
 
 	/* open succeeded, put active references and pin attr_sd */
@@ -388,10 +351,8 @@ static int sysfs_release(struct inode * inode, struct file * filp)
 {
 	struct sysfs_dirent *attr_sd = filp->f_path.dentry->d_fsdata;
 	struct attribute *attr = attr_sd->s_elem.attr.attr;
-	struct sysfs_buffer * buffer = filp->private_data;
+	struct sysfs_buffer *buffer = filp->private_data;
 
-	if (buffer)
-		remove_from_collection(buffer, inode);
 	sysfs_put(attr_sd);
 	/* After this point, attr should not be accessed. */
 	module_put(attr->owner);
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index a222064..b72d18a 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -190,24 +190,6 @@ int sysfs_create(struct dentry * dentry, int mode, int (*init)(struct inode *))
 	return error;
 }
 
-static inline void orphan_all_buffers(struct inode *node)
-{
-	struct sysfs_buffer_collection *set;
-	struct sysfs_buffer *buf;
-
-	mutex_lock_nested(&node->i_mutex, I_MUTEX_CHILD);
-	set = node->i_private;
-	if (set) {
-		list_for_each_entry(buf, &set->associates, associates) {
-			down(&buf->sem);
-			buf->orphaned = 1;
-			up(&buf->sem);
-		}
-	}
-	mutex_unlock(&node->i_mutex);
-}
-
-
 /*
  * Unhashes the dentry corresponding to given sysfs_dirent
  * Called with parent inode's i_mutex held.
@@ -215,23 +197,16 @@ static inline void orphan_all_buffers(struct inode *node)
 void sysfs_drop_dentry(struct sysfs_dirent * sd, struct dentry * parent)
 {
 	struct dentry * dentry = sd->s_dentry;
-	struct inode *inode;
 
 	if (dentry) {
 		spin_lock(&dcache_lock);
 		spin_lock(&dentry->d_lock);
 		if (!(d_unhashed(dentry) && dentry->d_inode)) {
-			inode = dentry->d_inode;
-			spin_lock(&inode->i_lock);
-			__iget(inode);
-			spin_unlock(&inode->i_lock);
 			dget_locked(dentry);
 			__d_drop(dentry);
 			spin_unlock(&dentry->d_lock);
 			spin_unlock(&dcache_lock);
 			simple_unlink(parent->d_inode, dentry);
-			orphan_all_buffers(inode);
-			iput(inode);
 		} else {
 			spin_unlock(&dentry->d_lock);
 			spin_unlock(&dcache_lock);
diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c
index b31256a..d1a37e8 100644
--- a/fs/sysfs/mount.c
+++ b/fs/sysfs/mount.c
@@ -19,12 +19,9 @@ struct vfsmount *sysfs_mount;
 struct super_block * sysfs_sb = NULL;
 struct kmem_cache *sysfs_dir_cachep;
 
-static void sysfs_clear_inode(struct inode *inode);
-
 static const struct super_operations sysfs_ops = {
 	.statfs		= simple_statfs,
 	.drop_inode	= sysfs_delete_inode,
-	.clear_inode	= sysfs_clear_inode,
 };
 
 static struct sysfs_dirent sysfs_root = {
@@ -36,11 +33,6 @@ static struct sysfs_dirent sysfs_root = {
 	.s_iattr	= NULL,
 };
 
-static void sysfs_clear_inode(struct inode *inode)
-{
-	kfree(inode->i_private);
-}
-
 static int sysfs_fill_super(struct super_block *sb, void *data, int silent)
 {
 	struct inode *inode;
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 47d5c5a..537c64b 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -87,22 +87,6 @@ extern const struct file_operations bin_fops;
 extern const struct inode_operations sysfs_dir_inode_operations;
 extern const struct inode_operations sysfs_symlink_inode_operations;
 
-struct sysfs_buffer {
-	struct list_head		associates;
-	size_t				count;
-	loff_t				pos;
-	char				* page;
-	struct sysfs_ops		* ops;
-	struct semaphore		sem;
-	int				orphaned;
-	int				needs_read_fill;
-	int				event;
-};
-
-struct sysfs_buffer_collection {
-	struct list_head	associates;
-};
-
 static inline struct sysfs_dirent * sysfs_get(struct sysfs_dirent * sd)
 {
 	if (sd) {
-- 
1.5.0.3



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 19/21] sysfs: kill unnecessary attribute->owner
  2007-04-28 13:39 [PATCHSET] sysfs: sysfs rework, take #2 Tejun Heo
                   ` (18 preceding siblings ...)
  2007-04-28 13:39 ` [PATCH 17/21] sysfs: implement sysfs_dirent active reference and immediate disconnect Tejun Heo
@ 2007-04-28 13:39 ` Tejun Heo
  2007-04-28 13:39 ` [PATCH 21/21] sysfs: reimplement syfs_drop_dentry() Tejun Heo
                   ` (2 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Tejun Heo @ 2007-04-28 13:39 UTC (permalink / raw)
  To: gregkh, dmitry.torokhov, cornelia.huck, oneukum, rpurdie, stern,
	maneesh, akpm, linux-kernel, htejun
  Cc: Tejun Heo

sysfs is now completely out of driver/module lifetime game.  After
deletion, a sysfs node doesn't access anything outside sysfs proper,
so there's no reason to hold onto the attribute owners.  Note that
often the wrong modules were accounted for as owners leading to
accessing removed modules.

This patch kills now unnecessary attribute->owner.  Note that with
this change, userland holding a sysfs node does not prevent the
backing module from being unloaded.

For more info regarding lifetime rule cleanup, please read the
following message.

  http://article.gmane.org/gmane.linux.kernel/510293

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 arch/ppc/syslib/mv64x60.c                   |    1 -
 arch/s390/kernel/ipl.c                      |    2 --
 drivers/base/bus.c                          |    2 --
 drivers/base/class.c                        |    2 --
 drivers/base/core.c                         |    4 ----
 drivers/base/firmware_class.c               |    2 +-
 drivers/block/pktcdvd.c                     |    3 +--
 drivers/char/ipmi/ipmi_msghandler.c         |   10 ----------
 drivers/cpufreq/cpufreq_stats.c             |    3 +--
 drivers/cpufreq/cpufreq_userspace.c         |    2 +-
 drivers/cpufreq/freq_table.c                |    1 -
 drivers/firmware/dcdbas.h                   |    3 +--
 drivers/firmware/dell_rbu.c                 |    6 +++---
 drivers/firmware/edd.c                      |    2 +-
 drivers/firmware/efivars.c                  |    6 +++---
 drivers/i2c/chips/eeprom.c                  |    1 -
 drivers/i2c/chips/max6875.c                 |    1 -
 drivers/infiniband/core/sysfs.c             |    1 -
 drivers/input/mouse/psmouse.h               |    1 -
 drivers/macintosh/windfarm_core.c           |    2 --
 drivers/media/video/pvrusb2/pvrusb2-sysfs.c |   14 --------------
 drivers/misc/asus-laptop.c                  |    3 +--
 drivers/net/ibmveth.c                       |    2 +-
 drivers/parisc/pdc_stable.c                 |    4 ++--
 drivers/pci/hotplug/acpiphp_ibm.c           |    1 -
 drivers/pci/pci-sysfs.c                     |    4 ----
 drivers/pci/probe.c                         |    2 --
 drivers/pcmcia/socket_sysfs.c               |    2 +-
 drivers/rapidio/rio-sysfs.c                 |    1 -
 drivers/rtc/rtc-ds1553.c                    |    1 -
 drivers/rtc/rtc-ds1742.c                    |    1 -
 drivers/s390/cio/chp.c                      |    2 --
 drivers/s390/net/qeth_sys.c                 |    2 +-
 drivers/scsi/arcmsr/arcmsr_attr.c           |    3 ---
 drivers/scsi/libsas/sas_expander.c          |    1 -
 drivers/scsi/lpfc/lpfc_attr.c               |    2 --
 drivers/scsi/qla2xxx/qla_attr.c             |    6 ------
 drivers/spi/at25.c                          |    1 -
 drivers/video/aty/radeon_base.c             |    2 --
 drivers/video/backlight/backlight.c         |    2 +-
 drivers/video/backlight/lcd.c               |    2 +-
 drivers/w1/slaves/w1_ds2433.c               |    1 -
 drivers/w1/slaves/w1_therm.c                |    1 -
 drivers/w1/w1.c                             |    2 --
 drivers/zorro/zorro-sysfs.c                 |    1 -
 fs/ecryptfs/main.c                          |    2 --
 fs/ocfs2/cluster/masklog.c                  |    1 -
 fs/partitions/check.c                       |    1 -
 fs/sysfs/bin.c                              |   19 +++++--------------
 fs/sysfs/file.c                             |   21 +++++----------------
 include/linux/sysdev.h                      |    3 +--
 include/linux/sysfs.h                       |    7 +++----
 kernel/module.c                             |    9 +++------
 kernel/params.c                             |    1 -
 net/bridge/br_sysfs_br.c                    |    3 +--
 net/bridge/br_sysfs_if.c                    |    3 +--
 56 files changed, 39 insertions(+), 149 deletions(-)

diff --git a/arch/ppc/syslib/mv64x60.c b/arch/ppc/syslib/mv64x60.c
index a6f8b68..dd451bd 100644
--- a/arch/ppc/syslib/mv64x60.c
+++ b/arch/ppc/syslib/mv64x60.c
@@ -2415,7 +2415,6 @@ static struct bin_attribute mv64xxx_hs_reg_attr = { /* Hotswap register */
 	.attr = {
 		.name = "hs_reg",
 		.mode = S_IRUGO | S_IWUSR,
-		.owner = THIS_MODULE,
 	},
 	.size  = VAL_LEN_MAX,
 	.read  = mv64xxx_hs_reg_read,
diff --git a/arch/s390/kernel/ipl.c b/arch/s390/kernel/ipl.c
index 06833ac..47c71e3 100644
--- a/arch/s390/kernel/ipl.c
+++ b/arch/s390/kernel/ipl.c
@@ -314,7 +314,6 @@ static struct bin_attribute ipl_parameter_attr = {
 	.attr = {
 		.name = "binary_parameter",
 		.mode = S_IRUGO,
-		.owner = THIS_MODULE,
 	},
 	.size = PAGE_SIZE,
 	.read = &ipl_parameter_read,
@@ -338,7 +337,6 @@ static struct bin_attribute ipl_scp_data_attr = {
 	.attr = {
 		.name = "scp_data",
 		.mode = S_IRUGO,
-		.owner = THIS_MODULE,
 	},
 	.size = PAGE_SIZE,
 	.read = &ipl_scp_data_read,
diff --git a/drivers/base/bus.c b/drivers/base/bus.c
index 1d76e23..b230c21 100644
--- a/drivers/base/bus.c
+++ b/drivers/base/bus.c
@@ -562,7 +562,6 @@ static int add_probe_files(struct bus_type *bus)
 
 	bus->drivers_probe_attr.attr.name = "drivers_probe";
 	bus->drivers_probe_attr.attr.mode = S_IWUSR;
-	bus->drivers_probe_attr.attr.owner = bus->owner;
 	bus->drivers_probe_attr.store = store_drivers_probe;
 	retval = bus_create_file(bus, &bus->drivers_probe_attr);
 	if (retval)
@@ -570,7 +569,6 @@ static int add_probe_files(struct bus_type *bus)
 
 	bus->drivers_autoprobe_attr.attr.name = "drivers_autoprobe";
 	bus->drivers_autoprobe_attr.attr.mode = S_IWUSR | S_IRUGO;
-	bus->drivers_autoprobe_attr.attr.owner = bus->owner;
 	bus->drivers_autoprobe_attr.show = show_drivers_autoprobe;
 	bus->drivers_autoprobe_attr.store = store_drivers_autoprobe;
 	retval = bus_create_file(bus, &bus->drivers_autoprobe_attr);
diff --git a/drivers/base/class.c b/drivers/base/class.c
index 80bbb20..0b42d7f 100644
--- a/drivers/base/class.c
+++ b/drivers/base/class.c
@@ -624,7 +624,6 @@ int class_device_add(struct class_device *class_dev)
 		goto out3;
 	class_dev->uevent_attr.attr.name = "uevent";
 	class_dev->uevent_attr.attr.mode = S_IWUSR;
-	class_dev->uevent_attr.attr.owner = parent_class->owner;
 	class_dev->uevent_attr.store = store_uevent;
 	error = class_device_create_file(class_dev, &class_dev->uevent_attr);
 	if (error)
@@ -639,7 +638,6 @@ int class_device_add(struct class_device *class_dev)
 		}
 		attr->attr.name = "dev";
 		attr->attr.mode = S_IRUGO;
-		attr->attr.owner = parent_class->owner;
 		attr->show = show_dev;
 		error = class_device_create_file(class_dev, attr);
 		if (error) {
diff --git a/drivers/base/core.c b/drivers/base/core.c
index 8aa090d..a13ef32 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -676,8 +676,6 @@ int device_add(struct device *dev)
 
 	dev->uevent_attr.attr.name = "uevent";
 	dev->uevent_attr.attr.mode = S_IRUGO | S_IWUSR;
-	if (dev->driver)
-		dev->uevent_attr.attr.owner = dev->driver->owner;
 	dev->uevent_attr.store = store_uevent;
 	dev->uevent_attr.show = show_uevent;
 	error = device_create_file(dev, &dev->uevent_attr);
@@ -693,8 +691,6 @@ int device_add(struct device *dev)
 		}
 		attr->attr.name = "dev";
 		attr->attr.mode = S_IRUGO;
-		if (dev->driver)
-			attr->attr.owner = dev->driver->owner;
 		attr->show = show_dev;
 		error = device_create_file(dev, attr);
 		if (error) {
diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
index 97ab5bd..930d905 100644
--- a/drivers/base/firmware_class.c
+++ b/drivers/base/firmware_class.c
@@ -271,7 +271,7 @@ out:
 }
 
 static struct bin_attribute firmware_attr_data_tmpl = {
-	.attr = {.name = "data", .mode = 0644, .owner = THIS_MODULE},
+	.attr = {.name = "data", .mode = 0644},
 	.size = 0,
 	.read = firmware_data_read,
 	.write = firmware_data_write,
diff --git a/drivers/block/pktcdvd.c b/drivers/block/pktcdvd.c
index f1b9dd7..ce64e86 100644
--- a/drivers/block/pktcdvd.c
+++ b/drivers/block/pktcdvd.c
@@ -146,8 +146,7 @@ static void pkt_kobj_release(struct kobject *kobj)
  **********************************************************/
 
 #define DEF_ATTR(_obj,_name,_mode) \
-	static struct attribute _obj = { \
-		.name = _name, .owner = THIS_MODULE, .mode = _mode }
+	static struct attribute _obj = { .name = _name, .mode = _mode }
 
 /**********************************************************
   /sys/class/pktcdvd/pktcdvd[0-7]/
diff --git a/drivers/char/ipmi/ipmi_msghandler.c b/drivers/char/ipmi/ipmi_msghandler.c
index 8e222f2..b5df7e6 100644
--- a/drivers/char/ipmi/ipmi_msghandler.c
+++ b/drivers/char/ipmi/ipmi_msghandler.c
@@ -2171,52 +2171,42 @@ static int create_files(struct bmc_device *bmc)
 	int err;
 
 	bmc->device_id_attr.attr.name = "device_id";
-	bmc->device_id_attr.attr.owner = THIS_MODULE;
 	bmc->device_id_attr.attr.mode = S_IRUGO;
 	bmc->device_id_attr.show = device_id_show;
 
 	bmc->provides_dev_sdrs_attr.attr.name = "provides_device_sdrs";
-	bmc->provides_dev_sdrs_attr.attr.owner = THIS_MODULE;
 	bmc->provides_dev_sdrs_attr.attr.mode = S_IRUGO;
 	bmc->provides_dev_sdrs_attr.show = provides_dev_sdrs_show;
 
 	bmc->revision_attr.attr.name = "revision";
-	bmc->revision_attr.attr.owner = THIS_MODULE;
 	bmc->revision_attr.attr.mode = S_IRUGO;
 	bmc->revision_attr.show = revision_show;
 
 	bmc->firmware_rev_attr.attr.name = "firmware_revision";
-	bmc->firmware_rev_attr.attr.owner = THIS_MODULE;
 	bmc->firmware_rev_attr.attr.mode = S_IRUGO;
 	bmc->firmware_rev_attr.show = firmware_rev_show;
 
 	bmc->version_attr.attr.name = "ipmi_version";
-	bmc->version_attr.attr.owner = THIS_MODULE;
 	bmc->version_attr.attr.mode = S_IRUGO;
 	bmc->version_attr.show = ipmi_version_show;
 
 	bmc->add_dev_support_attr.attr.name = "additional_device_support";
-	bmc->add_dev_support_attr.attr.owner = THIS_MODULE;
 	bmc->add_dev_support_attr.attr.mode = S_IRUGO;
 	bmc->add_dev_support_attr.show = add_dev_support_show;
 
 	bmc->manufacturer_id_attr.attr.name = "manufacturer_id";
-	bmc->manufacturer_id_attr.attr.owner = THIS_MODULE;
 	bmc->manufacturer_id_attr.attr.mode = S_IRUGO;
 	bmc->manufacturer_id_attr.show = manufacturer_id_show;
 
 	bmc->product_id_attr.attr.name = "product_id";
-	bmc->product_id_attr.attr.owner = THIS_MODULE;
 	bmc->product_id_attr.attr.mode = S_IRUGO;
 	bmc->product_id_attr.show = product_id_show;
 
 	bmc->guid_attr.attr.name = "guid";
-	bmc->guid_attr.attr.owner = THIS_MODULE;
 	bmc->guid_attr.attr.mode = S_IRUGO;
 	bmc->guid_attr.show = guid_show;
 
 	bmc->aux_firmware_rev_attr.attr.name = "aux_firmware_revision";
-	bmc->aux_firmware_rev_attr.attr.owner = THIS_MODULE;
 	bmc->aux_firmware_rev_attr.attr.mode = S_IRUGO;
 	bmc->aux_firmware_rev_attr.show = aux_firmware_rev_show;
 
diff --git a/drivers/cpufreq/cpufreq_stats.c b/drivers/cpufreq/cpufreq_stats.c
index d1c7cac..7c9589a 100644
--- a/drivers/cpufreq/cpufreq_stats.c
+++ b/drivers/cpufreq/cpufreq_stats.c
@@ -25,8 +25,7 @@ static spinlock_t cpufreq_stats_lock;
 
 #define CPUFREQ_STATDEVICE_ATTR(_name,_mode,_show) \
 static struct freq_attr _attr_##_name = {\
-	.attr = {.name = __stringify(_name), .owner = THIS_MODULE, \
-		.mode = _mode, }, \
+	.attr = {.name = __stringify(_name), .mode = _mode, }, \
 	.show = _show,\
 };
 
diff --git a/drivers/cpufreq/cpufreq_userspace.c b/drivers/cpufreq/cpufreq_userspace.c
index 860345c..a648970 100644
--- a/drivers/cpufreq/cpufreq_userspace.c
+++ b/drivers/cpufreq/cpufreq_userspace.c
@@ -120,7 +120,7 @@ store_speed (struct cpufreq_policy *policy, const char *buf, size_t count)
 
 static struct freq_attr freq_attr_scaling_setspeed =
 {
-	.attr = { .name = "scaling_setspeed", .mode = 0644, .owner = THIS_MODULE },
+	.attr = { .name = "scaling_setspeed", .mode = 0644 },
 	.show = show_speed,
 	.store = store_speed,
 };
diff --git a/drivers/cpufreq/freq_table.c b/drivers/cpufreq/freq_table.c
index e749092..5409f3a 100644
--- a/drivers/cpufreq/freq_table.c
+++ b/drivers/cpufreq/freq_table.c
@@ -199,7 +199,6 @@ static ssize_t show_available_freqs (struct cpufreq_policy *policy, char *buf)
 struct freq_attr cpufreq_freq_attr_scaling_available_freqs = {
 	.attr = { .name = "scaling_available_frequencies",
 		  .mode = 0444,
-		  .owner=THIS_MODULE
 		},
 	.show = show_available_freqs,
 };
diff --git a/drivers/firmware/dcdbas.h b/drivers/firmware/dcdbas.h
index 58a8518..dcdba0f 100644
--- a/drivers/firmware/dcdbas.h
+++ b/drivers/firmware/dcdbas.h
@@ -67,8 +67,7 @@
 #define DCDBAS_BIN_ATTR_RW(_name) \
 struct bin_attribute bin_attr_##_name = { \
 	.attr =  { .name = __stringify(_name), \
-		   .mode = 0600, \
-		   .owner = THIS_MODULE }, \
+		   .mode = 0600 }, \
 	.read =  _name##_read, \
 	.write = _name##_write, \
 }
diff --git a/drivers/firmware/dell_rbu.c b/drivers/firmware/dell_rbu.c
index fc702e4..f8afecb 100644
--- a/drivers/firmware/dell_rbu.c
+++ b/drivers/firmware/dell_rbu.c
@@ -687,18 +687,18 @@ static ssize_t write_rbu_packet_size(struct kobject *kobj, char *buffer,
 }
 
 static struct bin_attribute rbu_data_attr = {
-	.attr = {.name = "data",.owner = THIS_MODULE,.mode = 0444},
+	.attr = {.name = "data", .mode = 0444},
 	.read = read_rbu_data,
 };
 
 static struct bin_attribute rbu_image_type_attr = {
-	.attr = {.name = "image_type",.owner = THIS_MODULE,.mode = 0644},
+	.attr = {.name = "image_type", .mode = 0644},
 	.read = read_rbu_image_type,
 	.write = write_rbu_image_type,
 };
 
 static struct bin_attribute rbu_packet_size_attr = {
-	.attr = {.name = "packet_size",.owner = THIS_MODULE,.mode = 0644},
+	.attr = {.name = "packet_size", .mode = 0644},
 	.read = read_rbu_packet_size,
 	.write = write_rbu_packet_size,
 };
diff --git a/drivers/firmware/edd.c b/drivers/firmware/edd.c
index d8806e4..1523227 100644
--- a/drivers/firmware/edd.c
+++ b/drivers/firmware/edd.c
@@ -74,7 +74,7 @@ static struct edd_device *edd_devices[EDD_MBR_SIG_MAX];
 
 #define EDD_DEVICE_ATTR(_name,_mode,_show,_test) \
 struct edd_attribute edd_attr_##_name = { 	\
-	.attr = {.name = __stringify(_name), .mode = _mode, .owner = THIS_MODULE },	\
+	.attr = {.name = __stringify(_name), .mode = _mode },	\
 	.show	= _show,				\
 	.test	= _test,				\
 };
diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c
index c6281cc..b84b2b4 100644
--- a/drivers/firmware/efivars.c
+++ b/drivers/firmware/efivars.c
@@ -131,21 +131,21 @@ struct efivar_attribute {
 
 #define EFI_ATTR(_name, _mode, _show, _store) \
 struct subsys_attribute efi_attr_##_name = { \
-	.attr = {.name = __stringify(_name), .mode = _mode, .owner = THIS_MODULE}, \
+	.attr = {.name = __stringify(_name), .mode = _mode}, \
 	.show = _show, \
 	.store = _store, \
 };
 
 #define EFIVAR_ATTR(_name, _mode, _show, _store) \
 struct efivar_attribute efivar_attr_##_name = { \
-	.attr = {.name = __stringify(_name), .mode = _mode, .owner = THIS_MODULE}, \
+	.attr = {.name = __stringify(_name), .mode = _mode}, \
 	.show = _show, \
 	.store = _store, \
 };
 
 #define VAR_SUBSYS_ATTR(_name, _mode, _show, _store) \
 struct subsys_attribute var_subsys_attr_##_name = { \
-	.attr = {.name = __stringify(_name), .mode = _mode, .owner = THIS_MODULE}, \
+	.attr = {.name = __stringify(_name), .mode = _mode}, \
 	.show = _show, \
 	.store = _store, \
 };
diff --git a/drivers/i2c/chips/eeprom.c b/drivers/i2c/chips/eeprom.c
index bfce13c..5990dd5 100644
--- a/drivers/i2c/chips/eeprom.c
+++ b/drivers/i2c/chips/eeprom.c
@@ -143,7 +143,6 @@ static struct bin_attribute eeprom_attr = {
 	.attr = {
 		.name = "eeprom",
 		.mode = S_IRUGO,
-		.owner = THIS_MODULE,
 	},
 	.size = EEPROM_SIZE,
 	.read = eeprom_read,
diff --git a/drivers/i2c/chips/max6875.c b/drivers/i2c/chips/max6875.c
index 76645c1..1405ce5 100644
--- a/drivers/i2c/chips/max6875.c
+++ b/drivers/i2c/chips/max6875.c
@@ -152,7 +152,6 @@ static struct bin_attribute user_eeprom_attr = {
 	.attr = {
 		.name = "eeprom",
 		.mode = S_IRUGO,
-		.owner = THIS_MODULE,
 	},
 	.size = USER_EEPROM_SIZE,
 	.read = max6875_read,
diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c
index 08c299e..bf9b992 100644
--- a/drivers/infiniband/core/sysfs.c
+++ b/drivers/infiniband/core/sysfs.c
@@ -479,7 +479,6 @@ alloc_group_attrs(ssize_t (*show)(struct ib_port *,
 
 		element->attr.attr.name  = element->name;
 		element->attr.attr.mode  = S_IRUGO;
-		element->attr.attr.owner = THIS_MODULE;
 		element->attr.show       = show;
 		element->index		 = i;
 
diff --git a/drivers/input/mouse/psmouse.h b/drivers/input/mouse/psmouse.h
index cf1de95..4a5bddf 100644
--- a/drivers/input/mouse/psmouse.h
+++ b/drivers/input/mouse/psmouse.h
@@ -117,7 +117,6 @@ static struct psmouse_attribute psmouse_attr_##_name = {			\
 		.attr	= {							\
 			.name	= __stringify(_name),				\
 			.mode	= _mode,					\
-			.owner	= THIS_MODULE,					\
 		},								\
 		.show	= psmouse_attr_show_helper,				\
 		.store	= psmouse_attr_set_helper,				\
diff --git a/drivers/macintosh/windfarm_core.c b/drivers/macintosh/windfarm_core.c
index 94c117e..0b8fd46 100644
--- a/drivers/macintosh/windfarm_core.c
+++ b/drivers/macintosh/windfarm_core.c
@@ -213,7 +213,6 @@ int wf_register_control(struct wf_control *new_ct)
 	list_add(&new_ct->link, &wf_controls);
 
 	new_ct->attr.attr.name = new_ct->name;
-	new_ct->attr.attr.owner = THIS_MODULE;
 	new_ct->attr.attr.mode = 0644;
 	new_ct->attr.show = wf_show_control;
 	new_ct->attr.store = wf_store_control;
@@ -323,7 +322,6 @@ int wf_register_sensor(struct wf_sensor *new_sr)
 	list_add(&new_sr->link, &wf_sensors);
 
 	new_sr->attr.attr.name = new_sr->name;
-	new_sr->attr.attr.owner = THIS_MODULE;
 	new_sr->attr.attr.mode = 0444;
 	new_sr->attr.show = wf_show_sensor;
 	new_sr->attr.store = NULL;
diff --git a/drivers/media/video/pvrusb2/pvrusb2-sysfs.c b/drivers/media/video/pvrusb2/pvrusb2-sysfs.c
index a741c55..7ab79ba 100644
--- a/drivers/media/video/pvrusb2/pvrusb2-sysfs.c
+++ b/drivers/media/video/pvrusb2/pvrusb2-sysfs.c
@@ -518,40 +518,32 @@ static void pvr2_sysfs_add_control(struct pvr2_sysfs *sfp,int ctl_id)
 	}
 	sfp->item_last = cip;
 
-	cip->attr_name.attr.owner = THIS_MODULE;
 	cip->attr_name.attr.name = "name";
 	cip->attr_name.attr.mode = S_IRUGO;
 	cip->attr_name.show = fp->show_name;
 
-	cip->attr_type.attr.owner = THIS_MODULE;
 	cip->attr_type.attr.name = "type";
 	cip->attr_type.attr.mode = S_IRUGO;
 	cip->attr_type.show = fp->show_type;
 
-	cip->attr_min.attr.owner = THIS_MODULE;
 	cip->attr_min.attr.name = "min_val";
 	cip->attr_min.attr.mode = S_IRUGO;
 	cip->attr_min.show = fp->show_min;
 
-	cip->attr_max.attr.owner = THIS_MODULE;
 	cip->attr_max.attr.name = "max_val";
 	cip->attr_max.attr.mode = S_IRUGO;
 	cip->attr_max.show = fp->show_max;
 
-	cip->attr_val.attr.owner = THIS_MODULE;
 	cip->attr_val.attr.name = "cur_val";
 	cip->attr_val.attr.mode = S_IRUGO;
 
-	cip->attr_custom.attr.owner = THIS_MODULE;
 	cip->attr_custom.attr.name = "custom_val";
 	cip->attr_custom.attr.mode = S_IRUGO;
 
-	cip->attr_enum.attr.owner = THIS_MODULE;
 	cip->attr_enum.attr.name = "enum_val";
 	cip->attr_enum.attr.mode = S_IRUGO;
 	cip->attr_enum.show = fp->show_enum;
 
-	cip->attr_bits.attr.owner = THIS_MODULE;
 	cip->attr_bits.attr.name = "bit_val";
 	cip->attr_bits.attr.mode = S_IRUGO;
 	cip->attr_bits.show = fp->show_bits;
@@ -616,12 +608,10 @@ static void pvr2_sysfs_add_debugifc(struct pvr2_sysfs *sfp)
 
 	dip = kzalloc(sizeof(*dip),GFP_KERNEL);
 	if (!dip) return;
-	dip->attr_debugcmd.attr.owner = THIS_MODULE;
 	dip->attr_debugcmd.attr.name = "debugcmd";
 	dip->attr_debugcmd.attr.mode = S_IRUGO|S_IWUSR|S_IWGRP;
 	dip->attr_debugcmd.show = debugcmd_show;
 	dip->attr_debugcmd.store = debugcmd_store;
-	dip->attr_debuginfo.attr.owner = THIS_MODULE;
 	dip->attr_debuginfo.attr.name = "debuginfo";
 	dip->attr_debuginfo.attr.mode = S_IRUGO;
 	dip->attr_debuginfo.show = debuginfo_show;
@@ -811,7 +801,6 @@ static void class_dev_create(struct pvr2_sysfs *sfp,
 		return;
 	}
 
-	sfp->attr_v4l_minor_number.attr.owner = THIS_MODULE;
 	sfp->attr_v4l_minor_number.attr.name = "v4l_minor_number";
 	sfp->attr_v4l_minor_number.attr.mode = S_IRUGO;
 	sfp->attr_v4l_minor_number.show = v4l_minor_number_show;
@@ -825,7 +814,6 @@ static void class_dev_create(struct pvr2_sysfs *sfp,
 		sfp->v4l_minor_number_created_ok = !0;
 	}
 
-	sfp->attr_v4l_radio_minor_number.attr.owner = THIS_MODULE;
 	sfp->attr_v4l_radio_minor_number.attr.name = "v4l_radio_minor_number";
 	sfp->attr_v4l_radio_minor_number.attr.mode = S_IRUGO;
 	sfp->attr_v4l_radio_minor_number.show = v4l_radio_minor_number_show;
@@ -839,7 +827,6 @@ static void class_dev_create(struct pvr2_sysfs *sfp,
 		sfp->v4l_radio_minor_number_created_ok = !0;
 	}
 
-	sfp->attr_unit_number.attr.owner = THIS_MODULE;
 	sfp->attr_unit_number.attr.name = "unit_number";
 	sfp->attr_unit_number.attr.mode = S_IRUGO;
 	sfp->attr_unit_number.show = unit_number_show;
@@ -852,7 +839,6 @@ static void class_dev_create(struct pvr2_sysfs *sfp,
 		sfp->unit_number_created_ok = !0;
 	}
 
-	sfp->attr_bus_info.attr.owner = THIS_MODULE;
 	sfp->attr_bus_info.attr.name = "bus_info_str";
 	sfp->attr_bus_info.attr.mode = S_IRUGO;
 	sfp->attr_bus_info.show = bus_info_show;
diff --git a/drivers/misc/asus-laptop.c b/drivers/misc/asus-laptop.c
index 4b23212..fe2f29c 100644
--- a/drivers/misc/asus-laptop.c
+++ b/drivers/misc/asus-laptop.c
@@ -673,8 +673,7 @@ static void asus_hotk_notify(acpi_handle handle, u32 event, void *data)
 	struct device_attribute dev_attr_##_name = {			\
 		.attr = {						\
 			.name = __stringify(_name),			\
-			.mode = 0,					\
-			.owner = THIS_MODULE },				\
+			.mode = 0 },					\
 		.show   = NULL,						\
 		.store  = NULL,						\
 	}
diff --git a/drivers/net/ibmveth.c b/drivers/net/ibmveth.c
index 0573fcf..ebc0ace 100644
--- a/drivers/net/ibmveth.c
+++ b/drivers/net/ibmveth.c
@@ -1307,7 +1307,7 @@ const char * buf, size_t count)
 
 #define ATTR(_name, _mode)      \
         struct attribute veth_##_name##_attr = {               \
-        .name = __stringify(_name), .mode = _mode, .owner = THIS_MODULE \
+        .name = __stringify(_name), .mode = _mode, \
         };
 
 static ATTR(active, 0644);
diff --git a/drivers/parisc/pdc_stable.c b/drivers/parisc/pdc_stable.c
index ea1b7a6..55b87f3 100644
--- a/drivers/parisc/pdc_stable.c
+++ b/drivers/parisc/pdc_stable.c
@@ -121,14 +121,14 @@ struct pdcspath_entry pdcspath_entry_##_name = { \
 
 #define PDCS_ATTR(_name, _mode, _show, _store) \
 struct subsys_attribute pdcs_attr_##_name = { \
-	.attr = {.name = __stringify(_name), .mode = _mode, .owner = THIS_MODULE}, \
+	.attr = {.name = __stringify(_name), .mode = _mode}, \
 	.show = _show, \
 	.store = _store, \
 };
 
 #define PATHS_ATTR(_name, _mode, _show, _store) \
 struct pdcspath_attribute paths_attr_##_name = { \
-	.attr = {.name = __stringify(_name), .mode = _mode, .owner = THIS_MODULE}, \
+	.attr = {.name = __stringify(_name), .mode = _mode}, \
 	.show = _show, \
 	.store = _store, \
 };
diff --git a/drivers/pci/hotplug/acpiphp_ibm.c b/drivers/pci/hotplug/acpiphp_ibm.c
index 7f03881..4d5998a 100644
--- a/drivers/pci/hotplug/acpiphp_ibm.c
+++ b/drivers/pci/hotplug/acpiphp_ibm.c
@@ -117,7 +117,6 @@ static struct notification ibm_note;
 static struct bin_attribute ibm_apci_table_attr = {
 	    .attr = {
 		    .name = "apci_table",
-		    .owner = THIS_MODULE,
 		    .mode = S_IRUGO,
 	    },
 	    .read = ibm_read_apci_table,
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index cd913a2..d803ddf 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -499,7 +499,6 @@ static int pci_create_resource_files(struct pci_dev *pdev)
 			sprintf(res_attr_name, "resource%d", i);
 			res_attr->attr.name = res_attr_name;
 			res_attr->attr.mode = S_IRUSR | S_IWUSR;
-			res_attr->attr.owner = THIS_MODULE;
 			res_attr->size = pci_resource_len(pdev, i);
 			res_attr->mmap = pci_mmap_resource;
 			res_attr->private = &pdev->resource[i];
@@ -582,7 +581,6 @@ static struct bin_attribute pci_config_attr = {
 	.attr =	{
 		.name = "config",
 		.mode = S_IRUGO | S_IWUSR,
-		.owner = THIS_MODULE,
 	},
 	.size = 256,
 	.read = pci_read_config,
@@ -593,7 +591,6 @@ static struct bin_attribute pcie_config_attr = {
 	.attr =	{
 		.name = "config",
 		.mode = S_IRUGO | S_IWUSR,
-		.owner = THIS_MODULE,
 	},
 	.size = 4096,
 	.read = pci_read_config,
@@ -627,7 +624,6 @@ int __must_check pci_create_sysfs_dev_files (struct pci_dev *pdev)
 			rom_attr->size = pci_resource_len(pdev, PCI_ROM_RESOURCE);
 			rom_attr->attr.name = "rom";
 			rom_attr->attr.mode = S_IRUSR;
-			rom_attr->attr.owner = THIS_MODULE;
 			rom_attr->read = pci_read_rom;
 			rom_attr->write = pci_write_rom;
 			retval = sysfs_create_bin_file(&pdev->dev.kobj, rom_attr);
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 2fe1d69..182805f 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -39,7 +39,6 @@ static void pci_create_legacy_files(struct pci_bus *b)
 		b->legacy_io->attr.name = "legacy_io";
 		b->legacy_io->size = 0xffff;
 		b->legacy_io->attr.mode = S_IRUSR | S_IWUSR;
-		b->legacy_io->attr.owner = THIS_MODULE;
 		b->legacy_io->read = pci_read_legacy_io;
 		b->legacy_io->write = pci_write_legacy_io;
 		class_device_create_bin_file(&b->class_dev, b->legacy_io);
@@ -49,7 +48,6 @@ static void pci_create_legacy_files(struct pci_bus *b)
 		b->legacy_mem->attr.name = "legacy_mem";
 		b->legacy_mem->size = 1024*1024;
 		b->legacy_mem->attr.mode = S_IRUSR | S_IWUSR;
-		b->legacy_mem->attr.owner = THIS_MODULE;
 		b->legacy_mem->mmap = pci_mmap_legacy_mem;
 		class_device_create_bin_file(&b->class_dev, b->legacy_mem);
 	}
diff --git a/drivers/pcmcia/socket_sysfs.c b/drivers/pcmcia/socket_sysfs.c
index ea5765c..b7f321d 100644
--- a/drivers/pcmcia/socket_sysfs.c
+++ b/drivers/pcmcia/socket_sysfs.c
@@ -367,7 +367,7 @@ static struct device_attribute *pccard_socket_attributes[] = {
 };
 
 static struct bin_attribute pccard_cis_attr = {
-	.attr = { .name = "cis", .mode = S_IRUGO | S_IWUSR, .owner = THIS_MODULE},
+	.attr = { .name = "cis", .mode = S_IRUGO | S_IWUSR },
 	.size = 0x200,
 	.read = pccard_show_cis,
 	.write = pccard_store_cis,
diff --git a/drivers/rapidio/rio-sysfs.c b/drivers/rapidio/rio-sysfs.c
index eed9143..a3972b9 100644
--- a/drivers/rapidio/rio-sysfs.c
+++ b/drivers/rapidio/rio-sysfs.c
@@ -197,7 +197,6 @@ static struct bin_attribute rio_config_attr = {
 	.attr = {
 		 .name = "config",
 		 .mode = S_IRUGO | S_IWUSR,
-		 .owner = THIS_MODULE,
 		 },
 	.size = 0x200000,
 	.read = rio_read_config,
diff --git a/drivers/rtc/rtc-ds1553.c b/drivers/rtc/rtc-ds1553.c
index e27176c..b3f4c4e 100644
--- a/drivers/rtc/rtc-ds1553.c
+++ b/drivers/rtc/rtc-ds1553.c
@@ -290,7 +290,6 @@ static struct bin_attribute ds1553_nvram_attr = {
 	.attr = {
 		.name = "nvram",
 		.mode = S_IRUGO | S_IWUGO,
-		.owner = THIS_MODULE,
 	},
 	.size = RTC_OFFSET,
 	.read = ds1553_nvram_read,
diff --git a/drivers/rtc/rtc-ds1742.c b/drivers/rtc/rtc-ds1742.c
index d68288b..1638acd 100644
--- a/drivers/rtc/rtc-ds1742.c
+++ b/drivers/rtc/rtc-ds1742.c
@@ -159,7 +159,6 @@ static struct bin_attribute ds1742_nvram_attr = {
 	.attr = {
 		.name = "nvram",
 		.mode = S_IRUGO | S_IWUGO,
-		.owner = THIS_MODULE,
 	},
 	.read = ds1742_nvram_read,
 	.write = ds1742_nvram_write,
diff --git a/drivers/s390/cio/chp.c b/drivers/s390/cio/chp.c
index ac289e6..96a8a72 100644
--- a/drivers/s390/cio/chp.c
+++ b/drivers/s390/cio/chp.c
@@ -165,7 +165,6 @@ static struct bin_attribute chp_measurement_chars_attr = {
 	.attr = {
 		.name = "measurement_chars",
 		.mode = S_IRUSR,
-		.owner = THIS_MODULE,
 	},
 	.size = sizeof(struct cmg_chars),
 	.read = chp_measurement_chars_read,
@@ -217,7 +216,6 @@ static struct bin_attribute chp_measurement_attr = {
 	.attr = {
 		.name = "measurement",
 		.mode = S_IRUSR,
-		.owner = THIS_MODULE,
 	},
 	.size = sizeof(struct cmg_entry),
 	.read = chp_measurement_read,
diff --git a/drivers/s390/net/qeth_sys.c b/drivers/s390/net/qeth_sys.c
index d518419..97f9789 100644
--- a/drivers/s390/net/qeth_sys.c
+++ b/drivers/s390/net/qeth_sys.c
@@ -993,7 +993,7 @@ static struct attribute_group qeth_osn_device_attr_group = {
 
 #define QETH_DEVICE_ATTR(_id,_name,_mode,_show,_store)			     \
 struct device_attribute dev_attr_##_id = {				     \
-	.attr = {.name=__stringify(_name), .mode=_mode, .owner=THIS_MODULE },\
+	.attr = {.name=__stringify(_name), .mode=_mode, },\
 	.show	= _show,						     \
 	.store	= _store,						     \
 };
diff --git a/drivers/scsi/arcmsr/arcmsr_attr.c b/drivers/scsi/arcmsr/arcmsr_attr.c
index 12497da..b08cbac 100644
--- a/drivers/scsi/arcmsr/arcmsr_attr.c
+++ b/drivers/scsi/arcmsr/arcmsr_attr.c
@@ -189,7 +189,6 @@ static struct bin_attribute arcmsr_sysfs_message_read_attr = {
 	.attr = {
 		.name = "mu_read",
 		.mode = S_IRUSR ,
-		.owner = THIS_MODULE,
 	},
 	.size = 1032,
 	.read = arcmsr_sysfs_iop_message_read,
@@ -199,7 +198,6 @@ static struct bin_attribute arcmsr_sysfs_message_write_attr = {
 	.attr = {
 		.name = "mu_write",
 		.mode = S_IWUSR,
-		.owner = THIS_MODULE,
 	},
 	.size = 1032,
 	.write = arcmsr_sysfs_iop_message_write,
@@ -209,7 +207,6 @@ static struct bin_attribute arcmsr_sysfs_message_clear_attr = {
 	.attr = {
 		.name = "mu_clear",
 		.mode = S_IWUSR,
-		.owner = THIS_MODULE,
 	},
 	.size = 1,
 	.write = arcmsr_sysfs_iop_message_clear,
diff --git a/drivers/scsi/libsas/sas_expander.c b/drivers/scsi/libsas/sas_expander.c
index dc70c18..4bddb24 100644
--- a/drivers/scsi/libsas/sas_expander.c
+++ b/drivers/scsi/libsas/sas_expander.c
@@ -1369,7 +1369,6 @@ static void sas_ex_smp_hook(struct domain_device *dev)
 	memset(bin_attr, 0, sizeof(*bin_attr));
 
 	bin_attr->attr.name = SMP_BIN_ATTR_NAME;
-	bin_attr->attr.owner = THIS_MODULE;
 	bin_attr->attr.mode = 0600;
 
 	bin_attr->size = 0;
diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c
index f247e78..c8ce90b 100644
--- a/drivers/scsi/lpfc/lpfc_attr.c
+++ b/drivers/scsi/lpfc/lpfc_attr.c
@@ -1145,7 +1145,6 @@ static struct bin_attribute sysfs_ctlreg_attr = {
 	.attr = {
 		.name = "ctlreg",
 		.mode = S_IRUSR | S_IWUSR,
-		.owner = THIS_MODULE,
 	},
 	.size = 256,
 	.read = sysfs_ctlreg_read,
@@ -1356,7 +1355,6 @@ static struct bin_attribute sysfs_mbox_attr = {
 	.attr = {
 		.name = "mbox",
 		.mode = S_IRUSR | S_IWUSR,
-		.owner = THIS_MODULE,
 	},
 	.size = sizeof(MAILBOX_t),
 	.read = sysfs_mbox_read,
diff --git a/drivers/scsi/qla2xxx/qla_attr.c b/drivers/scsi/qla2xxx/qla_attr.c
index 8081b63..9658725 100644
--- a/drivers/scsi/qla2xxx/qla_attr.c
+++ b/drivers/scsi/qla2xxx/qla_attr.c
@@ -73,7 +73,6 @@ static struct bin_attribute sysfs_fw_dump_attr = {
 	.attr = {
 		.name = "fw_dump",
 		.mode = S_IRUSR | S_IWUSR,
-		.owner = THIS_MODULE,
 	},
 	.size = 0,
 	.read = qla2x00_sysfs_read_fw_dump,
@@ -149,7 +148,6 @@ static struct bin_attribute sysfs_nvram_attr = {
 	.attr = {
 		.name = "nvram",
 		.mode = S_IRUSR | S_IWUSR,
-		.owner = THIS_MODULE,
 	},
 	.size = 512,
 	.read = qla2x00_sysfs_read_nvram,
@@ -198,7 +196,6 @@ static struct bin_attribute sysfs_optrom_attr = {
 	.attr = {
 		.name = "optrom",
 		.mode = S_IRUSR | S_IWUSR,
-		.owner = THIS_MODULE,
 	},
 	.size = OPTROM_SIZE_24XX,
 	.read = qla2x00_sysfs_read_optrom,
@@ -279,7 +276,6 @@ static struct bin_attribute sysfs_optrom_ctl_attr = {
 	.attr = {
 		.name = "optrom_ctl",
 		.mode = S_IWUSR,
-		.owner = THIS_MODULE,
 	},
 	.size = 0,
 	.write = qla2x00_sysfs_write_optrom_ctl,
@@ -327,7 +323,6 @@ static struct bin_attribute sysfs_vpd_attr = {
 	.attr = {
 		.name = "vpd",
 		.mode = S_IRUSR | S_IWUSR,
-		.owner = THIS_MODULE,
 	},
 	.size = 0,
 	.read = qla2x00_sysfs_read_vpd,
@@ -375,7 +370,6 @@ static struct bin_attribute sysfs_sfp_attr = {
 	.attr = {
 		.name = "sfp",
 		.mode = S_IRUSR | S_IWUSR,
-		.owner = THIS_MODULE,
 	},
 	.size = SFP_DEV_SIZE * 2,
 	.read = qla2x00_sysfs_read_sfp,
diff --git a/drivers/spi/at25.c b/drivers/spi/at25.c
index 8efa07e..fde1ded 100644
--- a/drivers/spi/at25.c
+++ b/drivers/spi/at25.c
@@ -314,7 +314,6 @@ static int at25_probe(struct spi_device *spi)
 	 */
 	at25->bin.attr.name = "eeprom";
 	at25->bin.attr.mode = S_IRUSR;
-	at25->bin.attr.owner = THIS_MODULE;
 	at25->bin.read = at25_bin_read;
 
 	at25->bin.size = at25->chip.byte_len;
diff --git a/drivers/video/aty/radeon_base.c b/drivers/video/aty/radeon_base.c
index a4b3fd1..6ec10d0 100644
--- a/drivers/video/aty/radeon_base.c
+++ b/drivers/video/aty/radeon_base.c
@@ -2123,7 +2123,6 @@ static ssize_t radeon_show_edid2(struct kobject *kobj, char *buf, loff_t off, si
 static struct bin_attribute edid1_attr = {
 	.attr   = {
 		.name	= "edid1",
-		.owner	= THIS_MODULE,
 		.mode	= 0444,
 	},
 	.size	= EDID_LENGTH,
@@ -2133,7 +2132,6 @@ static struct bin_attribute edid1_attr = {
 static struct bin_attribute edid2_attr = {
 	.attr   = {
 		.name	= "edid2",
-		.owner	= THIS_MODULE,
 		.mode	= 0444,
 	},
 	.size	= EDID_LENGTH,
diff --git a/drivers/video/backlight/backlight.c b/drivers/video/backlight/backlight.c
index c65e81f..7e06223 100644
--- a/drivers/video/backlight/backlight.c
+++ b/drivers/video/backlight/backlight.c
@@ -172,7 +172,7 @@ static struct class backlight_class = {
 
 #define DECLARE_ATTR(_name,_mode,_show,_store)			\
 {							 	\
-	.attr	= { .name = __stringify(_name), .mode = _mode, .owner = THIS_MODULE },	\
+	.attr	= { .name = __stringify(_name), .mode = _mode }, \
 	.show	= _show,					\
 	.store	= _store,					\
 }
diff --git a/drivers/video/backlight/lcd.c b/drivers/video/backlight/lcd.c
index 6ef8f0a..648b53c 100644
--- a/drivers/video/backlight/lcd.c
+++ b/drivers/video/backlight/lcd.c
@@ -157,7 +157,7 @@ static struct class lcd_class = {
 
 #define DECLARE_ATTR(_name,_mode,_show,_store)			\
 {							 	\
-	.attr	= { .name = __stringify(_name), .mode = _mode, .owner = THIS_MODULE },	\
+	.attr	= { .name = __stringify(_name), .mode = _mode }, \
 	.show	= _show,					\
 	.store	= _store,					\
 }
diff --git a/drivers/w1/slaves/w1_ds2433.c b/drivers/w1/slaves/w1_ds2433.c
index 8ea17a5..4e13aa7 100644
--- a/drivers/w1/slaves/w1_ds2433.c
+++ b/drivers/w1/slaves/w1_ds2433.c
@@ -252,7 +252,6 @@ static struct bin_attribute w1_f23_bin_attr = {
 	.attr = {
 		.name = "eeprom",
 		.mode = S_IRUGO | S_IWUSR,
-		.owner = THIS_MODULE,
 	},
 	.size = W1_EEPROM_SIZE,
 	.read = w1_f23_read_bin,
diff --git a/drivers/w1/slaves/w1_therm.c b/drivers/w1/slaves/w1_therm.c
index 732db47..721261b 100644
--- a/drivers/w1/slaves/w1_therm.c
+++ b/drivers/w1/slaves/w1_therm.c
@@ -48,7 +48,6 @@ static struct bin_attribute w1_therm_bin_attr = {
 	.attr = {
 		.name = "w1_slave",
 		.mode = S_IRUGO,
-		.owner = THIS_MODULE,
 	},
 	.size = W1_SLAVE_DATA_SIZE,
 	.read = w1_therm_read_bin,
diff --git a/drivers/w1/w1.c b/drivers/w1/w1.c
index 63c0724..c1ee648 100644
--- a/drivers/w1/w1.c
+++ b/drivers/w1/w1.c
@@ -128,7 +128,6 @@ static struct bin_attribute w1_slave_attr_bin_id = {
       .attr = {
               .name = "id",
               .mode = S_IRUGO,
-              .owner = THIS_MODULE,
       },
       .size = 8,
       .read = w1_slave_read_id,
@@ -167,7 +166,6 @@ static struct bin_attribute w1_default_attr = {
       .attr = {
               .name = "rw",
               .mode = S_IRUGO | S_IWUSR,
-              .owner = THIS_MODULE,
       },
       .size = PAGE_SIZE,
       .read = w1_default_read,
diff --git a/drivers/zorro/zorro-sysfs.c b/drivers/zorro/zorro-sysfs.c
index 87c29d7..b1bf112 100644
--- a/drivers/zorro/zorro-sysfs.c
+++ b/drivers/zorro/zorro-sysfs.c
@@ -77,7 +77,6 @@ static struct bin_attribute zorro_config_attr = {
 	.attr =	{
 		.name = "config",
 		.mode = S_IRUGO | S_IWUSR,
-		.owner = THIS_MODULE
 	},
 	.size = sizeof(struct ConfigDev),
 	.read = zorro_read_config,
diff --git a/fs/ecryptfs/main.c b/fs/ecryptfs/main.c
index fc4a3a2..6669a25 100644
--- a/fs/ecryptfs/main.c
+++ b/fs/ecryptfs/main.c
@@ -842,8 +842,6 @@ static int __init ecryptfs_init(void)
 		goto out;
 	}
 	kset_set_kset_s(&ecryptfs_subsys, fs_subsys);
-	sysfs_attr_version.attr.owner = THIS_MODULE;
-	sysfs_attr_version_str.attr.owner = THIS_MODULE;
 	rc = do_sysfs_registration();
 	if (rc) {
 		printk(KERN_ERR "sysfs registration failed\n");
diff --git a/fs/ocfs2/cluster/masklog.c b/fs/ocfs2/cluster/masklog.c
index 636593b..fd741ce 100644
--- a/fs/ocfs2/cluster/masklog.c
+++ b/fs/ocfs2/cluster/masklog.c
@@ -74,7 +74,6 @@ struct mlog_attribute {
 #define define_mask(_name) {			\
 	.attr = {				\
 		.name = #_name,			\
-		.owner = THIS_MODULE,		\
 		.mode = S_IRUGO | S_IWUSR,	\
 	},					\
 	.mask = ML_##_name,			\
diff --git a/fs/partitions/check.c b/fs/partitions/check.c
index 8a7d003..694b369 100644
--- a/fs/partitions/check.c
+++ b/fs/partitions/check.c
@@ -393,7 +393,6 @@ void add_partition(struct gendisk *disk, int part, sector_t start, sector_t len,
 		static struct attribute addpartattr = {
 			.name = "whole_disk",
 			.mode = S_IRUSR | S_IRGRP | S_IROTH,
-			.owner = THIS_MODULE,
 		};
 
 		sysfs_create_file(&p->kobj, &addpartattr);
diff --git a/fs/sysfs/bin.c b/fs/sysfs/bin.c
index 618b8ae..3c5574a 100644
--- a/fs/sysfs/bin.c
+++ b/fs/sysfs/bin.c
@@ -175,25 +175,20 @@ static int open(struct inode * inode, struct file * file)
 	if (!sysfs_get_active(attr_sd))
 		return -ENODEV;
 
-	/* Grab the module reference for this attribute */
-	error = -ENODEV;
-	if (!try_module_get(attr->attr.owner))
-		goto err_sput;
-
 	error = -EACCES;
 	if ((file->f_mode & FMODE_WRITE) && !(attr->write || attr->mmap))
-		goto err_mput;
+		goto err_out;
 	if ((file->f_mode & FMODE_READ) && !(attr->read || attr->mmap))
-		goto err_mput;
+		goto err_out;
 
 	error = -ENOMEM;
 	bb = kzalloc(sizeof(*bb), GFP_KERNEL);
 	if (!bb)
-		goto err_mput;
+		goto err_out;
 
 	bb->buffer = kmalloc(PAGE_SIZE, GFP_KERNEL);
 	if (!bb->buffer)
-		goto err_mput;
+		goto err_out;
 
 	mutex_init(&bb->mutex);
 	file->private_data = bb;
@@ -203,9 +198,7 @@ static int open(struct inode * inode, struct file * file)
 	sysfs_get(attr_sd);
 	return 0;
 
- err_mput:
-	module_put(attr->attr.owner);
- err_sput:
+ err_out:
 	sysfs_put_active(attr_sd);
 	kfree(bb);
 	return error;
@@ -214,13 +207,11 @@ static int open(struct inode * inode, struct file * file)
 static int release(struct inode * inode, struct file * file)
 {
 	struct sysfs_dirent *attr_sd = file->f_path.dentry->d_fsdata;
-	struct bin_attribute *attr = attr_sd->s_elem.bin_attr.bin_attr;
 	struct bin_buffer *bb = file->private_data;
 
 	if (bb->mmapped)
 		sysfs_put_active_two(attr_sd);
 	sysfs_put(attr_sd);
-	module_put(attr->attr.owner);
 	kfree(bb->buffer);
 	kfree(bb);
 	return 0;
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index a34c2fa..fa4c4a5 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -271,7 +271,6 @@ sysfs_write_file(struct file *file, const char __user *buf, size_t count, loff_t
 static int sysfs_open_file(struct inode *inode, struct file *file)
 {
 	struct sysfs_dirent *attr_sd = file->f_path.dentry->d_fsdata;
-	struct attribute *attr = attr_sd->s_elem.attr.attr;
 	struct kobject *kobj = attr_sd->s_parent->s_elem.dir.kobj;
 	struct sysfs_buffer * buffer;
 	struct sysfs_ops * ops = NULL;
@@ -281,11 +280,6 @@ static int sysfs_open_file(struct inode *inode, struct file *file)
 	if (!sysfs_get_active_two(attr_sd))
 		return -ENODEV;
 
-	/* Grab the module reference for this attribute */
-	error = -ENODEV;
-	if (!try_module_get(attr->owner))
-		goto err_sput;
-
 	/* if the kobject has no ktype, then we assume that it is a subsystem
 	 * itself, and use ops for it.
 	 */
@@ -302,7 +296,7 @@ static int sysfs_open_file(struct inode *inode, struct file *file)
 	 * or the subsystem have no operations.
 	 */
 	if (!ops)
-		goto err_mput;
+		goto err_out;
 
 	/* File needs write support.
 	 * The inode's perms must say it's ok, 
@@ -310,7 +304,7 @@ static int sysfs_open_file(struct inode *inode, struct file *file)
 	 */
 	if (file->f_mode & FMODE_WRITE) {
 		if (!(inode->i_mode & S_IWUGO) || !ops->store)
-			goto err_mput;
+			goto err_out;
 	}
 
 	/* File needs read support.
@@ -319,7 +313,7 @@ static int sysfs_open_file(struct inode *inode, struct file *file)
 	 */
 	if (file->f_mode & FMODE_READ) {
 		if (!(inode->i_mode & S_IRUGO) || !ops->show)
-			goto err_mput;
+			goto err_out;
 	}
 
 	/* No error? Great, allocate a buffer for the file, and store it
@@ -328,7 +322,7 @@ static int sysfs_open_file(struct inode *inode, struct file *file)
 	error = -ENOMEM;
 	buffer = kzalloc(sizeof(struct sysfs_buffer), GFP_KERNEL);
 	if (!buffer)
-		goto err_mput;
+		goto err_out;
 
 	init_MUTEX(&buffer->sem);
 	buffer->needs_read_fill = 1;
@@ -340,9 +334,7 @@ static int sysfs_open_file(struct inode *inode, struct file *file)
 	sysfs_get(attr_sd);
 	return 0;
 
- err_mput:
-	module_put(attr->owner);
- err_sput:
+ err_out:
 	sysfs_put_active_two(attr_sd);
 	return error;
 }
@@ -350,12 +342,9 @@ static int sysfs_open_file(struct inode *inode, struct file *file)
 static int sysfs_release(struct inode * inode, struct file * filp)
 {
 	struct sysfs_dirent *attr_sd = filp->f_path.dentry->d_fsdata;
-	struct attribute *attr = attr_sd->s_elem.attr.attr;
 	struct sysfs_buffer *buffer = filp->private_data;
 
 	sysfs_put(attr_sd);
-	/* After this point, attr should not be accessed. */
-	module_put(attr->owner);
 
 	if (buffer) {
 		if (buffer->page)
diff --git a/include/linux/sysdev.h b/include/linux/sysdev.h
index 389ccf8..6e7c398 100644
--- a/include/linux/sysdev.h
+++ b/include/linux/sysdev.h
@@ -100,8 +100,7 @@ struct sysdev_attribute {
 
 #define _SYSDEV_ATTR(_name,_mode,_show,_store)			\
 {								\
-	.attr = { .name = __stringify(_name), .mode = _mode,	\
-		 .owner = THIS_MODULE },			\
+	.attr = { .name = __stringify(_name), .mode = _mode },	\
 	.show	= _show,					\
 	.store	= _store,					\
 }
diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h
index 2f86b08..a92bfcd 100644
--- a/include/linux/sysfs.h
+++ b/include/linux/sysfs.h
@@ -22,7 +22,6 @@ struct dentry;
 
 struct attribute {
 	const char		* name;
-	struct module 		* owner;
 	mode_t			mode;
 };
 
@@ -39,14 +38,14 @@ struct attribute_group {
  */
 
 #define __ATTR(_name,_mode,_show,_store) { \
-	.attr = {.name = __stringify(_name), .mode = _mode, .owner = THIS_MODULE },	\
+	.attr = {.name = __stringify(_name), .mode = _mode },	\
 	.show	= _show,					\
 	.store	= _store,					\
 }
 
 #define __ATTR_RO(_name) { \
-	.attr	= { .name = __stringify(_name), .mode = 0444, .owner = THIS_MODULE },	\
-	.show	= _name##_show,	\
+	.attr	= { .name = __stringify(_name), .mode = 0444 },	\
+	.show	= _name##_show,					\
 }
 
 #define __ATTR_NULL { .attr = { .name = NULL } }
diff --git a/kernel/module.c b/kernel/module.c
index 9da5af6..d892d3c 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -485,8 +485,7 @@ static void free_modinfo_##field(struct module *mod)                  \
         mod->field = NULL;                                            \
 }                                                                     \
 static struct module_attribute modinfo_##field = {                    \
-	.attr = { .name = __stringify(field), .mode = 0444,           \
-		  .owner = THIS_MODULE },                             \
+	.attr = { .name = __stringify(field), .mode = 0444 },         \
 	.show = show_modinfo_##field,                                 \
 	.setup = setup_modinfo_##field,                               \
 	.test = modinfo_##field##_exists,                             \
@@ -790,7 +789,7 @@ static ssize_t show_refcnt(struct module_attribute *mattr,
 }
 
 static struct module_attribute refcnt = {
-	.attr = { .name = "refcnt", .mode = 0444, .owner = THIS_MODULE },
+	.attr = { .name = "refcnt", .mode = 0444 },
 	.show = show_refcnt,
 };
 
@@ -848,7 +847,7 @@ static ssize_t show_initstate(struct module_attribute *mattr,
 }
 
 static struct module_attribute initstate = {
-	.attr = { .name = "initstate", .mode = 0444, .owner = THIS_MODULE },
+	.attr = { .name = "initstate", .mode = 0444 },
 	.show = show_initstate,
 };
 
@@ -1029,7 +1028,6 @@ static void add_sect_attrs(struct module *mod, unsigned int nsect,
 		sattr->mattr.show = module_sect_show;
 		sattr->mattr.store = NULL;
 		sattr->mattr.attr.name = sattr->name;
-		sattr->mattr.attr.owner = mod;
 		sattr->mattr.attr.mode = S_IRUGO;
 		*(gattr++) = &(sattr++)->mattr.attr;
 	}
@@ -1087,7 +1085,6 @@ int module_add_modinfo_attrs(struct module *mod)
 		if (!attr->test ||
 		    (attr->test && attr->test(mod))) {
 			memcpy(temp_attr, attr, sizeof(*temp_attr));
-			temp_attr->attr.owner = mod;
 			error = sysfs_create_file(&mod->mkobj.kobj,&temp_attr->attr);
 			++temp_attr;
 		}
diff --git a/kernel/params.c b/kernel/params.c
index 1fc4ac7..94fbdce 100644
--- a/kernel/params.c
+++ b/kernel/params.c
@@ -491,7 +491,6 @@ param_sysfs_setup(struct module_kobject *mk,
 			pattr->mattr.show = param_attr_show;
 			pattr->mattr.store = param_attr_store;
 			pattr->mattr.attr.name = (char *)&kp->name[name_skip];
-			pattr->mattr.attr.owner = mk->mod;
 			pattr->mattr.attr.mode = kp->perm;
 			*(gattr++) = &(pattr++)->mattr.attr;
 		}
diff --git a/net/bridge/br_sysfs_br.c b/net/bridge/br_sysfs_br.c
index 33c6c4a..31ace23 100644
--- a/net/bridge/br_sysfs_br.c
+++ b/net/bridge/br_sysfs_br.c
@@ -383,8 +383,7 @@ static ssize_t brforward_read(struct kobject *kobj, char *buf,
 
 static struct bin_attribute bridge_forward = {
 	.attr = { .name = SYSFS_BRIDGE_FDB,
-		  .mode = S_IRUGO,
-		  .owner = THIS_MODULE, },
+		  .mode = S_IRUGO, },
 	.read = brforward_read,
 };
 
diff --git a/net/bridge/br_sysfs_if.c b/net/bridge/br_sysfs_if.c
index 2da2292..79db51f 100644
--- a/net/bridge/br_sysfs_if.c
+++ b/net/bridge/br_sysfs_if.c
@@ -29,8 +29,7 @@ struct brport_attribute {
 #define BRPORT_ATTR(_name,_mode,_show,_store)		        \
 struct brport_attribute brport_attr_##_name = { 	        \
 	.attr = {.name = __stringify(_name), 			\
-		 .mode = _mode, 				\
-		 .owner = THIS_MODULE, },			\
+		 .mode = _mode },				\
 	.show	= _show,					\
 	.store	= _store,					\
 };
-- 
1.5.0.3



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 21/21] sysfs: reimplement syfs_drop_dentry()
  2007-04-28 13:39 [PATCHSET] sysfs: sysfs rework, take #2 Tejun Heo
                   ` (19 preceding siblings ...)
  2007-04-28 13:39 ` [PATCH 19/21] sysfs: kill unnecessary attribute->owner Tejun Heo
@ 2007-04-28 13:39 ` Tejun Heo
  2007-04-28 14:28 ` [PATCHSET] sysfs: sysfs rework, take #2 Alan Stern
  2007-05-02 12:38 ` Cornelia Huck
  22 siblings, 0 replies; 30+ messages in thread
From: Tejun Heo @ 2007-04-28 13:39 UTC (permalink / raw)
  To: gregkh, dmitry.torokhov, cornelia.huck, oneukum, rpurdie, stern,
	maneesh, akpm, linux-kernel, htejun
  Cc: Tejun Heo

There are two races around sysfs_drop_dentry().

### race 1

   http://article.gmane.org/gmane.linux.kernel/516210

   Unable to handle kernel NULL pointer dereference at 000000000000004c RIP:
    [<ffffffff802935b4>] simple_unlink+0x14/0x5c
   ...
   Call Trace:
    [<ffffffff802b31ee>] sysfs_hash_and_remove+0x7c/0xef
    [<ffffffff803a0c1a>] device_del+0x66/0x20a
    [<ffffffff804d2d7e>] netdev_run_todo+0xc6/0x225
    [<ffffffff8800d025>] :dummy:dummy_free_one+0x1c/0x2d
    [<ffffffff8800d0a2>] :dummy:dummy_cleanup_module+0xe/0x23
    [<ffffffff8024ceed>] sys_delete_module+0x1b1/0x1e0
    [<ffffffff803437e7>] __up_write+0x21/0x10e
    [<ffffffff80209bbe>] system_call+0x7e/0x83

   This is race between dcache shrinking triggered by umount and sysfs
   deletion.  It seems to be introduced when dentries for attr and
   symlink nodes are made unpinned.  sd->s_dentry clearing is done
   without synchronization and sysfs_drop_entry() ends up deleting
   already deleted dentry (dentry->inode is NULL).

### race 2

    thread shrinking dcache		thread looking up sysfs entry
   --------------------------------------------------------------------
 1. sysfs dentry for A is chosen as
    victim.
 2. prune_one_dentry() drops the dentry
    and calls dentry_iput().
 3. dentry_iput() unlinks d_alias and
    releases spin locks.
					4. looks up dentry for A which
					   is not in dcache.
					5. new dentry is created and
					   sysfs_lookup() is invoked,
					   which instantiates the dentry
					   and set sd->s_dentry to it.
 6. sysfs_d_iput() is called.
    BUG_ON(sd->s_dentry != dentry)
    triggers and sd->s_dentry is
    cleared.  You're screwed.

Both races are caused by sd->s_dentry going out of sync.  This patch
introduces sysfs_lock and protect sd->s_dentry with it so that...

* sd->s_dentry always points to the dentry which is looked up most
  recently.  This is guaranteed even when dentry_iput() on the
  previous dentry overlaps with the lookup of the latest dentry.

* While sysfs_lock is held, the value contained in sd->s_dentry is
  valid.  If null, no dentry is associated with the sd.  If not null,
  the dentry is accessible and is or used to be associated with the
  sd.  Whether the dentry is alive or not can be checked by locking
  dcache_lock and checking whether the dentry is positive.

This change allows syfs_drop_dentry() to reliably determine the
associated dentry and drop it.  This patch also converts remove_dir()
which used to use a separate drop mechanism to use the same drop path.
With this change, making directories reclaimable is much easier.

Signed-off-by: Tejun Heo <htejun@gmail.com>
---
 fs/sysfs/dir.c   |   39 +++++++++++++++---------
 fs/sysfs/inode.c |   88 ++++++++++++++++++++++++++++++++++++++++++++----------
 fs/sysfs/sysfs.h |    3 +-
 3 files changed, 98 insertions(+), 32 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 1a45a1d..bc11a26 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -14,6 +14,7 @@
 #include "sysfs.h"
 
 DECLARE_RWSEM(sysfs_rename_sem);
+spinlock_t sysfs_lock = SPIN_LOCK_UNLOCKED;
 spinlock_t kobj_sysfs_assoc_lock = SPIN_LOCK_UNLOCKED;
 
 static spinlock_t sysfs_ino_lock = SPIN_LOCK_UNLOCKED;
@@ -83,8 +84,19 @@ static void sysfs_d_iput(struct dentry * dentry, struct inode * inode)
 	struct sysfs_dirent * sd = dentry->d_fsdata;
 
 	if (sd) {
-		BUG_ON(sd->s_dentry != dentry);
-		sd->s_dentry = NULL;
+		/* sd->s_dentry is protected with sysfs_lock.  This
+		 * allows sysfs_drop_dentry() to dereference it.
+		 */
+		spin_lock(&sysfs_lock);
+
+		/* The dentry might have been deleted or another
+		 * lookup could have happened updating sd->s_dentry to
+		 * point the new dentry.  Ignore if it isn't pointing
+		 * to this dentry.
+		 */
+		if (sd->s_dentry == dentry)
+			sd->s_dentry = NULL;
+		spin_unlock(&sysfs_lock);
 		sysfs_put(sd);
 	}
 	iput(inode);
@@ -134,7 +146,12 @@ static void sysfs_attach_dentry(struct sysfs_dirent *sd, struct dentry *dentry)
 {
 	dentry->d_op = &sysfs_dentry_ops;
 	dentry->d_fsdata = sysfs_get(sd);
+
+	/* protect sd->s_dentry against sysfs_d_iput */
+	spin_lock(&sysfs_lock);
 	sd->s_dentry = dentry;
+	spin_unlock(&sysfs_lock);
+
 	d_rehash(dentry);
 }
 
@@ -355,22 +372,19 @@ const struct inode_operations sysfs_dir_inode_operations = {
 
 static void remove_dir(struct dentry * d)
 {
-	struct dentry * parent = dget(d->d_parent);
-	struct sysfs_dirent * sd;
+	struct dentry *parent = d->d_parent;
+	struct sysfs_dirent *sd = d->d_fsdata;
 
 	mutex_lock(&parent->d_inode->i_mutex);
-	d_delete(d);
-	sd = d->d_fsdata;
+
  	list_del_init(&sd->s_sibling);
-	if (d->d_inode)
-		simple_rmdir(parent->d_inode,d);
 
 	pr_debug(" o %s removing done (%d)\n",d->d_name.name,
 		 atomic_read(&d->d_count));
 
 	mutex_unlock(&parent->d_inode->i_mutex);
-	dput(parent);
 
+	sysfs_drop_dentry(sd);
 	sysfs_deactivate(sd);
 	sysfs_put(sd);
 }
@@ -387,7 +401,6 @@ static void __sysfs_remove_dir(struct dentry *dentry)
 	struct sysfs_dirent * parent_sd;
 	struct sysfs_dirent * sd, * tmp;
 
-	dget(dentry);
 	if (!dentry)
 		return;
 
@@ -398,21 +411,17 @@ static void __sysfs_remove_dir(struct dentry *dentry)
 		if (!sd->s_type || !(sd->s_type & SYSFS_NOT_PINNED))
 			continue;
 		list_move(&sd->s_sibling, &removed);
-		sysfs_drop_dentry(sd, dentry);
 	}
 	mutex_unlock(&dentry->d_inode->i_mutex);
 
 	list_for_each_entry_safe(sd, tmp, &removed, s_sibling) {
 		list_del_init(&sd->s_sibling);
+		sysfs_drop_dentry(sd);
 		sysfs_deactivate(sd);
 		sysfs_put(sd);
 	}
 
 	remove_dir(dentry);
-	/**
-	 * Drop reference from dget() on entrance.
-	 */
-	dput(dentry);
 }
 
 /**
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index 10aa33f..81460c5 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -190,28 +190,83 @@ int sysfs_create(struct sysfs_dirent *sd, struct dentry *dentry, int mode,
 	return error;
 }
 
-/*
- * Unhashes the dentry corresponding to given sysfs_dirent
- * Called with parent inode's i_mutex held.
+/**
+ *	sysfs_drop_dentry - drop dentry for the specified sysfs_dirent
+ *	@sd: target sysfs_dirent
+ *
+ *	Drop dentry for @sd.  @sd must have been unlinked from its
+ *	parent on entry to this function such that it can't be looked
+ *	up anymore.
+ *
+ *	@sd->s_dentry which is protected with sysfs_lock points to the
+ *	currently associated dentry but we're not holding a reference
+ *	to it and racing with dput().  Grab dcache_lock and verify
+ *	dentry before dropping it.  If @sd->s_dentry is NULL or dput()
+ *	beats us, no need to bother.
  */
-void sysfs_drop_dentry(struct sysfs_dirent * sd, struct dentry * parent)
+void sysfs_drop_dentry(struct sysfs_dirent *sd)
 {
-	struct dentry * dentry = sd->s_dentry;
+	struct dentry *dentry = NULL, *parent = NULL;
+	struct inode *dir;
+	struct timespec curtime;
+
+	/* We're not holding a reference to ->s_dentry dentry but the
+	 * field will stay valid as long as sysfs_lock is held.
+	 */
+	spin_lock(&sysfs_lock);
+	spin_lock(&dcache_lock);
+
+	if (sd->s_dentry && sd->s_dentry->d_inode) {
+		/* get dentry if it's there and dput() didn't kill it yet */
+		dentry = dget_locked(sd->s_dentry);
+		parent = dentry->d_parent;
+	} else if (sd->s_parent->s_dentry->d_inode) {
+		/* We need to update the parent even if dentry for the
+		 * victim itself doesn't exist.
+		 */
+		parent = dget_locked(sd->s_parent->s_dentry);
+	}
 
+	/* drop */
 	if (dentry) {
-		spin_lock(&dcache_lock);
 		spin_lock(&dentry->d_lock);
-		if (!(d_unhashed(dentry) && dentry->d_inode)) {
-			dget_locked(dentry);
-			__d_drop(dentry);
-			spin_unlock(&dentry->d_lock);
-			spin_unlock(&dcache_lock);
-			simple_unlink(parent->d_inode, dentry);
-		} else {
-			spin_unlock(&dentry->d_lock);
-			spin_unlock(&dcache_lock);
+		__d_drop(dentry);
+		spin_unlock(&dentry->d_lock);
+	}
+
+	spin_unlock(&dcache_lock);
+	spin_unlock(&sysfs_lock);
+
+	/* nothing to do if the parent isn't in dcache */
+	if (!parent)
+		return;
+
+	/* adjust nlink and update timestamp */
+	dir = parent->d_inode;
+	mutex_lock(&dir->i_mutex);
+
+	curtime = CURRENT_TIME;
+
+	dir->i_ctime = dir->i_mtime = curtime;
+
+	if (dentry) {
+		dentry->d_inode->i_ctime = curtime;
+		drop_nlink(dentry->d_inode);
+		if (sd->s_type & SYSFS_DIR) {
+			drop_nlink(dentry->d_inode);
+			drop_nlink(dir);
+			/* XXX: unpin if directory, this will go away soon */
+			dput(dentry);
 		}
 	}
+
+	mutex_unlock(&dir->i_mutex);
+
+	/* bye bye */
+	if (dentry)
+		dput(dentry);
+	else
+		dput(parent);
 }
 
 int sysfs_hash_and_remove(struct dentry * dir, const char * name)
@@ -234,7 +289,6 @@ int sysfs_hash_and_remove(struct dentry * dir, const char * name)
 			continue;
 		if (!strcmp(sd->s_name, name)) {
 			list_del_init(&sd->s_sibling);
-			sysfs_drop_dentry(sd, dir);
 			found = 1;
 			break;
 		}
@@ -244,7 +298,9 @@ int sysfs_hash_and_remove(struct dentry * dir, const char * name)
 	if (!found)
 		return -ENOENT;
 
+	sysfs_drop_dentry(sd);
 	sysfs_deactivate(sd);
 	sysfs_put(sd);
+
 	return 0;
 }
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 06a237e..fc6aa86 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -76,9 +76,10 @@ extern struct sysfs_dirent *sysfs_find(struct sysfs_dirent *dir, const char * na
 extern int sysfs_create_subdir(struct kobject *, const char *, struct dentry **);
 extern void sysfs_remove_subdir(struct dentry *);
 
-extern void sysfs_drop_dentry(struct sysfs_dirent *sd, struct dentry *parent);
+extern void sysfs_drop_dentry(struct sysfs_dirent *sd);
 extern int sysfs_setattr(struct dentry *dentry, struct iattr *iattr);
 
+extern spinlock_t sysfs_lock;
 extern spinlock_t kobj_sysfs_assoc_lock;
 extern struct rw_semaphore sysfs_rename_sem;
 extern struct super_block * sysfs_sb;
-- 
1.5.0.3



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCHSET] sysfs: sysfs rework, take #2
  2007-04-28 13:39 [PATCHSET] sysfs: sysfs rework, take #2 Tejun Heo
                   ` (20 preceding siblings ...)
  2007-04-28 13:39 ` [PATCH 21/21] sysfs: reimplement syfs_drop_dentry() Tejun Heo
@ 2007-04-28 14:28 ` Alan Stern
  2007-04-28 16:06   ` Tejun Heo
  2007-05-02 12:38 ` Cornelia Huck
  22 siblings, 1 reply; 30+ messages in thread
From: Alan Stern @ 2007-04-28 14:28 UTC (permalink / raw)
  To: Tejun Heo
  Cc: gregkh, dmitry.torokhov, cornelia.huck, oneukum, rpurdie, maneesh,
	akpm, linux-kernel

On Sat, 28 Apr 2007, Tejun Heo wrote:

> Subject: [PATCHSET] sysfs: sysfs rework, take #2
> 
> Hello, all.
> 
> This is respin of syswork rework patchset in gregkh patch series.
> Changes are...

...

> All the known regrssions are fixed.  I think I've got all the
> attr->owner but haven't verified with cross compiling yet, just
> allyesconfig on x86-64 and i386.  I'll try to setup some cross compile
> environments tomorrow and follow up if I can find more fallouts.

What have you done about the problem of device nodes getting unregistered
while the corresponding sysfs directory is non-empty, say with child
device directories still present?

Alan Stern


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCHSET] sysfs: sysfs rework, take #2
  2007-04-28 14:28 ` [PATCHSET] sysfs: sysfs rework, take #2 Alan Stern
@ 2007-04-28 16:06   ` Tejun Heo
  0 siblings, 0 replies; 30+ messages in thread
From: Tejun Heo @ 2007-04-28 16:06 UTC (permalink / raw)
  To: Alan Stern
  Cc: gregkh, dmitry.torokhov, cornelia.huck, oneukum, rpurdie, maneesh,
	akpm, linux-kernel

Alan Stern wrote:
> On Sat, 28 Apr 2007, Tejun Heo wrote:
> 
>> Subject: [PATCHSET] sysfs: sysfs rework, take #2
>>
>> Hello, all.
>>
>> This is respin of syswork rework patchset in gregkh patch series.
>> Changes are...
> 
> ...
> 
>> All the known regrssions are fixed.  I think I've got all the
>> attr->owner but haven't verified with cross compiling yet, just
>> allyesconfig on x86-64 and i386.  I'll try to setup some cross compile
>> environments tomorrow and follow up if I can find more fallouts.
> 
> What have you done about the problem of device nodes getting unregistered
> while the corresponding sysfs directory is non-empty, say with child
> device directories still present?

That was a problem because the previous update of sysfs_drop_dentry()
depended on walking from the root of /sys to the associated dentry to
drop it - which is quite nice because all dentry management can be
isolated.  I tried fixing it by 1. deferring drop of dentry till all its
children are gone (last child kills the deferred parent) 2. recursively
dropping all children when the parent is deleted (later drop of the
child becomes noop) but they both can't work with shadow directories, so
I gave up on the approach and just made synchronization around
sd->s_dentry correct.  So, it's not an issue anymore.  Verified to work.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 16/21] sysfs: implement bin_buffer
  2007-04-28 13:39 ` [PATCH 16/21] sysfs: implement bin_buffer Tejun Heo
@ 2007-05-01 14:25   ` Satyam Sharma
       [not found]     ` <463753DB.2060101@gmail.com>
  0 siblings, 1 reply; 30+ messages in thread
From: Satyam Sharma @ 2007-05-01 14:25 UTC (permalink / raw)
  To: Tejun Heo
  Cc: gregkh, dmitry.torokhov, cornelia.huck, oneukum, rpurdie, stern,
	maneesh, akpm, linux-kernel

On 4/28/07, Tejun Heo <htejun@gmail.com> wrote:
> Implement bin_buffer which contains a mutex and pointer to PAGE_SIZE
> buffer to properly synchronize accesses to per-openfile buffer and
> prepare for immediate-kobj-disconnect.
>
> Signed-off-by: Tejun Heo <htejun@gmail.com>
> ---
> [...]
> @@ -135,14 +160,22 @@ static int open(struct inode * inode, struct file * file)
>                 goto Error;
>
>         error = -ENOMEM;
> -       file->private_data = kmalloc(PAGE_SIZE, GFP_KERNEL);
> -       if (!file->private_data)
> +       bb = kzalloc(sizeof(*bb), GFP_KERNEL);
> +       if (!bb)
>                 goto Error;
>
> +       bb->buffer = kmalloc(PAGE_SIZE, GFP_KERNEL);

You could simply do a __get_free_page() here instead of a kmalloc(PAGE_SIZE).

> @@ -155,11 +188,12 @@ static int release(struct inode * inode, struct file * file)
>         struct kobject * kobj = to_kobj(file->f_path.dentry->d_parent);
>         struct sysfs_dirent *attr_sd = file->f_path.dentry->d_fsdata;
>         struct bin_attribute *attr = attr_sd->s_elem.bin_attr.bin_attr;
> -       u8 * buffer = file->private_data;
> +       struct bin_buffer *bb = file->private_data;
>
>         kobject_put(kobj);
>         module_put(attr->attr.owner);
> -       kfree(buffer);
> +       kfree(bb->buffer);

And so this would become free_page().

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 16/21] sysfs: implement bin_buffer
       [not found]     ` <463753DB.2060101@gmail.com>
@ 2007-05-01 19:37       ` Satyam Sharma
  2007-05-01 20:04         ` Satyam Sharma
  0 siblings, 1 reply; 30+ messages in thread
From: Satyam Sharma @ 2007-05-01 19:37 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel

(We went off-list just now. Adding lkml again.)

On 5/1/07, Tejun Heo <htejun@gmail.com> wrote:
> Satyam Sharma wrote:
> > On 4/28/07, Tejun Heo <htejun@gmail.com> wrote:
> >> Implement bin_buffer which contains a mutex and pointer to PAGE_SIZE
> >> buffer to properly synchronize accesses to per-openfile buffer and
> >> prepare for immediate-kobj-disconnect.
> >>
> >> Signed-off-by: Tejun Heo <htejun@gmail.com>
> >> ---
> >> [...]
> >> @@ -135,14 +160,22 @@ static int open(struct inode * inode, struct
> >> file * file)
> >>                 goto Error;
> >>
> >>         error = -ENOMEM;
> >> -       file->private_data = kmalloc(PAGE_SIZE, GFP_KERNEL);
> >> -       if (!file->private_data)
> >> +       bb = kzalloc(sizeof(*bb), GFP_KERNEL);
> >> +       if (!bb)
> >>                 goto Error;
> >>
> >> +       bb->buffer = kmalloc(PAGE_SIZE, GFP_KERNEL);
> >
> > You could simply do a __get_free_page() here instead of a
> > kmalloc(PAGE_SIZE).
>
> Yes, I could but 1. the original code was done using kmalloc

Yes, I saw that, but you could change it just as well. In any case,
this will only make fs/sysfs/bin.c similar to what is being done in
fs/sysfs/file.c. We allocate the buffer page backing the attribute's
data in fill_read_buffer() and fill_write_buffer() using
get_zeroed_page() and not kzalloc().

> 2. I'm not sure __get_free_page() is better than kmalloc(PAGE_SIZE, ), is it?

If you know for sure that you require *exactly* PAGE_SIZE bytes, then
yes, it is.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 16/21] sysfs: implement bin_buffer
  2007-05-01 19:37       ` Satyam Sharma
@ 2007-05-01 20:04         ` Satyam Sharma
  2007-05-02 11:29           ` Tejun Heo
  0 siblings, 1 reply; 30+ messages in thread
From: Satyam Sharma @ 2007-05-01 20:04 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel

On 5/2/07, Satyam Sharma <satyam.sharma@gmail.com> wrote:
> (We went off-list just now. Adding lkml again.)
>
> On 5/1/07, Tejun Heo <htejun@gmail.com> wrote:
> > Satyam Sharma wrote:
> > > On 4/28/07, Tejun Heo <htejun@gmail.com> wrote:
> > >> Implement bin_buffer which contains a mutex and pointer to PAGE_SIZE
> > >> buffer to properly synchronize accesses to per-openfile buffer and
> > >> prepare for immediate-kobj-disconnect.
> > >>
> > >> Signed-off-by: Tejun Heo <htejun@gmail.com>
> > >> ---
> > >> [...]
> > >> @@ -135,14 +160,22 @@ static int open(struct inode * inode, struct
> > >> file * file)
> > >>                 goto Error;
> > >>
> > >>         error = -ENOMEM;
> > >> -       file->private_data = kmalloc(PAGE_SIZE, GFP_KERNEL);
> > >> -       if (!file->private_data)
> > >> +       bb = kzalloc(sizeof(*bb), GFP_KERNEL);
> > >> +       if (!bb)
> > >>                 goto Error;
> > >>
> > >> +       bb->buffer = kmalloc(PAGE_SIZE, GFP_KERNEL);
> > >
> > > You could simply do a __get_free_page() here instead of a
> > > kmalloc(PAGE_SIZE).
> >
> > Yes, I could but 1. the original code was done using kmalloc
>
> Yes, I saw that, but you could change it just as well. In any case,
> this will only make fs/sysfs/bin.c similar to what is being done in
> fs/sysfs/file.c. We allocate the buffer page backing the attribute's
> data in fill_read_buffer() and fill_write_buffer() using
> get_zeroed_page() and not kzalloc().

Which begs another question -- why do we allocate the buffer page
lazily (only at the time of read(2) and write(2) and not at open(2))
in the case of normal attributes but prefer to do it during open(2)
itself for binary attributes? Note that the page (if allocated) is
freed during release() for both cases.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 16/21] sysfs: implement bin_buffer
  2007-05-01 20:04         ` Satyam Sharma
@ 2007-05-02 11:29           ` Tejun Heo
  0 siblings, 0 replies; 30+ messages in thread
From: Tejun Heo @ 2007-05-02 11:29 UTC (permalink / raw)
  To: Satyam Sharma; +Cc: linux-kernel

Satyam Sharma wrote:
>> Yes, I saw that, but you could change it just as well. In any case,
>> this will only make fs/sysfs/bin.c similar to what is being done in
>> fs/sysfs/file.c. We allocate the buffer page backing the attribute's
>> data in fill_read_buffer() and fill_write_buffer() using
>> get_zeroed_page() and not kzalloc().
> 
> Which begs another question -- why do we allocate the buffer page
> lazily (only at the time of read(2) and write(2) and not at open(2))
> in the case of normal attributes but prefer to do it during open(2)
> itself for binary attributes? Note that the page (if allocated) is
> freed during release() for both cases.

Well, that also is because that was how bin files did it before the
patch.  I don't object to any of your suggestions but also thinks that
they are better off as separate patches.  So, feel free to submit patches.

-- 
tejun

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCHSET] sysfs: sysfs rework, take #2
  2007-04-28 13:39 [PATCHSET] sysfs: sysfs rework, take #2 Tejun Heo
                   ` (21 preceding siblings ...)
  2007-04-28 14:28 ` [PATCHSET] sysfs: sysfs rework, take #2 Alan Stern
@ 2007-05-02 12:38 ` Cornelia Huck
  2007-05-02 13:04   ` Tejun Heo
  22 siblings, 1 reply; 30+ messages in thread
From: Cornelia Huck @ 2007-05-02 12:38 UTC (permalink / raw)
  To: Tejun Heo
  Cc: gregkh, dmitry.torokhov, oneukum, rpurdie, stern, maneesh, akpm,
	linux-kernel

On Sat, 28 Apr 2007 22:39:35 +0900,
Tejun Heo <htejun@gmail.com> wrote:

> All the known regrssions are fixed.  I think I've got all the
> attr->owner but haven't verified with cross compiling yet, just
> allyesconfig on x86-64 and i386.  I'll try to setup some cross compile
> environments tomorrow and follow up if I can find more fallouts.

s390 attr->owner seems to be fine afaics.

I've run some stress tests (attach/detach) on s390 and haven't run into
any problems yet.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCHSET] sysfs: sysfs rework, take #2
  2007-05-02 12:38 ` Cornelia Huck
@ 2007-05-02 13:04   ` Tejun Heo
  0 siblings, 0 replies; 30+ messages in thread
From: Tejun Heo @ 2007-05-02 13:04 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: gregkh, dmitry.torokhov, oneukum, rpurdie, stern, maneesh, akpm,
	linux-kernel

Cornelia Huck wrote:
> On Sat, 28 Apr 2007 22:39:35 +0900,
> Tejun Heo <htejun@gmail.com> wrote:
> 
>> All the known regrssions are fixed.  I think I've got all the
>> attr->owner but haven't verified with cross compiling yet, just
>> allyesconfig on x86-64 and i386.  I'll try to setup some cross compile
>> environments tomorrow and follow up if I can find more fallouts.
> 
> s390 attr->owner seems to be fine afaics.
> 
> I've run some stress tests (attach/detach) on s390 and haven't run into
> any problems yet.

Great, thanks a lot for testing.  I've long stress tests but the only
oops I've seen (which I'm trying to reproduce without success) is caused
by deallocated platform device when udev for it is generated.  So, I
think the sysfs changes are pretty stable.

-- 
tejun

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2007-05-02 13:05 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-04-28 13:39 [PATCHSET] sysfs: sysfs rework, take #2 Tejun Heo
2007-04-28 13:39 ` [PATCH 01/21] idr: fix obscure bug in allocation path Tejun Heo
2007-04-28 13:39 ` [PATCH 02/21] idr: separate out idr_mark_full() Tejun Heo
2007-04-28 13:39 ` [PATCH 06/21] sysfs: make sysfs_put() ignore NULL sd Tejun Heo
2007-04-28 13:39 ` [PATCH 03/21] ida: implement idr based id allocator Tejun Heo
2007-04-28 13:39 ` [PATCH 04/21] sysfs: move release_sysfs_dirent() to dir.c Tejun Heo
2007-04-28 13:39 ` [PATCH 07/21] sysfs: fix error handling in binattr write() Tejun Heo
2007-04-28 13:39 ` [PATCH 05/21] sysfs: allocate inode number using ida Tejun Heo
2007-04-28 13:39 ` [PATCH 11/21] sysfs: add sysfs_dirent->s_parent Tejun Heo
2007-04-28 13:39 ` [PATCH 08/21] sysfs: flatten cleanup paths in sysfs_add_link() and create_dir() Tejun Heo
2007-04-28 13:39 ` [PATCH 10/21] sysfs: consolidate sysfs_dirent creation functions Tejun Heo
2007-04-28 13:39 ` [PATCH 09/21] sysfs: flatten and fix sysfs_rename_dir() error handling Tejun Heo
2007-04-28 13:39 ` [PATCH 12/21] sysfs: add sysfs_dirent->s_name Tejun Heo
2007-04-28 13:39 ` [PATCH 14/21] sysfs: implement kobj_sysfs_assoc_lock Tejun Heo
2007-04-28 13:39 ` [PATCH 16/21] sysfs: implement bin_buffer Tejun Heo
2007-05-01 14:25   ` Satyam Sharma
     [not found]     ` <463753DB.2060101@gmail.com>
2007-05-01 19:37       ` Satyam Sharma
2007-05-01 20:04         ` Satyam Sharma
2007-05-02 11:29           ` Tejun Heo
2007-04-28 13:39 ` [PATCH 18/21] sysfs: kill attribute file orphaning Tejun Heo
2007-04-28 13:39 ` [PATCH 15/21] sysfs: reimplement symlink using sysfs_dirent tree Tejun Heo
2007-04-28 13:39 ` [PATCH 13/21] sysfs: make sysfs_dirent->s_element a union Tejun Heo
2007-04-28 13:39 ` [PATCH 20/21] sysfs: separate out sysfs_attach_dentry() Tejun Heo
2007-04-28 13:39 ` [PATCH 17/21] sysfs: implement sysfs_dirent active reference and immediate disconnect Tejun Heo
2007-04-28 13:39 ` [PATCH 19/21] sysfs: kill unnecessary attribute->owner Tejun Heo
2007-04-28 13:39 ` [PATCH 21/21] sysfs: reimplement syfs_drop_dentry() Tejun Heo
2007-04-28 14:28 ` [PATCHSET] sysfs: sysfs rework, take #2 Alan Stern
2007-04-28 16:06   ` Tejun Heo
2007-05-02 12:38 ` Cornelia Huck
2007-05-02 13:04   ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox