public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC 0/6] Backing Store for sysfs
@ 2003-10-06  8:59 Maneesh Soni
  2003-10-06  9:00 ` [RFC 1/6] sysfs-kobject.patch Maneesh Soni
                   ` (2 more replies)
  0 siblings, 3 replies; 33+ messages in thread
From: Maneesh Soni @ 2003-10-06  8:59 UTC (permalink / raw)
  To: Al Viro, Patrick Mochel, Greg KH; +Cc: LKML, Dipankar Sarma


Hi,

The following patch set(mailed separately) provides a prototype for a backing 
store mechanism for sysfs. Currently sysfs pins all its dentries and inodes in 
memory there by wasting kernel lowmem even when it is not mounted. 

With this patch set we create sysfs dentry whenever it is required like 
other real filesystems and, age and free it as per the dcache rules. We
now save significant amount of Lowmem by avoiding un-necessary pinning. 
The following numbers were on a 2-way system with 6 disks and 2 NICs with 
about 1028 dentries. The numbers are just indicative as they are system
wide collected from /proc and are not exclusively for sysfs.

				2.6.0-test6		With patches.
Right after system is booted
---------------------------
dentry_cache (active)		2343			1315
inode_cache (active)		1058			30
LowFree				875096 KB		875900 KB

After mounting sysfs
-------------------
dentry_cache (active)		2350			1321
inode_cache (active)		1058			31
LowFree				875096 KB		875836 KB

After "find /sys"
-----------------
dentry_cache (active)		2520			2544
inode_cache (active)		1058			1050
LowFree				875032 KB		874748 KB

After un-mounting sysfs
-----------------------
dentry_cache (active)		2363			1319
inode_cache (active)		1058			30
LowFree				875032 KB		875836 KB


The main idea is not create the dentry in sysfs_create_xxx calls but create
the dentry when it is first lookup. We now have lookup() inode_operation, 
open and close file_operations for sysfs directory inodes. 

The backing store is based on the kobjects which are always there in memory.
sysfs lookup is based on hierarchy of kobjects. As the current kobject 
infrastructure donot provide any means to traverse the kobject's children or 
siblings, two-way hierarchy lookup was not possible. For this new fields 
are added to kobject structure. This ended up increasing the size of kobject
from 52 bytes to 108 bytes but saving one dentry and inode per kobject.

The details of the patches are in the following mails. For testing please
apply all the patches as they are splitted just for ease of review.

Please send me comments on the approach, implementation, missed things and 
suggestions to improve them.

Thanks,
Maneesh

-- 
Maneesh Soni
Linux Technology Center, 
IBM Software Lab, Bangalore, India
email: maneesh@in.ibm.com
Phone: 91-80-5044999 Fax: 91-80-5268553
T/L : 9243696

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [RFC 1/6] sysfs-kobject.patch
  2003-10-06  8:59 [RFC 0/6] Backing Store for sysfs Maneesh Soni
@ 2003-10-06  9:00 ` Maneesh Soni
  2003-10-06  9:00   ` [RFC 2/6] sysfs-mount.patch Maneesh Soni
                     ` (2 more replies)
  2003-10-06 16:08 ` [RFC 0/6] Backing Store for sysfs Greg KH
  2003-10-06 18:44 ` Patrick Mochel
  2 siblings, 3 replies; 33+ messages in thread
From: Maneesh Soni @ 2003-10-06  9:00 UTC (permalink / raw)
  To: Al Viro, Patrick Mochel, Greg KH; +Cc: LKML, Dipankar Sarma



o This patch adds new fields to struct kobject for providing 
  parent-child-sibling  based hierarchy. Using these links we can traverse 
  the hierarchy in both directions. All these fields are intialised in 
  kobject_init and modified accordingly in kobject_add and kobject_del.

o The new fields attr, and attr_groups link the attributes and attribute
  groups to the kobject. These are linked using the new structures 
  kobject_attr and kobject_attr_group as we cannot link the struct attribute
  directly to the kobject because these are generally embedded in other 
  struct and are referenced with offsets.

o All the lists are protected by the read-write semaphore k_rwsem.

o All the subsystems are linked in the kobj_subsystem_list which is protected
  using spinlock kobj_subsystem_lock.


 include/linux/kobject.h |   33 ++++++++++++++++++++++++++++++
 lib/kobject.c           |   52 +++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 84 insertions(+), 1 deletion(-)

diff -puN include/linux/kobject.h~sysfs-kobject include/linux/kobject.h
--- linux-2.6.0-test6/include/linux/kobject.h~sysfs-kobject	2003-10-06 11:48:37.000000000 +0530
+++ linux-2.6.0-test6-maneesh/include/linux/kobject.h	2003-10-06 11:48:51.000000000 +0530
@@ -32,6 +32,12 @@ struct kobject {
 	struct kset		* kset;
 	struct kobj_type	* ktype;
 	struct dentry		* dentry;
+ 	struct list_head	k_sibling;
+ 	struct list_head	k_children;
+	struct list_head	attr;
+	struct list_head	attr_group;
+	struct rw_semaphore	k_rwsem;
+	char 			*k_symlink;
 };
 
 extern int kobject_set_name(struct kobject *, const char *, ...)
@@ -57,6 +63,29 @@ extern struct kobject * kobject_get(stru
 extern void kobject_put(struct kobject *);
 
 
+struct kobject_attr {
+	struct list_head 	list;
+	int 			flags;
+	union {
+		const struct attribute	* attr;
+		const struct bin_attribute * bin_attr;
+	} attr_u;
+};
+
+#define KOBJ_TEXT_ATTR		0x0001
+#define KOBJ_BINARY_ATTR	0x0002
+
+static inline const struct attribute * 
+kobject_attr(struct kobject_attr * k_attr) 
+{
+	return ((k_attr->flags & KOBJ_TEXT_ATTR) ? k_attr->attr_u.attr : &k_attr->attr_u.bin_attr->attr);
+}
+
+struct kobject_attr_group {
+	struct list_head 	list;
+	const struct attribute_group	* attr_group;
+};
+
 struct kobj_type {
 	void (*release)(struct kobject *);
 	struct sysfs_ops	* sysfs_ops;
@@ -140,8 +169,12 @@ extern struct kobject * kset_find_obj(st
 struct subsystem {
 	struct kset		kset;
 	struct rw_semaphore	rwsem;
+	struct list_head	next;
 };
 
+extern spinlock_t kobj_subsystem_lock;
+extern struct list_head kobj_subsystem_list;
+
 #define decl_subsys(_name,_type,_hotplug_ops) \
 struct subsystem _name##_subsys = { \
 	.kset = { \
diff -puN lib/kobject.c~sysfs-kobject lib/kobject.c
--- linux-2.6.0-test6/lib/kobject.c~sysfs-kobject	2003-10-06 11:48:44.000000000 +0530
+++ linux-2.6.0-test6-maneesh/lib/kobject.c	2003-10-06 11:51:51.000000000 +0530
@@ -17,6 +17,9 @@
 #include <linux/module.h>
 #include <linux/stat.h>
 
+spinlock_t kobj_subsystem_lock = SPIN_LOCK_UNLOCKED;
+LIST_HEAD(kobj_subsystem_list);
+
 /**
  *	populate_dir - populate directory with attributes.
  *	@kobj:	object we're working on.
@@ -216,6 +219,12 @@ void kobject_init(struct kobject * kobj)
 	atomic_set(&kobj->refcount,1);
 	INIT_LIST_HEAD(&kobj->entry);
 	kobj->kset = kset_get(kobj->kset);
+	kobj->dentry = NULL;
+	init_rwsem(&kobj->k_rwsem);
+ 	INIT_LIST_HEAD(&kobj->k_children);
+	INIT_LIST_HEAD(&kobj->attr);
+	INIT_LIST_HEAD(&kobj->attr_group);
+	kobj->k_symlink = NULL;
 }
 
 
@@ -236,8 +245,12 @@ static void unlink(struct kobject * kobj
 		list_del_init(&kobj->entry);
 		up_write(&kobj->kset->subsys->rwsem);
 	}
-	if (kobj->parent) 
+	if (kobj->parent) {
+		down_write(&kobj->parent->k_rwsem);
+		list_del_init(&kobj->k_sibling);
+		up_write(&kobj->parent->k_rwsem);
 		kobject_put(kobj->parent);
+	}
 	kobject_put(kobj);
 }
 
@@ -273,6 +286,15 @@ int kobject_add(struct kobject * kobj)
 	}
 	kobj->parent = parent;
 
+	if (kobj->parent) {
+		down_write(&parent->k_rwsem);
+		list_add(&kobj->k_sibling, &kobj->parent->k_children);
+		up_write(&parent->k_rwsem);
+	}
+	else {
+		INIT_LIST_HEAD(&kobj->k_sibling);
+	}
+
 	error = create_dir(kobj);
 	if (error)
 		unlink(kobj);
@@ -443,11 +465,32 @@ void kobject_cleanup(struct kobject * ko
 {
 	struct kobj_type * t = get_ktype(kobj);
 	struct kset * s = kobj->kset;
+	struct list_head * tmp = kobj->attr.next;
 
 	pr_debug("kobject %s: cleaning up\n",kobject_name(kobj));
+
+	down_write(&kobj->k_rwsem);
 	if (kobj->k_name != kobj->name)
 		kfree(kobj->k_name);
 	kobj->k_name = NULL;
+ 
+	while (tmp != &kobj->attr) {
+		struct kobject_attr * k_attr;
+		k_attr = list_entry(tmp, struct kobject_attr, list);
+		tmp = tmp->next;
+		list_del(&k_attr->list);
+		kfree(k_attr);
+	}
+	tmp = kobj->attr_group.next;
+	while (tmp != &kobj->attr_group) {
+		struct kobject_attr_group * k_attr_group;
+		k_attr_group = list_entry(tmp, struct kobject_attr_group, list);
+		tmp = tmp->next;
+		list_del(&k_attr_group->list);
+		kfree(k_attr_group);
+	}
+	up_write(&kobj->k_rwsem);
+
 	if (t && t->release)
 		t->release(kobj);
 	if (s)
@@ -557,6 +600,7 @@ void subsystem_init(struct subsystem * s
 {
 	init_rwsem(&s->rwsem);
 	kset_init(&s->kset);
+	INIT_LIST_HEAD(&s->next);
 }
 
 /**
@@ -578,6 +622,9 @@ int subsystem_register(struct subsystem 
 	if (!(error = kset_add(&s->kset))) {
 		if (!s->kset.subsys)
 			s->kset.subsys = s;
+		spin_lock(&kobj_subsystem_lock);
+		list_add(&s->next, &kobj_subsystem_list);
+		spin_unlock(&kobj_subsystem_lock);
 	}
 	return error;
 }
@@ -585,6 +632,9 @@ int subsystem_register(struct subsystem 
 void subsystem_unregister(struct subsystem * s)
 {
 	pr_debug("subsystem %s: unregistering\n",s->kset.kobj.name);
+	spin_lock(&kobj_subsystem_lock);
+	list_del_init(&s->next);
+	spin_unlock(&kobj_subsystem_lock);
 	kset_unregister(&s->kset);
 }
 

_
-- 
Maneesh Soni
Linux Technology Center, 
IBM Software Lab, Bangalore, India
email: maneesh@in.ibm.com
Phone: 91-80-5044999 Fax: 91-80-5268553
T/L : 9243696

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [RFC 2/6] sysfs-mount.patch
  2003-10-06  9:00 ` [RFC 1/6] sysfs-kobject.patch Maneesh Soni
@ 2003-10-06  9:00   ` Maneesh Soni
  2003-10-06  9:01     ` [RFC 3/6] sysfs-file.patch Maneesh Soni
  2003-10-06 13:43     ` [RFC 2/6] sysfs-mount.patch viro
  2003-10-06 13:41   ` [RFC 1/6] sysfs-kobject.patch viro
  2003-10-06 16:16   ` Greg KH
  2 siblings, 2 replies; 33+ messages in thread
From: Maneesh Soni @ 2003-10-06  9:00 UTC (permalink / raw)
  To: Al Viro, Patrick Mochel, Greg KH; +Cc: LKML, Dipankar Sarma



o This patch provides mount related changes for sysfs backing store. Now, 
  sysfs_init() just registers the sysfs filesystem and do not mount it. 

o The global sysfs_mount indicates whether sysfs is mounted or not. For this
  we initialize it in fs/super.c:do_kern_mount(). It looks like more of a hack.

o We also have new dir_operations and dir_inode_operations structs for sysfs 
  directories including the root directory.

o sysfs inode links are obtained from sysfs_get_link_count().


 fs/super.c       |    5 +++++
 fs/sysfs/mount.c |   26 ++++++--------------------
 init/do_mounts.c |    1 +
 3 files changed, 12 insertions(+), 20 deletions(-)

diff -puN init/do_mounts.c~sysfs-mount init/do_mounts.c
--- linux-2.6.0-test6/init/do_mounts.c~sysfs-mount	2003-10-06 11:54:59.000000000 +0530
+++ linux-2.6.0-test6-maneesh/init/do_mounts.c	2003-10-06 11:55:14.000000000 +0530
@@ -189,6 +189,7 @@ done:
 	sys_umount("/sys", 0);
 out:
 	sys_rmdir("/sys");
+	sysfs_mount = NULL;
 	return res;
 fail:
 	res = 0;
diff -puN fs/super.c~sysfs-mount fs/super.c
--- linux-2.6.0-test6/fs/super.c~sysfs-mount	2003-10-06 11:55:03.000000000 +0530
+++ linux-2.6.0-test6-maneesh/fs/super.c	2003-10-06 11:55:14.000000000 +0530
@@ -32,6 +32,7 @@
 #include <linux/security.h>
 #include <linux/vfs.h>
 #include <linux/writeback.h>		/* for the emergency remount stuff */
+#include <linux/sysfs.h>
 #include <asm/uaccess.h>
 
 
@@ -692,6 +693,10 @@ do_kern_mount(const char *fstype, int fl
 	mnt->mnt_mountpoint = sb->s_root;
 	mnt->mnt_parent = mnt;
 	up_write(&sb->s_umount);
+
+	if (!strcmp(fstype, "sysfs"))
+		sysfs_mount = mnt;
+
 	put_filesystem(type);
 	return mnt;
 out_sb:
diff -puN fs/sysfs/mount.c~sysfs-mount fs/sysfs/mount.c
--- linux-2.6.0-test6/fs/sysfs/mount.c~sysfs-mount	2003-10-06 11:55:07.000000000 +0530
+++ linux-2.6.0-test6-maneesh/fs/sysfs/mount.c	2003-10-06 11:55:14.000000000 +0530
@@ -2,8 +2,6 @@
  * mount.c - operations for initializing and mounting sysfs.
  */
 
-#define DEBUG 
-
 #include <linux/fs.h>
 #include <linux/mount.h>
 #include <linux/pagemap.h>
@@ -14,7 +12,7 @@
 /* Random magic number */
 #define SYSFS_MAGIC 0x62656572
 
-struct vfsmount *sysfs_mount;
+struct vfsmount *sysfs_mount = NULL;
 struct super_block * sysfs_sb = NULL;
 
 static struct super_operations sysfs_ops = {
@@ -35,10 +33,9 @@ static int sysfs_fill_super(struct super
 
 	inode = sysfs_new_inode(S_IFDIR | S_IRWXU | S_IRUGO | S_IXUGO);
 	if (inode) {
-		inode->i_op = &simple_dir_inode_operations;
-		inode->i_fop = &simple_dir_operations;
-		/* directory inodes start off with i_nlink == 2 (for "." entry) */
-		inode->i_nlink++;	
+		inode->i_op = &sysfs_dir_inode_operations;
+		inode->i_fop = &sysfs_dir_operations;
+		inode->i_nlink += sysfs_get_link_count(NULL);	
 	} else {
 		pr_debug("sysfs: could not get root inode\n");
 		return -ENOMEM;
@@ -63,21 +60,10 @@ static struct super_block *sysfs_get_sb(
 static struct file_system_type sysfs_fs_type = {
 	.name		= "sysfs",
 	.get_sb		= sysfs_get_sb,
-	.kill_sb	= kill_litter_super,
+	.kill_sb	= kill_anon_super,
 };
 
 int __init sysfs_init(void)
 {
-	int err;
-
-	err = register_filesystem(&sysfs_fs_type);
-	if (!err) {
-		sysfs_mount = kern_mount(&sysfs_fs_type);
-		if (IS_ERR(sysfs_mount)) {
-			printk(KERN_ERR "sysfs: could not mount!\n");
-			err = PTR_ERR(sysfs_mount);
-			sysfs_mount = NULL;
-		}
-	}
-	return err;
+	return register_filesystem(&sysfs_fs_type);
 }

_
-- 
Maneesh Soni
Linux Technology Center, 
IBM Software Lab, Bangalore, India
email: maneesh@in.ibm.com
Phone: 91-80-5044999 Fax: 91-80-5268553
T/L : 9243696

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [RFC 3/6] sysfs-file.patch
  2003-10-06  9:00   ` [RFC 2/6] sysfs-mount.patch Maneesh Soni
@ 2003-10-06  9:01     ` Maneesh Soni
  2003-10-06  9:01       ` [RFC 4/6] sysfs-symlink.patch Maneesh Soni
  2003-10-06 13:43     ` [RFC 2/6] sysfs-mount.patch viro
  1 sibling, 1 reply; 33+ messages in thread
From: Maneesh Soni @ 2003-10-06  9:01 UTC (permalink / raw)
  To: Al Viro, Patrick Mochel, Greg KH; +Cc: LKML, Dipankar Sarma




o  This patch modifies the externals sysfs_create_file and 
   sysfs_create_bin_file. Now these don't actually create files but just links
   the attribute to the kobject. There is no change in the definition but
   actual function. These should be actually part of kobject code but as of
   now for not changing the interfaces there are kept as part of sysfs code.
   


 fs/sysfs/bin.c  |   40 +++++++++++++++++-----------------------
 fs/sysfs/file.c |   45 +++++++++++++++++++--------------------------
 2 files changed, 36 insertions(+), 49 deletions(-)

diff -puN fs/sysfs/file.c~sysfs-file fs/sysfs/file.c
--- linux-2.6.0-test6/fs/sysfs/file.c~sysfs-file	2003-10-06 11:58:10.000000000 +0530
+++ linux-2.6.0-test6-maneesh/fs/sysfs/file.c	2003-10-06 11:58:50.000000000 +0530
@@ -11,7 +11,7 @@
 
 static struct file_operations sysfs_file_operations;
 
-static int init_file(struct inode * inode)
+int sysfs_init_file(struct inode * inode)
 {
 	inode->i_size = PAGE_SIZE;
 	inode->i_fop = &sysfs_file_operations;
@@ -345,27 +345,6 @@ static struct file_operations sysfs_file
 };
 
 
-int sysfs_add_file(struct dentry * dir, const struct attribute * attr)
-{
-	struct dentry * dentry;
-	int error;
-
-	down(&dir->d_inode->i_sem);
-	dentry = sysfs_get_dentry(dir,attr->name);
-	if (!IS_ERR(dentry)) {
-		error = sysfs_create(dentry,
-				     (attr->mode & S_IALLUGO) | S_IFREG,
-				     init_file);
-		if (!error)
-			dentry->d_fsdata = (void *)attr;
-		dput(dentry);
-	} else
-		error = PTR_ERR(dentry);
-	up(&dir->d_inode->i_sem);
-	return error;
-}
-
-
 /**
  *	sysfs_create_file - create an attribute file for an object.
  *	@kobj:	object we're creating for. 
@@ -374,11 +353,25 @@ int sysfs_add_file(struct dentry * dir, 
 
 int sysfs_create_file(struct kobject * kobj, const struct attribute * attr)
 {
-	if (kobj && attr)
-		return sysfs_add_file(kobj->dentry,attr);
-	return -EINVAL;
-}
+	struct kobject_attr * k_attr;
+	
+	if (!kobj || !attr)
+		return -EINVAL;
 
+	k_attr = kmalloc(sizeof(k_attr), GFP_KERNEL);
+	if (!k_attr)
+		return -ENOMEM;
+	memset(k_attr, 0, sizeof(k_attr));
+	INIT_LIST_HEAD(&k_attr->list);
+	k_attr->flags |= KOBJ_TEXT_ATTR;
+	k_attr->attr_u.attr = attr;
+
+	down_write(&kobj->k_rwsem);
+	list_add(&k_attr->list, &kobj->attr);
+	up_write(&kobj->k_rwsem);
+	
+	return 0;
+}
 
 /**
  * sysfs_update_file - update the modified timestamp on an object attribute.
diff -puN fs/sysfs/bin.c~sysfs-file fs/sysfs/bin.c
--- linux-2.6.0-test6/fs/sysfs/bin.c~sysfs-file	2003-10-06 11:58:13.000000000 +0530
+++ linux-2.6.0-test6-maneesh/fs/sysfs/bin.c	2003-10-06 11:58:50.000000000 +0530
@@ -9,6 +9,7 @@
 #include <linux/kobject.h>
 #include <linux/module.h>
 #include <linux/slab.h>
+#include <linux/namei.h>
 
 #include <asm/uaccess.h>
 
@@ -131,7 +132,7 @@ static int release(struct inode * inode,
 	return 0;
 }
 
-static struct file_operations bin_fops = {
+struct file_operations bin_fops = {
 	.read		= read,
 	.write		= write,
 	.llseek		= generic_file_llseek,
@@ -148,31 +149,24 @@ static struct file_operations bin_fops =
 
 int sysfs_create_bin_file(struct kobject * kobj, struct bin_attribute * attr)
 {
-	struct dentry * dentry;
-	struct dentry * parent;
-	int error = 0;
+	struct kobject_attr * k_attr;
 
 	if (!kobj || !attr)
 		return -EINVAL;
-
-	parent = kobj->dentry;
-
-	down(&parent->d_inode->i_sem);
-	dentry = sysfs_get_dentry(parent,attr->attr.name);
-	if (!IS_ERR(dentry)) {
-		dentry->d_fsdata = (void *)attr;
-		error = sysfs_create(dentry,
-				     (attr->attr.mode & S_IALLUGO) | S_IFREG,
-				     NULL);
-		if (!error) {
-			dentry->d_inode->i_size = attr->size;
-			dentry->d_inode->i_fop = &bin_fops;
-		}
-		dput(dentry);
-	} else
-		error = PTR_ERR(dentry);
-	up(&parent->d_inode->i_sem);
-	return error;
+	
+	k_attr = kmalloc(sizeof(k_attr), GFP_KERNEL);
+	if (!k_attr)
+		return -ENOMEM;
+	memset(k_attr, 0, sizeof(k_attr));
+	INIT_LIST_HEAD(&k_attr->list);
+	k_attr->flags |= KOBJ_BINARY_ATTR;
+	k_attr->attr_u.bin_attr = attr;
+
+	down_write(&kobj->k_rwsem);
+	list_add(&k_attr->list, &kobj->attr);
+	up_write(&kobj->k_rwsem);
+	
+	return 0;
 }
 
 

_
-- 
Maneesh Soni
Linux Technology Center, 
IBM Software Lab, Bangalore, India
email: maneesh@in.ibm.com
Phone: 91-80-5044999 Fax: 91-80-5268553
T/L : 9243696

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [RFC 4/6] sysfs-symlink.patch
  2003-10-06  9:01     ` [RFC 3/6] sysfs-file.patch Maneesh Soni
@ 2003-10-06  9:01       ` Maneesh Soni
  2003-10-06  9:02         ` [RFC 5/6] sysfs-attr_group.patch Maneesh Soni
  0 siblings, 1 reply; 33+ messages in thread
From: Maneesh Soni @ 2003-10-06  9:01 UTC (permalink / raw)
  To: Al Viro, Patrick Mochel, Greg KH; +Cc: LKML, Dipankar Sarma



o This patch creates symlink kobject. We don't create the symlink in sysfs
  when sysfs_create_symlink() is called but just allocates one kobject to act
  as symlink. Again this code should be part of kobject code. But to keep the
  interfaces same it is in sysfs code.


 fs/sysfs/symlink.c |   41 +++++++++++++++++++++++++++++------------
 1 files changed, 29 insertions(+), 12 deletions(-)

diff -puN fs/sysfs/symlink.c~sysfs-symlink fs/sysfs/symlink.c
--- linux-2.6.0-test6/fs/sysfs/symlink.c~sysfs-symlink	2003-10-06 12:08:04.000000000 +0530
+++ linux-2.6.0-test6-maneesh/fs/sysfs/symlink.c	2003-10-06 12:14:43.000000000 +0530
@@ -15,7 +15,8 @@ static int init_symlink(struct inode * i
 	return 0;
 }
 
-static int sysfs_symlink(struct inode * dir, struct dentry *dentry, const char * symname)
+int sysfs_symlink(struct inode * dir, struct dentry *dentry, 
+		   const char * symname)
 {
 	int error;
 
@@ -71,13 +72,12 @@ static void fill_object_path(struct kobj
  */
 int sysfs_create_link(struct kobject * kobj, struct kobject * target, char * name)
 {
-	struct dentry * dentry = kobj->dentry;
-	struct dentry * d;
 	int error = 0;
 	int size;
 	int depth;
 	char * path;
 	char * s;
+	struct kobject * link;
 
 	depth = object_depth(kobj);
 	size = object_path_length(target) + depth * 3 - 1;
@@ -96,15 +96,19 @@ int sysfs_create_link(struct kobject * k
 	fill_object_path(target,path,size);
 	pr_debug("%s: path = '%s'\n",__FUNCTION__,path);
 
-	down(&dentry->d_inode->i_sem);
-	d = sysfs_get_dentry(dentry,name);
-	if (!IS_ERR(d))
-		error = sysfs_symlink(dentry->d_inode,d,path);
-	else
-		error = PTR_ERR(d);
-	dput(d);
-	up(&dentry->d_inode->i_sem);
-	kfree(path);
+	link = kmalloc(sizeof(struct kobject), GFP_KERNEL); 
+	if (!link) {
+		kfree(path);
+		return -ENOMEM;
+	}
+	memset(link, 0, sizeof(struct kobject));
+	kobject_init(link);
+	kobject_set_name(link, name);
+	link->k_symlink = path;
+	down_write(&kobj->k_rwsem);
+	list_add(&link->k_sibling, &kobj->k_children);
+	up_write(&kobj->k_rwsem);
+	link->dentry = NULL;
 	return error;
 }
 
@@ -117,6 +121,19 @@ int sysfs_create_link(struct kobject * k
 
 void sysfs_remove_link(struct kobject * kobj, char * name)
 {
+	struct list_head * tmp;
+
+	down_write(&kobj->k_rwsem);
+	tmp = kobj->k_sibling.next;
+	while (tmp != &kobj->k_sibling) {
+		struct kobject * k = list_entry(tmp, struct kobject, k_sibling);
+		tmp = tmp->next;
+		if (!strcmp(kobject_name(k), name) && (k->k_symlink)) {
+			kfree(k->k_symlink);
+			kobject_cleanup(k);
+		}
+	}
+	up_write(&kobj->k_rwsem);
 	sysfs_hash_and_remove(kobj->dentry,name);
 }
 

_
-- 
Maneesh Soni
Linux Technology Center, 
IBM Software Lab, Bangalore, India
email: maneesh@in.ibm.com
Phone: 91-80-5044999 Fax: 91-80-5268553
T/L : 9243696

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [RFC 5/6] sysfs-attr_group.patch
  2003-10-06  9:01       ` [RFC 4/6] sysfs-symlink.patch Maneesh Soni
@ 2003-10-06  9:02         ` Maneesh Soni
  2003-10-06  9:03           ` [RFC 6/6] sysfs-dir.patch Maneesh Soni
  0 siblings, 1 reply; 33+ messages in thread
From: Maneesh Soni @ 2003-10-06  9:02 UTC (permalink / raw)
  To: Al Viro, Patrick Mochel, Greg KH; +Cc: LKML, Dipankar Sarma




o This patch creates attribute group for the given kobject. Again here we don;t
  create the sysfs directory but just links the struct attribute_group to the
  kobject.


 fs/sysfs/group.c |   46 ++++++++++++++++------------------------------
 1 files changed, 16 insertions(+), 30 deletions(-)

diff -puN fs/sysfs/group.c~sysfs-attr_group fs/sysfs/group.c
--- linux-2.6.0-test6/fs/sysfs/group.c~sysfs-attr_group	2003-10-06 12:00:23.000000000 +0530
+++ linux-2.6.0-test6-maneesh/fs/sysfs/group.c	2003-10-06 12:00:35.000000000 +0530
@@ -24,40 +24,26 @@ static void remove_files(struct dentry *
 		sysfs_hash_and_remove(dir,(*attr)->name);
 }
 
-static int create_files(struct dentry * dir,
-			const struct attribute_group * grp)
-{
-	struct attribute *const* attr;
-	int error = 0;
-
-	for (attr = grp->attrs; *attr && !error; attr++) {
-		error = sysfs_add_file(dir,*attr);
-	}
-	if (error)
-		remove_files(dir,grp);
-	return error;
-}
-
-
 int sysfs_create_group(struct kobject * kobj, 
 		       const struct attribute_group * grp)
 {
-	struct dentry * dir;
-	int error;
+	struct kobject_attr_group * k_attr_group;
 
-	if (grp->name) {
-		error = sysfs_create_subdir(kobj,grp->name,&dir);
-		if (error)
-			return error;
-	} else
-		dir = kobj->dentry;
-	dir = dget(dir);
-	if ((error = create_files(dir,grp))) {
-		if (grp->name)
-			sysfs_remove_subdir(dir);
-		dput(dir);
-	}
-	return error;
+	if (!kobj || !grp)
+		return -EINVAL;
+
+	k_attr_group = kmalloc(sizeof(k_attr_group), GFP_KERNEL);
+	if (!k_attr_group)
+		return -ENOMEM;
+	memset(k_attr_group, 0, sizeof(k_attr_group));
+	INIT_LIST_HEAD(&k_attr_group->list);
+	k_attr_group->attr_group = grp;
+
+	down_write(&kobj->k_rwsem);
+	list_add(&k_attr_group->list, &kobj->attr_group);
+	up_write(&kobj->k_rwsem);
+	
+	return 0;
 }
 
 void sysfs_remove_group(struct kobject * kobj, 

_
-- 
Maneesh Soni
Linux Technology Center, 
IBM Software Lab, Bangalore, India
email: maneesh@in.ibm.com
Phone: 91-80-5044999 Fax: 91-80-5268553
T/L : 9243696

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [RFC 6/6] sysfs-dir.patch
  2003-10-06  9:02         ` [RFC 5/6] sysfs-attr_group.patch Maneesh Soni
@ 2003-10-06  9:03           ` Maneesh Soni
  0 siblings, 0 replies; 33+ messages in thread
From: Maneesh Soni @ 2003-10-06  9:03 UTC (permalink / raw)
  To: Al Viro, Patrick Mochel, Greg KH; +Cc: LKML, Dipankar Sarma



o This is the main part of the sysfs backing store patch set. It provides the
  lookup routine, open and release routines for sysfs directories. As we don't
  create sysfs entires in sysfs_create_xxxx() calls, we create the dentries 
  when we actually look for them in sysfs_lookup and sysfs_open_dir. We parse
  the kobject hierachy for the required object and if found we go ahead and 
  create the sysfs file or directory for it.


 fs/sysfs/dir.c        |  442 ++++++++++++++++++++++++++++++++++++++++++++------
 fs/sysfs/inode.c      |   65 ++++++-
 fs/sysfs/sysfs.h      |   17 +
 include/linux/sysfs.h |    1 
 4 files changed, 475 insertions(+), 50 deletions(-)

diff -puN fs/sysfs/inode.c~sysfs-dir fs/sysfs/inode.c
--- linux-2.6.0-test6/fs/sysfs/inode.c~sysfs-dir	2003-10-06 12:15:07.000000000 +0530
+++ linux-2.6.0-test6-maneesh/fs/sysfs/inode.c	2003-10-06 12:15:07.000000000 +0530
@@ -11,6 +11,8 @@
 #include <linux/pagemap.h>
 #include <linux/namei.h>
 #include <linux/backing-dev.h>
+#include "sysfs.h"
+
 extern struct super_block * sysfs_sb;
 
 static struct address_space_operations sysfs_aops = {
@@ -59,10 +61,9 @@ int sysfs_create(struct dentry * dentry,
  Proceed:
 	if (init)
 		error = init(inode);
-	if (!error) {
-		d_instantiate(dentry, inode);
-		dget(dentry); /* Extra count - pin the dentry in core */
-	} else
+	if (!error) 
+		d_add(dentry, inode);
+	else
 		iput(inode);
  Done:
 	return error;
@@ -73,6 +74,22 @@ int sysfs_mknod(struct inode *dir, struc
 	return sysfs_create(dentry, mode, NULL);
 }
 
+struct dentry * sysfs_get_new_dentry(struct dentry * parent, const char * name)
+{
+	struct qstr qstr;
+	struct dentry * dentry;
+	
+	qstr.name = name;
+	qstr.len = strlen(name);
+	qstr.hash = full_name_hash(name,qstr.len);
+	
+	dentry = d_alloc(parent, &qstr);
+	if (!dentry)
+		return ERR_PTR(-ENOMEM);
+
+	return dentry;
+}
+
 struct dentry * sysfs_get_dentry(struct dentry * parent, const char * name)
 {
 	struct qstr qstr;
@@ -87,6 +104,9 @@ void sysfs_hash_and_remove(struct dentry
 {
 	struct dentry * victim;
 
+	if (!dir)
+		return;
+
 	down(&dir->d_inode->i_sem);
 	victim = sysfs_get_dentry(dir,name);
 	if (!IS_ERR(victim)) {
@@ -107,4 +127,41 @@ void sysfs_hash_and_remove(struct dentry
 	up(&dir->d_inode->i_sem);
 }
 
+/* called under down_read(k_rwsem)
+ */
+int sysfs_get_link_count(struct dentry * dentry)
+{
+	int count = 1; /* all directory link count starts with 2 */
+	struct kobject * k, * kobj;
+	struct kobject_attr * k_attr;
+	struct kobject_attr_group *k_attr_grp;
+
+	if (!dentry) {
+		struct subsystem * s;
+		list_for_each_entry(s, &kobj_subsystem_list, next)
+			count++;
+		return count;
+	}
+	kobj = dentry->d_fsdata;
+	if (dentry->d_parent->d_fsdata == kobj) {
+		struct attribute_group * grp;
+		struct attribute **attr;
+
+		list_for_each_entry(k_attr_grp, &kobj->attr_group, list) {
+			if (!strcmp(k_attr_grp->attr_group->name, dentry->d_name.name)) 
+			break;
+		}
+		grp = k_attr_grp->attr_group;
+		for (attr = grp->attrs; *attr ; attr++) 
+			count++;
+		return count;
+	}
+	list_for_each_entry(k, &kobj->k_children, k_sibling)
+		count++;
+	list_for_each_entry(k_attr, &kobj->attr, list)
+		count++;
+	list_for_each_entry(k_attr_grp, &kobj->attr_group, list)
+		count++;
 
+	return count;
+}
diff -puN fs/sysfs/dir.c~sysfs-dir fs/sysfs/dir.c
--- linux-2.6.0-test6/fs/sysfs/dir.c~sysfs-dir	2003-10-06 12:15:07.000000000 +0530
+++ linux-2.6.0-test6-maneesh/fs/sysfs/dir.c	2003-10-06 12:15:22.000000000 +0530
@@ -8,47 +8,342 @@
 #include <linux/mount.h>
 #include <linux/module.h>
 #include <linux/kobject.h>
+#include <linux/namei.h>
 #include "sysfs.h"
 
+struct inode_operations sysfs_dir_inode_operations = {
+	.lookup		= sysfs_lookup,
+};
+
+struct file_operations sysfs_dir_operations = {
+	.open		= sysfs_dir_open,
+	.release	= sysfs_dir_close,
+	.llseek		= dcache_dir_lseek,
+	.read		= generic_read_dir,
+	.readdir	= dcache_readdir,
+};
+
 static int init_dir(struct inode * inode)
 {
-	inode->i_op = &simple_dir_inode_operations;
-	inode->i_fop = &simple_dir_operations;
+	inode->i_op = &sysfs_dir_inode_operations;
+	inode->i_fop = &sysfs_dir_operations;
 
-	/* directory inodes start off with i_nlink == 2 (for "." entry) */
-	inode->i_nlink++;
 	return 0;
 }
 
 
-static int create_dir(struct kobject * k, struct dentry * p,
-		      const char * n, struct dentry ** d)
+static struct dentry * __create_dir(struct kobject * k, struct dentry * dentry)
 {
 	int error;
+       
+	error = sysfs_create(dentry, S_IFDIR| S_IRWXU | S_IRUGO | S_IXUGO,
+				 init_dir);
+	if (!error) {
+		dentry->d_fsdata = k;
+		dentry->d_inode->i_nlink += sysfs_get_link_count(dentry);
+		return dentry;
+	}
+	dput(dentry);
+
+	return ERR_PTR(error);;
+}
+
+int kobject_dentry_exist(struct dentry * parent, const char * name)
+{
+	struct list_head * next;
+
+	if (!name)
+		return 1;
+
+	spin_lock(&dcache_lock);
+	next = parent->d_subdirs.next;
+	while (next != &parent->d_subdirs) {
+		struct dentry * d = list_entry(next, struct dentry, d_child);
+		if (!strcmp(d->d_name.name, name)) {
+			dget_locked(d);
+			spin_unlock(&dcache_lock);
+			return 1;
+		}
+		next = next->next;
+	}
+	spin_unlock(&dcache_lock);
 
-	down(&p->d_inode->i_sem);
-	*d = sysfs_get_dentry(p,n);
-	if (!IS_ERR(*d)) {
-		error = sysfs_create(*d,
-					 S_IFDIR| S_IRWXU | S_IRUGO | S_IXUGO,
-					 init_dir);
-		if (!error) {
-			(*d)->d_fsdata = k;
-			p->d_inode->i_nlink++;
-		}
-		dput(*d);
-	} else
-		error = PTR_ERR(*d);
-	up(&p->d_inode->i_sem);
-	return error;
+	return 0;
+	
 }
 
+static inline struct dentry * 
+create_one_kobject_dir(struct kobject *k, struct dentry * dentry)
+{
+	struct dentry * new = NULL;
+	int err = 0;
+
+	if (k->k_symlink) {
+		err = sysfs_symlink(dentry->d_parent->d_inode, dentry, 
+					k->k_symlink);
+		new = dentry;
+		if (err)
+			new = ERR_PTR(err);
+	} else {
+		new =  __create_dir(k, dentry);
+		if (!IS_ERR(new))
+			k->dentry = dentry;
+	}
+
+	return new;
+}
+
+static int create_kobject_dir(struct kobject * kobj, struct dentry * dentry)
+{
+	struct kobject * k;
+	const char * name;
+	struct dentry * p = kobj->dentry;
+	struct dentry * new = NULL;
+	int err = 0;
+
+	list_for_each_entry(k, &kobj->k_children, k_sibling) {
+		name = kobject_name(k);
+		if (dentry) {
+                	if (!strcmp(name, dentry->d_name.name)) {
+				new = create_one_kobject_dir(k, dentry);
+				break;
+			}
+		} else if (!(kobject_dentry_exist(p, name))) {
+				new = sysfs_get_new_dentry(p, name);
+				if (!IS_ERR(new))
+					new = create_one_kobject_dir(k, new);
+				if (IS_ERR(new))
+					break;
+			}
+	}
+	if (!new)
+		return -ENOENT;
+
+	if (IS_ERR(new))
+		err = PTR_ERR(new);
+
+	return err;
+}
+
+static int create_attr_group(struct kobject *kobj, struct dentry * dentry)
+{
+	struct kobject_attr_group * grp;
+	struct dentry * p = kobj->dentry;
+        struct dentry * new = NULL;
+	int err = 0;
+
+	list_for_each_entry(grp, &kobj->attr_group, list) {
+		if (dentry) {
+			const char * name = dentry->d_name.name;
+			if (!strcmp(grp->attr_group->name, name)) {
+				new =  __create_dir(kobj, dentry);
+				break;
+			}
+		} else if (!kobject_dentry_exist(p, grp->attr_group->name)) {
+			new = sysfs_get_new_dentry(p, grp->attr_group->name);
+			if (!IS_ERR(new)) 
+				new = __create_dir(kobj, new);
+			else
+				break;
+		}
+	}	
+	if (!new)
+		return -ENOENT;
+
+	if (IS_ERR(new))
+		err = PTR_ERR(new);
+	return err;
+}
+
+static inline struct dentry *
+create_one_attr_file(struct kobject_attr * k_attr, struct dentry * dentry)
+{
+	const struct attribute * attr = NULL;
+	int (* init) (struct inode *) = NULL;
+	int err = 0;
+
+	attr = kobject_attr(k_attr);
+
+	if (k_attr->flags & KOBJ_TEXT_ATTR) 
+		init = sysfs_init_file;
+	
+	err = sysfs_create(dentry, (attr->mode & S_IALLUGO) | S_IFREG, init);
+	if (err)
+		return ERR_PTR(err);
+
+	if (k_attr->flags & KOBJ_TEXT_ATTR) 
+		dentry->d_fsdata = (void *) attr;
+	else {
+		dentry->d_fsdata = (void *)k_attr->attr_u.bin_attr;
+		dentry->d_inode->i_size = k_attr->attr_u.bin_attr->size;
+		dentry->d_inode->i_fop = &bin_fops;
+	}
+	return dentry;
+}
+
+static int create_attr_file(struct kobject * kobj, struct dentry * dentry)
+{
+	struct kobject_attr * k_attr;
+	struct dentry * p = kobj->dentry;
+        struct dentry *new = NULL;
+	int err = 0;
+
+	list_for_each_entry(k_attr, &kobj->attr, list) {
+		char * name = kobject_attr(k_attr)->name;
+		if (dentry) {
+			if (!strcmp(name, dentry->d_name.name)) {
+				new = create_one_attr_file(k_attr, dentry);
+				break;
+			}	
+		} else if (!kobject_dentry_exist(p, name)) {
+			new = sysfs_get_new_dentry(p, name);
+			if (!IS_ERR(new))
+				new = create_one_attr_file(k_attr, new);
+			if (IS_ERR(new))
+				break;
+		}
+	}
+	if (!new)
+		return -ENOENT;
+
+	if (IS_ERR(new))
+		err = PTR_ERR(new);
+	return err;
+}
+
+static int create_attr_group_files(struct kobject * kobj, 
+					struct dentry * dentry, int all)
+{
+	struct dentry * new = NULL;
+	struct dentry * parent = (all) ? dentry : dentry->d_parent;
+	struct attribute *const* attr;
+	const struct attribute_group * grp = NULL;
+	struct kobject_attr_group * k_attr_grp;
+	struct kobject_attr k_attr;
+	int err = 0;
+
+	list_for_each_entry(k_attr_grp, &kobj->attr_group, list) {
+		if (!strcmp(k_attr_grp->attr_group->name, parent->d_name.name)) 
+			break;
+	}
+
+	grp = k_attr_grp->attr_group;
+	if (!grp)
+		return -ENOENT;
+
+	for (attr = grp->attrs; *attr && !err; attr++) {
+		k_attr.attr_u.attr = *attr,
+		k_attr.flags = KOBJ_TEXT_ATTR;
+		if (!all) {
+			if (!strcmp((*attr)->name, dentry->d_name.name)) {
+				new = create_one_attr_file(&k_attr, dentry);
+				break;
+			}
+		} else if (!kobject_dentry_exist(parent, (*attr)->name)) {
+			new = sysfs_get_new_dentry(parent, (*attr)->name);
+			if (!IS_ERR(new)) 
+				new = create_one_attr_file(&k_attr, new);
+			if (IS_ERR(new))
+				break;
+		}
+	}
+	if (!new)
+		return -ENOENT;
+
+	if (IS_ERR(new))
+		err = PTR_ERR(new);
+	return err;
+}
 
-int sysfs_create_subdir(struct kobject * k, const char * n, struct dentry ** d)
+static int create_subsystem_dir(struct dentry * dentry)
 {
-	return create_dir(k,k->dentry,n,d);
+	struct subsystem * s;
+	struct kobject * k;
+	struct dentry * p = sysfs_mount->mnt_root;
+	struct dentry * new = NULL;
+	const char * name = NULL;
+	int err = 0;
+	
+	spin_lock(&kobj_subsystem_lock);
+	list_for_each_entry(s, &kobj_subsystem_list, next) {
+		k = &s->kset.kobj;
+		if (dentry) {
+			name = dentry->d_name.name;
+			if (!strcmp(kobject_name(k), name)) { 
+				spin_unlock(&kobj_subsystem_lock);
+				new = __create_dir(k, dentry);
+				spin_lock(&kobj_subsystem_lock);
+				if (!IS_ERR(new)) 
+					k->dentry = dentry;
+				break;
+			}
+		} else {
+			name = kobject_name(k);
+			if (!kobject_dentry_exist(p, name) && !(k->parent)) {
+				spin_unlock(&kobj_subsystem_lock);
+				new = sysfs_get_new_dentry(p, kobject_name(k));
+				if (!IS_ERR(new)) {
+					new = __create_dir(k, new);
+					if (!IS_ERR(new))
+						k->dentry = new;
+				}
+				spin_lock(&kobj_subsystem_lock);
+				if (IS_ERR(new))
+					break;
+			}
+		}
+	}
+	spin_unlock(&kobj_subsystem_lock);
+	if (!new)
+		return -ENOENT;
+
+	if (IS_ERR(new))
+		err = PTR_ERR(new);
+	return err;
 }
 
+struct dentry *sysfs_lookup(struct inode *dir, struct dentry *dentry, 
+				struct nameidata *nd)
+{
+	struct dentry * parent = dentry->d_parent;
+	struct kobject *parent_kobj = NULL;
+	int err = 0;
+	
+	if (parent == sysfs_mount->mnt_root) {
+		err = create_subsystem_dir(dentry);
+		goto exit;
+	}
+
+	parent_kobj = kobject_get(parent->d_fsdata);
+	down_read(&parent_kobj->k_rwsem);
+
+	err = create_kobject_dir(parent_kobj, dentry);
+	if (err != -ENOENT)
+		goto exit_unlock;
+
+	err = create_attr_file(parent_kobj, dentry);
+	if (err != -ENOENT)
+		goto exit_unlock;
+
+	err = create_attr_group(parent_kobj, dentry);
+	if (err != -ENOENT)
+		goto exit_unlock;
+
+	if (parent->d_fsdata == parent->d_parent->d_fsdata) 
+		err = create_attr_group_files(parent_kobj, dentry, 0);
+
+exit_unlock:
+	up_read(&parent_kobj->k_rwsem);
+	kobject_put(parent_kobj);
+exit:
+	if (err == -ENOENT) {
+		d_add(dentry, NULL);
+		err = 0;
+	}
+
+	return ERR_PTR(err);
+  }
+
 /**
  *	sysfs_create_dir - create a directory for an object.
  *	@parent:	parent parent object.
@@ -57,24 +352,10 @@ int sysfs_create_subdir(struct kobject *
 
 int sysfs_create_dir(struct kobject * kobj)
 {
-	struct dentry * dentry = NULL;
-	struct dentry * parent;
-	int error = 0;
-
 	if (!kobj)
 		return -EINVAL;
 
-	if (kobj->parent)
-		parent = kobj->parent->dentry;
-	else if (sysfs_mount && sysfs_mount->mnt_sb)
-		parent = sysfs_mount->mnt_sb->s_root;
-	else
-		return -EFAULT;
-
-	error = create_dir(kobj,parent,kobject_name(kobj),&dentry);
-	if (!error)
-		kobj->dentry = dentry;
-	return error;
+	return 0;
 }
 
 
@@ -133,7 +414,6 @@ void sysfs_remove_dir(struct kobject * k
 			 * Unlink and unhash.
 			 */
 			spin_unlock(&dcache_lock);
-			d_delete(d);
 			simple_unlink(dentry->d_inode,d);
 			dput(d);
 			spin_lock(&dcache_lock);
@@ -162,14 +442,86 @@ void sysfs_rename_dir(struct kobject * k
 	if (!kobj->parent)
 		return;
 
-	parent = kobj->parent->dentry;
+	kobject_set_name(kobj,new_name);
+	if (sysfs_mount) {
+		parent = kobj->parent->dentry;
+		down(&parent->d_inode->i_sem);
+		new_dentry = sysfs_get_dentry(parent, new_name);
+		d_move(kobj->dentry, new_dentry);
+		up(&parent->d_inode->i_sem);	
+	}
+}
 
-	down(&parent->d_inode->i_sem);
 
-	new_dentry = sysfs_get_dentry(parent, new_name);
-	d_move(kobj->dentry, new_dentry);
-	kobject_set_name(kobj,new_name);
-	up(&parent->d_inode->i_sem);	
+
+int sysfs_open_dir_entries(struct dentry * parent)
+{
+	struct kobject * kobj;
+	int err = 0;
+	
+	if (parent == sysfs_mount->mnt_root) 
+		return create_subsystem_dir(NULL);
+
+	kobj = kobject_get(parent->d_fsdata);
+	down_read(&kobj->k_rwsem);
+
+	if (parent->d_fsdata == parent->d_parent->d_fsdata) {
+		err = create_attr_group_files(kobj, parent, 1);
+		goto exit_unlock;
+	}
+
+	err = create_kobject_dir(kobj, NULL);
+	if (err && (err != -ENOENT))
+		goto exit_unlock;
+
+	err = create_attr_file(kobj, NULL);
+	if (err && (err != -ENOENT))
+		goto exit_unlock;
+
+	err = create_attr_group(kobj, NULL);
+
+exit_unlock:
+	up_read(&kobj->k_rwsem);
+	kobject_put(kobj);
+	
+	return err;
+}
+
+
+int sysfs_dir_open(struct inode *inode, struct file *file)
+{
+	static struct qstr cursor_name = {.len = 1, .name = "."};
+	int err = 0;
+
+	err = sysfs_open_dir_entries(file->f_dentry);
+	if (err && (err != -ENOENT)) {
+		printk("err opening dir %s %d\n", file->f_dentry->d_name.name, err);
+		return err;
+	}
+
+	file->private_data = d_alloc(file->f_dentry, &cursor_name);
+
+	return file->private_data ? 0 : -ENOMEM;
+}
+
+int sysfs_dir_close(struct inode *inode, struct file *file)
+{
+	struct list_head * tmp;
+	struct dentry * parent = file->f_dentry;
+
+	spin_lock(&dcache_lock);
+	tmp = parent->d_subdirs.next;
+	while (tmp != &parent->d_subdirs) {
+		struct dentry * d = list_entry(tmp, struct dentry, d_child);
+
+		tmp = tmp->next;
+		spin_unlock(&dcache_lock);
+		dput(d);
+		spin_lock(&dcache_lock);
+	}
+	spin_unlock(&dcache_lock);
+	dput(file->private_data);
+	return 0;
 }
 
 EXPORT_SYMBOL(sysfs_create_dir);
diff -puN fs/sysfs/sysfs.h~sysfs-dir fs/sysfs/sysfs.h
--- linux-2.6.0-test6/fs/sysfs/sysfs.h~sysfs-dir	2003-10-06 12:15:07.000000000 +0530
+++ linux-2.6.0-test6-maneesh/fs/sysfs/sysfs.h	2003-10-06 12:15:07.000000000 +0530
@@ -1,3 +1,4 @@
+#include <linux/fs.h>
 
 extern struct vfsmount * sysfs_mount;
 
@@ -5,9 +6,23 @@ extern struct inode * sysfs_new_inode(mo
 extern int sysfs_create(struct dentry *, int mode, int (*init)(struct inode *));
 
 extern struct dentry * sysfs_get_dentry(struct dentry *, const char *);
+extern struct dentry * sysfs_get_new_dentry(struct dentry *, const char *);
 
-extern int sysfs_add_file(struct dentry * dir, const struct attribute * attr);
 extern void sysfs_hash_and_remove(struct dentry * dir, const char * name);
 
 extern int sysfs_create_subdir(struct kobject *, const char *, struct dentry **);
 extern void sysfs_remove_subdir(struct dentry *);
+extern int sysfs_get_link_count(struct dentry * dentry);
+
+extern int sysfs_init_file(struct inode * inode);
+extern int sysfs_symlink(struct inode *, struct dentry *, const char * );
+
+extern struct inode_operations sysfs_dir_inode_operations;
+extern struct file_operations sysfs_dir_operations;
+extern struct dentry_operations sysfs_dentry_operations;
+extern struct file_operations bin_fops;
+
+struct dentry * sysfs_lookup(struct inode *, struct dentry *, struct nameidata *);
+int sysfs_dir_open(struct inode *inode, struct file *file);
+int sysfs_dir_close(struct inode *inode, struct file *file);
+void sysfs_d_iput(struct dentry *, struct inode *);
diff -puN include/linux/sysfs.h~sysfs-dir include/linux/sysfs.h
--- linux-2.6.0-test6/include/linux/sysfs.h~sysfs-dir	2003-10-06 12:15:07.000000000 +0530
+++ linux-2.6.0-test6-maneesh/include/linux/sysfs.h	2003-10-06 12:15:07.000000000 +0530
@@ -57,6 +57,7 @@ sysfs_create_link(struct kobject * kobj,
 extern void
 sysfs_remove_link(struct kobject *, char * name);
 
+extern struct vfsmount * sysfs_mount;
 
 struct attribute_group {
 	char			* name;

_
-- 
Maneesh Soni
Linux Technology Center, 
IBM Software Lab, Bangalore, India
email: maneesh@in.ibm.com
Phone: 91-80-5044999 Fax: 91-80-5268553
T/L : 9243696

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC 1/6] sysfs-kobject.patch
  2003-10-06  9:00 ` [RFC 1/6] sysfs-kobject.patch Maneesh Soni
  2003-10-06  9:00   ` [RFC 2/6] sysfs-mount.patch Maneesh Soni
@ 2003-10-06 13:41   ` viro
  2003-10-06 16:16   ` Greg KH
  2 siblings, 0 replies; 33+ messages in thread
From: viro @ 2003-10-06 13:41 UTC (permalink / raw)
  To: Maneesh Soni; +Cc: Patrick Mochel, Greg KH, LKML, Dipankar Sarma

On Mon, Oct 06, 2003 at 02:30:03PM +0530, Maneesh Soni wrote:
> @@ -32,6 +32,12 @@ struct kobject {
>  	struct kset		* kset;
>  	struct kobj_type	* ktype;
>  	struct dentry		* dentry;
> + 	struct list_head	k_sibling;
> + 	struct list_head	k_children;
> +	struct list_head	attr;
> +	struct list_head	attr_group;
> +	struct rw_semaphore	k_rwsem;
> +	char 			*k_symlink;
>  };

Too bloated.  I suspect that we will be better off if we simply leave
current scheme for directories and have dentries allocated on demand
for everything else.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC 2/6] sysfs-mount.patch
  2003-10-06  9:00   ` [RFC 2/6] sysfs-mount.patch Maneesh Soni
  2003-10-06  9:01     ` [RFC 3/6] sysfs-file.patch Maneesh Soni
@ 2003-10-06 13:43     ` viro
  2003-10-07  7:17       ` Maneesh Soni
  1 sibling, 1 reply; 33+ messages in thread
From: viro @ 2003-10-06 13:43 UTC (permalink / raw)
  To: Maneesh Soni; +Cc: Patrick Mochel, Greg KH, LKML, Dipankar Sarma

On Mon, Oct 06, 2003 at 02:30:30PM +0530, Maneesh Soni wrote:
> @@ -692,6 +693,10 @@ do_kern_mount(const char *fstype, int fl
>  	mnt->mnt_mountpoint = sb->s_root;
>  	mnt->mnt_parent = mnt;
>  	up_write(&sb->s_umount);
> +
> +	if (!strcmp(fstype, "sysfs"))
> +		sysfs_mount = mnt;
> +
>  	put_filesystem(type);
>  	return mnt;
>  out_sb:

That's too ugly for words.  Vetoed.  Sorry, but *that* will not fly.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC 0/6] Backing Store for sysfs
  2003-10-06  8:59 [RFC 0/6] Backing Store for sysfs Maneesh Soni
  2003-10-06  9:00 ` [RFC 1/6] sysfs-kobject.patch Maneesh Soni
@ 2003-10-06 16:08 ` Greg KH
  2003-10-06 17:31   ` Dipankar Sarma
  2003-10-06 18:44 ` Patrick Mochel
  2 siblings, 1 reply; 33+ messages in thread
From: Greg KH @ 2003-10-06 16:08 UTC (permalink / raw)
  To: Maneesh Soni; +Cc: Al Viro, Patrick Mochel, LKML, Dipankar Sarma

On Mon, Oct 06, 2003 at 02:29:15PM +0530, Maneesh Soni wrote:
> 
> 				2.6.0-test6		With patches.
> -----------------
> dentry_cache (active)		2520			2544
> inode_cache (active)		1058			1050
> LowFree			875032 KB		874748 KB

So with these patches we actually eat up more LowFree if all sysfs
entries are searched, and make the dentry_cache bigger?  That's not good :(

Remember, every kobject that's created will cause a call to
/sbin/hotplug which will cause udev to walk the sysfs tree to get the
information for that kobject.  So I don't see any savings in these
patches, do you?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC 1/6] sysfs-kobject.patch
  2003-10-06  9:00 ` [RFC 1/6] sysfs-kobject.patch Maneesh Soni
  2003-10-06  9:00   ` [RFC 2/6] sysfs-mount.patch Maneesh Soni
  2003-10-06 13:41   ` [RFC 1/6] sysfs-kobject.patch viro
@ 2003-10-06 16:16   ` Greg KH
  2003-10-06 17:41     ` Dipankar Sarma
  2 siblings, 1 reply; 33+ messages in thread
From: Greg KH @ 2003-10-06 16:16 UTC (permalink / raw)
  To: Maneesh Soni; +Cc: Al Viro, Patrick Mochel, LKML, Dipankar Sarma

On Mon, Oct 06, 2003 at 02:30:03PM +0530, Maneesh Soni wrote:
> diff -puN include/linux/kobject.h~sysfs-kobject include/linux/kobject.h
> --- linux-2.6.0-test6/include/linux/kobject.h~sysfs-kobject	2003-10-06 11:48:37.000000000 +0530
> +++ linux-2.6.0-test6-maneesh/include/linux/kobject.h	2003-10-06 11:48:51.000000000 +0530
> @@ -32,6 +32,12 @@ struct kobject {
>  	struct kset		* kset;
>  	struct kobj_type	* ktype;
>  	struct dentry		* dentry;
> + 	struct list_head	k_sibling;
> + 	struct list_head	k_children;
> +	struct list_head	attr;
> +	struct list_head	attr_group;
> +	struct rw_semaphore	k_rwsem;
> +	char 			*k_symlink;
>  };

Ouch.  Like Al said, this is too bloated.  Remember, not all kobjects
are registered for use in sysfs.  This makes the overhead for such
usages pretty high :(

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC 0/6] Backing Store for sysfs
  2003-10-06 16:08 ` [RFC 0/6] Backing Store for sysfs Greg KH
@ 2003-10-06 17:31   ` Dipankar Sarma
  2003-10-06 17:38     ` Greg KH
  0 siblings, 1 reply; 33+ messages in thread
From: Dipankar Sarma @ 2003-10-06 17:31 UTC (permalink / raw)
  To: Greg KH; +Cc: Maneesh Soni, Al Viro, Patrick Mochel, LKML

On Mon, Oct 06, 2003 at 09:08:46AM -0700, Greg KH wrote:
> On Mon, Oct 06, 2003 at 02:29:15PM +0530, Maneesh Soni wrote:
> > 
> > 				2.6.0-test6		With patches.
> > -----------------
> > dentry_cache (active)		2520			2544
> > inode_cache (active)		1058			1050
> > LowFree			875032 KB		874748 KB
> 
> So with these patches we actually eat up more LowFree if all sysfs
> entries are searched, and make the dentry_cache bigger?  That's not good :(

My guess is that those 24 dentries are just noise. What we should
do is verify with a large number of devices if the numbers are all
that different after a walk of the sysfs tree.

> 
> Remember, every kobject that's created will cause a call to
> /sbin/hotplug which will cause udev to walk the sysfs tree to get the
> information for that kobject.  So I don't see any savings in these
> patches, do you?

Assuming that unused files/dirs are aged out of dentry and inode cache,
it should benefit. The numbers you should look at are -

--------------------------------------------------------
After mounting sysfs
-------------------
dentry_cache (active)           2350                    1321
inode_cache (active)            1058                    31
LowFree                         875096 KB               875836 KB
--------------------------------------------------------

That saves ~800KB. If you just mount sysfs and use a few files, you
aren't eating up dentries and inodes for every file in sysfs. How often 
do you expect hotplug events to happen in a system ? Some time after a 
hotplug event, dentries/inodes will get aged out and then you should see 
savings. It should greatly benefit in a normal system.

Now if the additional kobjects cause problems with userland hotplug, then 
that needs to be resolved. However that seems to be a different problem 
altogether. Could you please elaborate on that ?

Thanks
Dipankar

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC 0/6] Backing Store for sysfs
  2003-10-06 17:31   ` Dipankar Sarma
@ 2003-10-06 17:38     ` Greg KH
  2003-10-06 18:01       ` Dipankar Sarma
  0 siblings, 1 reply; 33+ messages in thread
From: Greg KH @ 2003-10-06 17:38 UTC (permalink / raw)
  To: Dipankar Sarma; +Cc: Maneesh Soni, Al Viro, Patrick Mochel, LKML

On Mon, Oct 06, 2003 at 11:01:11PM +0530, Dipankar Sarma wrote:
> On Mon, Oct 06, 2003 at 09:08:46AM -0700, Greg KH wrote:
> > On Mon, Oct 06, 2003 at 02:29:15PM +0530, Maneesh Soni wrote:
> > > 
> > > 				2.6.0-test6		With patches.
> > > -----------------
> > > dentry_cache (active)		2520			2544
> > > inode_cache (active)		1058			1050
> > > LowFree			875032 KB		874748 KB
> > 
> > So with these patches we actually eat up more LowFree if all sysfs
> > entries are searched, and make the dentry_cache bigger?  That's not good :(
> 
> My guess is that those 24 dentries are just noise. What we should
> do is verify with a large number of devices if the numbers are all
> that different after a walk of the sysfs tree.

Ok, a better test would be with a _lot_ of devices.  Care to test with a
lot of scsi debug devices?

> > Remember, every kobject that's created will cause a call to
> > /sbin/hotplug which will cause udev to walk the sysfs tree to get the
> > information for that kobject.  So I don't see any savings in these
> > patches, do you?
> 
> Assuming that unused files/dirs are aged out of dentry and inode cache,
> it should benefit. The numbers you should look at are -
> 
> --------------------------------------------------------
> After mounting sysfs
> -------------------
> dentry_cache (active)           2350                    1321
> inode_cache (active)            1058                    31
> LowFree                         875096 KB               875836 KB
> --------------------------------------------------------
> 
> That saves ~800KB. If you just mount sysfs and use a few files, you
> aren't eating up dentries and inodes for every file in sysfs. How often 
> do you expect hotplug events to happen in a system ?

Every kobject that is created and is associated with a subsystem
generates a hotplug call.  So that's about every kobject that we care
about here :)

> Some time after a hotplug event, dentries/inodes will get aged out and
> then you should see savings. It should greatly benefit in a normal
> system.

Can you show this happening?

> Now if the additional kobjects cause problems with userland hotplug, then 
> that needs to be resolved. However that seems to be a different problem 
> altogether. Could you please elaborate on that ?

No, I don't think the additional ones you have added will cause
problems, but can you verify this?  Just log all hotplug events
happening in your system (point /proc/sys/kernel/hotplug to a simple
logging program).

But again, I don't think the added overhead you have added to a kobject
is acceptable for not much gain for the normal case (systems without a
zillion devices.)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC 1/6] sysfs-kobject.patch
  2003-10-06 16:16   ` Greg KH
@ 2003-10-06 17:41     ` Dipankar Sarma
  2003-10-06 17:44       ` Greg KH
  0 siblings, 1 reply; 33+ messages in thread
From: Dipankar Sarma @ 2003-10-06 17:41 UTC (permalink / raw)
  To: Greg KH; +Cc: Maneesh Soni, Al Viro, Patrick Mochel, LKML

On Mon, Oct 06, 2003 at 09:16:40AM -0700, Greg KH wrote:
> On Mon, Oct 06, 2003 at 02:30:03PM +0530, Maneesh Soni wrote:
> > diff -puN include/linux/kobject.h~sysfs-kobject include/linux/kobject.h
> > --- linux-2.6.0-test6/include/linux/kobject.h~sysfs-kobject	2003-10-06 11:48:37.000000000 +0530
> > +++ linux-2.6.0-test6-maneesh/include/linux/kobject.h	2003-10-06 11:48:51.000000000 +0530
> > @@ -32,6 +32,12 @@ struct kobject {
> >  	struct kset		* kset;
> >  	struct kobj_type	* ktype;
> >  	struct dentry		* dentry;
> > + 	struct list_head	k_sibling;
> > + 	struct list_head	k_children;
> > +	struct list_head	attr;
> > +	struct list_head	attr_group;
> > +	struct rw_semaphore	k_rwsem;
> > +	char 			*k_symlink;
> >  };
> 
> Ouch.  Like Al said, this is too bloated.  Remember, not all kobjects

That is not what LowFree numbers after mounting sysfs says. Sure
you add some 48 bytes to kobject, but you are no longer pinning
256-byte(??) dentries and possibly bigger inodes for kobjects.
Doesn't that count ?

> are registered for use in sysfs.  This makes the overhead for such
> usages pretty high :(

The only way to confirm this is with numbers. Maneesh posted some
numbers that clearly show that memory usage is lower even with
the added fields in kobjects. Now, the question is are there
scenerios where kobject size increase in overhead. It will be nice
to have some numbers to demonstrate that.

Thanks
Dipankar

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC 1/6] sysfs-kobject.patch
  2003-10-06 17:41     ` Dipankar Sarma
@ 2003-10-06 17:44       ` Greg KH
  0 siblings, 0 replies; 33+ messages in thread
From: Greg KH @ 2003-10-06 17:44 UTC (permalink / raw)
  To: Dipankar Sarma; +Cc: Maneesh Soni, Al Viro, Patrick Mochel, LKML

On Mon, Oct 06, 2003 at 11:11:08PM +0530, Dipankar Sarma wrote:
> On Mon, Oct 06, 2003 at 09:16:40AM -0700, Greg KH wrote:
> > On Mon, Oct 06, 2003 at 02:30:03PM +0530, Maneesh Soni wrote:
> > > diff -puN include/linux/kobject.h~sysfs-kobject include/linux/kobject.h
> > > --- linux-2.6.0-test6/include/linux/kobject.h~sysfs-kobject	2003-10-06 11:48:37.000000000 +0530
> > > +++ linux-2.6.0-test6-maneesh/include/linux/kobject.h	2003-10-06 11:48:51.000000000 +0530
> > > @@ -32,6 +32,12 @@ struct kobject {
> > >  	struct kset		* kset;
> > >  	struct kobj_type	* ktype;
> > >  	struct dentry		* dentry;
> > > + 	struct list_head	k_sibling;
> > > + 	struct list_head	k_children;
> > > +	struct list_head	attr;
> > > +	struct list_head	attr_group;
> > > +	struct rw_semaphore	k_rwsem;
> > > +	char 			*k_symlink;
> > >  };
> > 
> > Ouch.  Like Al said, this is too bloated.  Remember, not all kobjects
> 
> That is not what LowFree numbers after mounting sysfs says. Sure
> you add some 48 bytes to kobject, but you are no longer pinning
> 256-byte(??) dentries and possibly bigger inodes for kobjects.
> Doesn't that count ?

For kobjects that are not ever registered with sysfs?  Those kobjects
never create dentries, so yes, this is unneeded.  For an example of this
kind of usage, see the drivers/usb/serial/usb-serial.c file.  I know the
scsi people are also looking into using kobjects in this manner for some
of their internal structures.  

I would also like to use kobjects this way for all USB urbs, but they
are too heavy in their existing footprint for this to work out.  And by
adding this extra overhead, there's no way I could advocate using
kobjects for them in the future :(

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC 0/6] Backing Store for sysfs
  2003-10-06 17:38     ` Greg KH
@ 2003-10-06 18:01       ` Dipankar Sarma
  2003-10-06 18:09         ` Greg KH
  0 siblings, 1 reply; 33+ messages in thread
From: Dipankar Sarma @ 2003-10-06 18:01 UTC (permalink / raw)
  To: Greg KH; +Cc: Maneesh Soni, Al Viro, Patrick Mochel, LKML

On Mon, Oct 06, 2003 at 10:38:58AM -0700, Greg KH wrote:
> On Mon, Oct 06, 2003 at 11:01:11PM +0530, Dipankar Sarma wrote:
> > My guess is that those 24 dentries are just noise. What we should
> > do is verify with a large number of devices if the numbers are all
> > that different after a walk of the sysfs tree.
> 
> Ok, a better test would be with a _lot_ of devices.  Care to test with a
> lot of scsi debug devices?

Sure. At the same time, as Maneesh pointed out, this is just an
RFC. The backing store design probably needs quite some work first.

> > --------------------------------------------------------
> > After mounting sysfs
> > -------------------
> > dentry_cache (active)           2350                    1321
> > inode_cache (active)            1058                    31
> > LowFree                         875096 KB               875836 KB
> > --------------------------------------------------------
> > 
> > That saves ~800KB. If you just mount sysfs and use a few files, you
> > aren't eating up dentries and inodes for every file in sysfs. How often 
> > do you expect hotplug events to happen in a system ?
> 
> Every kobject that is created and is associated with a subsystem
> generates a hotplug call.  So that's about every kobject that we care
> about here :)

That would not happen in a normal running system often, right ? So,
I don't see the point looking at mem usage after hotplug events.

> 
> > Some time after a hotplug event, dentries/inodes will get aged out and
> > then you should see savings. It should greatly benefit in a normal
> > system.
> 
> Can you show this happening?

It should be easy to demonstrate. That is how dentries/inodes
work for on-disk filesystems. If Maneesh's patch didn't work that
way, then the whole point is lost. I hope that is not the case.

> 
> > Now if the additional kobjects cause problems with userland hotplug, then 
> > that needs to be resolved. However that seems to be a different problem 
> > altogether. Could you please elaborate on that ?
> 
> No, I don't think the additional ones you have added will cause
> problems, but can you verify this?  Just log all hotplug events
> happening in your system (point /proc/sys/kernel/hotplug to a simple
> logging program).
> 
> But again, I don't think the added overhead you have added to a kobject
> is acceptable for not much gain for the normal case (systems without a
> zillion devices.)

IIRC, Maneesh test machine is a 2-way P4 xeon with six scsi disks and savings
are of about 800KB. That is as normal a case as it gets, I think.
It only gets better as you have more devices in your system.

Thanks
Dipankar

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC 0/6] Backing Store for sysfs
  2003-10-06 18:01       ` Dipankar Sarma
@ 2003-10-06 18:09         ` Greg KH
  2003-10-06 18:31           ` Dipankar Sarma
  0 siblings, 1 reply; 33+ messages in thread
From: Greg KH @ 2003-10-06 18:09 UTC (permalink / raw)
  To: Dipankar Sarma; +Cc: Maneesh Soni, Al Viro, Patrick Mochel, LKML

On Mon, Oct 06, 2003 at 11:31:19PM +0530, Dipankar Sarma wrote:
> On Mon, Oct 06, 2003 at 10:38:58AM -0700, Greg KH wrote:
> > On Mon, Oct 06, 2003 at 11:01:11PM +0530, Dipankar Sarma wrote:
> > > --------------------------------------------------------
> > > After mounting sysfs
> > > -------------------
> > > dentry_cache (active)           2350                    1321
> > > inode_cache (active)            1058                    31
> > > LowFree                         875096 KB               875836 KB
> > > --------------------------------------------------------
> > > 
> > > That saves ~800KB. If you just mount sysfs and use a few files, you
> > > aren't eating up dentries and inodes for every file in sysfs. How often 
> > > do you expect hotplug events to happen in a system ?
> > 
> > Every kobject that is created and is associated with a subsystem
> > generates a hotplug call.  So that's about every kobject that we care
> > about here :)
> 
> That would not happen in a normal running system often, right ? So,
> I don't see the point looking at mem usage after hotplug events.

No.  My main point is that for every hotplug event (which is caused by a
kobject being created or destroyed), udev will run and look at the sysfs
entry for the kobject (by using libsysfs which reads in all of the
kobject information, including attributes).  This is a normal event, so
we have to care about what happens after running 'find' on the sysfs
tree as that is basically what will always happen.

Does that make more sense?  We can't just look at what happens with this
patch without actually accessing all of the sysfs tree, as that will be
the "normal" case.

> > > Some time after a hotplug event, dentries/inodes will get aged out and
> > > then you should see savings. It should greatly benefit in a normal
> > > system.
> > 
> > Can you show this happening?
> 
> It should be easy to demonstrate. That is how dentries/inodes
> work for on-disk filesystems. If Maneesh's patch didn't work that
> way, then the whole point is lost. I hope that is not the case.

Me too.  It's just that the free memory numbers didn't show much gain
with this patch on his system.  That worries me.

> > > Now if the additional kobjects cause problems with userland hotplug, then 
> > > that needs to be resolved. However that seems to be a different problem 
> > > altogether. Could you please elaborate on that ?
> > 
> > No, I don't think the additional ones you have added will cause
> > problems, but can you verify this?  Just log all hotplug events
> > happening in your system (point /proc/sys/kernel/hotplug to a simple
> > logging program).
> > 
> > But again, I don't think the added overhead you have added to a kobject
> > is acceptable for not much gain for the normal case (systems without a
> > zillion devices.)
> 
> IIRC, Maneesh test machine is a 2-way P4 xeon with six scsi disks and savings
> are of about 800KB. That is as normal a case as it gets, I think.
> It only gets better as you have more devices in your system.

800Kb after running find?  I don't see that :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC 0/6] Backing Store for sysfs
  2003-10-06 18:09         ` Greg KH
@ 2003-10-06 18:31           ` Dipankar Sarma
  2003-10-06 18:34             ` Greg KH
  0 siblings, 1 reply; 33+ messages in thread
From: Dipankar Sarma @ 2003-10-06 18:31 UTC (permalink / raw)
  To: Greg KH; +Cc: Maneesh Soni, Al Viro, Patrick Mochel, LKML

On Mon, Oct 06, 2003 at 11:09:07AM -0700, Greg KH wrote:
> On Mon, Oct 06, 2003 at 11:31:19PM +0530, Dipankar Sarma wrote:
> No.  My main point is that for every hotplug event (which is caused by a
> kobject being created or destroyed), udev will run and look at the sysfs
> entry for the kobject (by using libsysfs which reads in all of the
> kobject information, including attributes).  This is a normal event, so
> we have to care about what happens after running 'find' on the sysfs
> tree as that is basically what will always happen.
> 
> Does that make more sense?  We can't just look at what happens with this
> patch without actually accessing all of the sysfs tree, as that will be
> the "normal" case.

That sounds odd. So, udev essentially results in a frequent and continuous
"find /sys" ? That doesn't sound good. You are unnecessarily adding
pressure on vfs (dcache specially). We will discuss this offline then
and see what needs to be done.

> > > Can you show this happening?
> > 
> > It should be easy to demonstrate. That is how dentries/inodes
> > work for on-disk filesystems. If Maneesh's patch didn't work that
> > way, then the whole point is lost. I hope that is not the case.
> 
> Me too.  It's just that the free memory numbers didn't show much gain
> with this patch on his system.  That worries me.

Well, Maneesh didn't post numbers after letting the system age out
sysfs dentries/inodes. Maneesh can you post some such numbers ?


> > > But again, I don't think the added overhead you have added to a kobject
> > > is acceptable for not much gain for the normal case (systems without a
> > > zillion devices.)
> > 
> > IIRC, Maneesh test machine is a 2-way P4 xeon with six scsi disks and savings
> > are of about 800KB. That is as normal a case as it gets, I think.
> > It only gets better as you have more devices in your system.
> 
> 800Kb after running find?  I don't see that :)

No, those numbers were for just mounting sysfs. More numbers tomorrow.

Thanks
Dipankar

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC 0/6] Backing Store for sysfs
  2003-10-06 18:31           ` Dipankar Sarma
@ 2003-10-06 18:34             ` Greg KH
  2003-10-07  9:08               ` Andreas Jellinghaus
  0 siblings, 1 reply; 33+ messages in thread
From: Greg KH @ 2003-10-06 18:34 UTC (permalink / raw)
  To: Dipankar Sarma; +Cc: Maneesh Soni, Al Viro, Patrick Mochel, LKML

On Tue, Oct 07, 2003 at 12:01:32AM +0530, Dipankar Sarma wrote:
> On Mon, Oct 06, 2003 at 11:09:07AM -0700, Greg KH wrote:
> > On Mon, Oct 06, 2003 at 11:31:19PM +0530, Dipankar Sarma wrote:
> > No.  My main point is that for every hotplug event (which is caused by a
> > kobject being created or destroyed), udev will run and look at the sysfs
> > entry for the kobject (by using libsysfs which reads in all of the
> > kobject information, including attributes).  This is a normal event, so
> > we have to care about what happens after running 'find' on the sysfs
> > tree as that is basically what will always happen.
> > 
> > Does that make more sense?  We can't just look at what happens with this
> > patch without actually accessing all of the sysfs tree, as that will be
> > the "normal" case.
> 
> That sounds odd. So, udev essentially results in a frequent and continuous
> "find /sys" ? That doesn't sound good. You are unnecessarily adding
> pressure on vfs (dcache specially). We will discuss this offline then
> and see what needs to be done.

No, not a 'find', we look up the kobject that was added, and its
attributes.  Doing a 'find' will emulate this for your tests, that's
all.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC 0/6] Backing Store for sysfs
  2003-10-06  8:59 [RFC 0/6] Backing Store for sysfs Maneesh Soni
  2003-10-06  9:00 ` [RFC 1/6] sysfs-kobject.patch Maneesh Soni
  2003-10-06 16:08 ` [RFC 0/6] Backing Store for sysfs Greg KH
@ 2003-10-06 18:44 ` Patrick Mochel
  2003-10-06 19:27   ` Dipankar Sarma
  2 siblings, 1 reply; 33+ messages in thread
From: Patrick Mochel @ 2003-10-06 18:44 UTC (permalink / raw)
  To: Maneesh Soni; +Cc: Al Viro, Greg KH, LKML, Dipankar Sarma


> The following patch set(mailed separately) provides a prototype for a backing 
> store mechanism for sysfs. Currently sysfs pins all its dentries and inodes in 
> memory there by wasting kernel lowmem even when it is not mounted. 
> 
> With this patch set we create sysfs dentry whenever it is required like 
> other real filesystems and, age and free it as per the dcache rules. We
> now save significant amount of Lowmem by avoiding un-necessary pinning. 
> The following numbers were on a 2-way system with 6 disks and 2 NICs with 
> about 1028 dentries. The numbers are just indicative as they are system
> wide collected from /proc and are not exclusively for sysfs.

No thanks. 

First off, I'm not philosophically opposed to the concept of reducing 
sysfs and kobject memory usage. I think it can be gracefully done, though 
I don't think this is quite the solution, and I don't have one myself.. 

Now, you would really only problems when you have a large number of
devices and a limited amount of a Lowmem. I.e. it's only a problem on
large systems with 32-bit processors. And, the traditional arguments
against this situation is to a) use and promote 64-bit platforms and b)
that if you have that many devices, you (or your customers) can surely
afford enough memory to make the sysfs footprint a non-issue.

Concerning the patch, I really don't like it. I look at the kobject and 
sysfs code with the assumption in my mind that the objects are already too 
large and the code more complex than it should be. Adding to this is not 
the right approach, just as a general rule of thumb. 

Also, I don't think that increasing the co-dependency bewteen the kobject
and sysfs hierarchies is the right thing to do. They each have one pointer
back to the corresponding location in the other tree, which is about as
lightweight as you can get. Adding more only increases bloat for kobjects 
that are not represented in sysfs, and increases the total overhead of the 
entire system. 

As I said before, I don't know the right solution, but the directions to 
look in are related to attribute groups. Attributes definitely consume the 
most amount of memory (as opposed to the kobject hierachy), so delaying 
their creation would help, hopefully without making the interface too 
awkward. 

You can also use the assumption that an attribute group exists for all the 
kobjects in a kset, and that a kobject knows what kset it belongs to. And
that eventually, all attributes should be added as part of an attribute 
group..


	Pat




^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC 0/6] Backing Store for sysfs
  2003-10-06 18:44 ` Patrick Mochel
@ 2003-10-06 19:27   ` Dipankar Sarma
  2003-10-06 19:30     ` viro
  2003-10-06 19:33     ` Patrick Mochel
  0 siblings, 2 replies; 33+ messages in thread
From: Dipankar Sarma @ 2003-10-06 19:27 UTC (permalink / raw)
  To: Patrick Mochel; +Cc: Maneesh Soni, Al Viro, Greg KH, LKML

On Mon, Oct 06, 2003 at 11:44:14AM -0700, Patrick Mochel wrote:
> First off, I'm not philosophically opposed to the concept of reducing 
> sysfs and kobject memory usage. I think it can be gracefully done, though 
> I don't think this is quite the solution, and I don't have one myself.. 

Let's look at it this way - unless you find a way to save sizeof(struct dentry)
+ sizeof(struct inode) in every kobject/attr etc., there is no
way you can beat ageing of dentries/inodes.

> Now, you would really only problems when you have a large number of
> devices and a limited amount of a Lowmem. I.e. it's only a problem on
> large systems with 32-bit processors. And, the traditional arguments
> against this situation is to a) use and promote 64-bit platforms and b)
> that if you have that many devices, you (or your customers) can surely
> afford enough memory to make the sysfs footprint a non-issue.

That is not a very realistic argument, ia32 customers will likely
run systems with large number of disks and will have lowmem problem.
Besides that think about the added complexity of lookups due to
all those pinned dentries forever residing in dentry hash table.

> 
> Concerning the patch, I really don't like it. I look at the kobject and 
> sysfs code with the assumption in my mind that the objects are already too 
> large and the code more complex than it should be. Adding to this is not 
> the right approach, just as a general rule of thumb. 
> 
> Also, I don't think that increasing the co-dependency bewteen the kobject
> and sysfs hierarchies is the right thing to do. They each have one pointer
> back to the corresponding location in the other tree, which is about as
> lightweight as you can get. Adding more only increases bloat for kobjects 
> that are not represented in sysfs, and increases the total overhead of the 
> entire system. 

I don't see how you can claim that the total overhead of the entire
system is high. See Christian's numbers. The point here is pretty
straightforward -

sysfs currently uses dentries to represent filesystem hierarchy.
We want to create the dentries on the fly and age them out.
So, we can no longer use dentries to represent filesystem hierarchy.
Now, *something* has to represent the actual filesystem
hierarchy, so that dentries/inodes can be created on a lookup
miss based on that. So, what do you do here ? kobject and
its associates already represent most of the information necessary
for a backing store. Maneesh just added a little more to complete
what is equivalent of a on-disk filesystem. This allows vfs to
create dentries and inodes on the fly and age later on. Granted
that there are probably ugliness in the design to be sorted out
and it may have taken kobjects in a slightly different direction
than earlier, but it is not that odd when you look at it
from the VFS point of view.

> 
> You can also use the assumption that an attribute group exists for all the 
> kobjects in a kset, and that a kobject knows what kset it belongs to. And
> that eventually, all attributes should be added as part of an attribute 
> group..

As I said before, no matter how much you save on kobjects and attrs,
I can't see how you can account for ageing of dentries and inodes.
Please look at it from the VFS angle and see if there is a better
way to represent kobjects/attrs in order to create dentries/inodes
on demand and age later.

Thanks
Dipankar

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC 0/6] Backing Store for sysfs
  2003-10-06 19:27   ` Dipankar Sarma
@ 2003-10-06 19:30     ` viro
  2003-10-06 20:01       ` Dipankar Sarma
  2003-10-07  4:47       ` Maneesh Soni
  2003-10-06 19:33     ` Patrick Mochel
  1 sibling, 2 replies; 33+ messages in thread
From: viro @ 2003-10-06 19:30 UTC (permalink / raw)
  To: Dipankar Sarma; +Cc: Patrick Mochel, Maneesh Soni, Greg KH, LKML

On Tue, Oct 07, 2003 at 12:57:13AM +0530, Dipankar Sarma wrote:
 
> Let's look at it this way - unless you find a way to save sizeof(struct dentry)
> + sizeof(struct inode) in every kobject/attr etc., there is no
> way you can beat ageing of dentries/inodes.
 
> sysfs currently uses dentries to represent filesystem hierarchy.
> We want to create the dentries on the fly and age them out.
> So, we can no longer use dentries to represent filesystem hierarchy.
> Now, *something* has to represent the actual filesystem
> hierarchy, so that dentries/inodes can be created on a lookup
> miss based on that. So, what do you do here ? kobject and
> its associates already represent most of the information necessary
> for a backing store. Maneesh just added a little more to complete
> what is equivalent of a on-disk filesystem. This allows vfs to
> create dentries and inodes on the fly and age later on. Granted
> that there are probably ugliness in the design to be sorted out
> and it may have taken kobjects in a slightly different direction
> than earlier, but it is not that odd when you look at it
> from the VFS point of view.

Rot.  First of all, *not* *all* *kobjects* *are* *in* *sysfs*.  And these
are pure loss in your case.

What's more important, for leaves of the sysfs tree your overhead is also
a loss - we don't need to pin dentry down for them even with current sysfs
design.   And that can be done with minimal code changes and no data changes
at all.  Your patch will have to be more attractive than that.  What's the
expected ratio of directories to non-directories in sysfs?

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC 0/6] Backing Store for sysfs
  2003-10-06 19:27   ` Dipankar Sarma
  2003-10-06 19:30     ` viro
@ 2003-10-06 19:33     ` Patrick Mochel
  2003-10-06 20:26       ` Dipankar Sarma
  1 sibling, 1 reply; 33+ messages in thread
From: Patrick Mochel @ 2003-10-06 19:33 UTC (permalink / raw)
  To: Dipankar Sarma; +Cc: Maneesh Soni, Al Viro, Greg KH, LKML


> That is not a very realistic argument, ia32 customers will likely
> run systems with large number of disks and will have lowmem problem.

It's not a realistic requirement for me to solve your customer problems. 
:) I've been involved in this argument before, and the arguments have been 
the same, pretty much along party lines of IBM vs. Everyone else. I'm not 
here to point fingers, but you must heed the fact that we've been here 
before. 

> Besides that think about the added complexity of lookups due to
> all those pinned dentries forever residing in dentry hash table.

Well, along with more memory and more devices, I would expect your 
customers to also be paying for the fastest processors. :) 

> I don't see how you can claim that the total overhead of the entire
> system is high. See Christian's numbers. The point here is pretty
> straightforward -

Uh, let's recap: 

" This ended up increasing the size of kobject from 52 bytes to 108 bytes
but saving one dentry and inode per kobject. " 

Under one usage model, you've saved memory, however under the case where 
kobjects are used and not represnted in sysfs, you've more than doubled 
the overhead, and you've increased the total overhead in the worst case - 
when all dentries are looked up and pinned. 

> sysfs currently uses dentries to represent filesystem hierarchy.
> We want to create the dentries on the fly and age them out.
> So, we can no longer use dentries to represent filesystem hierarchy.
> Now, *something* has to represent the actual filesystem
> hierarchy, so that dentries/inodes can be created on a lookup
> miss based on that. So, what do you do here ? kobject and
> its associates already represent most of the information necessary
> for a backing store. 

I understand what you're trying to do, and I say it's the wrong approach. 
You're overloading kobjects in a manner unintended, and in a way that is 
not welcome. I do not have an alternative solution, but my last email gave 
some hints of where to look. Don't get bitter because I disagree. 

> > You can also use the assumption that an attribute group exists for all the 
> > kobjects in a kset, and that a kobject knows what kset it belongs to. And
> > that eventually, all attributes should be added as part of an attribute 
> > group..
> 
> As I said before, no matter how much you save on kobjects and attrs,
> I can't see how you can account for ageing of dentries and inodes.
> Please look at it from the VFS angle and see if there is a better
> way to represent kobjects/attrs in order to create dentries/inodes
> on demand and age later.

That's what I told you, only reversed - try again. The patch posted in 
unacceptable, though I'm willing to look at alternatives. I don't have or 
see a problem with the current situation, so your arguments are going to 
have to be a bit stronger. 

Thanks,


	Pat



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC 0/6] Backing Store for sysfs
  2003-10-06 19:30     ` viro
@ 2003-10-06 20:01       ` Dipankar Sarma
  2003-10-06 20:34         ` viro
  2003-10-07  4:47       ` Maneesh Soni
  1 sibling, 1 reply; 33+ messages in thread
From: Dipankar Sarma @ 2003-10-06 20:01 UTC (permalink / raw)
  To: viro; +Cc: Patrick Mochel, Maneesh Soni, Greg KH, LKML

On Mon, Oct 06, 2003 at 08:30:50PM +0100, viro@parcelfarce.linux.theplanet.co.uk wrote:
> On Tue, Oct 07, 2003 at 12:57:13AM +0530, Dipankar Sarma wrote:
> > sysfs currently uses dentries to represent filesystem hierarchy.
> > We want to create the dentries on the fly and age them out.
> > So, we can no longer use dentries to represent filesystem hierarchy.
> > Now, *something* has to represent the actual filesystem
> > hierarchy, so that dentries/inodes can be created on a lookup
> > miss based on that. So, what do you do here ? kobject and
> > its associates already represent most of the information necessary
> > for a backing store. Maneesh just added a little more to complete
> > what is equivalent of a on-disk filesystem. This allows vfs to
> > create dentries and inodes on the fly and age later on. Granted
> > that there are probably ugliness in the design to be sorted out
> > and it may have taken kobjects in a slightly different direction
> > than earlier, but it is not that odd when you look at it
> > from the VFS point of view.
> 
> Rot.  First of all, *not* *all* *kobjects* *are* *in* *sysfs*.  And these
> are pure loss in your case.

gregkh pointed out this as well and that is why I said that Maneesh's
patch may have taken kobjects in a different direction than what
it was intended earlier. I don't disagree with this and it may
very well be that dentry ageing will have to be done differently.

> 
> What's more important, for leaves of the sysfs tree your overhead is also
> a loss - we don't need to pin dentry down for them even with current sysfs
> design.   And that can be done with minimal code changes and no data changes
> at all.  Your patch will have to be more attractive than that.  What's the
> expected ratio of directories to non-directories in sysfs?

ISTR, a large number of files in sysfs are attributes which are leaves.
So, keeping a kobject tree partially connected using dentries as backing 
store as opposed to having everything connected might just be enough.
It will be looked into.

Thanks
Dipankar

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC 0/6] Backing Store for sysfs
  2003-10-06 19:33     ` Patrick Mochel
@ 2003-10-06 20:26       ` Dipankar Sarma
  2003-10-06 20:29         ` Patrick Mochel
  0 siblings, 1 reply; 33+ messages in thread
From: Dipankar Sarma @ 2003-10-06 20:26 UTC (permalink / raw)
  To: Patrick Mochel; +Cc: Maneesh Soni, Al Viro, Greg KH, LKML

On Mon, Oct 06, 2003 at 12:33:19PM -0700, Patrick Mochel wrote:
> It's not a realistic requirement for me to solve your customer problems. 
> :) I've been involved in this argument before, and the arguments have been 
> the same, pretty much along party lines of IBM vs. Everyone else. I'm not 
> here to point fingers, but you must heed the fact that we've been here 
> before. 

Well, I didn't mention the c-word, Pat, you did :-) I would much
rather help figure out the best possible way to implement dentry/inode
ageing in sysfs.

> > Besides that think about the added complexity of lookups due to
> > all those pinned dentries forever residing in dentry hash table.
> 
> Well, along with more memory and more devices, I would expect your 
> customers to also be paying for the fastest processors. :) 

Again, more than customers, it is a question of DTRT.

> > sysfs currently uses dentries to represent filesystem hierarchy.
> > We want to create the dentries on the fly and age them out.
> > So, we can no longer use dentries to represent filesystem hierarchy.
> > Now, *something* has to represent the actual filesystem
> > hierarchy, so that dentries/inodes can be created on a lookup
> > miss based on that. So, what do you do here ? kobject and
> > its associates already represent most of the information necessary
> > for a backing store. 
> 
> I understand what you're trying to do, and I say it's the wrong approach. 
> You're overloading kobjects in a manner unintended, and in a way that is 
> not welcome. I do not have an alternative solution, but my last email gave 
> some hints of where to look. Don't get bitter because I disagree. 

The overloading kobject argument is much better. Gregkh has also
indicated that non-sysfs kobjects will increase. That definitely
puts things in a different perspective. Fair enough.

> > > You can also use the assumption that an attribute group exists for all the 
> > > kobjects in a kset, and that a kobject knows what kset it belongs to. And
> > > that eventually, all attributes should be added as part of an attribute 
> > > group..
> > 
> > As I said before, no matter how much you save on kobjects and attrs,
> > I can't see how you can account for ageing of dentries and inodes.
> > Please look at it from the VFS angle and see if there is a better
> > way to represent kobjects/attrs in order to create dentries/inodes
> > on demand and age later.
> 
> That's what I told you, only reversed - try again. The patch posted in 
> unacceptable, though I'm willing to look at alternatives. I don't have or 

Viro's suggestion of pinning the non-leaf dentries only seems like
a very good first alternative to try out.

> see a problem with the current situation, so your arguments are going to 
> have to be a bit stronger. 

By not pinning dentries, you save several hundreds of KBs of lowmem
in a common case low-end system with six disks, much reduced number of dentries
in the hash table and huge savings in large systems. I would hope that
is a good argument. Granted you don't like Maneesh's patch as it is now,
but those things will change as more feedbacks come in.

Thanks
Dipankar

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC 0/6] Backing Store for sysfs
  2003-10-06 20:26       ` Dipankar Sarma
@ 2003-10-06 20:29         ` Patrick Mochel
  2003-10-07  4:31           ` Maneesh Soni
  0 siblings, 1 reply; 33+ messages in thread
From: Patrick Mochel @ 2003-10-06 20:29 UTC (permalink / raw)
  To: Dipankar Sarma; +Cc: Maneesh Soni, Al Viro, Greg KH, LKML


> > That's what I told you, only reversed - try again. The patch posted in 
> > unacceptable, though I'm willing to look at alternatives. I don't have or 
> 
> Viro's suggestion of pinning the non-leaf dentries only seems like
> a very good first alternative to try out.

Uh, that's about the same thing I suggested, though probably not as 
concisely: 

"As I said before, I don't know the right solution, but the directions to 
look in are related to attribute groups. Attributes definitely consume the 
most amount of memory (as opposed to the kobject hierachy), so delaying 
their creation would help, hopefully without making the interface too 
awkward. 

You can also use the assumption that an attribute group exists for all the 
kobjects in a kset, and that a kobject knows what kset it belongs to. And
that eventually, all attributes should be added as part of an attribute 
group.."

Attributes are the leaf entries, and they don't need to always exist. But, 
you have easy access to them via the attribute groups of the ksets the 
kobjects belong to. 

> > see a problem with the current situation, so your arguments are going to 
> > have to be a bit stronger. 
> 
> By not pinning dentries, you save several hundreds of KBs of lowmem
> in a common case low-end system with six disks, much reduced number of dentries
> in the hash table and huge savings in large systems. I would hope that
> is a good argument. Granted you don't like Maneesh's patch as it is now,
> but those things will change as more feedbacks come in.

A low-end system has six disks? I don't think so. Maybe a low-end server,
but of the dozen or so computers that I own, not one has six disks. Call 
me a techno-wimp, but I think your perspective is still a bit skewed. 

I understand your argument, but I still fail to see evidence that it's
really a problem. Perhaps you could characterize it a bit more and
convince us that sysfs overhead is really taking up a significant
percentage of the total overhead while running a set of common
applications with lots of open files (which should also be putting 
pressure on the same caches..) 

Thanks,


	Pat


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC 0/6] Backing Store for sysfs
  2003-10-06 20:01       ` Dipankar Sarma
@ 2003-10-06 20:34         ` viro
  0 siblings, 0 replies; 33+ messages in thread
From: viro @ 2003-10-06 20:34 UTC (permalink / raw)
  To: Dipankar Sarma; +Cc: Patrick Mochel, Maneesh Soni, Greg KH, LKML

On Tue, Oct 07, 2003 at 01:31:10AM +0530, Dipankar Sarma wrote:
> > What's more important, for leaves of the sysfs tree your overhead is also
> > a loss - we don't need to pin dentry down for them even with current sysfs
> > design.   And that can be done with minimal code changes and no data changes
> > at all.  Your patch will have to be more attractive than that.  What's the
> > expected ratio of directories to non-directories in sysfs?
> 
> ISTR, a large number of files in sysfs are attributes which are leaves.
> So, keeping a kobject tree partially connected using dentries as backing 
> store as opposed to having everything connected might just be enough.
> It will be looked into.

Note that main reason why sysfs uses ramfs model is that it gets good
interaction with VFS locking for free - it just uses ->i_sem of associated
inodes for tree protection and that gives us all we need.  Very nice,
but it means that we need these associated inodes.  And since operations
are done deep in tree, we don't want to walk all the way from root, bringing
them in-core.

However, having them for all nodes is an overkill - if we keep them only
for non-leaves, we get all the benefits of ramfs approach with less overhead.
Indeed, even if argument of sysfs operation is a leaf node (and I'm not sure
that we actually have such beasts), we can always take the parent node and
be done with that.

All we need is
	a) ->lookup() that would look for an attribute (all directories are
in cache, so if there's no attribute with such name and ->lookup() had been
called, we'd need to return negative anyway).
	b) sysfs code slightly modified in several places - mostly,
sysfs_get_dentry() callers.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC 0/6] Backing Store for sysfs
  2003-10-06 20:29         ` Patrick Mochel
@ 2003-10-07  4:31           ` Maneesh Soni
  2003-10-07  5:25             ` Nick Piggin
  0 siblings, 1 reply; 33+ messages in thread
From: Maneesh Soni @ 2003-10-07  4:31 UTC (permalink / raw)
  To: Patrick Mochel; +Cc: Dipankar Sarma, Al Viro, Greg KH, LKML

On Mon, Oct 06, 2003 at 01:29:20PM -0700, Patrick Mochel wrote:
> 
> Uh, that's about the same thing I suggested, though probably not as 
> concisely: 
> 
> "As I said before, I don't know the right solution, but the directions to 
> look in are related to attribute groups. Attributes definitely consume the 
> most amount of memory (as opposed to the kobject hierachy), so delaying 
> their creation would help, hopefully without making the interface too 
> awkward. 

Ok.. attributes do consume maximum in sysfs. In the system I mentioned
leaf dentries are about 65% of the total.

> You can also use the assumption that an attribute group exists for all the 
> kobjects in a kset, and that a kobject knows what kset it belongs to. And

That's not correct... kobject corresponding to /sys/block/hda/queue 
doesnot know which kset it belongs to and what are its attributes. Same
for /sys/block/hda/queue/iosched.

> that eventually, all attributes should be added as part of an attribute 
> group.."
> 
> Attributes are the leaf entries, and they don't need to always exist. But, 
> you have easy access to them via the attribute groups of the ksets the 
> kobjects belong to. 
> 

Having backing store just for leaf dentries should be fine. But there is 
_no_ easy access for attributes. For this also I see some data change required 
as of now. The reasons are 
 - not all kobjects belong to a kset. For example, /sys/block/hda/queue
 - not all ksets have attribute groups
  
I don't see any generic rule for finding attributes or attribute group
of a kobject. Such random-ness forced me to add new fields to kobject. The
sysfs picture doesnot show the kset-kobject relationship. For example
kobject corresponding /sys/devices/system does not belong to devices_subsystem.
and it is not in the devices_subsys->list. There was no other way except to
build new hierarchy info in the kobject. 

What are people's opinion about the way I have linked attributes and
attributes_group to the kobject. I could not link "struct attribute" and
"struct attriubte_group" directly to kobject because these are generally 
statically alocated and many kobjects will have the same attribute structure.
and are asigned to multiple kobjects 

Thanks
Maneesh
-- 
Maneesh Soni
Linux Technology Center, 
IBM Software Lab, Bangalore, India
email: maneesh@in.ibm.com
Phone: 91-80-5044999 Fax: 91-80-5268553
T/L : 9243696

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC 0/6] Backing Store for sysfs
  2003-10-06 19:30     ` viro
  2003-10-06 20:01       ` Dipankar Sarma
@ 2003-10-07  4:47       ` Maneesh Soni
  1 sibling, 0 replies; 33+ messages in thread
From: Maneesh Soni @ 2003-10-07  4:47 UTC (permalink / raw)
  To: viro; +Cc: Dipankar Sarma, Patrick Mochel, Greg KH, LKML

On Mon, Oct 06, 2003 at 08:30:50PM +0100, viro@parcelfarce.linux.theplanet.co.uk wrote:
> What's more important, for leaves of the sysfs tree your overhead is also
> a loss - we don't need to pin dentry down for them even with current sysfs
> design.   And that can be done with minimal code changes and no data changes
> at all.  Your patch will have to be more attractive than that.  What's the
> expected ratio of directories to non-directories in sysfs?

Current sysfs / kobject design _require_ that dentries for the leaves to be
present all the times. There is simply no generic way to find attributes
of a kobject. As of now it uses dentry->d_fsdata to reach to the attribute.

In my system leaves are around 65% of the total.

Thanks
Maneesh

-- 
Maneesh Soni
Linux Technology Center, 
IBM Software Lab, Bangalore, India
email: maneesh@in.ibm.com
Phone: 91-80-5044999 Fax: 91-80-5268553
T/L : 9243696

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC 0/6] Backing Store for sysfs
  2003-10-07  4:31           ` Maneesh Soni
@ 2003-10-07  5:25             ` Nick Piggin
  2003-10-07  7:17               ` Maneesh Soni
  0 siblings, 1 reply; 33+ messages in thread
From: Nick Piggin @ 2003-10-07  5:25 UTC (permalink / raw)
  To: maneesh; +Cc: Patrick Mochel, Dipankar Sarma, Al Viro, Greg KH, LKML



Maneesh Soni wrote:

>On Mon, Oct 06, 2003 at 01:29:20PM -0700, Patrick Mochel wrote:
>
>>Uh, that's about the same thing I suggested, though probably not as 
>>concisely: 
>>
>>"As I said before, I don't know the right solution, but the directions to 
>>look in are related to attribute groups. Attributes definitely consume the 
>>most amount of memory (as opposed to the kobject hierachy), so delaying 
>>their creation would help, hopefully without making the interface too 
>>awkward. 
>>
>
>Ok.. attributes do consume maximum in sysfs. In the system I mentioned
>leaf dentries are about 65% of the total.
>
>
>>You can also use the assumption that an attribute group exists for all the 
>>kobjects in a kset, and that a kobject knows what kset it belongs to. And
>>
>
>That's not correct... kobject corresponding to /sys/block/hda/queue 
>doesnot know which kset it belongs to and what are its attributes. Same
>for /sys/block/hda/queue/iosched.
>
>
>>that eventually, all attributes should be added as part of an attribute 
>>group.."
>>
>>Attributes are the leaf entries, and they don't need to always exist. But, 
>>you have easy access to them via the attribute groups of the ksets the 
>>kobjects belong to. 
>>
>>
>
>Having backing store just for leaf dentries should be fine. But there is 
>_no_ easy access for attributes. For this also I see some data change required 
>as of now. The reasons are 
> - not all kobjects belong to a kset. For example, /sys/block/hda/queue
> - not all ksets have attribute groups
>  
>

queue and iosched might not be good examples as they are somewhat broken
wrt the block device scheme. Possibly they will be put in their own kset,
with /sys/block/hda/queue symlinked to them.



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC 2/6] sysfs-mount.patch
  2003-10-06 13:43     ` [RFC 2/6] sysfs-mount.patch viro
@ 2003-10-07  7:17       ` Maneesh Soni
  0 siblings, 0 replies; 33+ messages in thread
From: Maneesh Soni @ 2003-10-07  7:17 UTC (permalink / raw)
  To: viro; +Cc: Patrick Mochel, Greg KH, LKML, Dipankar Sarma

On Mon, Oct 06, 2003 at 02:43:20PM +0100, viro@parcelfarce.linux.theplanet.co.uk wrote:
> On Mon, Oct 06, 2003 at 02:30:30PM +0530, Maneesh Soni wrote:
> > @@ -692,6 +693,10 @@ do_kern_mount(const char *fstype, int fl
> >  	mnt->mnt_mountpoint = sb->s_root;
> >  	mnt->mnt_parent = mnt;
> >  	up_write(&sb->s_umount);
> > +
> > +	if (!strcmp(fstype, "sysfs"))
> > +		sysfs_mount = mnt;
> > +
> >  	put_filesystem(type);
> >  	return mnt;
> >  out_sb:
> 
> That's too ugly for words.  Vetoed.  Sorry, but *that* will not fly.

Yeah.. that was not meant to fly.. and I didnot pushed the patches on runway
and to any of the Air Traffic Controllers..(tree maintainers) :-).. 

I will get rid of sysfs_mount.

Maneesh
-- 
Maneesh Soni
Linux Technology Center, 
IBM Software Lab, Bangalore, India
email: maneesh@in.ibm.com
Phone: 91-80-5044999 Fax: 91-80-5268553
T/L : 9243696

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC 0/6] Backing Store for sysfs
  2003-10-07  5:25             ` Nick Piggin
@ 2003-10-07  7:17               ` Maneesh Soni
  0 siblings, 0 replies; 33+ messages in thread
From: Maneesh Soni @ 2003-10-07  7:17 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Patrick Mochel, Dipankar Sarma, Al Viro, Greg KH, LKML

On Tue, Oct 07, 2003 at 03:25:58PM +1000, Nick Piggin wrote:
> 
> 
> >
> >Having backing store just for leaf dentries should be fine. But there is 
> >_no_ easy access for attributes. For this also I see some data change 
> >required as of now. The reasons are 
> >- not all kobjects belong to a kset. For example, /sys/block/hda/queue
> >- not all ksets have attribute groups
> > 
> >
> 
> queue and iosched might not be good examples as they are somewhat broken
> wrt the block device scheme. Possibly they will be put in their own kset,
> with /sys/block/hda/queue symlinked to them.
> 

Well here is more crap then...

kobjects corresponding to /sys/class/tty/* and /sys/class/net/* have the
same kset (i.e class_obj) but totally different attributes. There is no
way to find the attributes given a kobject belonging to lets say 
/sys/class/net/eth0 except through the hierarchy maintained in pinned sysfs 
dentries.


-- 
Maneesh Soni
Linux Technology Center, 
IBM Software Lab, Bangalore, India
email: maneesh@in.ibm.com
Phone: 91-80-5044999 Fax: 91-80-5268553
T/L : 9243696

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC 0/6] Backing Store for sysfs
  2003-10-06 18:34             ` Greg KH
@ 2003-10-07  9:08               ` Andreas Jellinghaus
  0 siblings, 0 replies; 33+ messages in thread
From: Andreas Jellinghaus @ 2003-10-07  9:08 UTC (permalink / raw)
  To: linux-kernel

> No, not a 'find', we look up the kobject that was added, and its
> attributes.  Doing a 'find' will emulate this for your tests, that's
> all.

But coldplugging will more or less do a "find /sys" to get a list of
all existing devices and add those to /dev. So expect a "find /sys" 
to be run at least once in the early boot process. Coldplugging is
not yet implemented, but it's possible to simulate it with a bit
of shell scripting.

Andreas


^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2003-10-07  9:08 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-10-06  8:59 [RFC 0/6] Backing Store for sysfs Maneesh Soni
2003-10-06  9:00 ` [RFC 1/6] sysfs-kobject.patch Maneesh Soni
2003-10-06  9:00   ` [RFC 2/6] sysfs-mount.patch Maneesh Soni
2003-10-06  9:01     ` [RFC 3/6] sysfs-file.patch Maneesh Soni
2003-10-06  9:01       ` [RFC 4/6] sysfs-symlink.patch Maneesh Soni
2003-10-06  9:02         ` [RFC 5/6] sysfs-attr_group.patch Maneesh Soni
2003-10-06  9:03           ` [RFC 6/6] sysfs-dir.patch Maneesh Soni
2003-10-06 13:43     ` [RFC 2/6] sysfs-mount.patch viro
2003-10-07  7:17       ` Maneesh Soni
2003-10-06 13:41   ` [RFC 1/6] sysfs-kobject.patch viro
2003-10-06 16:16   ` Greg KH
2003-10-06 17:41     ` Dipankar Sarma
2003-10-06 17:44       ` Greg KH
2003-10-06 16:08 ` [RFC 0/6] Backing Store for sysfs Greg KH
2003-10-06 17:31   ` Dipankar Sarma
2003-10-06 17:38     ` Greg KH
2003-10-06 18:01       ` Dipankar Sarma
2003-10-06 18:09         ` Greg KH
2003-10-06 18:31           ` Dipankar Sarma
2003-10-06 18:34             ` Greg KH
2003-10-07  9:08               ` Andreas Jellinghaus
2003-10-06 18:44 ` Patrick Mochel
2003-10-06 19:27   ` Dipankar Sarma
2003-10-06 19:30     ` viro
2003-10-06 20:01       ` Dipankar Sarma
2003-10-06 20:34         ` viro
2003-10-07  4:47       ` Maneesh Soni
2003-10-06 19:33     ` Patrick Mochel
2003-10-06 20:26       ` Dipankar Sarma
2003-10-06 20:29         ` Patrick Mochel
2003-10-07  4:31           ` Maneesh Soni
2003-10-07  5:25             ` Nick Piggin
2003-10-07  7:17               ` Maneesh Soni

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox