public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCHSET] sysfs: implement sysfs_remove()
@ 2013-09-18 21:15 Tejun Heo
  2013-09-18 21:15 ` [PATCH 1/4] sysfs: remove sysfs_addrm_cxt->parent_sd Tejun Heo
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Tejun Heo @ 2013-09-18 21:15 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm

Hello,

Currently, there are multiple variants of internal sysfs removal
functions and the directory removal behavior is a bit weird in that
while it does remove the files contained immediately in the directory
it wouldn't recurse into its subdirectories, even the group ones which
belong to the same kobject.

This patchset restructures sysfs removal path so that there is single
internal removal function which is fully recursive.  This will be the
only removal interface in the planned new interface based on
sysfs_dirents instead of kobjects.

This patchset contains the following four patches.

 0001-sysfs-remove-sysfs_addrm_cxt-parent_sd.patch
 0002-kobject-grab-an-extra-reference-on-kobject-sd-to-all.patch
 0003-sysfs-make-__sysfs_remove_dir-recursive.patch
 0004-sysfs-introduce-__-sysfs_remove.patch

The patches are on top of

  linus#master c2d95729e3 ("Merge branch 'akpm' (patches from Andrew Morton)")
+ [1] [PATCHSET] sysfs: disentangle kobject namespace handling from sysfs

and available in the following git branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git review-sysfs-recursive-rm

 fs/sysfs/dir.c     |  161 ++++++++++++++++++++++++++++++++++++-----------------
 fs/sysfs/file.c    |    4 -
 fs/sysfs/group.c   |    4 -
 fs/sysfs/inode.c   |    4 -
 fs/sysfs/symlink.c |    6 -
 fs/sysfs/sysfs.h   |   14 ++--
 lib/kobject.c      |   12 +++
 7 files changed, 139 insertions(+), 66 deletions(-)

Thanks.

--
tejun

[1] http://thread.gmane.org/gmane.linux.kernel/1560372

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/4] sysfs: remove sysfs_addrm_cxt->parent_sd
  2013-09-18 21:15 [PATCHSET] sysfs: implement sysfs_remove() Tejun Heo
@ 2013-09-18 21:15 ` Tejun Heo
  2013-09-18 21:15 ` [PATCH 2/4] kobject: grab an extra reference on kobject->sd to allow duplicate deletes Tejun Heo
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Tejun Heo @ 2013-09-18 21:15 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, Tejun Heo

sysfs_addrm_start/finish() enclose sysfs_dirent additions and
deletions and sysfs_addrm_cxt is used to record information necessary
to finish the operations.  Currently, sysfs_addrm_start() takes
@parent_sd, records it in sysfs_addrm_cxt, and assumes that all
operations in the block are performed under that @parent_sd.

This assumption has been fine until now but we want to make some
operations behave recursively and, while having @parent_sd recorded in
sysfs_addrm_cxt doesn't necessarily prevents that, it becomes
confusing.

This patch removes sysfs_addrm_cxt->parent_sd and makes
sysfs_add_one() take an explicit @parent_sd parameter.  Note that
sysfs_remove_one() doesn't need the extra argument as its parent is
always known from the target @sd.

While at it, add __acquires/releases() notations to
sysfs_addrm_start/finish() respectively.

This patch doesn't make any functional difference.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/sysfs/dir.c     | 52 +++++++++++++++++++++++++++-------------------------
 fs/sysfs/file.c    |  4 ++--
 fs/sysfs/inode.c   |  2 +-
 fs/sysfs/symlink.c |  6 +++---
 fs/sysfs/sysfs.h   | 10 +++++-----
 5 files changed, 38 insertions(+), 36 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index d23e66d..6718689 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -406,22 +406,19 @@ struct sysfs_dirent *sysfs_new_dirent(const char *name, umode_t mode, int type)
 /**
  *	sysfs_addrm_start - prepare for sysfs_dirent add/remove
  *	@acxt: pointer to sysfs_addrm_cxt to be used
- *	@parent_sd: parent sysfs_dirent
  *
- *	This function is called when the caller is about to add or
- *	remove sysfs_dirent under @parent_sd.  This function acquires
- *	sysfs_mutex.  @acxt is used to keep and pass context to
- *	other addrm functions.
+ *	This function is called when the caller is about to add or remove
+ *	sysfs_dirent.  This function acquires sysfs_mutex.  @acxt is used
+ *	to keep and pass context to other addrm functions.
  *
  *	LOCKING:
  *	Kernel thread context (may sleep).  sysfs_mutex is locked on
  *	return.
  */
-void sysfs_addrm_start(struct sysfs_addrm_cxt *acxt,
-		       struct sysfs_dirent *parent_sd)
+void sysfs_addrm_start(struct sysfs_addrm_cxt *acxt)
+	__acquires(sysfs_mutex)
 {
 	memset(acxt, 0, sizeof(*acxt));
-	acxt->parent_sd = parent_sd;
 
 	mutex_lock(&sysfs_mutex);
 }
@@ -430,10 +427,11 @@ void sysfs_addrm_start(struct sysfs_addrm_cxt *acxt,
  *	__sysfs_add_one - add sysfs_dirent to parent without warning
  *	@acxt: addrm context to use
  *	@sd: sysfs_dirent to be added
+ *	@parent_sd: the parent sysfs_dirent to add @sd to
  *
- *	Get @acxt->parent_sd and set sd->s_parent to it and increment
- *	nlink of parent inode if @sd is a directory and link into the
- *	children list of the parent.
+ *	Get @parent_sd and set @sd->s_parent to it and increment nlink of
+ *	the parent inode if @sd is a directory and link into the children
+ *	list of the parent.
  *
  *	This function should be called between calls to
  *	sysfs_addrm_start() and sysfs_addrm_finish() and should be
@@ -446,20 +444,21 @@ void sysfs_addrm_start(struct sysfs_addrm_cxt *acxt,
  *	0 on success, -EEXIST if entry with the given name already
  *	exists.
  */
-int __sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
+int __sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd,
+		    struct sysfs_dirent *parent_sd)
 {
 	struct sysfs_inode_attrs *ps_iattr;
 	int ret;
 
 	sd->s_hash = sysfs_name_hash(sd->s_name, sd->s_ns);
-	sd->s_parent = sysfs_get(acxt->parent_sd);
+	sd->s_parent = sysfs_get(parent_sd);
 
 	ret = sysfs_link_sibling(sd);
 	if (ret)
 		return ret;
 
 	/* Update timestamps on the parent */
-	ps_iattr = acxt->parent_sd->s_iattr;
+	ps_iattr = parent_sd->s_iattr;
 	if (ps_iattr) {
 		struct iattr *ps_iattrs = &ps_iattr->ia_iattr;
 		ps_iattrs->ia_ctime = ps_iattrs->ia_mtime = CURRENT_TIME;
@@ -493,10 +492,11 @@ static char *sysfs_pathname(struct sysfs_dirent *sd, char *path)
  *	sysfs_add_one - add sysfs_dirent to parent
  *	@acxt: addrm context to use
  *	@sd: sysfs_dirent to be added
+ *	@parent_sd: the parent sysfs_dirent to add @sd to
  *
- *	Get @acxt->parent_sd and set sd->s_parent to it and increment
- *	nlink of parent inode if @sd is a directory and link into the
- *	children list of the parent.
+ *	Get @parent_sd and set @sd->s_parent to it and increment nlink of
+ *	the parent inode if @sd is a directory and link into the children
+ *	list of the parent.
  *
  *	This function should be called between calls to
  *	sysfs_addrm_start() and sysfs_addrm_finish() and should be
@@ -509,17 +509,18 @@ static char *sysfs_pathname(struct sysfs_dirent *sd, char *path)
  *	0 on success, -EEXIST if entry with the given name already
  *	exists.
  */
-int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
+int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd,
+		  struct sysfs_dirent *parent_sd)
 {
 	int ret;
 
-	ret = __sysfs_add_one(acxt, sd);
+	ret = __sysfs_add_one(acxt, sd, parent_sd);
 	if (ret == -EEXIST) {
 		char *path = kzalloc(PATH_MAX, GFP_KERNEL);
 		WARN(1, KERN_WARNING
 		     "sysfs: cannot create duplicate filename '%s'\n",
 		     (path == NULL) ? sd->s_name
-				    : (sysfs_pathname(acxt->parent_sd, path),
+				    : (sysfs_pathname(parent_sd, path),
 				       strlcat(path, "/", PATH_MAX),
 				       strlcat(path, sd->s_name, PATH_MAX),
 				       path));
@@ -553,7 +554,7 @@ void sysfs_remove_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
 	sysfs_unlink_sibling(sd);
 
 	/* Update timestamps on the parent */
-	ps_iattr = acxt->parent_sd->s_iattr;
+	ps_iattr = sd->s_parent->s_iattr;
 	if (ps_iattr) {
 		struct iattr *ps_iattrs = &ps_iattr->ia_iattr;
 		ps_iattrs->ia_ctime = ps_iattrs->ia_mtime = CURRENT_TIME;
@@ -576,6 +577,7 @@ void sysfs_remove_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
  *	sysfs_mutex is released.
  */
 void sysfs_addrm_finish(struct sysfs_addrm_cxt *acxt)
+	__releases(sysfs_mutex)
 {
 	/* release resources acquired by sysfs_addrm_start() */
 	mutex_unlock(&sysfs_mutex);
@@ -678,8 +680,8 @@ static int create_dir(struct kobject *kobj, struct sysfs_dirent *parent_sd,
 	sd->s_dir.kobj = kobj;
 
 	/* link in */
-	sysfs_addrm_start(&acxt, parent_sd);
-	rc = sysfs_add_one(&acxt, sd);
+	sysfs_addrm_start(&acxt);
+	rc = sysfs_add_one(&acxt, sd, parent_sd);
 	sysfs_addrm_finish(&acxt);
 
 	if (rc == 0)
@@ -772,7 +774,7 @@ static void remove_dir(struct sysfs_dirent *sd)
 {
 	struct sysfs_addrm_cxt acxt;
 
-	sysfs_addrm_start(&acxt, sd->s_parent);
+	sysfs_addrm_start(&acxt);
 	sysfs_remove_one(&acxt, sd);
 	sysfs_addrm_finish(&acxt);
 }
@@ -792,7 +794,7 @@ static void __sysfs_remove_dir(struct sysfs_dirent *dir_sd)
 		return;
 
 	pr_debug("sysfs %s: removing dir\n", dir_sd->s_name);
-	sysfs_addrm_start(&acxt, dir_sd);
+	sysfs_addrm_start(&acxt);
 	pos = rb_first(&dir_sd->s_dir.children);
 	while (pos) {
 		struct sysfs_dirent *sd = to_sysfs_dirent(pos);
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 4697019..1656a79 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -502,8 +502,8 @@ int sysfs_add_file_mode_ns(struct sysfs_dirent *dir_sd,
 	sd->s_attr.attr = (void *)attr;
 	sysfs_dirent_init_lockdep(sd);
 
-	sysfs_addrm_start(&acxt, dir_sd);
-	rc = sysfs_add_one(&acxt, sd);
+	sysfs_addrm_start(&acxt);
+	rc = sysfs_add_one(&acxt, sd, dir_sd);
 	sysfs_addrm_finish(&acxt);
 
 	if (rc)
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index 07193d7..364c887 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -326,7 +326,7 @@ int sysfs_hash_and_remove(struct sysfs_dirent *dir_sd, const char *name,
 		return -ENOENT;
 	}
 
-	sysfs_addrm_start(&acxt, dir_sd);
+	sysfs_addrm_start(&acxt);
 
 	sd = sysfs_find_dirent(dir_sd, name, ns);
 	if (sd)
diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index 88c8bc5..22ea2f5 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -53,11 +53,11 @@ static int sysfs_do_create_link_sd(struct sysfs_dirent *parent_sd,
 	sd->s_symlink.target_sd = target_sd;
 	target_sd = NULL;	/* reference is now owned by the symlink */
 
-	sysfs_addrm_start(&acxt, parent_sd);
+	sysfs_addrm_start(&acxt);
 	if (warn)
-		error = sysfs_add_one(&acxt, sd);
+		error = sysfs_add_one(&acxt, sd, parent_sd);
 	else
-		error = __sysfs_add_one(&acxt, sd);
+		error = __sysfs_add_one(&acxt, sd, parent_sd);
 	sysfs_addrm_finish(&acxt);
 
 	if (error)
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index ee44fde..4d11544 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -120,7 +120,6 @@ do {								\
  * Context structure to be used while adding/removing nodes.
  */
 struct sysfs_addrm_cxt {
-	struct sysfs_dirent	*parent_sd;
 	struct sysfs_dirent	*removed;
 };
 
@@ -154,10 +153,11 @@ extern const struct inode_operations sysfs_dir_inode_operations;
 struct dentry *sysfs_get_dentry(struct sysfs_dirent *sd);
 struct sysfs_dirent *sysfs_get_active(struct sysfs_dirent *sd);
 void sysfs_put_active(struct sysfs_dirent *sd);
-void sysfs_addrm_start(struct sysfs_addrm_cxt *acxt,
-		       struct sysfs_dirent *parent_sd);
-int __sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd);
-int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd);
+void sysfs_addrm_start(struct sysfs_addrm_cxt *acxt);
+int __sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd,
+		    struct sysfs_dirent *parent_sd);
+int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd,
+		  struct sysfs_dirent *parent_sd);
 void sysfs_remove_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd);
 void sysfs_addrm_finish(struct sysfs_addrm_cxt *acxt);
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/4] kobject: grab an extra reference on kobject->sd to allow duplicate deletes
  2013-09-18 21:15 [PATCHSET] sysfs: implement sysfs_remove() Tejun Heo
  2013-09-18 21:15 ` [PATCH 1/4] sysfs: remove sysfs_addrm_cxt->parent_sd Tejun Heo
@ 2013-09-18 21:15 ` Tejun Heo
  2013-09-18 21:15 ` [PATCH 3/4] sysfs: make __sysfs_remove_dir() recursive Tejun Heo
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Tejun Heo @ 2013-09-18 21:15 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, Tejun Heo

sysfs currently has a rather weird behavior regarding removals.  A
directory removal would delete all files directly under it but
wouldn't recurse into subdirectories, which, while a bit inconsistent,
seems to make sense at the first glance as each directory is
supposedly associated with a kobject and each kobject can take care of
the directory deletion; however, this doesn't really hold as we have
groups which can be directories without a kobject associated with it
and require explicit deletions.

We're in the process of separating out sysfs from kboject / driver
core and want a consistent behavior.  A removal should delete either
only the specified node or everything under it.  I think it is helpful
to support recursive atomic removal and later patches will implement
it.

Such change means that a sysfs_dirent associated with kobject may be
deleted before the kobject itself is removed if one of its ancestor
gets removed before it.  As sysfs_remove_dir() puts the base ref, we
may end up with dangling pointer on descendants.  This can be solved
by holding an extra reference on the sd from kobject.

Acquire an extra reference on the associated sysfs_dirent on directory
creation and put it after removal.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/sysfs/dir.c |  7 ++++++-
 lib/kobject.c  | 12 ++++++++++++
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 6718689..105a7e2 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -549,7 +549,12 @@ void sysfs_remove_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
 {
 	struct sysfs_inode_attrs *ps_iattr;
 
-	BUG_ON(sd->s_flags & SYSFS_FLAG_REMOVED);
+	/*
+	 * Removal can be called multiple times on the same node.  Only the
+	 * first invocation is effective and puts the base ref.
+	 */
+	if (sd->s_flags & SYSFS_FLAG_REMOVED)
+		return;
 
 	sysfs_unlink_sibling(sd);
 
diff --git a/lib/kobject.c b/lib/kobject.c
index e769ee3..aa42d7d 100644
--- a/lib/kobject.c
+++ b/lib/kobject.c
@@ -75,6 +75,13 @@ static int create_dir(struct kobject *kobj)
 		if (error)
 			sysfs_remove_dir(kobj);
 	}
+
+	/*
+	 * @kobj->sd may be deleted by an ancestor going away.  Hold an
+	 * extra reference so that it stays until @kobj is gone.
+	 */
+	sysfs_get(kobj->sd);
+
 	return error;
 }
 
@@ -531,10 +538,15 @@ out:
  */
 void kobject_del(struct kobject *kobj)
 {
+	struct sysfs_dirent *sd;
+
 	if (!kobj)
 		return;
 
+	sd = kobj->sd;
 	sysfs_remove_dir(kobj);
+	sysfs_put(sd);
+
 	kobj->state_in_sysfs = 0;
 	kobj_kset_leave(kobj);
 	kobject_put(kobj->parent);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/4] sysfs: make __sysfs_remove_dir() recursive
  2013-09-18 21:15 [PATCHSET] sysfs: implement sysfs_remove() Tejun Heo
  2013-09-18 21:15 ` [PATCH 1/4] sysfs: remove sysfs_addrm_cxt->parent_sd Tejun Heo
  2013-09-18 21:15 ` [PATCH 2/4] kobject: grab an extra reference on kobject->sd to allow duplicate deletes Tejun Heo
@ 2013-09-18 21:15 ` Tejun Heo
  2013-09-18 21:15 ` [PATCH 4/4] sysfs: introduce [__]sysfs_remove() Tejun Heo
  2013-09-19 10:48 ` [PATCHSET] sysfs: implement sysfs_remove() Eric W. Biederman
  4 siblings, 0 replies; 10+ messages in thread
From: Tejun Heo @ 2013-09-18 21:15 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, Tejun Heo

Currently, sysfs directory removal is inconsistent in that it would
remove any files directly under it but wouldn't recurse into
directories.  Thanks to group subdirectories, this doesn't even match
with kobject boundaries.  sysfs is in the process of being separated
out so that it can be used by multiple subsystems and we want to have
a consistent behavior - either removal of a sysfs_dirent should remove
every descendant entries or none instead of something inbetween.

This patch implements proper recursive removal in
__sysfs_remove_dir().  The function now walks its subtree in a
post-order walk to remove all descendants.

This is a behavior change but kobject / driver layer, which currently
is the only consumer, has already been updated to handle duplicate
removal attempts, so nothing should be broken after this change.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/sysfs/dir.c | 75 +++++++++++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 64 insertions(+), 11 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 105a7e2..0cdfd81 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -789,27 +789,81 @@ void sysfs_remove_subdir(struct sysfs_dirent *sd)
 	remove_dir(sd);
 }
 
+static struct sysfs_dirent *sysfs_leftmost_descendant(struct sysfs_dirent *pos)
+{
+	struct sysfs_dirent *last;
+
+	while (true) {
+		struct rb_node *rbn;
+
+		last = pos;
+
+		if (sysfs_type(pos) != SYSFS_DIR)
+			break;
+
+		rbn = rb_first(&pos->s_dir.children);
+		if (!rbn)
+			break;
+
+		pos = to_sysfs_dirent(rbn);
+	}
+
+	return last;
+}
+
+/**
+ * sysfs_next_descendant_post - find the next descendant for post-order walk
+ * @pos: the current position (%NULL to initiate traversal)
+ * @root: sysfs_dirent whose descendants to walk
+ *
+ * Find the next descendant to visit for post-order traversal of @root's
+ * descendants.  @root is included in the iteration and the last node to be
+ * visited.
+ */
+static struct sysfs_dirent *sysfs_next_descendant_post(struct sysfs_dirent *pos,
+						       struct sysfs_dirent *root)
+{
+	struct rb_node *rbn;
+
+	lockdep_assert_held(&sysfs_mutex);
+
+	/* if first iteration, visit leftmost descendant which may be root */
+	if (!pos)
+		return sysfs_leftmost_descendant(root);
+
+	/* if we visited @root, we're done */
+	if (pos == root)
+		return NULL;
+
+	/* if there's an unvisited sibling, visit its leftmost descendant */
+	rbn = rb_next(&pos->s_rb);
+	if (rbn)
+		return sysfs_leftmost_descendant(to_sysfs_dirent(rbn));
+
+	/* no sibling left, visit parent */
+	return pos->s_parent;
+}
 
 static void __sysfs_remove_dir(struct sysfs_dirent *dir_sd)
 {
 	struct sysfs_addrm_cxt acxt;
-	struct rb_node *pos;
+	struct sysfs_dirent *pos, *next;
 
 	if (!dir_sd)
 		return;
 
 	pr_debug("sysfs %s: removing dir\n", dir_sd->s_name);
 	sysfs_addrm_start(&acxt);
-	pos = rb_first(&dir_sd->s_dir.children);
-	while (pos) {
-		struct sysfs_dirent *sd = to_sysfs_dirent(pos);
-		pos = rb_next(pos);
-		if (sysfs_type(sd) != SYSFS_DIR)
-			sysfs_remove_one(&acxt, sd);
-	}
-	sysfs_addrm_finish(&acxt);
 
-	remove_dir(dir_sd);
+	next = NULL;
+	do {
+		pos = next;
+		next = sysfs_next_descendant_post(pos, dir_sd);
+		if (pos)
+			sysfs_remove_one(&acxt, pos);
+	} while (next);
+
+	sysfs_addrm_finish(&acxt);
 }
 
 /**
@@ -820,7 +874,6 @@ static void __sysfs_remove_dir(struct sysfs_dirent *dir_sd)
  *	the directory before we remove the directory, and we've inlined
  *	what used to be sysfs_rmdir() below, instead of calling separately.
  */
-
 void sysfs_remove_dir(struct kobject *kobj)
 {
 	struct sysfs_dirent *sd = kobj->sd;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/4] sysfs: introduce [__]sysfs_remove()
  2013-09-18 21:15 [PATCHSET] sysfs: implement sysfs_remove() Tejun Heo
                   ` (2 preceding siblings ...)
  2013-09-18 21:15 ` [PATCH 3/4] sysfs: make __sysfs_remove_dir() recursive Tejun Heo
@ 2013-09-18 21:15 ` Tejun Heo
  2013-09-19 10:48 ` [PATCHSET] sysfs: implement sysfs_remove() Eric W. Biederman
  4 siblings, 0 replies; 10+ messages in thread
From: Tejun Heo @ 2013-09-18 21:15 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, Tejun Heo

Given a sysfs_dirent, there is no reason to have multiple versions of
removal functions.  A function which removes the specified
sysfs_dirent and its descendants is enough.

This patch intorduces [__}sysfs_remove() which replaces all internal
variations of removal functions.  This will be the only removal
function in the planned new sysfs_dirent based interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/sysfs/dir.c   | 47 ++++++++++++++++++++++++-----------------------
 fs/sysfs/group.c |  4 ++--
 fs/sysfs/inode.c |  2 +-
 fs/sysfs/sysfs.h |  4 ++--
 4 files changed, 29 insertions(+), 28 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 0cdfd81..b518afd 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -545,7 +545,8 @@ int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd,
  *	LOCKING:
  *	Determined by sysfs_addrm_start().
  */
-void sysfs_remove_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
+static void sysfs_remove_one(struct sysfs_addrm_cxt *acxt,
+			     struct sysfs_dirent *sd)
 {
 	struct sysfs_inode_attrs *ps_iattr;
 
@@ -775,20 +776,6 @@ const struct inode_operations sysfs_dir_inode_operations = {
 	.setxattr	= sysfs_setxattr,
 };
 
-static void remove_dir(struct sysfs_dirent *sd)
-{
-	struct sysfs_addrm_cxt acxt;
-
-	sysfs_addrm_start(&acxt);
-	sysfs_remove_one(&acxt, sd);
-	sysfs_addrm_finish(&acxt);
-}
-
-void sysfs_remove_subdir(struct sysfs_dirent *sd)
-{
-	remove_dir(sd);
-}
-
 static struct sysfs_dirent *sysfs_leftmost_descendant(struct sysfs_dirent *pos)
 {
 	struct sysfs_dirent *last;
@@ -844,25 +831,36 @@ static struct sysfs_dirent *sysfs_next_descendant_post(struct sysfs_dirent *pos,
 	return pos->s_parent;
 }
 
-static void __sysfs_remove_dir(struct sysfs_dirent *dir_sd)
+void __sysfs_remove(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
 {
-	struct sysfs_addrm_cxt acxt;
 	struct sysfs_dirent *pos, *next;
 
-	if (!dir_sd)
+	if (!sd)
 		return;
 
-	pr_debug("sysfs %s: removing dir\n", dir_sd->s_name);
-	sysfs_addrm_start(&acxt);
+	pr_debug("sysfs %s: removing\n", sd->s_name);
 
 	next = NULL;
 	do {
 		pos = next;
-		next = sysfs_next_descendant_post(pos, dir_sd);
+		next = sysfs_next_descendant_post(pos, sd);
 		if (pos)
-			sysfs_remove_one(&acxt, pos);
+			sysfs_remove_one(acxt, pos);
 	} while (next);
+}
 
+/**
+ * sysfs_remove - remove a sysfs_dirent recursively
+ * @sd: the sysfs_dirent to remove
+ *
+ * Remove @sd along with all its subdirectories and files.
+ */
+void sysfs_remove(struct sysfs_dirent *sd)
+{
+	struct sysfs_addrm_cxt acxt;
+
+	sysfs_addrm_start(&acxt);
+	__sysfs_remove(&acxt, sd);
 	sysfs_addrm_finish(&acxt);
 }
 
@@ -882,7 +880,10 @@ void sysfs_remove_dir(struct kobject *kobj)
 	kobj->sd = NULL;
 	spin_unlock(&sysfs_assoc_lock);
 
-	__sysfs_remove_dir(sd);
+	if (sd) {
+		WARN_ON_ONCE(sysfs_type(sd) != SYSFS_DIR);
+		sysfs_remove(sd);
+	}
 }
 
 int sysfs_rename(struct sysfs_dirent *sd, struct sysfs_dirent *new_parent_sd,
diff --git a/fs/sysfs/group.c b/fs/sysfs/group.c
index 2dae55c..1898a10 100644
--- a/fs/sysfs/group.c
+++ b/fs/sysfs/group.c
@@ -111,7 +111,7 @@ static int internal_create_group(struct kobject *kobj, int update,
 	error = create_files(sd, kobj, grp, update);
 	if (error) {
 		if (grp->name)
-			sysfs_remove_subdir(sd);
+			sysfs_remove(sd);
 	}
 	sysfs_put(sd);
 	return error;
@@ -219,7 +219,7 @@ void sysfs_remove_group(struct kobject *kobj,
 
 	remove_files(sd, kobj, grp);
 	if (grp->name)
-		sysfs_remove_subdir(sd);
+		sysfs_remove(sd);
 
 	sysfs_put(sd);
 }
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index 364c887..63f755e 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -330,7 +330,7 @@ int sysfs_hash_and_remove(struct sysfs_dirent *dir_sd, const char *name,
 
 	sd = sysfs_find_dirent(dir_sd, name, ns);
 	if (sd)
-		sysfs_remove_one(&acxt, sd);
+		__sysfs_remove(&acxt, sd);
 
 	sysfs_addrm_finish(&acxt);
 
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 4d11544..4b1d825 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -158,7 +158,8 @@ int __sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd,
 		    struct sysfs_dirent *parent_sd);
 int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd,
 		  struct sysfs_dirent *parent_sd);
-void sysfs_remove_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd);
+void __sysfs_remove(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd);
+void sysfs_remove(struct sysfs_dirent *sd);
 void sysfs_addrm_finish(struct sysfs_addrm_cxt *acxt);
 
 struct sysfs_dirent *sysfs_find_dirent(struct sysfs_dirent *parent_sd,
@@ -170,7 +171,6 @@ void release_sysfs_dirent(struct sysfs_dirent *sd);
 
 int sysfs_create_subdir(struct kobject *kobj, const char *name,
 			struct sysfs_dirent **p_sd);
-void sysfs_remove_subdir(struct sysfs_dirent *sd);
 
 int sysfs_rename(struct sysfs_dirent *sd, struct sysfs_dirent *new_parent_sd,
 		 const char *new_name, const void *new_ns);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCHSET] sysfs: implement sysfs_remove()
  2013-09-18 21:15 [PATCHSET] sysfs: implement sysfs_remove() Tejun Heo
                   ` (3 preceding siblings ...)
  2013-09-18 21:15 ` [PATCH 4/4] sysfs: introduce [__]sysfs_remove() Tejun Heo
@ 2013-09-19 10:48 ` Eric W. Biederman
  2013-09-19 12:38   ` Tejun Heo
  4 siblings, 1 reply; 10+ messages in thread
From: Eric W. Biederman @ 2013-09-19 10:48 UTC (permalink / raw)
  To: Tejun Heo; +Cc: gregkh, kay, linux-kernel

Tejun Heo <tj@kernel.org> writes:

> Hello,
>
> Currently, there are multiple variants of internal sysfs removal
> functions and the directory removal behavior is a bit weird in that
> while it does remove the files contained immediately in the directory
> it wouldn't recurse into its subdirectories, even the group ones which
> belong to the same kobject.

There are two very big problems with this direction.
1) It violates the principle of least surprise.  In particular it messes
   up the mental model of people like Al Viro.  Which can easily lead to
   code breaking during routine maintenance because of unexpecte
   semantics.

2) Recursive removal is not safe.  There are very weird and somewhat
   pathological cases where sysfs directories are removed out of order
   aka parent before sibling, and (if my memory holds) recursive removal
   takes this from a little bit ugly to actually breaking things.

For long term maintenance and simplicity I believe we will be in much
better shape if we take directory removal in the opposite direction, and
fix the small number of issues with the users and don't support any kind
of recursive removal.

Eric

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCHSET] sysfs: implement sysfs_remove()
  2013-09-19 10:48 ` [PATCHSET] sysfs: implement sysfs_remove() Eric W. Biederman
@ 2013-09-19 12:38   ` Tejun Heo
  2013-09-19 17:03     ` Eric W. Biederman
  0 siblings, 1 reply; 10+ messages in thread
From: Tejun Heo @ 2013-09-19 12:38 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: gregkh, kay, linux-kernel

Hey, Eric.

On Thu, Sep 19, 2013 at 05:48:52AM -0500, Eric W. Biederman wrote:
> There are two very big problems with this direction.
> 1) It violates the principle of least surprise.  In particular it messes
>    up the mental model of people like Al Viro.  Which can easily lead to
>    code breaking during routine maintenance because of unexpecte
>    semantics.

The interface is already and inherently different from the usual file
removal operation.  If we were to follow the rm semantics, we would
have to fail sysfs_remove_dir() if the node has any child.  The
current behavior is actually more surprising because it creates
dangling child directories - it is neither "rmdir" or "rm -rf".  I
don't think enforcing "rmdir" semantics is useful for sysfs so the
next best thing is to go for "rm -rf".

> 2) Recursive removal is not safe.  There are very weird and somewhat
>    pathological cases where sysfs directories are removed out of order
>    aka parent before sibling, and (if my memory holds) recursive removal
>    takes this from a little bit ugly to actually breaking things.

The case is already handled in the patch series for kobject users.
kobjects just hold an extra ref on sysfs_dirent until it gets
released.  Note that this is something peculiar about kobject based
interface because it has fundamental distinction between "a
sysfs_dirent which has a kobject associated" and the rest, and wants
to treat the former specially.  That is all good and dandy for
kobjects but that logic should stay with kobject.

Also note that the current implementation kinda conflates the
distinction between kobject-associated nodes and others with the
distinction between directories and files.  This used to be okay
because only kobjects could create directories; however, with groups,
this no longer is the case and groups require explicit deletions
although they are not associated with kobjects, which leads
essentially unnecessary boilerplates - just count how many
remove_group calls we have in the tree - and naturally some instances
missing those calls and dangling and leaving those dirents without
anyway to reach them - they don't have kobjects associated with them.

> For long term maintenance and simplicity I believe we will be in much
> better shape if we take directory removal in the opposite direction, and
> fix the small number of issues with the users and don't support any kind
> of recursive removal.

If you're suggesting that we should do "rmdir" semantics, you're
suggesting adding a *LOT* of boilerplate code to *all* users of sysfs,
which is extremely tedious and error-prone in a way which will be
difficult to track down.  We need to make it easier to use, not the
other way around.

I think all this mostly comes from the way we conjoined kobject and
sysfs and then getting confused about the distinction between kobjects
and directories.  The only special thing we need for kobject handling
is allowing parent kobjects to be removed before its descendants as
kobjects lifetime rules are weird and doesn't enforce any ordering
according to its hierarchical organization, but if you put that one
aside, we actually do want recursive removal.  You never want to
require the destruction paths tedious.  That always leads to
wide-spread bugs.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCHSET] sysfs: implement sysfs_remove()
  2013-09-19 12:38   ` Tejun Heo
@ 2013-09-19 17:03     ` Eric W. Biederman
  2013-09-26 23:44       ` Greg KH
  0 siblings, 1 reply; 10+ messages in thread
From: Eric W. Biederman @ 2013-09-19 17:03 UTC (permalink / raw)
  To: Tejun Heo; +Cc: gregkh, kay, linux-kernel


I am running from memory right now.  But the short version is.

Al Viro has complained about the sysfs removal antics of sysfs, and
I have seen Al get confused and "fix" filesystems that depart too far
from normal filesystem semantics.

I have gone down this path both ways and "rm -rf" semantics are horrible
and cause real bugs in the kernel at the boundaries between devices.
"rm -rf" semantics are also horrible because no sanity checks can be
performed.

Normal "unlink/rmdir" semantics are absolutely achievable including
not allowing children to be removed before their parents with just
a few bug fixes to the kernel.  I ran out of energy before I could
track down and make those bug fixes which is why things are left
in the current state.  And now we don't need any more boiler plate
to get there, the current primary interfaces to sysfs remember all of
the filenames.

The fact we actually need to allow parents to be deleted before their
children today to support pci is absoltuely broken.  It is a simple
matter of code bugs.  The device tree semantics are tree semantics not
random order semantics.

I will aim to take a second look when I can spend a little more time
and give you more concrete reasons (other than the old NAK from Viro)
about why recursive sysfs directory removal can cause real bugs.  It is
just subtle enough I can't remember the set of the problems in detail
and a quick look at the code is not enough to remind me.  But I have run
into real issues with even the limited recursive remvoval that sysfs
does today.

Eric

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCHSET] sysfs: implement sysfs_remove()
  2013-09-19 17:03     ` Eric W. Biederman
@ 2013-09-26 23:44       ` Greg KH
  2013-09-27 13:49         ` Tejun Heo
  0 siblings, 1 reply; 10+ messages in thread
From: Greg KH @ 2013-09-26 23:44 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Tejun Heo, kay, linux-kernel

On Thu, Sep 19, 2013 at 12:03:08PM -0500, Eric W. Biederman wrote:
> 
> I am running from memory right now.  But the short version is.
> 
> Al Viro has complained about the sysfs removal antics of sysfs, and
> I have seen Al get confused and "fix" filesystems that depart too far
> from normal filesystem semantics.
> 
> I have gone down this path both ways and "rm -rf" semantics are horrible
> and cause real bugs in the kernel at the boundaries between devices.
> "rm -rf" semantics are also horrible because no sanity checks can be
> performed.

I seem to remember some issues here as well, probably with scsi devices,
that kept us from doing this in this manner.  Can you test this on
removing some scsi devices and see if everything still works properly
with this patchset applied?

I'm really hesitant to apply this series, as it does change how sysfs
works in this area, why do you need these changes?

> I will aim to take a second look when I can spend a little more time
> and give you more concrete reasons (other than the old NAK from Viro)
> about why recursive sysfs directory removal can cause real bugs.  It is
> just subtle enough I can't remember the set of the problems in detail
> and a quick look at the code is not enough to remind me.  But I have run
> into real issues with even the limited recursive remvoval that sysfs
> does today.

What real issues did you run into?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCHSET] sysfs: implement sysfs_remove()
  2013-09-26 23:44       ` Greg KH
@ 2013-09-27 13:49         ` Tejun Heo
  0 siblings, 0 replies; 10+ messages in thread
From: Tejun Heo @ 2013-09-27 13:49 UTC (permalink / raw)
  To: Greg KH; +Cc: Eric W. Biederman, kay, linux-kernel

Hey,

On Thu, Sep 26, 2013 at 04:44:33PM -0700, Greg KH wrote:
> I seem to remember some issues here as well, probably with scsi devices,
> that kept us from doing this in this manner.  Can you test this on
> removing some scsi devices and see if everything still works properly
> with this patchset applied?

>From kobject side, it shouldn't really change as kobj holds an extra
ref on the associated sysfs_dirent.  While sysfs_dirents are
recursively unlinked, they'll still be there until the associated kobj
is released.  And, yeap, SCSI works fine.

> I'm really hesitant to apply this series, as it does change how sysfs
> works in this area, why do you need these changes?

Because the interface we now have is broken and I don't want spread
that to new users.  Our current semantics is somewhere between "rmdir"
and "rm -rf".  We remove files immediately below a directory but don't
recurse.

I think this was designed this way because of the tight coupling with
kobj.  As a directory always used to be represented by a kobj and a
kobj's lifetime is managed separately, chaining directory removal to
kobj lifetime was good enough; however, please note that this is no
longer true with the sysfs group, so now sysfs is in a weird place
where it doesn't require attribute removals right below the kobj
directory but does require group removals.  Even that is inconsistent
as group can be embedded in the parent kobj directory, and I'm
relatively sure we're already leaking sysfs_dirents by forgetting
group removals in some paths.

It's a broken interface and even for the existing users we should be
able to remove most of group group removal invocation afterwards,
which is a silly and error-prone requirement.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2013-09-27 13:49 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-18 21:15 [PATCHSET] sysfs: implement sysfs_remove() Tejun Heo
2013-09-18 21:15 ` [PATCH 1/4] sysfs: remove sysfs_addrm_cxt->parent_sd Tejun Heo
2013-09-18 21:15 ` [PATCH 2/4] kobject: grab an extra reference on kobject->sd to allow duplicate deletes Tejun Heo
2013-09-18 21:15 ` [PATCH 3/4] sysfs: make __sysfs_remove_dir() recursive Tejun Heo
2013-09-18 21:15 ` [PATCH 4/4] sysfs: introduce [__]sysfs_remove() Tejun Heo
2013-09-19 10:48 ` [PATCHSET] sysfs: implement sysfs_remove() Eric W. Biederman
2013-09-19 12:38   ` Tejun Heo
2013-09-19 17:03     ` Eric W. Biederman
2013-09-26 23:44       ` Greg KH
2013-09-27 13:49         ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox