linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] configfs: Pin configfs subsystems separately from new config_items.
@ 2008-06-17 22:41 Joel Becker
  2008-06-18 12:31 ` [Ocfs2-devel] " Louis Rilling
  0 siblings, 1 reply; 13+ messages in thread
From: Joel Becker @ 2008-06-17 22:41 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: cluster-devel, Louis.Rilling, ocfs2-devel

configfs_mkdir() creates a new item by calling its parent's
->make_item/group() functions.  Once that object is created,
configfs_mkdir() calls try_module_get() on the new item's module.  If it
succeeds, the module owning the new item cannot be unloaded, and
configfs is safe to reference the item.

If the item and the subsystem it belongs to are part of the same module,
the subsystem is also pinned.  This is the common case.

However, if the subsystem is made up of multiple modules, this may not
pin the subsystem.  Thus, it would be possible to unload the toplevel
subsystem module while there is still a child item.  Thus, we now
try_module_get() the subsystem's module.  This only really affects
children of the toplevel subsystem group.  Deeper children already have
their parents pinned.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
---
 fs/configfs/dir.c |   42 +++++++++++++++++++++++++++++++++---------
 1 files changed, 33 insertions(+), 9 deletions(-)

diff --git a/fs/configfs/dir.c b/fs/configfs/dir.c
index a48dc7d..52eed83 100644
--- a/fs/configfs/dir.c
+++ b/fs/configfs/dir.c
@@ -1008,7 +1008,7 @@ static int configfs_mkdir(struct inode *dir, struct dentry *dentry, int mode)
 	struct configfs_subsystem *subsys;
 	struct configfs_dirent *sd;
 	struct config_item_type *type;
-	struct module *owner = NULL;
+	struct module *subsys_owner = NULL, *new_item_owner = NULL;
 	char *name;
 
 	if (dentry->d_parent == configfs_sb->s_root) {
@@ -1035,10 +1035,25 @@ static int configfs_mkdir(struct inode *dir, struct dentry *dentry, int mode)
 		goto out_put;
 	}
 
+	/*
+	 * The subsystem may belong to a different module than the item
+	 * being created.  We don't want to safely pin the new item but
+	 * fail to pin the subsystem it sits under.
+	 */
+	if (!subsys->su_group.cg_item.ci_type) {
+		ret = -EINVAL;
+		goto out_put;
+	}
+	subsys_owner = subsys->su_group.cg_item.ci_type->ct_owner;
+	if (!try_module_get(subsys_owner)) {
+		ret = -EINVAL;
+		goto out_put;
+	}
+
 	name = kmalloc(dentry->d_name.len + 1, GFP_KERNEL);
 	if (!name) {
 		ret = -ENOMEM;
-		goto out_put;
+		goto out_subsys_put;
 	}
 
 	snprintf(name, dentry->d_name.len + 1, "%s", dentry->d_name.name);
@@ -1066,7 +1081,7 @@ static int configfs_mkdir(struct inode *dir, struct dentry *dentry, int mode)
 		 * There are no extra references to clean up.
 		 */
 		ret = -ENOMEM;
-		goto out_put;
+		goto out_subsys_put;
 	}
 
 	/*
@@ -1080,8 +1095,8 @@ static int configfs_mkdir(struct inode *dir, struct dentry *dentry, int mode)
 		goto out_unlink;
 	}
 
-	owner = type->ct_owner;
-	if (!try_module_get(owner)) {
+	new_item_owner = type->ct_owner;
+	if (!try_module_get(new_item_owner)) {
 		ret = -EINVAL;
 		goto out_unlink;
 	}
@@ -1113,9 +1128,13 @@ out_unlink:
 		mutex_unlock(&subsys->su_mutex);
 
 		if (module_got)
-			module_put(owner);
+			module_put(new_item_owner);
 	}
 
+out_subsys_put:
+	if (ret)
+		module_put(subsys_owner);
+
 out_put:
 	/*
 	 * link_obj()/link_group() took a reference from child->parent,
@@ -1134,7 +1153,7 @@ static int configfs_rmdir(struct inode *dir, struct dentry *dentry)
 	struct config_item *item;
 	struct configfs_subsystem *subsys;
 	struct configfs_dirent *sd;
-	struct module *owner = NULL;
+	struct module *subsys_owner = NULL, *dead_item_owner = NULL;
 	int ret;
 
 	if (dentry->d_parent == configfs_sb->s_root)
@@ -1161,6 +1180,10 @@ static int configfs_rmdir(struct inode *dir, struct dentry *dentry)
 		return -EINVAL;
 	}
 
+	/* configfs_mkdir() shouldn't have allowed this */
+	BUG_ON(!subsys->su_group.cg_item.ci_type);
+	subsys_owner = subsys->su_group.cg_item.ci_type->ct_owner;
+
 	ret = configfs_detach_prep(dentry);
 	if (ret) {
 		configfs_detach_rollback(dentry);
@@ -1175,7 +1198,7 @@ static int configfs_rmdir(struct inode *dir, struct dentry *dentry)
 	config_item_put(parent_item);
 
 	if (item->ci_type)
-		owner = item->ci_type->ct_owner;
+		dead_item_owner = item->ci_type->ct_owner;
 
 	if (sd->s_type & CONFIGFS_USET_DIR) {
 		configfs_detach_group(item);
@@ -1197,7 +1220,8 @@ static int configfs_rmdir(struct inode *dir, struct dentry *dentry)
 	/* Drop our reference from above */
 	config_item_put(item);
 
-	module_put(owner);
+	module_put(dead_item_owner);
+	module_put(subsys_owner);
 
 	return 0;
 }
-- 
1.5.5.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [Ocfs2-devel] [RFC] configfs: Pin configfs subsystems separately from new config_items.
  2008-06-17 22:41 [RFC] configfs: Pin configfs subsystems separately from new config_items Joel Becker
@ 2008-06-18 12:31 ` Louis Rilling
  2008-06-18 16:12   ` Joel Becker
  0 siblings, 1 reply; 13+ messages in thread
From: Louis Rilling @ 2008-06-18 12:31 UTC (permalink / raw)
  To: Joel Becker; +Cc: linux-fsdevel, cluster-devel, ocfs2-devel

[-- Attachment #1: Type: text/plain, Size: 1753 bytes --]

On Tue, Jun 17, 2008 at 03:41:00PM -0700, Joel Becker wrote:
> configfs_mkdir() creates a new item by calling its parent's
> ->make_item/group() functions.  Once that object is created,
> configfs_mkdir() calls try_module_get() on the new item's module.  If it
> succeeds, the module owning the new item cannot be unloaded, and
> configfs is safe to reference the item.
> 
> If the item and the subsystem it belongs to are part of the same module,
> the subsystem is also pinned.  This is the common case.
> 
> However, if the subsystem is made up of multiple modules, this may not
> pin the subsystem.  Thus, it would be possible to unload the toplevel
> subsystem module while there is still a child item.  Thus, we now
> try_module_get() the subsystem's module.  This only really affects
> children of the toplevel subsystem group.  Deeper children already have
> their parents pinned.

Looks good to me.

What about new item module pinning versus a concurrent sys_delete_module() in a
preemptible kernel? AFAICS new_item pinning is just done too late to protect
anybody against sys_delete_module(). Shouldn't we remove new item module pinning
and let the subsystem do it?

	process 1: 				process 2:
	confifs_mkdir()
	  item = make_item()

	--- preemption schedule ---
						sys_delete_module()
						  ok
	--- end of preemption   ---

	  new_item_owner = item->ci_type.ct_owner
Possible access to freed memory if type statically allocated!
	  try_module_get(new_item_owner)
Access to freed memory of the module metadata!

Louis

-- 
Dr Louis Rilling			Kerlabs
Skype: louis.rilling			Batiment Germanium
Phone: (+33|0) 6 80 89 08 23		80 avenue des Buttes de Coesmes
http://www.kerlabs.com/			35700 Rennes

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Ocfs2-devel] [RFC] configfs: Pin configfs subsystems separately from new config_items.
  2008-06-18 12:31 ` [Ocfs2-devel] " Louis Rilling
@ 2008-06-18 16:12   ` Joel Becker
  2008-06-18 16:51     ` Louis Rilling
  0 siblings, 1 reply; 13+ messages in thread
From: Joel Becker @ 2008-06-18 16:12 UTC (permalink / raw)
  To: Louis Rilling; +Cc: linux-fsdevel, cluster-devel, ocfs2-devel

On Wed, Jun 18, 2008 at 02:31:34PM +0200, Louis Rilling wrote:
> On Tue, Jun 17, 2008 at 03:41:00PM -0700, Joel Becker wrote:
> > However, if the subsystem is made up of multiple modules, this may not
> > pin the subsystem.  Thus, it would be possible to unload the toplevel
> > subsystem module while there is still a child item.  Thus, we now
> > try_module_get() the subsystem's module.  This only really affects
> > children of the toplevel subsystem group.  Deeper children already have
> > their parents pinned.
> 
> Looks good to me.
> 
> What about new item module pinning versus a concurrent sys_delete_module() in a
> preemptible kernel? AFAICS new_item pinning is just done too late to protect
> anybody against sys_delete_module(). Shouldn't we remove new item module pinning
> and let the subsystem do it?

	Good catch.  Preempt doesn't matter - it could just be another
CPU.
	Certainly if there are multiple modules the make_item() will
have to pin the submodule before calling it.  But the common case isn't
multiple modules.  What we absolutely don't want is making the common
case have to pin itself.

Joel

-- 

Life's Little Instruction Book #30

	"Never buy a house without a fireplace."

Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker@oracle.com
Phone: (650) 506-8127

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Ocfs2-devel] [RFC] configfs: Pin configfs subsystems separately from new config_items.
  2008-06-18 16:12   ` Joel Becker
@ 2008-06-18 16:51     ` Louis Rilling
  2008-06-18 20:07       ` Joel Becker
  0 siblings, 1 reply; 13+ messages in thread
From: Louis Rilling @ 2008-06-18 16:51 UTC (permalink / raw)
  To: Joel Becker; +Cc: linux-fsdevel, cluster-devel, ocfs2-devel

[-- Attachment #1: Type: text/plain, Size: 2684 bytes --]

On Wed, Jun 18, 2008 at 09:12:15AM -0700, Joel Becker wrote:
> On Wed, Jun 18, 2008 at 02:31:34PM +0200, Louis Rilling wrote:
> > On Tue, Jun 17, 2008 at 03:41:00PM -0700, Joel Becker wrote:
> > > However, if the subsystem is made up of multiple modules, this may not
> > > pin the subsystem.  Thus, it would be possible to unload the toplevel
> > > subsystem module while there is still a child item.  Thus, we now
> > > try_module_get() the subsystem's module.  This only really affects
> > > children of the toplevel subsystem group.  Deeper children already have
> > > their parents pinned.
> > 
> > Looks good to me.
> > 
> > What about new item module pinning versus a concurrent sys_delete_module() in a
> > preemptible kernel? AFAICS new_item pinning is just done too late to protect
> > anybody against sys_delete_module(). Shouldn't we remove new item module pinning
> > and let the subsystem do it?
> 
> 	Good catch.  Preempt doesn't matter - it could just be another
> CPU.
> 	Certainly if there are multiple modules the make_item() will
> have to pin the submodule before calling it.  But the common case isn't
> multiple modules.  What we absolutely don't want is making the common
> case have to pin itself.

I suspect the common case to not need to pin the new item: if we assume that the
parent is already pinned, it will remain pinned until the new item is dropped.
	I think that the following rules ensure that the parent's module is
always pinned:

1/ The subsystem root is pinned before creating any other subdir under a default
   group of its module (ok with your patch).

2/ A subsystem root having default groups in other modules pins these modules
   before register() and until unregister() (ok with in-tree subsystems I think,
   thanks to automatic module dependency tracking). This should mostly be done
   implicitly thanks to the module dependency tracking based on used symbols.

3/ Same as 2 but for a new group between allocation and release(). Here again,
   module dependency tracking should do the job most of the time.

4/ make_item()/make_group() pins the module of the new item/group if it differs
   from the current one, and at least until drop_item() (must check in-tree
   subsystems, but I suspect that module dependency tracking already does the
   job).

If you agree with these rules, which seem to be just reasonable, we should be
able to get rid of the new item's module pinning inside configfs.

Louis

-- 
Dr Louis Rilling			Kerlabs
Skype: louis.rilling			Batiment Germanium
Phone: (+33|0) 6 80 89 08 23		80 avenue des Buttes de Coesmes
http://www.kerlabs.com/			35700 Rennes

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC] configfs: Pin configfs subsystems separately from new config_items.
  2008-06-18 16:51     ` Louis Rilling
@ 2008-06-18 20:07       ` Joel Becker
  2008-06-19 11:13         ` [Ocfs2-devel] " Louis Rilling
  0 siblings, 1 reply; 13+ messages in thread
From: Joel Becker @ 2008-06-18 20:07 UTC (permalink / raw)
  To: Louis Rilling; +Cc: linux-fsdevel, cluster-devel, ocfs2-devel

On Wed, Jun 18, 2008 at 06:51:01PM +0200, Louis Rilling wrote:
> On Wed, Jun 18, 2008 at 09:12:15AM -0700, Joel Becker wrote:
> I suspect the common case to not need to pin the new item: if we assume that the
> parent is already pinned, it will remain pinned until the new item is dropped.

	We still want to pin the new item.  I'll discuss below.

> 4/ make_item()/make_group() pins the module of the new item/group if it differs
>    from the current one, and at least until drop_item() (must check in-tree
>    subsystems, but I suspect that module dependency tracking already does the
>    job).

	This is a silly rule, though.  Why "pin if it is different" when
"always pin" is just as effective and much easier to understand?  You
never have to ask "but was the item's module pinned?" when tracking a
problem.
	In the end, though, it doesn't matter.  We need a reference on
the item because we refer to it after calling client_drop_item().
Specifically, we call config_item_put().  If we don't have a reference
and trust make_item()/drop_item() to handle that for us, the module can
go away between the call of client_drop_item() and config_item_put() in
configfs_rmdir().
	In the end, we are holding a reference to the module while we
are accessing it.  It's overkill for the common case (single module was
safe before when we pinned item->owner, and it is safe if we only
pin subsys->owner), but it provides the best proper safety if we allow
multi-module subsystems.

Joel

-- 

"Also, all of life's big problems include the words 'indictment' or
	'inoperable.' Everything else is small stuff."
	- Alton Brown

Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker@oracle.com
Phone: (650) 506-8127

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Ocfs2-devel] [RFC] configfs: Pin configfs subsystems separately from new config_items.
  2008-06-18 20:07       ` Joel Becker
@ 2008-06-19 11:13         ` Louis Rilling
  2008-06-19 22:07           ` Joel Becker
  0 siblings, 1 reply; 13+ messages in thread
From: Louis Rilling @ 2008-06-19 11:13 UTC (permalink / raw)
  To: linux-fsdevel, cluster-devel, ocfs2-devel

[-- Attachment #1: Type: text/plain, Size: 3125 bytes --]

On Wed, Jun 18, 2008 at 01:07:13PM -0700, Joel Becker wrote:
> On Wed, Jun 18, 2008 at 06:51:01PM +0200, Louis Rilling wrote:
> > On Wed, Jun 18, 2008 at 09:12:15AM -0700, Joel Becker wrote:
> > I suspect the common case to not need to pin the new item: if we assume that the
> > parent is already pinned, it will remain pinned until the new item is dropped.
> 
> 	We still want to pin the new item.  I'll discuss below.
> 
> > 4/ make_item()/make_group() pins the module of the new item/group if it differs
> >    from the current one, and at least until drop_item() (must check in-tree
> >    subsystems, but I suspect that module dependency tracking already does the
> >    job).
> 
> 	This is a silly rule, though.  Why "pin if it is different" when
> "always pin" is just as effective and much easier to understand?  You
> never have to ask "but was the item's module pinned?" when tracking a
> problem.

Not so silly, if you consider that this relieves in-tree users from having to
add try_module_get() in their code. Only special users (like me) who create
items implemented by other modules, without explicitly depending on symbols of
these modules or keeping references after ->drop_item(), have to deal with
such module pinning.

> 	In the end, though, it doesn't matter.  We need a reference on
> the item because we refer to it after calling client_drop_item().
> Specifically, we call config_item_put().  If we don't have a reference
> and trust make_item()/drop_item() to handle that for us, the module can
> go away between the call of client_drop_item() and config_item_put() in
> configfs_rmdir().

Ok, rule 4/ should say "until ->release()", whatever the means (as we discussed
earlier), the common case being that ->release() is called inside parent's
->drop_item() (see right below). This is needed anyway, because otherwise
failing to pin the module in mkdir() cannot recover safely by calling
client_drop_item().
	And I think that we can also get rid of the last config_item_put() (or
put it before client_drop_item()), because after client_drop_item() rmdir() does
not need the item anymore, and client_drop_item() is supposed to call
config_item_put() (in parent's drop_item() or directly). IOW, when entering
rmdir() configfs already holds the item's ref that was given by parent's
->make_item(), and rmdir() drops that ref in client_drop_item(). No need to hold
the extra one grabbed by configfs_get_config_item().

> 	In the end, we are holding a reference to the module while we
> are accessing it.  It's overkill for the common case (single module was
> safe before when we pinned item->owner, and it is safe if we only
> pin subsys->owner), but it provides the best proper safety if we allow
> multi-module subsystems.

As I said above, the way it is done currently, pinning the new item's module
does not provide any safety in multi-module subsystems.

Louis

-- 
Dr Louis Rilling			Kerlabs
Skype: louis.rilling			Batiment Germanium
Phone: (+33|0) 6 80 89 08 23		80 avenue des Buttes de Coesmes
http://www.kerlabs.com/			35700 Rennes

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Ocfs2-devel] [RFC] configfs: Pin configfs subsystems separately from new config_items.
  2008-06-19 11:13         ` [Ocfs2-devel] " Louis Rilling
@ 2008-06-19 22:07           ` Joel Becker
  2008-06-20 12:46             ` Louis Rilling
  0 siblings, 1 reply; 13+ messages in thread
From: Joel Becker @ 2008-06-19 22:07 UTC (permalink / raw)
  To: Louis Rilling; +Cc: linux-fsdevel, cluster-devel, ocfs2-devel

On Thu, Jun 19, 2008 at 01:13:57PM +0200, Louis Rilling wrote:
> On Wed, Jun 18, 2008 at 01:07:13PM -0700, Joel Becker wrote:
> > > 4/ make_item()/make_group() pins the module of the new item/group if it differs
> > >    from the current one, and at least until drop_item() (must check in-tree
> > >    subsystems, but I suspect that module dependency tracking already does the
> > >    job).
> > 
> > 	This is a silly rule, though.  Why "pin if it is different" when
> > "always pin" is just as effective and much easier to understand?  You
> > never have to ask "but was the item's module pinned?" when tracking a
> > problem.
> 
> Not so silly, if you consider that this relieves in-tree users from having to
> add try_module_get() in their code. Only special users (like me) who create
> items implemented by other modules, without explicitly depending on symbols of
> these modules or keeping references after ->drop_item(), have to deal with
> such module pinning.

	With my rule ("always pin"), single-module users don't have to
try_module_get() at all.  Just like today.  That's kind of my point.

> 	And I think that we can also get rid of the last config_item_put() (or
> put it before client_drop_item()), because after client_drop_item() rmdir() does
> not need the item anymore, and client_drop_item() is supposed to call
> config_item_put() (in parent's drop_item() or directly). IOW, when entering
> rmdir() configfs already holds the item's ref that was given by parent's
> ->make_item(), and rmdir() drops that ref in client_drop_item(). No need to hold
> the extra one grabbed by configfs_get_config_item().

	We could, but it's a much cleaner read to hold a reference for
the duration of the function.  And since we hold a module reference
anyway, I like simpler and clearer.

> > 	In the end, we are holding a reference to the module while we
> > are accessing it.  It's overkill for the common case (single module was
> > safe before when we pinned item->owner, and it is safe if we only
> > pin subsys->owner), but it provides the best proper safety if we allow
> > multi-module subsystems.
> 
> As I said above, the way it is done currently, pinning the new item's module
> does not provide any safety in multi-module subsystems.

	We provide safety for ourselves.  We can't provide safety for
the subsystem - we don't know how it is put together.  Once again, the
module reference is just configfs saying "I know that I have a reference
and that I'm safe to access this."

Joel

-- 

To spot the expert, pick the one who predicts the job will take the
longest and cost the most.

Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker@oracle.com
Phone: (650) 506-8127

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Ocfs2-devel] [RFC] configfs: Pin configfs subsystems separately from new config_items.
  2008-06-19 22:07           ` Joel Becker
@ 2008-06-20 12:46             ` Louis Rilling
  2008-06-20 22:36               ` Joel Becker
  0 siblings, 1 reply; 13+ messages in thread
From: Louis Rilling @ 2008-06-20 12:46 UTC (permalink / raw)
  To: Joel.Becker; +Cc: linux-fsdevel, cluster-devel, ocfs2-devel

[-- Attachment #1: Type: text/plain, Size: 4503 bytes --]

On Thu, Jun 19, 2008 at 03:07:39PM -0700, Joel Becker wrote:
> On Thu, Jun 19, 2008 at 01:13:57PM +0200, Louis Rilling wrote:
> > On Wed, Jun 18, 2008 at 01:07:13PM -0700, Joel Becker wrote:
> > > > 4/ make_item()/make_group() pins the module of the new item/group if it differs
> > > >    from the current one, and at least until drop_item() (must check in-tree
> > > >    subsystems, but I suspect that module dependency tracking already does the
> > > >    job).
> > > 
> > > 	This is a silly rule, though.  Why "pin if it is different" when
> > > "always pin" is just as effective and much easier to understand?  You
> > > never have to ask "but was the item's module pinned?" when tracking a
> > > problem.
> > 
> > Not so silly, if you consider that this relieves in-tree users from having to
> > add try_module_get() in their code. Only special users (like me) who create
> > items implemented by other modules, without explicitly depending on symbols of
> > these modules or keeping references after ->drop_item(), have to deal with
> > such module pinning.
> 
> 	With my rule ("always pin"), single-module users don't have to
> try_module_get() at all.  Just like today.  That's kind of my point.

With my rules, neither single-module users, nor in-tree multi-module users have
to try_module_get(). And, as I try to convince you of, configfs does not have
to try_module_get() on new items.

What I suggest is a *simplification* of configfs, and does not require to change
existing subsystems. I'm convinced that this does not lower the safety of
configfs, but instead this makes it clear that subsystems have to take care of
module pinning, as they always had to.

> 
> > 	And I think that we can also get rid of the last config_item_put() (or
> > put it before client_drop_item()), because after client_drop_item() rmdir() does
> > not need the item anymore, and client_drop_item() is supposed to call
> > config_item_put() (in parent's drop_item() or directly). IOW, when entering
> > rmdir() configfs already holds the item's ref that was given by parent's
> > ->make_item(), and rmdir() drops that ref in client_drop_item(). No need to hold
> > the extra one grabbed by configfs_get_config_item().
> 
> 	We could, but it's a much cleaner read to hold a reference for
> the duration of the function.  And since we hold a module reference
> anyway, I like simpler and clearer.

But keeping this last config_item_put() prevents the simplification that I'm
defending. And I think that the simplification is worth moving this
config_item_put() before client_drop_item().

> 
> > > 	In the end, we are holding a reference to the module while we
> > > are accessing it.  It's overkill for the common case (single module was
> > > safe before when we pinned item->owner, and it is safe if we only
> > > pin subsys->owner), but it provides the best proper safety if we allow
> > > multi-module subsystems.
> > 
> > As I said above, the way it is done currently, pinning the new item's module
> > does not provide any safety in multi-module subsystems.
> 
> 	We provide safety for ourselves.  We can't provide safety for
> the subsystem - we don't know how it is put together.  Once again, the
> module reference is just configfs saying "I know that I have a reference
> and that I'm safe to access this."

No, we do not provide safety for ourselves: remember the scenario I described
earlier in this thread [ Actually I'm able to show you a similar scenario with a
non-preemptible kernel on a single-processor system, because attach_group() may
block and schedule in a memory allocation. ]. This is the reason why I suggest
to get rid of the last config_item_put() above, and simply remove this new
item's module pinning which, again, does not provide any safety, even for
configfs alone.
	Note that I agree with your principle of providing safety for configfs
only, and let modules handle their specific module uses. I'm really keeping this
in mind, and also keeping in mind not to disturb subsystems with new rules. It
just happens that the rules that I'm proposing are already respected by in-tree
subsystems, and are even already respected by my own multi-modules use case.
	So, what do you think is really bad in my suggestion?

Louis

-- 
Dr Louis Rilling			Kerlabs
Skype: louis.rilling			Batiment Germanium
Phone: (+33|0) 6 80 89 08 23		80 avenue des Buttes de Coesmes
http://www.kerlabs.com/			35700 Rennes

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Ocfs2-devel] [RFC] configfs: Pin configfs subsystems separately from new config_items.
  2008-06-20 12:46             ` Louis Rilling
@ 2008-06-20 22:36               ` Joel Becker
  2008-06-23 15:44                 ` Louis Rilling
  0 siblings, 1 reply; 13+ messages in thread
From: Joel Becker @ 2008-06-20 22:36 UTC (permalink / raw)
  To: Louis Rilling; +Cc: linux-fsdevel, cluster-devel, ocfs2-devel

On Fri, Jun 20, 2008 at 02:46:44PM +0200, Louis Rilling wrote:
> But keeping this last config_item_put() prevents the simplification that I'm
> defending. And I think that the simplification is worth moving this
> config_item_put() before client_drop_item().

	I guess I'm not seeing what's simpler.  Four lines of
try_module_get() aren't very complex, really.  Conversely, having some
functions that *don't* do config_item_get_item() is weird.  Not invalid,
just harder to read.
	Let's put it on the shelf.  What I have with this patch is
really no different in effective behavior.

Joel

-- 

"When ideas fail, words come in very handy." 
         - Goethe

Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker@oracle.com
Phone: (650) 506-8127

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Ocfs2-devel] [RFC] configfs: Pin configfs subsystems separately from new config_items.
  2008-06-20 22:36               ` Joel Becker
@ 2008-06-23 15:44                 ` Louis Rilling
  2008-06-23 19:10                   ` Joel Becker
  0 siblings, 1 reply; 13+ messages in thread
From: Louis Rilling @ 2008-06-23 15:44 UTC (permalink / raw)
  To: Joel Becker; +Cc: linux-fsdevel, cluster-devel, ocfs2-devel

[-- Attachment #1: Type: text/plain, Size: 2226 bytes --]

On Fri, Jun 20, 2008 at 03:36:14PM -0700, Joel Becker wrote:
> On Fri, Jun 20, 2008 at 02:46:44PM +0200, Louis Rilling wrote:
> > But keeping this last config_item_put() prevents the simplification that I'm
> > defending. And I think that the simplification is worth moving this
> > config_item_put() before client_drop_item().
> 
> 	I guess I'm not seeing what's simpler.  Four lines of
> try_module_get() aren't very complex, really.  Conversely, having some
> functions that *don't* do config_item_get_item() is weird.  Not invalid,
> just harder to read.

config_item_get_item() followed by config_item_put(), with a comment on why this
is correct, was my suggestion, not avoiding to call config_item_get_item().

> 	Let's put it on the shelf.  What I have with this patch is
> really no different in effective behavior.

I'm completely ok with your patch. I just thought that this was the occasion to
cleanup module pinning stuff.

Would you consider the last following argument?

Consider a very simple subsystem, implemented by a single module, that does
absolutely no config_item_get(). Currently, this module must provide a release()
operation for all user-created config_items, because the last reference is
dropped by configfs, right after dropping the item. With my simplification, the
module effectively drops the last reference in ->drop_item(), and could rely on
that to make the code simpler like below:

make_item()
	new_item = kmalloc();
	config_item_init_type_long_name();
	return new_item;

drop_item(item)
	config_item_put(item);
	kfree(item);

No need for a release() operation in that simple case, and one may find such
code more readable thanks to its symmetry.

Looking at some in-tree code, netconsole would be a candidate for such code
simplification. Indeed, netconsole does not look like needing config_item_get(),
because the target_list_lock already provides the necessary protection.

That said, thanks for the discussion, and for not forgetting the idea :)

Louis

-- 
Dr Louis Rilling			Kerlabs
Skype: louis.rilling			Batiment Germanium
Phone: (+33|0) 6 80 89 08 23		80 avenue des Buttes de Coesmes
http://www.kerlabs.com/			35700 Rennes

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Ocfs2-devel] [RFC] configfs: Pin configfs subsystems separately from new config_items.
  2008-06-23 15:44                 ` Louis Rilling
@ 2008-06-23 19:10                   ` Joel Becker
  2008-06-24  5:04                     ` Louis Rilling
  0 siblings, 1 reply; 13+ messages in thread
From: Joel Becker @ 2008-06-23 19:10 UTC (permalink / raw)
  To: Louis Rilling; +Cc: linux-fsdevel, cluster-devel, ocfs2-devel

On Mon, Jun 23, 2008 at 05:44:57PM +0200, Louis Rilling wrote:
> make_item()
> 	new_item = kmalloc();
> 	config_item_init_type_long_name();
> 	return new_item;
> 
> drop_item(item)
> 	config_item_put(item);
> 	kfree(item);

	This is never, ever safe.  Consider that someone has an
attribute file open - it has a reference to the item.  You can still
rmdir() the item - doing stuff to the attribute after drop_item() will
just get ignored.  But you can't free it in drop_item().

Joel

-- 

"Vote early and vote often." 
        - Al Capone

Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker@oracle.com
Phone: (650) 506-8127

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Ocfs2-devel] [RFC] configfs: Pin configfs subsystems separately from new config_items.
  2008-06-23 19:10                   ` Joel Becker
@ 2008-06-24  5:04                     ` Louis Rilling
  2008-06-24 17:03                       ` Joel Becker
  0 siblings, 1 reply; 13+ messages in thread
From: Louis Rilling @ 2008-06-24  5:04 UTC (permalink / raw)
  To: Joel.Becker; +Cc: linux-fsdevel, cluster-devel, ocfs2-devel

On Mon, Jun 23, 2008 at 12:10:28PM -0700, Joel Becker wrote:
> On Mon, Jun 23, 2008 at 05:44:57PM +0200, Louis Rilling wrote:
> > make_item()
> > 	new_item = kmalloc();
> > 	config_item_init_type_long_name();
> > 	return new_item;
> > 
> > drop_item(item)
> > 	config_item_put(item);
> > 	kfree(item);
> 
> 	This is never, ever safe.  Consider that someone has an
> attribute file open - it has a reference to the item.  You can still
> rmdir() the item - doing stuff to the attribute after drop_item() will
> just get ignored.  But you can't free it in drop_item().

Yup, I realized it this night (prevented me from sleeping by the way).
The previous arguments remain valid though. I hope that you won't discard
them because of this buggy one :)

While looking at file.c, I found this in configfs_release():

struct config_item * item = to_item(filp->f_path.dentry->d_parent);
[...]
if (item)
	config_item_put(item);

It looks strange:
1/ either item may be NULL, and there is a probable memory leak because of the
reference grabbed in configfs_open_file();
2/ or item may never be NULL and this check is just useless and (at least for
me) misleading.

If I understand correctly, option 2 is correct because
a/ even if .dentry gets unhashed and .dentry->d_parent gets unhashed as well,
VFS ensures that filp->f_path.dentry->d_parent is unchanged unless .dentry was
renamed, which is not permitted by configfs, and, I guess, will never be
permitted for a configfs attribute;
b/ once dentry->d_fsdata points to a configfs_dirent, it never changes
and remains valid for the rest of dentry's life (until dentry_iput());
c/ configfs_dirent->s_element never changes once it is set.

Am I wrong somewhere?

Thanks

Louis

-- 
Dr Louis Rilling			Kerlabs - IRISA
Skype: louis.rilling			Campus Universitaire de Beaulieu
Phone: (+33|0) 2 99 84 71 52		Avenue du General Leclerc
Fax: (+33|0) 2 99 84 71 71		35042 Rennes CEDEX - France
http://www.kerlabs.com/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC] configfs: Pin configfs subsystems separately from new config_items.
  2008-06-24  5:04                     ` Louis Rilling
@ 2008-06-24 17:03                       ` Joel Becker
  0 siblings, 0 replies; 13+ messages in thread
From: Joel Becker @ 2008-06-24 17:03 UTC (permalink / raw)
  To: Louis Rilling; +Cc: linux-fsdevel, cluster-devel, ocfs2-devel

On Tue, Jun 24, 2008 at 07:04:23AM +0200, Louis Rilling wrote:
> Yup, I realized it this night (prevented me from sleeping by the way).
> The previous arguments remain valid though. I hope that you won't discard
> them because of this buggy one :)

	I'm quite happy with the current code, and I figure to leave it
(taking a module_get() on item->owner).  Thanks for thinking it out with
me.

> struct config_item * item = to_item(filp->f_path.dentry->d_parent);
> [...]
> if (item)
> 	config_item_put(item);
> 
> It looks strange:
> 1/ either item may be NULL, and there is a probable memory leak because of the
> reference grabbed in configfs_open_file();
> 2/ or item may never be NULL and this check is just useless and (at least for
> me) misleading.
> 
> If I understand correctly, option 2 is correct because

	I think you are right about (2), but I'm not changing it.  That
code comes from sysfs and I try not to change it unless its wrong.
sysfs has actually changed their handling over time.  What I try to do
is keep track of such changes and sometimes bring them over if
warranted.  I haven't had a chance to look at the new way sysfs does it.

Joel

-- 

	f/8 and be there.

Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker@oracle.com
Phone: (650) 506-8127

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2008-06-24 17:03 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-17 22:41 [RFC] configfs: Pin configfs subsystems separately from new config_items Joel Becker
2008-06-18 12:31 ` [Ocfs2-devel] " Louis Rilling
2008-06-18 16:12   ` Joel Becker
2008-06-18 16:51     ` Louis Rilling
2008-06-18 20:07       ` Joel Becker
2008-06-19 11:13         ` [Ocfs2-devel] " Louis Rilling
2008-06-19 22:07           ` Joel Becker
2008-06-20 12:46             ` Louis Rilling
2008-06-20 22:36               ` Joel Becker
2008-06-23 15:44                 ` Louis Rilling
2008-06-23 19:10                   ` Joel Becker
2008-06-24  5:04                     ` Louis Rilling
2008-06-24 17:03                       ` Joel Becker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).