* [Cluster-devel] [RFC] configfs: Pin configfs subsystems separately from new config_items.
@ 2008-06-17 22:41 Joel Becker
[not found] ` <20080618123134.GC30804@localhost>
0 siblings, 1 reply; 7+ messages in thread
From: Joel Becker @ 2008-06-17 22:41 UTC (permalink / raw)
To: cluster-devel.redhat.com
configfs_mkdir() creates a new item by calling its parent's
->make_item/group() functions. Once that object is created,
configfs_mkdir() calls try_module_get() on the new item's module. If it
succeeds, the module owning the new item cannot be unloaded, and
configfs is safe to reference the item.
If the item and the subsystem it belongs to are part of the same module,
the subsystem is also pinned. This is the common case.
However, if the subsystem is made up of multiple modules, this may not
pin the subsystem. Thus, it would be possible to unload the toplevel
subsystem module while there is still a child item. Thus, we now
try_module_get() the subsystem's module. This only really affects
children of the toplevel subsystem group. Deeper children already have
their parents pinned.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
---
fs/configfs/dir.c | 42 +++++++++++++++++++++++++++++++++---------
1 files changed, 33 insertions(+), 9 deletions(-)
diff --git a/fs/configfs/dir.c b/fs/configfs/dir.c
index a48dc7d..52eed83 100644
--- a/fs/configfs/dir.c
+++ b/fs/configfs/dir.c
@@ -1008,7 +1008,7 @@ static int configfs_mkdir(struct inode *dir, struct dentry *dentry, int mode)
struct configfs_subsystem *subsys;
struct configfs_dirent *sd;
struct config_item_type *type;
- struct module *owner = NULL;
+ struct module *subsys_owner = NULL, *new_item_owner = NULL;
char *name;
if (dentry->d_parent == configfs_sb->s_root) {
@@ -1035,10 +1035,25 @@ static int configfs_mkdir(struct inode *dir, struct dentry *dentry, int mode)
goto out_put;
}
+ /*
+ * The subsystem may belong to a different module than the item
+ * being created. We don't want to safely pin the new item but
+ * fail to pin the subsystem it sits under.
+ */
+ if (!subsys->su_group.cg_item.ci_type) {
+ ret = -EINVAL;
+ goto out_put;
+ }
+ subsys_owner = subsys->su_group.cg_item.ci_type->ct_owner;
+ if (!try_module_get(subsys_owner)) {
+ ret = -EINVAL;
+ goto out_put;
+ }
+
name = kmalloc(dentry->d_name.len + 1, GFP_KERNEL);
if (!name) {
ret = -ENOMEM;
- goto out_put;
+ goto out_subsys_put;
}
snprintf(name, dentry->d_name.len + 1, "%s", dentry->d_name.name);
@@ -1066,7 +1081,7 @@ static int configfs_mkdir(struct inode *dir, struct dentry *dentry, int mode)
* There are no extra references to clean up.
*/
ret = -ENOMEM;
- goto out_put;
+ goto out_subsys_put;
}
/*
@@ -1080,8 +1095,8 @@ static int configfs_mkdir(struct inode *dir, struct dentry *dentry, int mode)
goto out_unlink;
}
- owner = type->ct_owner;
- if (!try_module_get(owner)) {
+ new_item_owner = type->ct_owner;
+ if (!try_module_get(new_item_owner)) {
ret = -EINVAL;
goto out_unlink;
}
@@ -1113,9 +1128,13 @@ out_unlink:
mutex_unlock(&subsys->su_mutex);
if (module_got)
- module_put(owner);
+ module_put(new_item_owner);
}
+out_subsys_put:
+ if (ret)
+ module_put(subsys_owner);
+
out_put:
/*
* link_obj()/link_group() took a reference from child->parent,
@@ -1134,7 +1153,7 @@ static int configfs_rmdir(struct inode *dir, struct dentry *dentry)
struct config_item *item;
struct configfs_subsystem *subsys;
struct configfs_dirent *sd;
- struct module *owner = NULL;
+ struct module *subsys_owner = NULL, *dead_item_owner = NULL;
int ret;
if (dentry->d_parent == configfs_sb->s_root)
@@ -1161,6 +1180,10 @@ static int configfs_rmdir(struct inode *dir, struct dentry *dentry)
return -EINVAL;
}
+ /* configfs_mkdir() shouldn't have allowed this */
+ BUG_ON(!subsys->su_group.cg_item.ci_type);
+ subsys_owner = subsys->su_group.cg_item.ci_type->ct_owner;
+
ret = configfs_detach_prep(dentry);
if (ret) {
configfs_detach_rollback(dentry);
@@ -1175,7 +1198,7 @@ static int configfs_rmdir(struct inode *dir, struct dentry *dentry)
config_item_put(parent_item);
if (item->ci_type)
- owner = item->ci_type->ct_owner;
+ dead_item_owner = item->ci_type->ct_owner;
if (sd->s_type & CONFIGFS_USET_DIR) {
configfs_detach_group(item);
@@ -1197,7 +1220,8 @@ static int configfs_rmdir(struct inode *dir, struct dentry *dentry)
/* Drop our reference from above */
config_item_put(item);
- module_put(owner);
+ module_put(dead_item_owner);
+ module_put(subsys_owner);
return 0;
}
--
1.5.5.4
^ permalink raw reply related [flat|nested] 7+ messages in thread[parent not found: <20080618123134.GC30804@localhost>]
* [Cluster-devel] Re: [Ocfs2-devel] [RFC] configfs: Pin configfs subsystems separately from new config_items. [not found] ` <20080618123134.GC30804@localhost> @ 2008-06-18 16:12 ` Joel Becker [not found] ` <20080618165101.GI30804@localhost> 0 siblings, 1 reply; 7+ messages in thread From: Joel Becker @ 2008-06-18 16:12 UTC (permalink / raw) To: cluster-devel.redhat.com On Wed, Jun 18, 2008 at 02:31:34PM +0200, Louis Rilling wrote: > On Tue, Jun 17, 2008 at 03:41:00PM -0700, Joel Becker wrote: > > However, if the subsystem is made up of multiple modules, this may not > > pin the subsystem. Thus, it would be possible to unload the toplevel > > subsystem module while there is still a child item. Thus, we now > > try_module_get() the subsystem's module. This only really affects > > children of the toplevel subsystem group. Deeper children already have > > their parents pinned. > > Looks good to me. > > What about new item module pinning versus a concurrent sys_delete_module() in a > preemptible kernel? AFAICS new_item pinning is just done too late to protect > anybody against sys_delete_module(). Shouldn't we remove new item module pinning > and let the subsystem do it? Good catch. Preempt doesn't matter - it could just be another CPU. Certainly if there are multiple modules the make_item() will have to pin the submodule before calling it. But the common case isn't multiple modules. What we absolutely don't want is making the common case have to pin itself. Joel -- Life's Little Instruction Book #30 "Never buy a house without a fireplace." Joel Becker Principal Software Developer Oracle E-mail: joel.becker at oracle.com Phone: (650) 506-8127 ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <20080618165101.GI30804@localhost>]
* [Cluster-devel] Re: [Ocfs2-devel] [RFC] configfs: Pin configfs subsystems separately from new config_items. [not found] ` <20080618165101.GI30804@localhost> @ 2008-06-18 20:07 ` Joel Becker [not found] ` <20080619111357.GM30804@localhost> 0 siblings, 1 reply; 7+ messages in thread From: Joel Becker @ 2008-06-18 20:07 UTC (permalink / raw) To: cluster-devel.redhat.com On Wed, Jun 18, 2008 at 06:51:01PM +0200, Louis Rilling wrote: > On Wed, Jun 18, 2008 at 09:12:15AM -0700, Joel Becker wrote: > I suspect the common case to not need to pin the new item: if we assume that the > parent is already pinned, it will remain pinned until the new item is dropped. We still want to pin the new item. I'll discuss below. > 4/ make_item()/make_group() pins the module of the new item/group if it differs > from the current one, and at least until drop_item() (must check in-tree > subsystems, but I suspect that module dependency tracking already does the > job). This is a silly rule, though. Why "pin if it is different" when "always pin" is just as effective and much easier to understand? You never have to ask "but was the item's module pinned?" when tracking a problem. In the end, though, it doesn't matter. We need a reference on the item because we refer to it after calling client_drop_item(). Specifically, we call config_item_put(). If we don't have a reference and trust make_item()/drop_item() to handle that for us, the module can go away between the call of client_drop_item() and config_item_put() in configfs_rmdir(). In the end, we are holding a reference to the module while we are accessing it. It's overkill for the common case (single module was safe before when we pinned item->owner, and it is safe if we only pin subsys->owner), but it provides the best proper safety if we allow multi-module subsystems. Joel -- "Also, all of life's big problems include the words 'indictment' or 'inoperable.' Everything else is small stuff." - Alton Brown Joel Becker Principal Software Developer Oracle E-mail: joel.becker at oracle.com Phone: (650) 506-8127 ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <20080619111357.GM30804@localhost>]
* [Cluster-devel] Re: [Ocfs2-devel] [RFC] configfs: Pin configfs subsystems separately from new config_items. [not found] ` <20080619111357.GM30804@localhost> @ 2008-06-19 22:07 ` Joel Becker [not found] ` <20080620124644.GQ30804@localhost> 0 siblings, 1 reply; 7+ messages in thread From: Joel Becker @ 2008-06-19 22:07 UTC (permalink / raw) To: cluster-devel.redhat.com On Thu, Jun 19, 2008 at 01:13:57PM +0200, Louis Rilling wrote: > On Wed, Jun 18, 2008 at 01:07:13PM -0700, Joel Becker wrote: > > > 4/ make_item()/make_group() pins the module of the new item/group if it differs > > > from the current one, and at least until drop_item() (must check in-tree > > > subsystems, but I suspect that module dependency tracking already does the > > > job). > > > > This is a silly rule, though. Why "pin if it is different" when > > "always pin" is just as effective and much easier to understand? You > > never have to ask "but was the item's module pinned?" when tracking a > > problem. > > Not so silly, if you consider that this relieves in-tree users from having to > add try_module_get() in their code. Only special users (like me) who create > items implemented by other modules, without explicitly depending on symbols of > these modules or keeping references after ->drop_item(), have to deal with > such module pinning. With my rule ("always pin"), single-module users don't have to try_module_get() at all. Just like today. That's kind of my point. > And I think that we can also get rid of the last config_item_put() (or > put it before client_drop_item()), because after client_drop_item() rmdir() does > not need the item anymore, and client_drop_item() is supposed to call > config_item_put() (in parent's drop_item() or directly). IOW, when entering > rmdir() configfs already holds the item's ref that was given by parent's > ->make_item(), and rmdir() drops that ref in client_drop_item(). No need to hold > the extra one grabbed by configfs_get_config_item(). We could, but it's a much cleaner read to hold a reference for the duration of the function. And since we hold a module reference anyway, I like simpler and clearer. > > In the end, we are holding a reference to the module while we > > are accessing it. It's overkill for the common case (single module was > > safe before when we pinned item->owner, and it is safe if we only > > pin subsys->owner), but it provides the best proper safety if we allow > > multi-module subsystems. > > As I said above, the way it is done currently, pinning the new item's module > does not provide any safety in multi-module subsystems. We provide safety for ourselves. We can't provide safety for the subsystem - we don't know how it is put together. Once again, the module reference is just configfs saying "I know that I have a reference and that I'm safe to access this." Joel -- To spot the expert, pick the one who predicts the job will take the longest and cost the most. Joel Becker Principal Software Developer Oracle E-mail: joel.becker at oracle.com Phone: (650) 506-8127 ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <20080620124644.GQ30804@localhost>]
* [Cluster-devel] Re: [Ocfs2-devel] [RFC] configfs: Pin configfs subsystems separately from new config_items. [not found] ` <20080620124644.GQ30804@localhost> @ 2008-06-20 22:36 ` Joel Becker [not found] ` <20080623154457.GW30804@localhost> 0 siblings, 1 reply; 7+ messages in thread From: Joel Becker @ 2008-06-20 22:36 UTC (permalink / raw) To: cluster-devel.redhat.com On Fri, Jun 20, 2008 at 02:46:44PM +0200, Louis Rilling wrote: > But keeping this last config_item_put() prevents the simplification that I'm > defending. And I think that the simplification is worth moving this > config_item_put() before client_drop_item(). I guess I'm not seeing what's simpler. Four lines of try_module_get() aren't very complex, really. Conversely, having some functions that *don't* do config_item_get_item() is weird. Not invalid, just harder to read. Let's put it on the shelf. What I have with this patch is really no different in effective behavior. Joel -- "When ideas fail, words come in very handy." - Goethe Joel Becker Principal Software Developer Oracle E-mail: joel.becker at oracle.com Phone: (650) 506-8127 ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <20080623154457.GW30804@localhost>]
* [Cluster-devel] Re: [Ocfs2-devel] [RFC] configfs: Pin configfs subsystems separately from new config_items. [not found] ` <20080623154457.GW30804@localhost> @ 2008-06-23 19:10 ` Joel Becker [not found] ` <20080624050423.GA4193@localdomain> 0 siblings, 1 reply; 7+ messages in thread From: Joel Becker @ 2008-06-23 19:10 UTC (permalink / raw) To: cluster-devel.redhat.com On Mon, Jun 23, 2008 at 05:44:57PM +0200, Louis Rilling wrote: > make_item() > new_item = kmalloc(); > config_item_init_type_long_name(); > return new_item; > > drop_item(item) > config_item_put(item); > kfree(item); This is never, ever safe. Consider that someone has an attribute file open - it has a reference to the item. You can still rmdir() the item - doing stuff to the attribute after drop_item() will just get ignored. But you can't free it in drop_item(). Joel -- "Vote early and vote often." - Al Capone Joel Becker Principal Software Developer Oracle E-mail: joel.becker at oracle.com Phone: (650) 506-8127 ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <20080624050423.GA4193@localdomain>]
* [Cluster-devel] Re: [Ocfs2-devel] [RFC] configfs: Pin configfs subsystems separately from new config_items. [not found] ` <20080624050423.GA4193@localdomain> @ 2008-06-24 17:03 ` Joel Becker 0 siblings, 0 replies; 7+ messages in thread From: Joel Becker @ 2008-06-24 17:03 UTC (permalink / raw) To: cluster-devel.redhat.com On Tue, Jun 24, 2008 at 07:04:23AM +0200, Louis Rilling wrote: > Yup, I realized it this night (prevented me from sleeping by the way). > The previous arguments remain valid though. I hope that you won't discard > them because of this buggy one :) I'm quite happy with the current code, and I figure to leave it (taking a module_get() on item->owner). Thanks for thinking it out with me. > struct config_item * item = to_item(filp->f_path.dentry->d_parent); > [...] > if (item) > config_item_put(item); > > It looks strange: > 1/ either item may be NULL, and there is a probable memory leak because of the > reference grabbed in configfs_open_file(); > 2/ or item may never be NULL and this check is just useless and (at least for > me) misleading. > > If I understand correctly, option 2 is correct because I think you are right about (2), but I'm not changing it. That code comes from sysfs and I try not to change it unless its wrong. sysfs has actually changed their handling over time. What I try to do is keep track of such changes and sometimes bring them over if warranted. I haven't had a chance to look at the new way sysfs does it. Joel -- f/8 and be there. Joel Becker Principal Software Developer Oracle E-mail: joel.becker at oracle.com Phone: (650) 506-8127 ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2008-06-24 17:03 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-17 22:41 [Cluster-devel] [RFC] configfs: Pin configfs subsystems separately from new config_items Joel Becker
[not found] ` <20080618123134.GC30804@localhost>
2008-06-18 16:12 ` [Cluster-devel] Re: [Ocfs2-devel] " Joel Becker
[not found] ` <20080618165101.GI30804@localhost>
2008-06-18 20:07 ` Joel Becker
[not found] ` <20080619111357.GM30804@localhost>
2008-06-19 22:07 ` Joel Becker
[not found] ` <20080620124644.GQ30804@localhost>
2008-06-20 22:36 ` Joel Becker
[not found] ` <20080623154457.GW30804@localhost>
2008-06-23 19:10 ` Joel Becker
[not found] ` <20080624050423.GA4193@localdomain>
2008-06-24 17:03 ` Joel Becker
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).