[Ocfs2-devel] [RFC] configfs: module reference counting rules

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Joel Becker <Joel.Becker@oracle.com>
To: Louis Rilling <Louis.Rilling@kerlabs.com>
Cc: ocfs2-devel@oss.oracle.com, linux-kernel@vger.kernel.org
Subject: [Ocfs2-devel] [RFC] configfs: module reference counting rules
Date: Sat, 14 Jun 2008 01:47:01 -0700	[thread overview]
Message-ID: <20080614084701.GA9657@mail.oracle.com> (raw)
In-Reply-To: <20080613222744.GB4153@localdomain>

On Sat, Jun 14, 2008 at 12:27:44AM +0200, Louis Rilling wrote:
> 1/ taking the module reference is only needed if mkdir() is called under
> a subsystem root or one of its default group, right? Of course this is a
> bit complex to check for this condition, and it does not hurt to take
> module references in all cases of mkdir().

	Correct, but it's much cleaner to always take the module ref.

> 2/ this module reference counting makes unregister_subsystem() win
> against mkdir(), but only if unregister_subsystem() is called in
> module_cleanup(), because otherwise try_module_get() would succeed,
> right? If so, this means that after having called register_subsystem(),
> a module_init() cannot cleanly fail. Perhaps this should be documented
> in that case.

	Correct again, mkdir can race a failing module_init().  This is
the same as register_filesystem().  A module that needed to protect
against this could have a mutex they release right before module_init()
succeeds.  They'd check it in make_item().  But most everyone can safely
make configfs_register_subsystem() the last thing in their
module_init().

> 3/ to make unregister_subsystem() win against mkdir(), mkdir() should
> try_module_get() on the subsystem's owner, not on the new item owner (as
> is done by the current code), right? Maybe there is a bug here...

	Nope.  You can build a subsystem out of multiple modules if you
like.  The module that owns the newly created object needs to be pinned,
and that's new_item->type->owner.  If a subsystem lives within one
module, then subsys->type->owner == new_item->type->owner, and it
doesn't matter.

> > 	So for the simple case, we provide plenty of protection.  If
> > someone wants to do something fancier, they have to provide their own
> > protection, but they would anyway.  And we can't know their complex
> > lifecycle, so we can't really help anyway.
> 
> Actually, I'm developing a framework based on configfs, with which many
> modules can be linked together through mkdir() and symlink() operations.
> So I'm already managing such reference holding, but the fact that
> configfs does not hold a reference until the last config_item_put()
> imposes limitations (with which I can live though) to the framework:

	Think about it this way: the try_module_get() isn't to protect
the client module, it is to protect configfs.  It makes sure that if
someone calls a VFS operation on a configfs inode, configfs can follow
the config_*_operations safely.  Once a config_item is removed from the
filesystem view, configfs is done with it.
	Look at it the other way around.  A config_item is not a
structure owned by configfs.  It is a part of the larger object, which
is owned by the module that created it.  The config_item portion just
lets configfs display it to userspace.

> 	For instance, A is a group of the framework, and mkdir A/B creates an
> object implemented by module "mod_b". Before calling mod_b's
> constructor, A's make_item() has to grab a reference on mod_b, and fail
> if not successful. Then this reference cannot be dropped before B's last
> reference is dropped, which can happen a long time after rmdir A/B. So
> A's drop_item() cannot drop mod_b's reference, and A has to provide
> the release() config_item_operation for B, which will call B's
> destructor and finally drop mod_b's reference.

	Wow, so A->make_item() does something like:

    submod = lookup_which_mod();
    if (!strcmp(submod, "mod_a"))
        new_thing = mod_a->alloc();
    else if (!strcmp(submod, "mod_b"))
        new_thing = mod_b->alloc();

    return &new_thing->config_item;

That's pretty complicated, I agree.  But certainly doable.
	Why can't mod_b provide a ->release() that does
module_put(self)?  Or are you trying to hide that detail from the person
who is implementing mod_b?  Even better, use the chained release scheme
that is used by bio_endio().  Have mod_b control its own
config_item_operations.  In make_item, copy off the type and operations,
then put in your own.  In drop, put them back.

struct my_item_type {
	struct config_item_type *original_type;
	struct config_item_type type;
	struct config_item_operations ops;
};

make_item()
{
	mod_b = alloc_mod_b();
	my_type = kzalloc(struct my_item_type);
	my_type->original_type = mod_b->item->ci_type;
	my_type->ops = *my_type->original_type->ct_item_ops;
	my_type->ops.release = my_item_release();
	my_type->type = *my_type->original_type;
	my_type->type.ops = &my_type->ops;
	mod_b->item->ci_type = &my_type->type;

	return &mod_b->item;
}

my_item_release(struct config_item *item)
{
	my_type = to_my_type(item->type);
	item->ci_type = my_type->original_type;
	kfree(my_type);
	item->ci_type->ct_ops->release(item);
}

Joel

-- 

"Ninety feet between bases is perhaps as close as man has ever come
 to perfection."
	- Red Smith

Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker at oracle.com
Phone: (650) 506-8127

WARNING: multiple messages have this Message-ID (diff)

From: Joel Becker <Joel.Becker@oracle.com>
To: Louis Rilling <Louis.Rilling@kerlabs.com>
Cc: ocfs2-devel@oss.oracle.com, linux-kernel@vger.kernel.org
Subject: Re: [RFC] configfs: module reference counting rules
Date: Sat, 14 Jun 2008 01:47:01 -0700	[thread overview]
Message-ID: <20080614084701.GA9657@mail.oracle.com> (raw)
In-Reply-To: <20080613222744.GB4153@localdomain>

On Sat, Jun 14, 2008 at 12:27:44AM +0200, Louis Rilling wrote:
> 1/ taking the module reference is only needed if mkdir() is called under
> a subsystem root or one of its default group, right? Of course this is a
> bit complex to check for this condition, and it does not hurt to take
> module references in all cases of mkdir().

	Correct, but it's much cleaner to always take the module ref.

> 2/ this module reference counting makes unregister_subsystem() win
> against mkdir(), but only if unregister_subsystem() is called in
> module_cleanup(), because otherwise try_module_get() would succeed,
> right? If so, this means that after having called register_subsystem(),
> a module_init() cannot cleanly fail. Perhaps this should be documented
> in that case.

	Correct again, mkdir can race a failing module_init().  This is
the same as register_filesystem().  A module that needed to protect
against this could have a mutex they release right before module_init()
succeeds.  They'd check it in make_item().  But most everyone can safely
make configfs_register_subsystem() the last thing in their
module_init().

> 3/ to make unregister_subsystem() win against mkdir(), mkdir() should
> try_module_get() on the subsystem's owner, not on the new item owner (as
> is done by the current code), right? Maybe there is a bug here...

	Nope.  You can build a subsystem out of multiple modules if you
like.  The module that owns the newly created object needs to be pinned,
and that's new_item->type->owner.  If a subsystem lives within one
module, then subsys->type->owner == new_item->type->owner, and it
doesn't matter.

> > 	So for the simple case, we provide plenty of protection.  If
> > someone wants to do something fancier, they have to provide their own
> > protection, but they would anyway.  And we can't know their complex
> > lifecycle, so we can't really help anyway.
> 
> Actually, I'm developing a framework based on configfs, with which many
> modules can be linked together through mkdir() and symlink() operations.
> So I'm already managing such reference holding, but the fact that
> configfs does not hold a reference until the last config_item_put()
> imposes limitations (with which I can live though) to the framework:

	Think about it this way: the try_module_get() isn't to protect
the client module, it is to protect configfs.  It makes sure that if
someone calls a VFS operation on a configfs inode, configfs can follow
the config_*_operations safely.  Once a config_item is removed from the
filesystem view, configfs is done with it.
	Look at it the other way around.  A config_item is not a
structure owned by configfs.  It is a part of the larger object, which
is owned by the module that created it.  The config_item portion just
lets configfs display it to userspace.

> 	For instance, A is a group of the framework, and mkdir A/B creates an
> object implemented by module "mod_b". Before calling mod_b's
> constructor, A's make_item() has to grab a reference on mod_b, and fail
> if not successful. Then this reference cannot be dropped before B's last
> reference is dropped, which can happen a long time after rmdir A/B. So
> A's drop_item() cannot drop mod_b's reference, and A has to provide
> the release() config_item_operation for B, which will call B's
> destructor and finally drop mod_b's reference.

	Wow, so A->make_item() does something like:

    submod = lookup_which_mod();
    if (!strcmp(submod, "mod_a"))
        new_thing = mod_a->alloc();
    else if (!strcmp(submod, "mod_b"))
        new_thing = mod_b->alloc();

    return &new_thing->config_item;

That's pretty complicated, I agree.  But certainly doable.
	Why can't mod_b provide a ->release() that does
module_put(self)?  Or are you trying to hide that detail from the person
who is implementing mod_b?  Even better, use the chained release scheme
that is used by bio_endio().  Have mod_b control its own
config_item_operations.  In make_item, copy off the type and operations,
then put in your own.  In drop, put them back.

struct my_item_type {
	struct config_item_type *original_type;
	struct config_item_type type;
	struct config_item_operations ops;
};

make_item()
{
	mod_b = alloc_mod_b();
	my_type = kzalloc(struct my_item_type);
	my_type->original_type = mod_b->item->ci_type;
	my_type->ops = *my_type->original_type->ct_item_ops;
	my_type->ops.release = my_item_release();
	my_type->type = *my_type->original_type;
	my_type->type.ops = &my_type->ops;
	mod_b->item->ci_type = &my_type->type;

	return &mod_b->item;
}

my_item_release(struct config_item *item)
{
	my_type = to_my_type(item->type);
	item->ci_type = my_type->original_type;
	kfree(my_type);
	item->ci_type->ct_ops->release(item);
}

Joel

-- 

"Ninety feet between bases is perhaps as close as man has ever come
 to perfection."
	- Red Smith

Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker@oracle.com
Phone: (650) 506-8127

next prev parent reply	other threads:[~2008-06-14  8:47 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-06-12 23:54 [Ocfs2-devel] [RFC] configfs: module reference counting rules Louis Rilling
2008-06-12 23:54 ` Louis Rilling
2008-06-13  3:33 ` [Ocfs2-devel] " Joel Becker
2008-06-13  3:33   ` Joel Becker
2008-06-13  9:51   ` [Ocfs2-devel] " Louis Rilling
2008-06-13  9:51     ` Louis Rilling
2008-06-13 20:26     ` [Ocfs2-devel] " Joel Becker
2008-06-13 20:26       ` Joel Becker
2008-06-13 22:27       ` [Ocfs2-devel] " Louis Rilling
2008-06-13 22:27         ` Louis Rilling
2008-06-14  8:47         ` Joel Becker [this message]
2008-06-14  8:47           ` Joel Becker
2008-06-16 12:39           ` [Ocfs2-devel] " Louis Rilling
2008-06-16 12:39             ` Louis Rilling
2008-06-16 18:06             ` [Ocfs2-devel] " Joel Becker
2008-06-16 18:06               ` Joel Becker
2008-06-16 18:14               ` [Ocfs2-devel] " Louis Rilling
2008-06-16 18:14                 ` Louis Rilling
2008-06-16 19:36                 ` [Ocfs2-devel] " Joel Becker
2008-06-16 19:36                   ` Joel Becker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080614084701.GA9657@mail.oracle.com \
    --to=joel.becker@oracle.com \
    --cc=Louis.Rilling@kerlabs.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ocfs2-devel@oss.oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.