From: Ming Lei <ming.lei@redhat.com>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Petr Mladek <pmladek@suse.com>,
linux-kernel@vger.kernel.org,
Luis Chamberlain <mcgrof@kernel.org>
Subject: Re: [PATCH V2 2/2] kobject: wait until kobject is cleaned up before freeing module
Date: Tue, 7 Dec 2021 09:50:03 +0800 [thread overview]
Message-ID: <Ya69y7hPo52V0kRy@T590> (raw)
In-Reply-To: <Ya3EGLbhNWrpTqX+@kroah.com>
On Mon, Dec 06, 2021 at 09:04:40AM +0100, Greg Kroah-Hartman wrote:
> On Mon, Dec 06, 2021 at 10:13:53AM +0800, Ming Lei wrote:
> > On Fri, Dec 03, 2021 at 04:07:39PM +0100, Greg Kroah-Hartman wrote:
> > > On Mon, Nov 29, 2021 at 11:45:09AM +0800, Ming Lei wrote:
> > > > kobject_put() may become asynchronously because of
> > > > CONFIG_DEBUG_KOBJECT_RELEASE, so once kobject_put() returns, the caller may
> > > > expect the kobject is released after the last refcnt is dropped, however
> > > > CONFIG_DEBUG_KOBJECT_RELEASE just schedules one delayed work function
> > > > for cleaning up the kobject.
> > >
> > > The caller should NOT expect the kobject to be released. That's the
> > > whole point of dynamic reference counted objects, you never "know" when
> > > the last object is released. This option just makes it obvious so that
> > > you know when to fix up code that has this assumption.
> >
> > Yes, so CONFIG_DEBUG_KOBJECT_RELEASE needs to be fixed.
>
> What is broken with it today? It is there for you to find problems in
> your kernel code that uses kobjects. What oops/crash/whatever is it
> causing that you feel it should not be causing?
>
> A module's kobject is "owned" by the module core, not the module code
No, this patch is nothing to do with module's kobject, we are talking
about any kobjects allocated/released from one driver built as module.
> that is being unloaded, so I don't see the problem here. More details
> please.
If CONFIG_DEBUG_KOBJECT_RELEASE is enabled, kobject_release() will
schedule a (random time)delay work to run kobject_cleanup(), and the delay
work may be run after the module which allocates/frees the kobject is
unloaded.
kobject_cleanup():
struct kobj_type *t = get_ktype(kobj);
...
if (t && t->release) {
pr_debug("kobject: '%s' (%p): calling ktype release\n",
kobject_name(kobj), kobj);
t->release(kobj);
}
Both kobj_type and ->release are allocated in the module data/text section,
so kernel panic is triggered when 't && t->release' is run from the
delay work context.
>
> > > > Inside the cleanup handler, kobj->ktype and kobj->ktype->release are
> > > > required.
> > >
> > > Yes. Is that a problem?
> >
> > Of course for CONFIG_DEBUG_KOBJECT_RELEASE, which delays to call
> > ->release after random time, when the module for storing ->ktype and
> > ->ktype->release has been unloaded.
> >
> > As I mentioned, the issue can be triggered 100% by 'modprobe -r
> > kset-example' when CONFIG_DEBUG_KOBJECT_RELEASE is enabled if the
> > 1st patch is applied.
>
> Is there any "real" kernel code that this causes problems on?
>
> Again, this is for debugging, yes, this tiny example will crash that
> way, but that is fine, as we can obviously see that the kernel code here
> is correct.
Nothing is wrong with kset-example, the issue is just that foo_ktype and
foo_release are allocated in code/data section of the module 'kset-example'.
There are ~150 such uses:
[linux]$ git grep -n "static struct kobj_type" ./ | grep "{" | wc
153 923 11676
Most of the code can be built as module, so all should have such problem,
that is why I think it as one generic issue, not kset-example specific.
Here kset-example is referred just for showing the issue easily.
>
> And if you really want to ensure that it works properly, let's wait on
> release before allowing that module to be unloaded. But again, module
Then all modules which uses kobject need such change.
> unload is NOT a normal operation and is not what this debugging option
> was created to help out with.
But CONFIG_MODULE and CONFIG_DEBUG_KOBJECT_RELEASE can be enabled at the
same time.
>
> Again, the confusion between kobjects (which protect data) and module
> references (which protect code) is getting mixed up here.
>
> > > > It is supposed that no activity is on kobject itself any more since
> > > > module_exit() is started, so it is reasonable for the kobject user or
> > > > driver to expect that kobject can be really released in the last run of
> > > > kobject_put() in module_exit() code path. Otherwise, it can be thought as
> > > > one driver's bug since the module is going away.
> > >
> > > Why is module_exit() somehow special here? What is so odd about that?
> >
> > After module_exit() is done, the module will be unloaded, then any code
> > or data stored in the module can't be referred.
> >
> > >
> > > > When the ->ktype and ->ktype->release are allocated as module static
> > > > variable, it can cause trouble because the delayed cleanup handler may
> > > > be run after the module is unloaded.
> > >
> > > Why is ktype and release part of module code?
> >
> > Lots of driver defines ktype and ktype->release in its module static
> > variable.
>
> They do? Where?
>
> > > What module kobject is causing this problem?
> >
> > Any modules which defines its ktype and ktype->release in its module
> > static variable, which is pretty common.
>
> What non-example code does this? Let's fix that.
>
> > > > Fixes the issue by flushing scheduled kobject cleanup work before
> > > > freeing module.
> > >
> > > Why are modules special here?
> > >
> > > And if you enable this option, and then start unloading kernel modules,
> > > yes, things can go wrong, but that's not what this kernel option is for
> > > at all.
> > >
> > > This feels like a hack for not a real problem.
> >
> > I think it is caused by CONFIG_DEBUG_KOBJECT_RELEASE, that is why this
> > patch is posted. Otherwise I'd suggest to remove
> > CONFIG_DEBUG_KOBJECT_RELEASE, which supposes to not panic kernel since
> > there isn't anything wrong from driver side.
>
> Perhaps just put a nice warning in that debug option that says "beware
> of unloading modules with this option enabled."
>
> Or better yet, forbid it if that option is enabled :)
You mean disabling CONFIG_DEBUG_KOBJECT_RELEASE if CONFIG_MODULE is
enabled?
Thanks,
Ming
next prev parent reply other threads:[~2021-12-07 1:50 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-29 3:45 [PATCH V2 0/2] kobject: avoid to cleanup kobject after module is unloaded Ming Lei
2021-11-29 3:45 ` [PATCH V2 1/2] kobject: don't delay to cleanup module kobject Ming Lei
2021-12-03 15:08 ` Greg Kroah-Hartman
2021-12-07 13:58 ` kernel test robot
2021-12-07 13:58 ` kernel test robot
2021-11-29 3:45 ` [PATCH V2 2/2] kobject: wait until kobject is cleaned up before freeing module Ming Lei
2021-12-03 15:07 ` Greg Kroah-Hartman
2021-12-06 2:13 ` Ming Lei
2021-12-06 8:04 ` Greg Kroah-Hartman
2021-12-07 1:50 ` Ming Lei [this message]
2021-12-07 10:32 ` Petr Mladek
2021-12-07 12:51 ` Ming Lei
2021-12-07 14:02 ` Petr Mladek
2022-07-07 4:54 ` Ming Lei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Ya69y7hPo52V0kRy@T590 \
--to=ming.lei@redhat.com \
--cc=gregkh@linuxfoundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mcgrof@kernel.org \
--cc=pmladek@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.