public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Jinjie Ruan <ruanjinjie@huawei.com>,
	linux-kernel@vger.kernel.org,
	"Peter Zijlstra (Intel)" <peterz@infradead.org>,
	Christophe Leroy <christophe.leroy@csgroup.eu>,
	Josh Poimboeuf <jpoimboe@kernel.org>,
	mcgrof@kernel.org, Liu Shixin <liushixin2@huawei.com>
Subject: Re: [PATCH] static_call: Handle module init failure correctly in static_call_del_module()
Date: Thu, 05 Sep 2024 11:44:00 +0200	[thread overview]
Message-ID: <87a5gm5tb3.ffs@tglx> (raw)
In-Reply-To: <50551f21-6e90-3556-7a3d-8b81a042f99c@huawei.com>

On Thu, Sep 05 2024 at 11:34, Jinjie Ruan wrote:
> On 2024/9/4 16:51, Thomas Gleixner wrote:
>> Can you spot the problem?
>
> It seems that it is not the memory leak source,

That's possible, but the code in there is definitely incorrect when
pci_register_driver() fails. Simply because module->exit() is not
invoked when module->init() fails.

> but with the Link patch, the problem solved and the memory leak missed.

Sure, but the change log of this patch is not really helpful at all and
you sent me a gazillion of leak dumps, but failed to explain what the
actual failure scenario is.

So without a coherent description how to reproduce this issue there is
not much I can do than looking at the code. The amdgpu init() function
was an obvious candidate, but there seems to be some other way to cause
that problem.

Now you at least provided the information that the missing cleanup in
the init() function is not the problem. So the obvious place to look is
in the module core code whether there is a failure path _after_
module->init() returned success.

do_init_module()
        ret = do_one_initcall(mod->init);
        ...
	ret = module_enable_rodata_ro(mod, true);
	if (ret)
		goto fail_mutex_unlock;

and that error path does _not_ invoke module->exit(), which is obviously
not correct. Luis?

I assume that's not the problem when looking at the change log of that:

 https://lore.kernel.org/all/20221112114602.1268989-4-liushixin2@huawei.com/

> Following the rules stated in the comment for kobject_init_and_add():
>  If this function returns an error, kobject_put() must be called to
>  properly clean up the memory associated with the object.
> 
> kobject_put() is more appropriate for error handling after kobject_init().

Why? The rules are exactly the same, no?

> And we can use this function to solve this problem.

Which function and which problem?

> For the cache created early, the related sysfs_slab_add() is called in
> slab_sysfs_init(). Skip free these kmem_cache since they are important
> for system. Keep them working without sysfs.

Let me try to decode this. I assume that sysfs_slab_add() fails after
kobject_init_and_add() or kobject_init_and_add) itself fails.

So the problem is that there are two ways how this can be invoked:

   1) from slab_sysfs_init() for the kmem_cache instances which have
      been allocated during early boot before sysfs was available.

   2) from kmem_cache_create() after early boot

So what Liu tries to avoid is to invoke kobject_put(), because
kobject_put() would not only free the name (which is what the leak
complains about), but also invoke the release callback, which would 
try to destroy the kmem_cache itself.

That's a problem for the mm people to solve, but that does not make my
observations about the amdgpu init() function and the error path in
do_init_module() moot :)

Thanks,

        tglx



  reply	other threads:[~2024-09-05  9:44 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-15  8:21 [PATCH] static_call: Fix a wild-memory-access bug in static_call_del_module() Jinjie Ruan
2023-09-15  9:01 ` Peter Zijlstra
2024-09-02 12:01 ` Jinjie Ruan
2024-09-02 12:06   ` Christophe Leroy
2024-09-02 12:07     ` Jinjie Ruan
2024-09-02 12:14       ` Christophe Leroy
2024-09-03 20:01   ` Thomas Gleixner
2024-09-03 22:58     ` [PATCH] static_call: Handle module init failure correctly " Thomas Gleixner
2024-09-04  3:32       ` Jinjie Ruan
2024-09-04  7:08         ` Thomas Gleixner
2024-09-04  8:00           ` Thomas Gleixner
2024-09-04  8:17             ` Peter Zijlstra
2024-09-04  9:08               ` [PATCH] static_call: Replace pointless WARN_ON() in static_call_module_notify() Thomas Gleixner
2024-09-06 14:41                 ` [tip: locking/urgent] " tip-bot2 for Thomas Gleixner
2024-09-04  8:03           ` [PATCH] static_call: Handle module init failure correctly in static_call_del_module() Jinjie Ruan
2024-09-04  8:51             ` Thomas Gleixner
2024-09-04  9:50               ` Jinjie Ruan
2024-09-05  3:34               ` Jinjie Ruan
2024-09-05  9:44                 ` Thomas Gleixner [this message]
2024-09-06 23:24                   ` Luis Chamberlain
2024-09-19  9:53                     ` Luis Chamberlain
2024-09-24  7:22                       ` Mike Rapoport
2024-11-08  8:12                         ` Christophe Leroy
2024-11-08 15:49                           ` Luis Chamberlain
2024-11-08 16:09                             ` Daniel Gomez
2024-11-08 16:22                               ` Daniel Gomez
2024-09-04  9:09             ` [PATCH v2] " Thomas Gleixner
2024-09-04  9:19               ` Jinjie Ruan
2024-09-06 14:41               ` [tip: locking/urgent] " tip-bot2 for Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a5gm5tb3.ffs@tglx \
    --to=tglx@linutronix.de \
    --cc=christophe.leroy@csgroup.eu \
    --cc=jpoimboe@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=liushixin2@huawei.com \
    --cc=mcgrof@kernel.org \
    --cc=peterz@infradead.org \
    --cc=ruanjinjie@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox