linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Cristian Marussi <cristian.marussi@arm.com>
To: Catalin Marinas <catalin.marinas@arm.com>
Cc: Cristian Marussi <cristian.marussi@arm.com>,
	Alice Ryhl <aliceryhl@google.com>,
	Sudeep Holla <sudeep.holla@arm.com>,
	linux-arm-kernel@lists.infradead.org, arm-scmi@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [Bug report] Memory leak in scmi_device_create
Date: Thu, 6 Mar 2025 18:43:45 +0000	[thread overview]
Message-ID: <Z8ns4T6pas9DEy8B@pluto> (raw)
In-Reply-To: <Z8nK3uFkspy61yjP@arm.com>

On Thu, Mar 06, 2025 at 04:18:38PM +0000, Catalin Marinas wrote:
> On Thu, Mar 06, 2025 at 03:47:27PM +0000, Cristian Marussi wrote:
> > On Thu, Mar 06, 2025 at 02:36:16PM +0000, Catalin Marinas wrote:
> > > This loop in scmi_device_create() looks strange:
> > > 
> > > 	list_for_each_entry(rdev, phead, node) {
> > > 		struct scmi_device *sdev;
> > > 
> > > 		sdev = __scmi_device_create(np, parent,
> > > 					    rdev->id_table->protocol_id,
> > > 					    rdev->id_table->name);
> > > 		/* Report errors and carry on... */
> > > 		if (sdev)
> > > 			scmi_dev = sdev;
> > > 		else
> > > 			pr_err("(%s) Failed to create device for protocol 0x%x (%s)\n",
> > > 			       of_node_full_name(parent->of_node),
> > > 			       rdev->id_table->protocol_id,
> > > 			       rdev->id_table->name);
> > > 	}
> > > 
> > > We can override scmi_dev a few times in the loop and lose the previous
> > > sdev allocations. Is this intended?
> > 
> > Yes...it is weird..but by design I would say :P ...
> > 
> > ...because this is called to instantiate one single device OR instantiate at
> > once all the multiple devices needed for a protocol: in this latter case it
> > returns just one of the created devices to signal success or NULL if all the
> > devices' creation failed....we dont need to keep the allocated devices references
> > anyway here since on success those devices are now referenced and kept on the
> > SCMI bus, so they can be searched/scanned/destroyed from there.
> 
> Not sure why the pointer isn't found, device_add() should link it with
> the parent. Unless something else fails, the parent is freed and the
> linked devices unreachable. I'm not familiar at all with this code, I
> just saw kmemleak and thought of replying.
> 
> The loop is still weird, scmi_chan_setup() seems to use the pointer to
> scmi_device for something more meaningful than a pass/fail check. Also
> the overall result is based only on what the last __scmi_device_create()
> return value was, irrespective of the previous iterations of the loop.
> You do have a pr_err() but no early bailing out of the loop on failure.
> I'm curious if there are any SCMI errors in the Alice's kernel log.
> 

Yes, the weirdness comes from the fact such function is used alternatively
to create a single named device (and make some use of it, like in
scmi_chan_setup) OR to create a bunch of devices for the same protocol
when no specific device is asked for (name==NULL)...anyway the case at
hand that kmemleak complains about does NOT pass through that weird loop...

...good news is, I was able to reproduce a similar report consistently
with a load/unload/load sequence....the culprit is that when looking for
a device to destroy on unload, the SCMI bus uses device_find_child()
and that bumps the device refcnt implicitly...as a result when the device
is destroyed the refcnt is NEVER found as zero and so NO device_release
is ever called...this results in dev->p private_data to be never released
and that is what kmemleak spotted (at the start of teh chain):


unreferenced object 0xffff00000f583800 (size 512):
      comm "insmod", pid 227, jiffies 4294912190
      hex dump (first 32 bytes):
        00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00  .....N..........
        ff ff ff ff ff ff ff ff 60 36 1d 8a 00 80 ff ff  ........`6......
      backtrace (crc 114e2eed):
        kmemleak_alloc+0xbc/0xd8
        __kmalloc_cache_noprof+0x2dc/0x398
        device_add+0x954/0x12d0
        device_register+0x28/0x40
        __scmi_device_create.part.0+0x1bc/0x380
        scmi_device_create+0x2d0/0x390
        scmi_create_protocol_devices+0x74/0xf8
        scmi_device_request_notifier+0x1f8/0x2a8
        notifier_call_chain+0x110/0x3b0
        blocking_notifier_call_chain+0x70/0xb0
        scmi_driver_register+0x350/0x7f0
        0xffff80000a3b3038
        do_one_initcall+0x12c/0x730
        do_init_module+0x1dc/0x640
        load_module+0x4b20/0x5b70
        init_module_from_file+0xec/0x158
    
    $ ./scripts/faddr2line ./vmlinux device_add+0x954/0x12d0
    device_add+0x954/0x12d0:
    kmalloc_noprof at include/linux/slab.h:901
    (inlined by) kzalloc_noprof at include/linux/slab.h:1037
    (inlined by) device_private_init at drivers/base/core.c:3510
    (inlined by) device_add at drivers/base/core.c:3561

I am posting a fix.

Thanks for the report and the help.
Any feedback and testing is much welcomed :D

Cristian


      reply	other threads:[~2025-03-06 18:47 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-05 11:59 [Bug report] Memory leak in scmi_device_create Alice Ryhl
2025-03-05 17:10 ` Cristian Marussi
2025-03-06 11:09   ` Alice Ryhl
2025-03-06 12:50     ` Cristian Marussi
2025-03-06 13:25       ` Alice Ryhl
2025-03-06 14:36     ` Catalin Marinas
2025-03-06 15:47       ` Cristian Marussi
2025-03-06 16:18         ` Catalin Marinas
2025-03-06 18:43           ` Cristian Marussi [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z8ns4T6pas9DEy8B@pluto \
    --to=cristian.marussi@arm.com \
    --cc=aliceryhl@google.com \
    --cc=arm-scmi@vger.kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sudeep.holla@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).