From: Cristian Marussi <cristian.marussi@arm.com>
To: Catalin Marinas <catalin.marinas@arm.com>
Cc: Cristian Marussi <cristian.marussi@arm.com>,
Alice Ryhl <aliceryhl@google.com>,
Sudeep Holla <sudeep.holla@arm.com>,
linux-arm-kernel@lists.infradead.org, arm-scmi@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [Bug report] Memory leak in scmi_device_create
Date: Thu, 6 Mar 2025 18:43:45 +0000 [thread overview]
Message-ID: <Z8ns4T6pas9DEy8B@pluto> (raw)
In-Reply-To: <Z8nK3uFkspy61yjP@arm.com>
On Thu, Mar 06, 2025 at 04:18:38PM +0000, Catalin Marinas wrote:
> On Thu, Mar 06, 2025 at 03:47:27PM +0000, Cristian Marussi wrote:
> > On Thu, Mar 06, 2025 at 02:36:16PM +0000, Catalin Marinas wrote:
> > > This loop in scmi_device_create() looks strange:
> > >
> > > list_for_each_entry(rdev, phead, node) {
> > > struct scmi_device *sdev;
> > >
> > > sdev = __scmi_device_create(np, parent,
> > > rdev->id_table->protocol_id,
> > > rdev->id_table->name);
> > > /* Report errors and carry on... */
> > > if (sdev)
> > > scmi_dev = sdev;
> > > else
> > > pr_err("(%s) Failed to create device for protocol 0x%x (%s)\n",
> > > of_node_full_name(parent->of_node),
> > > rdev->id_table->protocol_id,
> > > rdev->id_table->name);
> > > }
> > >
> > > We can override scmi_dev a few times in the loop and lose the previous
> > > sdev allocations. Is this intended?
> >
> > Yes...it is weird..but by design I would say :P ...
> >
> > ...because this is called to instantiate one single device OR instantiate at
> > once all the multiple devices needed for a protocol: in this latter case it
> > returns just one of the created devices to signal success or NULL if all the
> > devices' creation failed....we dont need to keep the allocated devices references
> > anyway here since on success those devices are now referenced and kept on the
> > SCMI bus, so they can be searched/scanned/destroyed from there.
>
> Not sure why the pointer isn't found, device_add() should link it with
> the parent. Unless something else fails, the parent is freed and the
> linked devices unreachable. I'm not familiar at all with this code, I
> just saw kmemleak and thought of replying.
>
> The loop is still weird, scmi_chan_setup() seems to use the pointer to
> scmi_device for something more meaningful than a pass/fail check. Also
> the overall result is based only on what the last __scmi_device_create()
> return value was, irrespective of the previous iterations of the loop.
> You do have a pr_err() but no early bailing out of the loop on failure.
> I'm curious if there are any SCMI errors in the Alice's kernel log.
>
Yes, the weirdness comes from the fact such function is used alternatively
to create a single named device (and make some use of it, like in
scmi_chan_setup) OR to create a bunch of devices for the same protocol
when no specific device is asked for (name==NULL)...anyway the case at
hand that kmemleak complains about does NOT pass through that weird loop...
...good news is, I was able to reproduce a similar report consistently
with a load/unload/load sequence....the culprit is that when looking for
a device to destroy on unload, the SCMI bus uses device_find_child()
and that bumps the device refcnt implicitly...as a result when the device
is destroyed the refcnt is NEVER found as zero and so NO device_release
is ever called...this results in dev->p private_data to be never released
and that is what kmemleak spotted (at the start of teh chain):
unreferenced object 0xffff00000f583800 (size 512):
comm "insmod", pid 227, jiffies 4294912190
hex dump (first 32 bytes):
00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00 .....N..........
ff ff ff ff ff ff ff ff 60 36 1d 8a 00 80 ff ff ........`6......
backtrace (crc 114e2eed):
kmemleak_alloc+0xbc/0xd8
__kmalloc_cache_noprof+0x2dc/0x398
device_add+0x954/0x12d0
device_register+0x28/0x40
__scmi_device_create.part.0+0x1bc/0x380
scmi_device_create+0x2d0/0x390
scmi_create_protocol_devices+0x74/0xf8
scmi_device_request_notifier+0x1f8/0x2a8
notifier_call_chain+0x110/0x3b0
blocking_notifier_call_chain+0x70/0xb0
scmi_driver_register+0x350/0x7f0
0xffff80000a3b3038
do_one_initcall+0x12c/0x730
do_init_module+0x1dc/0x640
load_module+0x4b20/0x5b70
init_module_from_file+0xec/0x158
$ ./scripts/faddr2line ./vmlinux device_add+0x954/0x12d0
device_add+0x954/0x12d0:
kmalloc_noprof at include/linux/slab.h:901
(inlined by) kzalloc_noprof at include/linux/slab.h:1037
(inlined by) device_private_init at drivers/base/core.c:3510
(inlined by) device_add at drivers/base/core.c:3561
I am posting a fix.
Thanks for the report and the help.
Any feedback and testing is much welcomed :D
Cristian
prev parent reply other threads:[~2025-03-06 18:47 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-05 11:59 [Bug report] Memory leak in scmi_device_create Alice Ryhl
2025-03-05 17:10 ` Cristian Marussi
2025-03-06 11:09 ` Alice Ryhl
2025-03-06 12:50 ` Cristian Marussi
2025-03-06 13:25 ` Alice Ryhl
2025-03-06 14:36 ` Catalin Marinas
2025-03-06 15:47 ` Cristian Marussi
2025-03-06 16:18 ` Catalin Marinas
2025-03-06 18:43 ` Cristian Marussi [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z8ns4T6pas9DEy8B@pluto \
--to=cristian.marussi@arm.com \
--cc=aliceryhl@google.com \
--cc=arm-scmi@vger.kernel.org \
--cc=catalin.marinas@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=sudeep.holla@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).