From: Greg KH <gregkh@linuxfoundation.org>
To: Deepak Sharma <deepak.sharma.472935@gmail.com>
Cc: linux-kernel@vger.kernel.org,
linux-kernel-mentees@lists.linux.dev,
syzbot+6c905ab800f20cf4086c@syzkaller.appspotmail.com
Subject: Re: [PATCH] drivers: core: Fix synchronization of removal of device with rpm work
Date: Fri, 17 Oct 2025 09:41:21 +0200 [thread overview]
Message-ID: <2025101714-fiction-reprocess-9368@gregkh> (raw)
In-Reply-To: <20250917030955.41708-1-deepak.sharma.472935@gmail.com>
On Wed, Sep 17, 2025 at 08:39:55AM +0530, Deepak Sharma wrote:
> Syzbot reports a use-after-free at `rpm_suspend`, while the free
> occurs at the `usb_disconnect`
>
> All line numbers references will be for commit ID
> d69eb204c255c35abd9e8cb621484e8074c75eaa
Which is 6.17-rc5?
Please always include the full commit information when referencing git
ids. This would be:
d69eb204c255 ("Merge tag 'net-6.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net")
Which is an odd point in our tree :)
> This points to a possible synchronization issue. In `usb_disconnect`
> there's a call to `pm_runtime_barrier` but it does nothing more than
> acting as a sort of "flush" (while cancelling what's the pending
> rpm actions not started yet). There does not seem to be any increase
> in device usage count either in this stacktrace after this stacktrace
How is syzbot triggering any of this? How is it disconnecting a device,
is this through the gadget api or something else?
> Then we have an eventual call to `device_del`, which further leads
> to a call to `device_pm_remove`. No code synchronizing in any way
> so far with the PM system after that `pm_runtime_barrier`
>
> Let's say now that the timer expiration queued work for `rpm_suspend`
> executed in this period of absent synchronization. We can create few
> interesting situations here, I will address one
>
> Let's say that we unlock the `dev->power.lock` at `rpm_suspend`
> work at `drivers/base/power/runtime.c:723` and then the code
> `device_pm_remove` proceeds as normal clearing up the device.
> Any further calls are not going to cancel the tasks we have pending
> and since the lock has been given up, we will proceed, and end up
> deleting the device too, which will lead to a use-after-free
> as observed.
>
> So at the device removal, we could add a `pm_runtime_forbid`,
> followed by a `pm_runtime_barrier`. This leads to the completion of
> any pending work and forbids any other new work to be added.
>
> Once we return, we can do `device_pm_remove`. `pm_runtime_forbid`
> does not seem to influence the behavior of `device_pm_remove`
> (tho it does lead to a call to `pm_runtime_get_noresume()` which
> touches the device usage count, but it would still work the same)
>
> Reported-by: syzbot+6c905ab800f20cf4086c@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=6c905ab800f20cf4086c
> Signed-off-by: Deepak Sharma <deepak.sharma.472935@gmail.com>
> ---
> drivers/base/core.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index d22d6b23e758..616fd02d18ed 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -3876,7 +3876,13 @@ void device_del(struct device *dev)
> device_remove_file(dev, &dev_attr_uevent);
> device_remove_attrs(dev);
> bus_remove_device(dev);
> + /* We need to forbid and then proceed with a barrier here,
> + * so that any pending work is flushed
> + */
Trailing whitespace which checkpatch should have caught :(
Also odd comment style.
And you don't document what type of barrier or what type of pending work
you are flushing.
> + pm_runtime_forbid(dev);
> + pm_runtime_barrier(dev);
> device_pm_remove(dev);
> + pm_runtime_allow(dev);
Why are you allowing this to happen again? The device is going away, it
should be stopped by now as per the bus removal.
This all feels very fragile.
thanks,
greg k-h
prev parent reply other threads:[~2025-10-17 7:41 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-17 3:09 [PATCH] drivers: core: Fix synchronization of removal of device with rpm work Deepak Sharma
2025-10-17 7:41 ` Greg KH [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2025101714-fiction-reprocess-9368@gregkh \
--to=gregkh@linuxfoundation.org \
--cc=deepak.sharma.472935@gmail.com \
--cc=linux-kernel-mentees@lists.linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=syzbot+6c905ab800f20cf4086c@syzkaller.appspotmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox