From: Conor Dooley <conor@kernel.org>
To: sashiko-reviews@lists.linux.dev
Cc: MidG971 <midgy971@gmail.com>,
conor+dt@kernel.org, robh@kernel.org, devicetree@vger.kernel.org,
dri-devel@lists.freedesktop.org
Subject: Re: [RFC PATCH v4 6/9] dt-bindings: npu: rockchip,rk3588-rknn-core: Add RK3568
Date: Mon, 15 Jun 2026 17:48:48 +0100 [thread overview]
Message-ID: <20260615-platter-washroom-043ea869adb1@spud> (raw)
In-Reply-To: <20260613071113.C60751F00A3D@smtp.kernel.org>
[-- Attachment #1: Type: text/plain, Size: 6723 bytes --]
On Sat, Jun 13, 2026 at 07:11:13AM +0000, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:
>
> Pre-existing issues:
> - [Critical] Double free of `rjob->tasks` leading to memory corruption triggerable from userspace.
> - [Critical] Unconditional dereference of a NULL `job->domain` triggering a kernel panic on early job submission failure.
> - [High] Permanent PM runtime reference count leak when `iommu_attach_group()` fails.
> - [High] Zero-initialized clock IDs cause the framework to fetch the same clock repeatedly, violating hardware constraints.
> - [Medium] Premature DRM device registration exposes an incomplete state to userspace.
Why is the bot complaining about drivers here?
> --
>
> commit d0d76b931ac773e3e176353cdb3ac9c590c43dc3
> Author: Midgy BALON <midgy971@gmail.com>
>
> dt-bindings: npu: rockchip, rk3588-rknn-core: Add RK3568
>
> The RK3568 carries a single core of the same NVDLA-derived NPU IP as the
> RK3588. Add its compatible.
>
> > diff --git a/Documentation/devicetree/bindings/npu/rockchip,rk3588-rknn-core.yaml b/Documentation/devicetree/bindings/npu/rockchip,rk3588-rknn-core.yaml
> > index caca2a4903cd1..e0b948ac47d45 100644
> > --- a/Documentation/devicetree/bindings/npu/rockchip,rk3588-rknn-core.yaml
> > +++ b/Documentation/devicetree/bindings/npu/rockchip,rk3588-rknn-core.yaml
> [ ... ]
>
> drivers/accel/rocket/rocket_device.c:rocket_device_init() {
> ...
> err = drm_dev_register(ddev, 0);
> ...
> }
>
> drivers/accel/rocket/rocket_drv.c:rocket_probe() {
> ...
> rdev->cores[core].dev = &pdev->dev;
> ...
> }
>
> [Severity: Medium]
> This is a pre-existing issue, but does this premature registration expose an
> incomplete state to userspace?
>
> drm_dev_register() is called in rocket_device_init() before rocket_probe()
> assigns rdev->cores[0].dev.
>
> As a result, the device node is briefly visible to userspace while its core
> structures remain partially uninitialized.
>
> If userspace attempts to open the DRM device during this window, rocket_open()
> passes the NULL rdev->cores[0].dev to rocket_iommu_domain_create().
>
> While the IOMMU framework safely catches the NULL device and returns -ENODEV,
> this still causes a spurious open() failure for userspace applications relying
> on udev events.
>
>
> drivers/accel/rocket/rocket_job.c:rocket_job_run() {
> ...
> ret = pm_runtime_get_sync(core->dev);
> if (ret < 0)
> return fence;
> ...
> ret = iommu_attach_group(job->domain->domain, core->iommu_group);
> if (ret < 0)
> return fence;
> ...
> }
>
> [Severity: High]
> This is a pre-existing issue, but does this cause a permanent PM runtime
> reference count leak when iommu_attach_group() fails?
>
> In rocket_job_run(), pm_runtime_get_sync() increments the PM usage count to wake
> up the NPU.
>
> If the subsequent iommu_attach_group() fails, the driver returns the allocated
> fence and exits early without calling pm_runtime_put().
>
> Because the driver exits before assigning core->in_flight_job, the subsequent
> timeout handler rocket_reset() will also skip the manual
> pm_runtime_put_noidle().
>
> This causes a permanent PM runtime reference count leak, preventing the NPU
> hardware from ever entering runtime suspend.
>
>
> drivers/accel/rocket/rocket_core.c:rocket_core_init() {
> ...
> err = devm_clk_bulk_get(dev, ARRAY_SIZE(core->clks), core->clks);
> ...
> }
>
> [Severity: High]
> This is a pre-existing issue, but do these zero-initialized clock IDs cause the
> framework to fetch the same clock repeatedly?
>
> In rocket_core_init(), the driver declares an array of clk_bulk_data but never
> initializes the .id fields, leaving them implicitly NULL.
>
> When devm_clk_bulk_get() is called, the clock framework handles the NULL IDs by
> returning the first clock defined in the device tree for the device on all four
> iterations.
>
> Consequently, rocket_device_runtime_resume() enables the exact same clock four
> times, while the remaining required clocks are left disabled.
>
> This violates physical hardware state requirements and causes unclocked NPU
> register accesses.
>
>
> drivers/accel/rocket/rocket_job.c:rocket_copy_tasks() {
> ...
> fail:
> kvfree(rjob->tasks);
> return ret;
> }
>
> drivers/accel/rocket/rocket_job.c:rocket_job_cleanup() {
> ...
> kvfree(job->tasks);
> ...
> }
>
> [Severity: Critical]
> This is a pre-existing issue, but does this lead to a double free of rjob->tasks
> that can be triggered from userspace?
>
> In rocket_copy_tasks(), if copy_from_user() fails, kvfree(rjob->tasks) is
> called, but the pointer rjob->tasks is not cleared.
>
> The function returns an error to rocket_ioctl_submit_job(), which jumps to
> out_cleanup_job and invokes rocket_job_put().
>
> When the reference count drops, rocket_job_cleanup() is executed,
> unconditionally calling kvfree(job->tasks) again on the already freed memory
> block.
>
> Since the IOCTL input is fully controlled by userspace, an unprivileged attacker
> can trigger a double free by submitting an invalid task pointer.
>
>
> drivers/accel/rocket/rocket_job.c:rocket_ioctl_submit_job() {
> ...
> rjob = kzalloc_obj(*rjob);
> ...
> ret = rocket_copy_tasks(dev, file, job, rjob);
> if (ret)
> goto out_cleanup_job;
> ...
> }
>
> drivers/accel/rocket/rocket_drv.c:rocket_iommu_domain_put() {
> ...
> kref_put(&domain->kref, rocket_iommu_domain_destroy);
> ...
> }
>
> [Severity: Critical]
> This is a pre-existing issue, but does this unconditionally dereference a NULL
> job->domain, triggering a kernel panic on early job submission failure?
>
> In rocket_ioctl_submit_job(), rjob is allocated via kzalloc_obj(), leaving
> rjob->domain initialized to NULL.
>
> If a failure occurs before the domain is assigned, such as an invalid task
> pointer in rocket_copy_tasks() or a bad BO handle in drm_gem_objects_lookup(),
> the driver jumps to the cleanup path.
>
> The cleanup handler, rocket_job_cleanup(), unconditionally passes the NULL
> domain to rocket_iommu_domain_put(), which invokes kref_put(&domain->kref, ...).
>
> Computing &domain->kref yields an offset address, and refcount_dec_and_test()
> immediately dereferences this invalid address, resulting in a kernel panic that
> can be triggered by unprivileged userspace.
>
> --
> Sashiko AI review · https://sashiko.dev/#/patchset/20260613070116.438906-1-midgy971@gmail.com?part=6
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
next prev parent reply other threads:[~2026-06-15 16:48 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-13 7:01 [RFC PATCH v4 0/9] accel: rocket: Add RK3568 NPU support MidG971
2026-06-13 7:01 ` [RFC PATCH v4 1/9] accel: rocket: Introduce per-SoC rocket_soc_data MidG971
2026-06-13 7:10 ` sashiko-bot
2026-06-13 7:01 ` [RFC PATCH v4 2/9] accel: rocket: Derive DMA width and core count from match data MidG971
2026-06-13 7:09 ` sashiko-bot
2026-06-13 7:01 ` [RFC PATCH v4 3/9] accel: rocket: Add RK3568 SoC support MidG971
2026-06-13 7:11 ` sashiko-bot
2026-06-13 7:01 ` [RFC PATCH v4 4/9] accel: rocket: Reset the NPU before detaching the IOMMU on timeout MidG971
2026-06-13 7:15 ` sashiko-bot
2026-06-13 7:01 ` [RFC PATCH v4 5/9] accel: rocket: Keep the IOMMU domain attached across jobs MidG971
2026-06-13 7:12 ` sashiko-bot
2026-06-13 7:01 ` [RFC PATCH v4 6/9] dt-bindings: npu: rockchip,rk3588-rknn-core: Add RK3568 MidG971
2026-06-13 7:11 ` sashiko-bot
2026-06-15 16:48 ` Conor Dooley [this message]
2026-06-15 16:49 ` Conor Dooley
2026-06-13 7:01 ` [RFC PATCH v4 7/9] arm64: dts: rockchip: rk356x: Add the NPU and its IOMMU MidG971
2026-06-13 7:09 ` sashiko-bot
2026-06-13 8:18 ` Jonas Karlman
2026-06-13 7:01 ` [RFC PATCH v4 8/9] arm64: dts: rockchip: rk3568-rock-3b: Enable the NPU MidG971
2026-06-13 7:40 ` Jonas Karlman
2026-06-13 7:01 ` [RFC PATCH v4 9/9] pmdomain: rockchip: Add a regulator to the RK3568 NPU power domain MidG971
2026-06-13 7:21 ` sashiko-bot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260615-platter-washroom-043ea869adb1@spud \
--to=conor@kernel.org \
--cc=conor+dt@kernel.org \
--cc=devicetree@vger.kernel.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=midgy971@gmail.com \
--cc=robh@kernel.org \
--cc=sashiko-reviews@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox