Re: [PATCH v3 1/2] accel/rocket: Fix error path handling in rocket_job_run()

All of lore.kernel.org
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: ZhaoJinming <zhaojinming@uniontech.com>
Cc: dri-devel@lists.freedesktop.org
Subject: Re: [PATCH v3 1/2] accel/rocket: Fix error path handling in rocket_job_run()
Date: Tue, 09 Jun 2026 09:50:57 +0000	[thread overview]
Message-ID: <20260609095057.877A21F00893@smtp.kernel.org> (raw)
In-Reply-To: <20260609093346.380396-1-zhaojinming@uniontech.com>

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:

Pre-existing issues:
- [High] Calling `drm_sched_job_cleanup()` after `drm_sched_job_arm()` violates the DRM scheduler API contract.
- [High] The runtime suspend callback fails to synchronize the shared IRQ before powering down.
- [High] Potential NULL pointer dereference in `iommu_attach_group()` if the device lacks an IOMMU group.
--

commit a91812825621124e6ae3a091adc6205cbeefed80
Author: ZhaoJinming <zhaojinming@uniontech.com>

accel/rocket: Fix error path handling in rocket_job_run()

This commit fixes multiple issues in the error paths of rocket_job_run(). It
resolves a dma_fence reference leak, correctly balances the runtime PM
usage counter on failure by switching to pm_runtime_resume_and_get(), and
returns ERR_PTR(ret) instead of an unsignaled fence to avoid DRM scheduler
warnings.

> diff --git a/drivers/accel/rocket/rocket_job.c b/drivers/accel/rocket/rocket_job.c
> index 2f1861f960ccb..cf9f61682dd8b 100644
> --- a/drivers/accel/rocket/rocket_job.c
> +++ b/drivers/accel/rocket/rocket_job.c
> @@ -310,13 +310,22 @@ static struct dma_fence *rocket_job_run(struct drm_sched_job *sched_job)

[ ... ]

>  	ret = iommu_attach_group(job->domain->domain, core->iommu_group);
> -	if (ret < 0)
> -		return fence;
> +	if (ret < 0) {
> +		pm_runtime_put(core->dev);
> +		dma_fence_put(job->done_fence);
> +		job->done_fence = NULL;
> +		dma_fence_put(fence);
> +		return ERR_PTR(ret);
> +	}

[Severity: High]
This is a pre-existing issue, but can core->iommu_group be NULL here?

If the device does not have an IOMMU group (e.g., IOMMU is disabled or not
present in the device tree), iommu_group_get() returns NULL. Calling
iommu_attach_group() unconditionally with a NULL group parameter will cause a
kernel crash when it attempts to dereference the pointer.

Should there be a check for a valid iommu_group before attaching?

[ ... ]

I also noticed a couple of pre-existing issues in this driver:

[Severity: High]
This isn't a bug introduced by this patch, but is it safe to call
drm_sched_job_cleanup() after drm_sched_job_arm() has already been called?

In drivers/accel/rocket/rocket_job.c:rocket_job_push(), if
rocket_acquire_object_fences() fails:

	drm_sched_job_arm(&job->base);
	...
	ret = rocket_acquire_object_fences(job->in_bos, job->in_bo_count,
					   &job->base, false);
	if (ret)
		goto err_unlock;

The error path returns to rocket_ioctl_submit_job(), which then calls:

out_cleanup_job:
	if (ret)
		drm_sched_job_cleanup(&rjob->base);

The DRM scheduler API contract states that drm_sched_job_arm() is a point of
no return that initializes fences and sequence numbers. Calling
drm_sched_job_cleanup() after this breaks the dma_fence sequence numbers,
potentially causing any context waiting on the aborted sequence number to hang
indefinitely.

[Severity: High]
This is a pre-existing issue, but does the runtime suspend callback need to
synchronize the shared IRQ before powering down?

In drivers/accel/rocket/rocket_drv.c:rocket_device_runtime_suspend():

	clk_bulk_disable_unprepare(ARRAY_SIZE(rdev->cores[core].clks),
				   rdev->cores[core].clks);

The Rocket driver registers its IRQ handler with IRQF_SHARED. According to
the power management subsystem rules, drivers using shared IRQs must call
synchronize_irq() in their runtime suspend callback before disabling clocks.

Without this, could the hardware be powered off while the IRQ handler is
mid-flight and reading registers, potentially leading to system hangs or
spurious wakeups?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260609093346.380396-1-zhaojinming@uniontech.com?part=1

     prev parent reply	other threads:[~2026-06-09  9:51 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-08  8:38 [PATCH] accel/rocket: Fix dma_fence reference leak on error paths in rocket_job_run() ZhaoJinming
2026-06-08  8:50 ` sashiko-bot
2026-06-08 10:00   ` [PATCH v2] accel/rocket: Fix fence leak, runtime PM leak and WARN in rocket_job_run() error paths ZhaoJinming
2026-06-08 10:38     ` sashiko-bot
2026-06-09  9:33       ` [PATCH v3 1/2] accel/rocket: Fix error path handling in rocket_job_run() ZhaoJinming
2026-06-09  9:33         ` [PATCH v3 2/2] accel/rocket: Fix iommu_group leak and unsafe IRQ register access ZhaoJinming
2026-06-09  9:47           ` sashiko-bot
2026-06-09  9:50         ` sashiko-bot [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260609095057.877A21F00893@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=sashiko-reviews@lists.linux.dev \
    --cc=zhaojinming@uniontech.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.