From: sashiko-bot@kernel.org
To: "Sven Püschel" <s.pueschel@pengutronix.de>
Cc: linux-media@vger.kernel.org
Subject: Re: [PATCH 15/17] media: rockchip: rga: schedule jobs to multiple cores
Date: Fri, 05 Jun 2026 22:25:49 +0000 [thread overview]
Message-ID: <20260605222549.80D521F00898@smtp.kernel.org> (raw)
In-Reply-To: <20260606-spu-rga3multicore-v1-15-3ec2b15675f7@pengutronix.de>
Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:
New issues:
- [High] Race condition in device_run() error path and unhandled early return permanently deadlock the mem2mem queue.
- [Medium] Data race on core->curr pointer when clearing job assignments.
Pre-existing issues:
- [High] Out-of-bounds array write on unbind/rebind of components because rga->num_cores is never reset.
- [High] Missing pm_runtime_get_if_active() before hardware access in shared IRQ handler.
- [High] Missing synchronize_irq() in runtime suspend callback for shared IRQ.
--
commit f03021dff03be4350cbe7b0ab4ce4a9a54a7602f
Author: Sven Püschel <s.pueschel@pengutronix.de>
media: rockchip: rga: schedule jobs to multiple cores
This commit modifies the Rockchip RGA driver to schedule jobs across
multiple cores to better utilize hardware. A dedicated spinlock is added
to protect the selection of the next free core from race conditions.
> diff --git a/drivers/media/platform/rockchip/rga/rga.c b/drivers/media/platform/rockchip/rga/rga.c
> index 9cebb461b3fd2..f00b7f99f2521 100644
> --- a/drivers/media/platform/rockchip/rga/rga.c
> +++ b/drivers/media/platform/rockchip/rga/rga.c
> @@ -38,15 +38,31 @@ static void device_run(void *prv)
> {
> struct rga_ctx *ctx = prv;
> struct rockchip_rga *rga = ctx->rga;
> - struct rga_core *core = rga->cores[0];
> + struct rga_core *core = NULL;
> struct vb2_v4l2_buffer *src, *dst;
> unsigned long flags;
> int ret;
> + unsigned int i;
> +
> + spin_lock_irqsave(&rga->cores_lock, flags);
> + for (i = 0; i < rga->num_cores; i++) {
> + if (!rga->cores[i]->curr) {
> + core = rga->cores[i];
> + core->curr = ctx;
> + break;
> + }
> + }
> + spin_unlock_irqrestore(&rga->cores_lock, flags);
> +
> + WARN_ONCE(!core, "No free core although max parallel jobs matches the core count!\n");
> + if (!core)
> + return;
>
> ret = pm_runtime_resume_and_get(core->dev);
> if (ret < 0) {
> v4l2_m2m_buf_done_and_job_finish(rga->m2m_dev, ctx->fh.m2m_ctx,
> VB2_BUF_STATE_ERROR);
> + core->curr = NULL;
[Severity: High]
Could this error path introduce a race condition that permanently deadlocks the
mem2mem queue?
v4l2_m2m_buf_done_and_job_finish() is called before core->curr is cleared,
which asynchronously schedules the next job via a workqueue.
If the worker executes device_run() on another CPU before the current CPU marks
the core as free, the worker will see the core as busy. If no other core is
free, device_run() will trigger the WARN_ONCE and return early without
notifying the v4l2 mem2mem framework:
if (!core)
return;
Because the new job is already marked as TRANS_RUNNING, returning early without
signaling job completion permanently hangs the job queue, and can cause a
D-state deadlock if userspace attempts to close the device file descriptor.
[Severity: Medium]
Additionally, could there be a data race when clearing core->curr here?
While cores_lock is acquired when scanning and assigning a core earlier in
device_run(), it is not held when releasing the core in this error path.
Concurrent unprotected reads and writes to core->curr across CPUs could lead to
undefined behavior.
> return;
> }
[ ... ]
[Severity: Medium]
This isn't a bug introduced by this patch, but does modifying core->curr in
rga_isr() without holding cores_lock introduce a data race?
Since cores_lock was introduced to protect against concurrent selections,
shouldn't it also be held when releasing the core in the interrupt handler?
drivers/media/platform/rockchip/rga/rga.c:rga_isr() {
...
core->curr = NULL;
...
}
[Severity: High]
This isn't a bug introduced by this patch, but does the unbind/rebind path
cause an out-of-bounds array write in rga->cores[]?
During component bind, rga_core_bind() appends to the rga->cores array:
drivers/media/platform/rockchip/rga/rga.c:rga_core_bind() {
...
rga->cores[rga->num_cores++] = core;
...
}
However, the master unbind callback and the component system do not reset
rga->num_cores to 0:
drivers/media/platform/rockchip/rga/rga.c:rga_unbind() {
...
component_unbind_all(dev, NULL);
}
The struct rockchip_rga structure is not freed when a component is unbound. If
a component is unbound and subsequently re-bound, rga_core_bind() will continue
incrementing rga->num_cores past the initially allocated length, corrupting
memory.
[Severity: High]
This isn't a bug introduced by this patch, but is the driver missing a
pm_runtime_get_if_active() check before accessing hardware in the shared IRQ
handler?
The driver registers rga_isr() with IRQF_SHARED if there is no internal IOMMU.
When an interrupt is triggered by another device on the shared line, rga_isr()
blindly calls handle_irq(), which reads hardware registers:
drivers/media/platform/rockchip/rga/rga.c:rga_isr() {
...
if (rga->hw->handle_irq(core)) {
...
}
If the RGA core is runtime-suspended and its clocks are disabled, accessing its
registers will cause an external abort / SError.
[Severity: High]
This isn't a bug introduced by this patch, but is there a missing
synchronize_irq() in the runtime suspend callback for shared IRQs?
In rga_runtime_suspend(), the driver disables clocks unconditionally:
drivers/media/platform/rockchip/rga/rga.c:rga_runtime_suspend() {
...
clk_bulk_disable_unprepare(core->num_clks, core->clks);
...
}
Because the driver can use a shared IRQ, an interrupt handler could be
mid-flight on another CPU when the clocks are disabled, causing an external
abort when the handler accesses the now-unclocked hardware.
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260606-spu-rga3multicore-v1-0-3ec2b15675f7@pengutronix.de?part=15
next prev parent reply other threads:[~2026-06-05 22:25 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-05 22:06 [PATCH 00/17] media: rockchip: rga: Add multi-core support Sven Püschel
2026-06-05 22:06 ` [PATCH 01/17] media: rockchip: rga: zero cmdbuf in shared code Sven Püschel
2026-06-05 22:20 ` sashiko-bot
2026-06-05 22:06 ` [PATCH 02/17] media: rockchip: rga: add comment about pixel alignment for YUV formats Sven Püschel
2026-06-05 22:06 ` [PATCH 03/17] media: rockchip: rga: move early return into if condition in vidioc_enum_fmt Sven Püschel
2026-06-05 22:06 ` [PATCH 04/17] media: rockchip: rga: removed unused regmap member Sven Püschel
2026-06-05 22:06 ` [PATCH 05/17] media: v4l2-mem2mem: support running multiple jobs in parallel Sven Püschel
2026-06-05 22:18 ` sashiko-bot
2026-06-05 22:06 ` [PATCH 06/17] media: rockchip: rga: move power handling to device_run Sven Püschel
2026-06-05 22:22 ` sashiko-bot
2026-06-05 22:06 ` [PATCH 07/17] media: rockchip: rga: adjust get_version to return the version Sven Püschel
2026-06-05 22:06 ` [PATCH 08/17] media: rockchip: rga: add rga_core structure Sven Püschel
2026-06-05 22:22 ` sashiko-bot
2026-06-05 22:06 ` [PATCH 09/17] media: rockchip: rga: use components to manage multiple cores Sven Püschel
2026-06-05 22:24 ` sashiko-bot
2026-06-05 22:06 ` [PATCH 10/17] media: rockchip: rga: move rockchip_rga allocation to master probe Sven Püschel
2026-06-05 22:23 ` sashiko-bot
2026-06-05 22:06 ` [PATCH 11/17] media: rockchip: rga: move video device to the master Sven Püschel
2026-06-05 22:21 ` sashiko-bot
2026-06-05 22:06 ` [PATCH 12/17] media: rockchip: rga: move core initialization from bind to probe Sven Püschel
2026-06-05 22:20 ` sashiko-bot
2026-06-05 22:06 ` [PATCH 13/17] media: rockchip: rga: bind all cores to the master Sven Püschel
2026-06-05 22:23 ` sashiko-bot
2026-06-05 22:07 ` [PATCH 14/17] media: rockchip: rga: put all cores into first core iommu domain Sven Püschel
2026-06-05 22:23 ` sashiko-bot
2026-06-05 22:07 ` [PATCH 15/17] media: rockchip: rga: schedule jobs to multiple cores Sven Püschel
2026-06-05 22:25 ` sashiko-bot [this message]
2026-06-05 22:07 ` [PATCH 16/17] arm64: dts: rockchip: add rga3 dt nodes to rk3588 Sven Püschel
2026-06-05 22:07 ` [PATCH 17/17] iommu/rockchip: disable fetch dte time limit Sven Püschel
2026-06-05 22:26 ` sashiko-bot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260605222549.80D521F00898@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=linux-media@vger.kernel.org \
--cc=s.pueschel@pengutronix.de \
--cc=sashiko-reviews@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox