* [PATCH 01/17] media: rockchip: rga: zero cmdbuf in shared code
2026-06-05 22:06 [PATCH 00/17] media: rockchip: rga: Add multi-core support Sven Püschel
@ 2026-06-05 22:06 ` Sven Püschel
2026-06-05 22:20 ` sashiko-bot
2026-06-05 22:06 ` [PATCH 02/17] media: rockchip: rga: add comment about pixel alignment for YUV formats Sven Püschel
` (15 subsequent siblings)
16 siblings, 1 reply; 30+ messages in thread
From: Sven Püschel @ 2026-06-05 22:06 UTC (permalink / raw)
To: Jacob Chen, Ezequiel Garcia, Mauro Carvalho Chehab,
Heiko Stuebner, Philipp Zabel
Cc: linux-media, linux-rockchip, linux-arm-kernel, linux-kernel,
kernel, Detlev Casanova, Michael Tretter, Sven Püschel
Zero the command buffer (cmdbuf) in the shared code instead of the
individual RGA2/RGA3 implementations. Besides centralizing the memset
operation this also uses the cmdbuf_size member for the memset size,
which is also used as the size for the actual allocation.
Signed-off-by: Sven Püschel <s.pueschel@pengutronix.de>
---
drivers/media/platform/rockchip/rga/rga-hw.c | 2 --
drivers/media/platform/rockchip/rga/rga.c | 1 +
drivers/media/platform/rockchip/rga/rga.h | 3 +++
drivers/media/platform/rockchip/rga/rga3-hw.c | 2 --
4 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/media/platform/rockchip/rga/rga-hw.c b/drivers/media/platform/rockchip/rga/rga-hw.c
index be1bc8ddbd03b..4d7b0a03820a1 100644
--- a/drivers/media/platform/rockchip/rga/rga-hw.c
+++ b/drivers/media/platform/rockchip/rga/rga-hw.c
@@ -443,8 +443,6 @@ static void rga_cmd_set(struct rga_ctx *ctx,
static void rga_hw_setup_cmdbuf(struct rga_ctx *ctx)
{
- memset(ctx->cmdbuf_virt, 0, RGA_CMDBUF_SIZE);
-
rga_cmd_set_mode(ctx);
rga_cmd_set_trans_info(ctx);
}
diff --git a/drivers/media/platform/rockchip/rga/rga.c b/drivers/media/platform/rockchip/rga/rga.c
index b3cb6bf8eb863..bd0afd33affe4 100644
--- a/drivers/media/platform/rockchip/rga/rga.c
+++ b/drivers/media/platform/rockchip/rga/rga.c
@@ -41,6 +41,7 @@ static void device_run(void *prv)
spin_lock_irqsave(&rga->ctrl_lock, flags);
if (ctx->cmdbuf_dirty) {
ctx->cmdbuf_dirty = false;
+ memset(ctx->cmdbuf_virt, 0, rga->hw->cmdbuf_size);
rga->hw->setup_cmdbuf(ctx);
}
spin_unlock_irqrestore(&rga->ctrl_lock, flags);
diff --git a/drivers/media/platform/rockchip/rga/rga.h b/drivers/media/platform/rockchip/rga/rga.h
index bd431534d0d39..2b4f5694375a4 100644
--- a/drivers/media/platform/rockchip/rga/rga.h
+++ b/drivers/media/platform/rockchip/rga/rga.h
@@ -152,6 +152,9 @@ struct rga_hw {
u8 stride_alignment;
u8 features;
+ /*
+ * Requires that the cmdbuf is already zeroed.
+ */
void (*setup_cmdbuf)(struct rga_ctx *ctx);
void (*start)(struct rockchip_rga *rga,
struct rga_vb_buffer *src, struct rga_vb_buffer *dst);
diff --git a/drivers/media/platform/rockchip/rga/rga3-hw.c b/drivers/media/platform/rockchip/rga/rga3-hw.c
index ca1c268303dd4..72741e1faccff 100644
--- a/drivers/media/platform/rockchip/rga/rga3-hw.c
+++ b/drivers/media/platform/rockchip/rga/rga3-hw.c
@@ -261,8 +261,6 @@ static void rga3_cmd_set_wr_format(struct rga_ctx *ctx)
static void rga3_hw_setup_cmdbuf(struct rga_ctx *ctx)
{
- memset(ctx->cmdbuf_virt, 0, RGA3_CMDBUF_SIZE);
-
rga3_cmd_set_win0_format(ctx);
rga3_cmd_set_trans_info(ctx);
rga3_cmd_set_wr_format(ctx);
--
2.54.0
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH 01/17] media: rockchip: rga: zero cmdbuf in shared code
2026-06-05 22:06 ` [PATCH 01/17] media: rockchip: rga: zero cmdbuf in shared code Sven Püschel
@ 2026-06-05 22:20 ` sashiko-bot
0 siblings, 0 replies; 30+ messages in thread
From: sashiko-bot @ 2026-06-05 22:20 UTC (permalink / raw)
To: Sven Püschel; +Cc: linux-media
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
Pre-existing issues:
- [High] Concurrent lockless modification of f->crop and ctx->cmdbuf_dirty in vidioc_s_selection races with device_run.
--
commit d338e595cb9fb0b46d00f7fc398fe05c4adbac84
Author: Sven Püschel <s.pueschel@pengutronix.de>
media: rockchip: rga: zero cmdbuf in shared code
This commit centralizes the zeroing of the command buffer into the shared rga
code instead of individual implementations. It also uses the cmdbuf_size member
for the memset operation.
> diff --git a/drivers/media/platform/rockchip/rga/rga.c b/drivers/media/platform/rockchip/rga/rga.c
> index b3cb6bf8eb863..bd0afd33affe4 100644
> --- a/drivers/media/platform/rockchip/rga/rga.c
> +++ b/drivers/media/platform/rockchip/rga/rga.c
> @@ -41,6 +41,7 @@ static void device_run(void *prv)
> spin_lock_irqsave(&rga->ctrl_lock, flags);
> if (ctx->cmdbuf_dirty) {
> ctx->cmdbuf_dirty = false;
> + memset(ctx->cmdbuf_virt, 0, rga->hw->cmdbuf_size);
> rga->hw->setup_cmdbuf(ctx);
[Severity: High]
This is a pre-existing issue, but can the lockless modification of f->crop
and ctx->cmdbuf_dirty in vidioc_s_selection race with this execution of
device_run?
In vidioc_s_selection, the crop dimensions and dirty flag are updated without
acquiring the rga->ctrl_lock spinlock:
drivers/media/platform/rockchip/rga/rga.c:vidioc_s_selection() {
...
f->crop = s->r;
ctx->cmdbuf_dirty = true;
...
}
Because vidioc_s_selection runs with only rga->mutex held and allows
modifications while streaming, it can execute concurrently with the v4l2_m2m
worker thread running device_run.
Since vidioc_s_selection does not acquire rga->ctrl_lock, hardware
configuration functions like rga_cmd_set_dst_info can read a torn f->crop
state during rga->hw->setup_cmdbuf:
drivers/media/platform/rockchip/rga/rga-hw.c:rga_cmd_set_dst_info() {
...
dst_h = ctx->out.crop.height;
dst_w = ctx->out.crop.width;
dst_x = ctx->out.crop.left;
dst_y = ctx->out.crop.top;
...
corner_addrs = rga_get_corner_addrs(&ctx->out, addrs, dst_x, dst_y,
dst_w, dst_h);
...
}
If a torn read mixes an old, large left and top offset with a new, large
width and height, it bypasses the validation done in vidioc_s_selection.
Does this cause rga_get_corner_addrs to calculate out-of-bounds DMA base
addresses, resulting in the RGA hardware performing out-of-bounds DMA writes
past the allocated video buffers?
> }
> spin_unlock_irqrestore(&rga->ctrl_lock, flags);
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260606-spu-rga3multicore-v1-0-3ec2b15675f7@pengutronix.de?part=1
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH 02/17] media: rockchip: rga: add comment about pixel alignment for YUV formats
2026-06-05 22:06 [PATCH 00/17] media: rockchip: rga: Add multi-core support Sven Püschel
2026-06-05 22:06 ` [PATCH 01/17] media: rockchip: rga: zero cmdbuf in shared code Sven Püschel
@ 2026-06-05 22:06 ` Sven Püschel
2026-06-05 22:06 ` [PATCH 03/17] media: rockchip: rga: move early return into if condition in vidioc_enum_fmt Sven Püschel
` (14 subsequent siblings)
16 siblings, 0 replies; 30+ messages in thread
From: Sven Püschel @ 2026-06-05 22:06 UTC (permalink / raw)
To: Jacob Chen, Ezequiel Garcia, Mauro Carvalho Chehab,
Heiko Stuebner, Philipp Zabel
Cc: linux-media, linux-rockchip, linux-arm-kernel, linux-kernel,
kernel, Detlev Casanova, Michael Tretter, Sven Püschel
Add a comment to clarify the use of fixed step_height values for all YUV
formats. While the commit introducing the change already explains the
reasoning, add an explicit comment to improve the visibility of the
reasoning.
Signed-off-by: Sven Püschel <s.pueschel@pengutronix.de>
---
drivers/media/platform/rockchip/rga/rga.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/drivers/media/platform/rockchip/rga/rga.c b/drivers/media/platform/rockchip/rga/rga.c
index bd0afd33affe4..efe5541078214 100644
--- a/drivers/media/platform/rockchip/rga/rga.c
+++ b/drivers/media/platform/rockchip/rga/rga.c
@@ -414,6 +414,16 @@ static int vidioc_try_fmt(struct file *file, void *priv, struct v4l2_format *f)
.step_height = 1,
};
+ /*
+ * Technically 4:2:2 YUV formats don't need a step_height of 2.
+ * But for the RGA3 this is explicitly documented in section 5.6.3
+ * of the RK3588 TRM Part 2.
+ * And the RGA2 vendor driver also checks that the height (and width)
+ * is aligned to 2 when a YUV format is used.
+ *
+ * Therefore be safe and always align width and height to 2
+ * when a YUV format is used.
+ */
if (v4l2_is_format_yuv(v4l2_format_info(pix_fmt->pixelformat))) {
frmsize.step_width = 2;
frmsize.step_height = 2;
--
2.54.0
^ permalink raw reply related [flat|nested] 30+ messages in thread* [PATCH 03/17] media: rockchip: rga: move early return into if condition in vidioc_enum_fmt
2026-06-05 22:06 [PATCH 00/17] media: rockchip: rga: Add multi-core support Sven Püschel
2026-06-05 22:06 ` [PATCH 01/17] media: rockchip: rga: zero cmdbuf in shared code Sven Püschel
2026-06-05 22:06 ` [PATCH 02/17] media: rockchip: rga: add comment about pixel alignment for YUV formats Sven Püschel
@ 2026-06-05 22:06 ` Sven Püschel
2026-06-05 22:06 ` [PATCH 04/17] media: rockchip: rga: removed unused regmap member Sven Püschel
` (13 subsequent siblings)
16 siblings, 0 replies; 30+ messages in thread
From: Sven Püschel @ 2026-06-05 22:06 UTC (permalink / raw)
To: Jacob Chen, Ezequiel Garcia, Mauro Carvalho Chehab,
Heiko Stuebner, Philipp Zabel
Cc: linux-media, linux-rockchip, linux-arm-kernel, linux-kernel,
kernel, Detlev Casanova, Michael Tretter, Sven Püschel
Instead of a doing an early return when we don't have a capture device,
merge the condition with the following if condition. This improves
readability, as the condition now explicitly contains a check for a
capture device instead of returning when we don't have a capture device.
Also use the V4L2_TYPE_IS_CAPTURE helper and improve the comment.
The early return if was copied from the vivid drivers
vivid_enum_fmt_vid function.
Signed-off-by: Sven Püschel <s.pueschel@pengutronix.de>
---
drivers/media/platform/rockchip/rga/rga.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/drivers/media/platform/rockchip/rga/rga.c b/drivers/media/platform/rockchip/rga/rga.c
index efe5541078214..8c03422d669cf 100644
--- a/drivers/media/platform/rockchip/rga/rga.c
+++ b/drivers/media/platform/rockchip/rga/rga.c
@@ -372,12 +372,14 @@ static int vidioc_enum_fmt(struct file *file, void *priv, struct v4l2_fmtdesc *f
if (ret != 0)
return ret;
- if (f->type != V4L2_BUF_TYPE_VIDEO_CAPTURE &&
- f->type != V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE)
- return 0;
-
- /* allow changing the quantization and xfer func for YUV formats */
- if (v4l2_is_format_yuv(v4l2_format_info(f->pixelformat)))
+ /*
+ * Allow changing the quantization and ycbcr_enc func for YUV formats
+ * on the capture side for RGB -> YUV conversions.
+ *
+ * These flags are only relevant for capture devices.
+ */
+ if (V4L2_TYPE_IS_CAPTURE(f->type) &&
+ v4l2_is_format_yuv(v4l2_format_info(f->pixelformat)))
f->flags |= V4L2_FMT_FLAG_CSC_QUANTIZATION |
V4L2_FMT_FLAG_CSC_YCBCR_ENC;
--
2.54.0
^ permalink raw reply related [flat|nested] 30+ messages in thread* [PATCH 04/17] media: rockchip: rga: removed unused regmap member
2026-06-05 22:06 [PATCH 00/17] media: rockchip: rga: Add multi-core support Sven Püschel
` (2 preceding siblings ...)
2026-06-05 22:06 ` [PATCH 03/17] media: rockchip: rga: move early return into if condition in vidioc_enum_fmt Sven Püschel
@ 2026-06-05 22:06 ` Sven Püschel
2026-06-05 22:06 ` [PATCH 05/17] media: v4l2-mem2mem: support running multiple jobs in parallel Sven Püschel
` (12 subsequent siblings)
16 siblings, 0 replies; 30+ messages in thread
From: Sven Püschel @ 2026-06-05 22:06 UTC (permalink / raw)
To: Jacob Chen, Ezequiel Garcia, Mauro Carvalho Chehab,
Heiko Stuebner, Philipp Zabel
Cc: linux-media, linux-rockchip, linux-arm-kernel, linux-kernel,
kernel, Detlev Casanova, Michael Tretter, Sven Püschel
The grf member variable is never accessed or written by the RGA driver.
Therefore drop it from the rockchip_rga struct.
Signed-off-by: Sven Püschel <s.pueschel@pengutronix.de>
---
drivers/media/platform/rockchip/rga/rga.h | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/media/platform/rockchip/rga/rga.h b/drivers/media/platform/rockchip/rga/rga.h
index 2b4f5694375a4..0e62337f8dd38 100644
--- a/drivers/media/platform/rockchip/rga/rga.h
+++ b/drivers/media/platform/rockchip/rga/rga.h
@@ -71,7 +71,6 @@ struct rockchip_rga {
struct video_device *vfd;
struct device *dev;
- struct regmap *grf;
void __iomem *regs;
struct clk_bulk_data *clks;
int num_clks;
--
2.54.0
^ permalink raw reply related [flat|nested] 30+ messages in thread* [PATCH 05/17] media: v4l2-mem2mem: support running multiple jobs in parallel
2026-06-05 22:06 [PATCH 00/17] media: rockchip: rga: Add multi-core support Sven Püschel
` (3 preceding siblings ...)
2026-06-05 22:06 ` [PATCH 04/17] media: rockchip: rga: removed unused regmap member Sven Püschel
@ 2026-06-05 22:06 ` Sven Püschel
2026-06-05 22:18 ` sashiko-bot
2026-06-05 22:06 ` [PATCH 06/17] media: rockchip: rga: move power handling to device_run Sven Püschel
` (11 subsequent siblings)
16 siblings, 1 reply; 30+ messages in thread
From: Sven Püschel @ 2026-06-05 22:06 UTC (permalink / raw)
To: Jacob Chen, Ezequiel Garcia, Mauro Carvalho Chehab,
Heiko Stuebner, Philipp Zabel
Cc: linux-media, linux-rockchip, linux-arm-kernel, linux-kernel,
kernel, Detlev Casanova, Michael Tretter, Sven Püschel
Add support for running multiple jobs in parallel for SoCs containing
multiple identical devices. An example is the Rockchip RK3588 SoC,
which contains two identical RGA3 devices. Therefore it is desirable to
have the kernel schedule the work across all available devices and only
expose one video device to the userspace.
Previously the curr_ctx member of a v4l2_m2m_dev was used to track the
currently running context. But the currently running context will always
be at the top of the job_queue. As the TRANS_RUNNING flag can be used to
check if the queue head is already running, the curr_ctx member can be
completely dropped
To avoid queueing too many parallel jobs, the
v4l2_m2m_set_max_parallel_jobs method is added. It allows a driver
to set the number of parallel jobs and avoids calling device_run when
the given number of jobs is already running. This is set to 1 by default
to prevent parallel job runs. Drivers with the need and support for
scheduling jobs can adjust this value accordingly.
Note that this change doesn't allow a context to be used multiple times
in parallel. So a single stream won't be able to utilize multiple devices
at once, but N streams can utilize up to N devices. This is caused by the
fact that a context is not added multiple times to the job_list and also
holds the job_flags to distinguish if it's currently running.
Signed-off-by: Sven Püschel <s.pueschel@pengutronix.de>
---
drivers/media/v4l2-core/v4l2-mem2mem.c | 89 ++++++++++++++++++++++------------
include/media/v4l2-mem2mem.h | 3 ++
2 files changed, 62 insertions(+), 30 deletions(-)
diff --git a/drivers/media/v4l2-core/v4l2-mem2mem.c b/drivers/media/v4l2-core/v4l2-mem2mem.c
index a65cbb124cfe0..14ac9c85803d1 100644
--- a/drivers/media/v4l2-core/v4l2-mem2mem.c
+++ b/drivers/media/v4l2-core/v4l2-mem2mem.c
@@ -84,16 +84,15 @@ static const char * const m2m_entity_name[] = {
* v4l2_m2m_unregister_media_controller().
* @intf_devnode: &struct media_intf devnode pointer with the interface
* with controls the M2M device.
- * @curr_ctx: currently running instance
* @job_queue: instances queued to run
* @job_spinlock: protects job_queue
* @job_work: worker to run queued jobs.
* @job_queue_flags: flags of the queue status, %QUEUE_PAUSED.
+ * @max_parallel_jobs: max job_queue instances number marked as running
* @m2m_ops: driver callbacks
* @kref: device reference count
*/
struct v4l2_m2m_dev {
- struct v4l2_m2m_ctx *curr_ctx;
#ifdef CONFIG_MEDIA_CONTROLLER
struct media_entity *source;
struct media_pad source_pad;
@@ -108,6 +107,7 @@ struct v4l2_m2m_dev {
spinlock_t job_spinlock;
struct work_struct job_work;
unsigned long job_queue_flags;
+ u32 max_parallel_jobs;
const struct v4l2_m2m_ops *m2m_ops;
@@ -123,6 +123,12 @@ static struct v4l2_m2m_queue_ctx *get_queue_ctx(struct v4l2_m2m_ctx *m2m_ctx,
return &m2m_ctx->cap_q_ctx;
}
+void v4l2_m2m_set_max_parallel_jobs(struct v4l2_m2m_dev *m2m_dev,
+ u32 max_parallel_jobs)
+{
+ m2m_dev->max_parallel_jobs = max_parallel_jobs;
+}
+
struct vb2_queue *v4l2_m2m_get_vq(struct v4l2_m2m_ctx *m2m_ctx,
enum v4l2_buf_type type)
{
@@ -229,14 +235,22 @@ EXPORT_SYMBOL_GPL(v4l2_m2m_buf_remove_by_idx);
void *v4l2_m2m_get_curr_priv(struct v4l2_m2m_dev *m2m_dev)
{
unsigned long flags;
- void *ret = NULL;
+ struct v4l2_m2m_ctx *first_ctx;
spin_lock_irqsave(&m2m_dev->job_spinlock, flags);
- if (m2m_dev->curr_ctx)
- ret = m2m_dev->curr_ctx->priv;
+ if (list_empty(&m2m_dev->job_queue)) {
+ spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
+ return NULL;
+ }
+
+ first_ctx = list_first_entry(&m2m_dev->job_queue,
+ struct v4l2_m2m_ctx, queue);
spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
- return ret;
+ if (first_ctx->job_flags & TRANS_RUNNING)
+ return first_ctx->priv;
+ else
+ return NULL;
}
EXPORT_SYMBOL(v4l2_m2m_get_curr_priv);
@@ -252,13 +266,11 @@ EXPORT_SYMBOL(v4l2_m2m_get_curr_priv);
static void v4l2_m2m_try_run(struct v4l2_m2m_dev *m2m_dev)
{
unsigned long flags;
+ struct v4l2_m2m_ctx *ctx;
+ struct v4l2_m2m_ctx *chosen_ctx = NULL;
+ u32 running_jobs = 0;
spin_lock_irqsave(&m2m_dev->job_spinlock, flags);
- if (NULL != m2m_dev->curr_ctx) {
- spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
- dprintk("Another instance is running, won't run now\n");
- return;
- }
if (list_empty(&m2m_dev->job_queue)) {
spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
@@ -272,13 +284,30 @@ static void v4l2_m2m_try_run(struct v4l2_m2m_dev *m2m_dev)
return;
}
- m2m_dev->curr_ctx = list_first_entry(&m2m_dev->job_queue,
- struct v4l2_m2m_ctx, queue);
- m2m_dev->curr_ctx->job_flags |= TRANS_RUNNING;
+ list_for_each_entry(ctx, &m2m_dev->job_queue, queue) {
+ if (!(ctx->job_flags & TRANS_RUNNING)) {
+ chosen_ctx = ctx;
+ break;
+ }
+
+ running_jobs++;
+ }
+ if (running_jobs >= m2m_dev->max_parallel_jobs) {
+ spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
+ dprintk("Maximum number of parallel jobs reached\n");
+ return;
+ }
+ if (!chosen_ctx) {
+ spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
+ dprintk("All jobs already running\n");
+ return;
+ }
+
+ chosen_ctx->job_flags |= TRANS_RUNNING;
spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
- dprintk("Running job on m2m_ctx: %p\n", m2m_dev->curr_ctx);
- m2m_dev->m2m_ops->device_run(m2m_dev->curr_ctx->priv);
+ dprintk("Running job on m2m_ctx: %p\n", chosen_ctx);
+ m2m_dev->m2m_ops->device_run(chosen_ctx->priv);
}
/*
@@ -469,15 +498,14 @@ static void v4l2_m2m_schedule_next_job(struct v4l2_m2m_dev *m2m_dev,
static bool _v4l2_m2m_job_finish(struct v4l2_m2m_dev *m2m_dev,
struct v4l2_m2m_ctx *m2m_ctx)
{
- if (!m2m_dev->curr_ctx || m2m_dev->curr_ctx != m2m_ctx) {
+ if (!m2m_ctx || !(m2m_ctx->job_flags & TRANS_RUNNING)) {
dprintk("Called by an instance not currently running\n");
return false;
}
- list_del(&m2m_dev->curr_ctx->queue);
- m2m_dev->curr_ctx->job_flags &= ~(TRANS_QUEUED | TRANS_RUNNING);
- wake_up(&m2m_dev->curr_ctx->finished);
- m2m_dev->curr_ctx = NULL;
+ list_del(&m2m_ctx->queue);
+ m2m_ctx->job_flags &= ~(TRANS_QUEUED | TRANS_RUNNING);
+ wake_up(&m2m_ctx->finished);
return true;
}
@@ -544,16 +572,19 @@ EXPORT_SYMBOL(v4l2_m2m_buf_done_and_job_finish);
void v4l2_m2m_suspend(struct v4l2_m2m_dev *m2m_dev)
{
unsigned long flags;
- struct v4l2_m2m_ctx *curr_ctx;
+ struct v4l2_m2m_ctx *ctx;
+ struct v4l2_m2m_ctx *ctx_safe;
spin_lock_irqsave(&m2m_dev->job_spinlock, flags);
m2m_dev->job_queue_flags |= QUEUE_PAUSED;
- curr_ctx = m2m_dev->curr_ctx;
spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
- if (curr_ctx)
- wait_event(curr_ctx->finished,
- !(curr_ctx->job_flags & TRANS_RUNNING));
+ list_for_each_entry_safe(ctx, ctx_safe, &m2m_dev->job_queue, queue) {
+ if (!(ctx->job_flags & TRANS_RUNNING))
+ break;
+
+ wait_event(ctx->finished, !(ctx->job_flags & TRANS_RUNNING));
+ }
}
EXPORT_SYMBOL(v4l2_m2m_suspend);
@@ -896,10 +927,8 @@ int v4l2_m2m_streamoff(struct file *file, struct v4l2_m2m_ctx *m2m_ctx,
q_ctx->num_rdy = 0;
spin_unlock_irqrestore(&q_ctx->rdy_spinlock, flags);
- if (m2m_dev->curr_ctx == m2m_ctx) {
- m2m_dev->curr_ctx = NULL;
+ if (m2m_ctx->job_flags & TRANS_RUNNING)
wake_up(&m2m_ctx->finished);
- }
spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags_job);
return 0;
@@ -1194,12 +1223,12 @@ struct v4l2_m2m_dev *v4l2_m2m_init(const struct v4l2_m2m_ops *m2m_ops)
if (!m2m_dev)
return ERR_PTR(-ENOMEM);
- m2m_dev->curr_ctx = NULL;
m2m_dev->m2m_ops = m2m_ops;
INIT_LIST_HEAD(&m2m_dev->job_queue);
spin_lock_init(&m2m_dev->job_spinlock);
INIT_WORK(&m2m_dev->job_work, v4l2_m2m_device_run_work);
kref_init(&m2m_dev->kref);
+ m2m_dev->max_parallel_jobs = 1;
return m2m_dev;
}
diff --git a/include/media/v4l2-mem2mem.h b/include/media/v4l2-mem2mem.h
index 31de25d792b98..e6177d0eaf637 100644
--- a/include/media/v4l2-mem2mem.h
+++ b/include/media/v4l2-mem2mem.h
@@ -594,6 +594,9 @@ static inline void v4l2_m2m_set_dst_buffered(struct v4l2_m2m_ctx *m2m_ctx,
m2m_ctx->cap_q_ctx.buffered = buffered;
}
+void v4l2_m2m_set_max_parallel_jobs(struct v4l2_m2m_dev *m2m_dev,
+ u32 max_parallel_jobs);
+
/**
* v4l2_m2m_ctx_release() - release m2m context
*
--
2.54.0
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH 05/17] media: v4l2-mem2mem: support running multiple jobs in parallel
2026-06-05 22:06 ` [PATCH 05/17] media: v4l2-mem2mem: support running multiple jobs in parallel Sven Püschel
@ 2026-06-05 22:18 ` sashiko-bot
0 siblings, 0 replies; 30+ messages in thread
From: sashiko-bot @ 2026-06-05 22:18 UTC (permalink / raw)
To: Sven Püschel; +Cc: linux-media
Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
- [High] Lockless list traversal of `job_queue` in `v4l2_m2m_suspend` leads to list corruption and kernel panic.
- [High] Unlocked dereference of `first_ctx` fields in `v4l2_m2m_get_curr_priv` leads to a Use-After-Free.
- [Medium] `v4l2_m2m_get_curr_priv` breaks interrupt mapping for drivers utilizing parallel jobs.
--
commit b3a2fe91efda298e9f5953581aac80b6a8ccdefb
Author: Sven Püschel <s.pueschel@pengutronix.de>
media: v4l2-mem2mem: support running multiple jobs in parallel
Add support for running multiple jobs in parallel for SoCs containing
multiple identical devices. An example is the Rockchip RK3588 SoC,
which contains two identical RGA3 devices.
> diff --git a/drivers/media/v4l2-core/v4l2-mem2mem.c b/drivers/media/v4l2-core/v4l2-mem2mem.c
> index a65cbb124cfe0..14ac9c85803d1 100644
> --- a/drivers/media/v4l2-core/v4l2-mem2mem.c
> +++ b/drivers/media/v4l2-core/v4l2-mem2mem.c
[ ... ]
> @@ -229,14 +235,22 @@ EXPORT_SYMBOL_GPL(v4l2_m2m_buf_remove_by_idx);
> void *v4l2_m2m_get_curr_priv(struct v4l2_m2m_dev *m2m_dev)
> {
> unsigned long flags;
> - void *ret = NULL;
> + struct v4l2_m2m_ctx *first_ctx;
>
> spin_lock_irqsave(&m2m_dev->job_spinlock, flags);
> - if (m2m_dev->curr_ctx)
> - ret = m2m_dev->curr_ctx->priv;
> + if (list_empty(&m2m_dev->job_queue)) {
> + spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
> + return NULL;
> + }
> +
> + first_ctx = list_first_entry(&m2m_dev->job_queue,
> + struct v4l2_m2m_ctx, queue);
> spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
>
> - return ret;
> + if (first_ctx->job_flags & TRANS_RUNNING)
> + return first_ctx->priv;
[Severity: High]
Does this introduce a use-after-free?
The job_spinlock is dropped right before reading first_ctx->job_flags
and first_ctx->priv. If the job is concurrently canceled, it could be
removed from the list and its context freed before these reads occur.
[Severity: Medium]
Will this return the correct context when multiple jobs are running in
parallel?
If max_parallel_jobs > 1, multiple contexts could be marked with
TRANS_RUNNING. Because this function indiscriminately returns the priv
pointer of the first element in the queue, a driver servicing an interrupt
for a newer concurrent job might mistakenly receive the priv of the oldest
running job instead.
> + else
> + return NULL;
> }
> EXPORT_SYMBOL(v4l2_m2m_get_curr_priv);
[ ... ]
> @@ -544,16 +572,19 @@ EXPORT_SYMBOL(v4l2_m2m_buf_done_and_job_finish);
> void v4l2_m2m_suspend(struct v4l2_m2m_dev *m2m_dev)
> {
> unsigned long flags;
> - struct v4l2_m2m_ctx *curr_ctx;
> + struct v4l2_m2m_ctx *ctx;
> + struct v4l2_m2m_ctx *ctx_safe;
>
> spin_lock_irqsave(&m2m_dev->job_spinlock, flags);
> m2m_dev->job_queue_flags |= QUEUE_PAUSED;
> - curr_ctx = m2m_dev->curr_ctx;
> spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
>
> - if (curr_ctx)
> - wait_event(curr_ctx->finished,
> - !(curr_ctx->job_flags & TRANS_RUNNING));
> + list_for_each_entry_safe(ctx, ctx_safe, &m2m_dev->job_queue, queue) {
> + if (!(ctx->job_flags & TRANS_RUNNING))
> + break;
> +
> + wait_event(ctx->finished, !(ctx->job_flags & TRANS_RUNNING));
[Severity: High]
Could this result in list corruption?
This loop traverses m2m_dev->job_queue without holding the
m2m_dev->job_spinlock. When wait_event() blocks, a concurrent job
completion could call _v4l2_m2m_job_finish() and remove ctx_safe from the
list via list_del().
When the loop resumes, dereferencing the poisoned ctx_safe pointer would
likely cause a crash.
> + }
> }
> EXPORT_SYMBOL(v4l2_m2m_suspend);
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260606-spu-rga3multicore-v1-0-3ec2b15675f7@pengutronix.de?part=5
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH 06/17] media: rockchip: rga: move power handling to device_run
2026-06-05 22:06 [PATCH 00/17] media: rockchip: rga: Add multi-core support Sven Püschel
` (4 preceding siblings ...)
2026-06-05 22:06 ` [PATCH 05/17] media: v4l2-mem2mem: support running multiple jobs in parallel Sven Püschel
@ 2026-06-05 22:06 ` Sven Püschel
2026-06-05 22:22 ` sashiko-bot
2026-06-05 22:06 ` [PATCH 07/17] media: rockchip: rga: adjust get_version to return the version Sven Püschel
` (10 subsequent siblings)
16 siblings, 1 reply; 30+ messages in thread
From: Sven Püschel @ 2026-06-05 22:06 UTC (permalink / raw)
To: Jacob Chen, Ezequiel Garcia, Mauro Carvalho Chehab,
Heiko Stuebner, Philipp Zabel
Cc: linux-media, linux-rockchip, linux-arm-kernel, linux-kernel,
kernel, Detlev Casanova, Michael Tretter, Sven Püschel
Move the power handling to the device_run function in preparation for
enabling multiple cores. This allows to power the only the necessary cores
instead of powering all available cores.
As the decision on which core the given job is executed will be done in
device_run, we can only power to correct core there.
To avoid unpowering the core in a streaming state switch to autosuspend.
This avoids powering down the core when the next frame is scheduled in
the next 50ms. The timeout maps to a framerate of 20fps, which should be
pretty uncommon in a normal video stream.
Signed-off-by: Sven Püschel <s.pueschel@pengutronix.de>
---
drivers/media/platform/rockchip/rga/rga-buf.c | 12 ------------
drivers/media/platform/rockchip/rga/rga.c | 11 +++++++++++
2 files changed, 11 insertions(+), 12 deletions(-)
diff --git a/drivers/media/platform/rockchip/rga/rga-buf.c b/drivers/media/platform/rockchip/rga/rga-buf.c
index c0ea6003336bf..3f7c3c68e0cb8 100644
--- a/drivers/media/platform/rockchip/rga/rga-buf.c
+++ b/drivers/media/platform/rockchip/rga/rga-buf.c
@@ -242,14 +242,6 @@ static int rga_buf_prepare_streaming(struct vb2_queue *q)
static int rga_buf_start_streaming(struct vb2_queue *q, unsigned int count)
{
struct rga_ctx *ctx = vb2_get_drv_priv(q);
- struct rockchip_rga *rga = ctx->rga;
- int ret;
-
- ret = pm_runtime_resume_and_get(rga->dev);
- if (ret < 0) {
- rga_buf_return_buffers(q, VB2_BUF_STATE_QUEUED);
- return ret;
- }
if (V4L2_TYPE_IS_OUTPUT(q->type))
ctx->osequence = 0;
@@ -261,11 +253,7 @@ static int rga_buf_start_streaming(struct vb2_queue *q, unsigned int count)
static void rga_buf_stop_streaming(struct vb2_queue *q)
{
- struct rga_ctx *ctx = vb2_get_drv_priv(q);
- struct rockchip_rga *rga = ctx->rga;
-
rga_buf_return_buffers(q, VB2_BUF_STATE_ERROR);
- pm_runtime_put(rga->dev);
}
const struct vb2_ops rga_qops = {
diff --git a/drivers/media/platform/rockchip/rga/rga.c b/drivers/media/platform/rockchip/rga/rga.c
index 8c03422d669cf..0eff558d7f133 100644
--- a/drivers/media/platform/rockchip/rga/rga.c
+++ b/drivers/media/platform/rockchip/rga/rga.c
@@ -37,6 +37,14 @@ static void device_run(void *prv)
struct rockchip_rga *rga = ctx->rga;
struct vb2_v4l2_buffer *src, *dst;
unsigned long flags;
+ int ret;
+
+ ret = pm_runtime_resume_and_get(rga->dev);
+ if (ret < 0) {
+ v4l2_m2m_buf_done_and_job_finish(rga->m2m_dev, ctx->fh.m2m_ctx,
+ VB2_BUF_STATE_ERROR);
+ return;
+ }
spin_lock_irqsave(&rga->ctrl_lock, flags);
if (ctx->cmdbuf_dirty) {
@@ -81,6 +89,8 @@ static irqreturn_t rga_isr(int irq, void *prv)
v4l2_m2m_buf_done(src, VB2_BUF_STATE_DONE);
v4l2_m2m_buf_done(dst, VB2_BUF_STATE_DONE);
v4l2_m2m_job_finish(rga->m2m_dev, ctx->fh.m2m_ctx);
+
+ pm_runtime_put_autosuspend(rga->dev);
}
return IRQ_HANDLED;
@@ -797,6 +807,7 @@ static int rga_probe(struct platform_device *pdev)
if (ret)
return dev_err_probe(&pdev->dev, ret, "Unable to parse OF data\n");
+ pm_runtime_set_autosuspend_delay(rga->dev, 50);
pm_runtime_enable(rga->dev);
rga->regs = devm_platform_ioremap_resource(pdev, 0);
--
2.54.0
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH 06/17] media: rockchip: rga: move power handling to device_run
2026-06-05 22:06 ` [PATCH 06/17] media: rockchip: rga: move power handling to device_run Sven Püschel
@ 2026-06-05 22:22 ` sashiko-bot
0 siblings, 0 replies; 30+ messages in thread
From: sashiko-bot @ 2026-06-05 22:22 UTC (permalink / raw)
To: Sven Püschel; +Cc: linux-media
Thank you for your contribution! Sashiko AI review found 7 potential issue(s) to consider:
New issues:
- [Medium] Autosuspend is configured but never enabled, causing `pm_runtime_put_autosuspend()` to act like an immediate `pm_runtime_put()`.
Pre-existing issues:
- [Critical] The `rga` structure is freed by devres on device removal but accessed via V4L2 file ops later.
- [High] Missing hardware state validation in shared interrupt handler allows register access on powered-off device.
- [High] Missing IRQ synchronization during runtime suspend allows hardware power-down while IRQ handler is mid-flight.
- [High] Asynchronous pm_runtime_put() is dangerously canceled by an immediate pm_runtime_disable() during module removal, leaking power.
- [High] Missing return after WARN_ON(!ctx) leads to immediate NULL pointer dereference.
- [High] `dma_sync_single_for_device()` is illegally called on coherent memory.
--
commit ed159a0b6bc2e424a0e4c721a5d6723c23fc3c3e
Author: Sven Püschel <s.pueschel@pengutronix.de>
media: rockchip: rga: move power handling to device_run
Move the power handling to the device_run function in preparation for
enabling multiple cores. Switch to autosuspend to avoid powering down
the core when the next frame is scheduled in the next 50ms.
> diff --git a/drivers/media/platform/rockchip/rga/rga.c b/drivers/media/platform/rockchip/rga/rga.c
> index 8c03422d669cf..0eff558d7f133 100644
> --- a/drivers/media/platform/rockchip/rga/rga.c
> +++ b/drivers/media/platform/rockchip/rga/rga.c
[ ... ]
> @@ -81,6 +89,8 @@ static irqreturn_t rga_isr(int irq, void *prv)
> v4l2_m2m_buf_done(src, VB2_BUF_STATE_DONE);
> v4l2_m2m_buf_done(dst, VB2_BUF_STATE_DONE);
> v4l2_m2m_job_finish(rga->m2m_dev, ctx->fh.m2m_ctx);
> +
> + pm_runtime_put_autosuspend(rga->dev);
[Severity: Medium]
Is the idle timer being reset properly here?
It appears we are missing a call to pm_runtime_mark_last_busy() before the
put operation. Without it, the idle timer might not be reset, defeating the
intended autosuspend optimization.
[Severity: High]
This is a pre-existing issue, but I noticed the shared interrupt handler
might execute hardware operations blindly. Does rga_isr() need to verify
the device is powered on before touching hardware?
Since the device can use shared IRQs, rga_isr() can be invoked
asynchronously if another device on the same line triggers an interrupt,
even when the RGA device is suspended. Should we use
pm_runtime_get_if_active() to ensure we don't read from powered-down
registers?
[Severity: High]
This is also a pre-existing issue, but looking at rga_isr(), if ctx is NULL,
does the code fall through to a NULL pointer dereference?
rga_isr() checks for a NULL context:
WARN_ON(!ctx);
core->curr = NULL;
src = v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
Should there be a return statement after the WARN_ON(!ctx) to prevent
dereferencing ctx in the subsequent calls?
> }
>
> return IRQ_HANDLED;
> @@ -797,6 +807,7 @@ static int rga_probe(struct platform_device *pdev)
> if (ret)
> return dev_err_probe(&pdev->dev, ret, "Unable to parse OF data\n");
>
> + pm_runtime_set_autosuspend_delay(rga->dev, 50);
> pm_runtime_enable(rga->dev);
[Severity: Medium]
Are we missing a call to pm_runtime_use_autosuspend() here?
Without enabling the autosuspend flag, the PM core might ignore the 50ms
delay, causing pm_runtime_put_autosuspend() to fall back to an immediate
suspend behavior.
[ ... ]
I also noticed a few other pre-existing issues in the driver while reviewing
this patch:
[Severity: High]
This isn't a bug introduced by this patch, but does the driver need to
synchronize IRQs during runtime suspend?
In rga_runtime_suspend(), the driver disables clocks using
clk_bulk_disable_unprepare(). Because the driver uses shared IRQs, could
the hardware be powered down while an executing rga_isr() is mid-flight on
another CPU? Should synchronize_irq() be called before shutting down the
hardware?
[Severity: High]
This is a pre-existing issue, but does device removal safely cancel pending
suspend operations?
During module removal, rga_core_remove() calls pm_runtime_disable().
Because the driver uses pm_runtime_put_autosuspend(), an asynchronous
suspend might be pending. Could this immediate disable cancel the pending
work without executing it, leaving the hardware active and leaking power?
Would using pm_runtime_put_sync() ensure the suspend completes before
removal?
[Severity: Critical]
This is a pre-existing issue, but is there a potential use-after-free
related to the rga structure lifecycle?
The rga structure is allocated using devm_kzalloc() during probe, meaning
it is freed automatically on device removal. However, if a user process
holds the video device open, the V4L2 file descriptor remains valid. When
the process later closes the file, rga_release() is called and accesses the
freed memory, for example by acquiring the mutex:
mutex_lock(&rga->mutex);
Should the allocation be tied to the v4l2_device or video_device lifecycle
instead of devres?
[Severity: High]
This is a pre-existing issue, but I noticed a potential DMA API violation in
rga_hw_start().
The command buffer is allocated as write-combined coherent memory using
dma_alloc_attrs() with DMA_ATTR_WRITE_COMBINE. Later, the driver attempts
to sync this buffer to the device:
dma_sync_single_for_device(..., ctx->cmdbuf_phy, ...);
Since the sync functions are intended for streaming DMA mappings and not
coherent allocations, could this corrupt internal state or trigger DMA API
debug warnings?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260606-spu-rga3multicore-v1-0-3ec2b15675f7@pengutronix.de?part=6
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH 07/17] media: rockchip: rga: adjust get_version to return the version
2026-06-05 22:06 [PATCH 00/17] media: rockchip: rga: Add multi-core support Sven Püschel
` (5 preceding siblings ...)
2026-06-05 22:06 ` [PATCH 06/17] media: rockchip: rga: move power handling to device_run Sven Püschel
@ 2026-06-05 22:06 ` Sven Püschel
2026-06-05 22:06 ` [PATCH 08/17] media: rockchip: rga: add rga_core structure Sven Püschel
` (9 subsequent siblings)
16 siblings, 0 replies; 30+ messages in thread
From: Sven Püschel @ 2026-06-05 22:06 UTC (permalink / raw)
To: Jacob Chen, Ezequiel Garcia, Mauro Carvalho Chehab,
Heiko Stuebner, Philipp Zabel
Cc: linux-media, linux-rockchip, linux-arm-kernel, linux-kernel,
kernel, Detlev Casanova, Michael Tretter, Sven Püschel
Adjust get_version to return the version instead of directly updating it
in the rockchip_rga structure. This is done in preparation for a
multi-core support to check that cores with the same compatible share the
same version.
Signed-off-by: Sven Püschel <s.pueschel@pengutronix.de>
---
drivers/media/platform/rockchip/rga/rga-hw.c | 10 +++++++---
drivers/media/platform/rockchip/rga/rga.c | 2 +-
drivers/media/platform/rockchip/rga/rga.h | 2 +-
drivers/media/platform/rockchip/rga/rga3-hw.c | 8 +++++---
4 files changed, 14 insertions(+), 8 deletions(-)
diff --git a/drivers/media/platform/rockchip/rga/rga-hw.c b/drivers/media/platform/rockchip/rga/rga-hw.c
index 4d7b0a03820a1..190104f3b2954 100644
--- a/drivers/media/platform/rockchip/rga/rga-hw.c
+++ b/drivers/media/platform/rockchip/rga/rga-hw.c
@@ -474,10 +474,14 @@ static bool rga_handle_irq(struct rockchip_rga *rga)
return intr & RGA_INT_COMMAND_FINISHED;
}
-static void rga_get_version(struct rockchip_rga *rga)
+static struct rockchip_rga_version rga_get_version(struct rockchip_rga *rga)
{
- rga->version.major = (rga_read(rga, RGA_VERSION_INFO) >> 24) & 0xFF;
- rga->version.minor = (rga_read(rga, RGA_VERSION_INFO) >> 20) & 0x0F;
+ u32 version = rga_read(rga, RGA_VERSION_INFO);
+
+ return (struct rockchip_rga_version) {
+ .major = (version >> 24) & 0xFF,
+ .minor = (version >> 20) & 0x0F,
+ };
}
static struct rga_fmt formats[] = {
diff --git a/drivers/media/platform/rockchip/rga/rga.c b/drivers/media/platform/rockchip/rga/rga.c
index 0eff558d7f133..b8edd3596c919 100644
--- a/drivers/media/platform/rockchip/rga/rga.c
+++ b/drivers/media/platform/rockchip/rga/rga.c
@@ -864,7 +864,7 @@ static int rga_probe(struct platform_device *pdev)
if (ret < 0)
goto rel_m2m;
- rga->hw->get_version(rga);
+ rga->version = rga->hw->get_version(rga);
v4l2_info(&rga->v4l2_dev, "HW Version: 0x%02x.%02x\n",
rga->version.major, rga->version.minor);
diff --git a/drivers/media/platform/rockchip/rga/rga.h b/drivers/media/platform/rockchip/rga/rga.h
index 0e62337f8dd38..0e854cdf739f4 100644
--- a/drivers/media/platform/rockchip/rga/rga.h
+++ b/drivers/media/platform/rockchip/rga/rga.h
@@ -158,7 +158,7 @@ struct rga_hw {
void (*start)(struct rockchip_rga *rga,
struct rga_vb_buffer *src, struct rga_vb_buffer *dst);
bool (*handle_irq)(struct rockchip_rga *rga);
- void (*get_version)(struct rockchip_rga *rga);
+ struct rockchip_rga_version (*get_version)(struct rockchip_rga *rga);
void *(*adjust_and_map_format)(struct rga_ctx *ctx,
struct v4l2_pix_format_mplane *format,
bool is_output);
diff --git a/drivers/media/platform/rockchip/rga/rga3-hw.c b/drivers/media/platform/rockchip/rga/rga3-hw.c
index 72741e1faccff..3469523a5ecad 100644
--- a/drivers/media/platform/rockchip/rga/rga3-hw.c
+++ b/drivers/media/platform/rockchip/rga/rga3-hw.c
@@ -299,12 +299,14 @@ static bool rga3_handle_irq(struct rockchip_rga *rga)
return FIELD_GET(RGA3_INT_FRM_DONE, intr);
}
-static void rga3_get_version(struct rockchip_rga *rga)
+static struct rockchip_rga_version rga3_get_version(struct rockchip_rga *rga)
{
u32 version = rga_read(rga, RGA3_VERSION_NUM);
- rga->version.major = FIELD_GET(RGA3_VERSION_NUM_MAJOR, version);
- rga->version.minor = FIELD_GET(RGA3_VERSION_NUM_MINOR, version);
+ return (struct rockchip_rga_version) {
+ .major = FIELD_GET(RGA3_VERSION_NUM_MAJOR, version),
+ .minor = FIELD_GET(RGA3_VERSION_NUM_MINOR, version),
+ };
}
static struct rga3_fmt rga3_formats[] = {
--
2.54.0
^ permalink raw reply related [flat|nested] 30+ messages in thread* [PATCH 08/17] media: rockchip: rga: add rga_core structure
2026-06-05 22:06 [PATCH 00/17] media: rockchip: rga: Add multi-core support Sven Püschel
` (6 preceding siblings ...)
2026-06-05 22:06 ` [PATCH 07/17] media: rockchip: rga: adjust get_version to return the version Sven Püschel
@ 2026-06-05 22:06 ` Sven Püschel
2026-06-05 22:22 ` sashiko-bot
2026-06-05 22:06 ` [PATCH 09/17] media: rockchip: rga: use components to manage multiple cores Sven Püschel
` (8 subsequent siblings)
16 siblings, 1 reply; 30+ messages in thread
From: Sven Püschel @ 2026-06-05 22:06 UTC (permalink / raw)
To: Jacob Chen, Ezequiel Garcia, Mauro Carvalho Chehab,
Heiko Stuebner, Philipp Zabel
Cc: linux-media, linux-rockchip, linux-arm-kernel, linux-kernel,
kernel, Detlev Casanova, Michael Tretter, Sven Püschel
Add a rga_core structure to separate the core specific data from the
m2m device. This is done in preparation for multi-core support, where
multiple identical cores are exposed as a single m2m device to the
user-space.
Allocation related calls are explicitly done on the first core, as the
scheduling decisions will be made on demand after the buffers have been
allocated and filled.
In preparation of storing the rockchip_rga struct on a dedicated master
platform device, the rga_core struct is allocated on it's own and only a
pointer is saved in the rockchip_rga struct.
Signed-off-by: Sven Püschel <s.pueschel@pengutronix.de>
---
drivers/media/platform/rockchip/rga/rga-buf.c | 4 +-
drivers/media/platform/rockchip/rga/rga-hw.c | 32 ++++----
drivers/media/platform/rockchip/rga/rga.c | 104 ++++++++++++++------------
drivers/media/platform/rockchip/rga/rga.h | 39 +++++-----
drivers/media/platform/rockchip/rga/rga3-hw.c | 24 +++---
5 files changed, 108 insertions(+), 95 deletions(-)
diff --git a/drivers/media/platform/rockchip/rga/rga-buf.c b/drivers/media/platform/rockchip/rga/rga-buf.c
index 3f7c3c68e0cb8..47a8d5a4500a3 100644
--- a/drivers/media/platform/rockchip/rga/rga-buf.c
+++ b/drivers/media/platform/rockchip/rga/rga-buf.c
@@ -93,7 +93,7 @@ static int rga_buf_init(struct vb2_buffer *vb)
n_desc = DIV_ROUND_UP(size, PAGE_SIZE);
rbuf->n_desc = n_desc;
- rbuf->dma_desc = dma_alloc_coherent(rga->dev,
+ rbuf->dma_desc = dma_alloc_coherent(rga->cores[0]->dev,
rbuf->n_desc * sizeof(*rbuf->dma_desc),
&rbuf->dma_desc_pa, GFP_KERNEL);
if (!rbuf->dma_desc)
@@ -191,7 +191,7 @@ static void rga_buf_cleanup(struct vb2_buffer *vb)
if (!rga_has_internal_iommu(rga))
return;
- dma_free_coherent(rga->dev, rbuf->n_desc * sizeof(*rbuf->dma_desc),
+ dma_free_coherent(rga->cores[0]->dev, rbuf->n_desc * sizeof(*rbuf->dma_desc),
rbuf->dma_desc, rbuf->dma_desc_pa);
}
diff --git a/drivers/media/platform/rockchip/rga/rga-hw.c b/drivers/media/platform/rockchip/rga/rga-hw.c
index 190104f3b2954..9a5da4e1716ca 100644
--- a/drivers/media/platform/rockchip/rga/rga-hw.c
+++ b/drivers/media/platform/rockchip/rga/rga-hw.c
@@ -417,10 +417,10 @@ static void rga_cmd_set_mode(struct rga_ctx *ctx)
dest[(RGA_MODE_CTRL - RGA_MODE_BASE_REG) >> 2] = mode.val;
}
-static void rga_cmd_set(struct rga_ctx *ctx,
+static void rga_cmd_set(struct rga_core *core,
struct rga_vb_buffer *src, struct rga_vb_buffer *dst)
{
- struct rockchip_rga *rga = ctx->rga;
+ struct rga_ctx *ctx = core->curr;
rga_cmd_set_src_addr(ctx, src->dma_desc_pa);
/*
@@ -434,10 +434,10 @@ static void rga_cmd_set(struct rga_ctx *ctx,
rga_cmd_set_src_info(ctx, &src->dma_addrs);
rga_cmd_set_dst_info(ctx, &dst->dma_addrs);
- rga_write(rga, RGA_CMD_BASE, ctx->cmdbuf_phy);
+ rga_write(core, RGA_CMD_BASE, ctx->cmdbuf_phy);
/* sync CMD buf for RGA */
- dma_sync_single_for_device(rga->dev, ctx->cmdbuf_phy,
+ dma_sync_single_for_device(core->rga->cores[0]->dev, ctx->cmdbuf_phy,
PAGE_SIZE, DMA_BIDIRECTIONAL);
}
@@ -447,36 +447,34 @@ static void rga_hw_setup_cmdbuf(struct rga_ctx *ctx)
rga_cmd_set_trans_info(ctx);
}
-static void rga_hw_start(struct rockchip_rga *rga,
+static void rga_hw_start(struct rga_core *core,
struct rga_vb_buffer *src, struct rga_vb_buffer *dst)
{
- struct rga_ctx *ctx = rga->curr;
-
- rga_cmd_set(ctx, src, dst);
+ rga_cmd_set(core, src, dst);
- rga_write(rga, RGA_SYS_CTRL, 0x00);
+ rga_write(core, RGA_SYS_CTRL, 0x00);
- rga_write(rga, RGA_SYS_CTRL, 0x22);
+ rga_write(core, RGA_SYS_CTRL, 0x22);
- rga_write(rga, RGA_INT, 0x600);
+ rga_write(core, RGA_INT, 0x600);
- rga_write(rga, RGA_CMD_CTRL, 0x1);
+ rga_write(core, RGA_CMD_CTRL, 0x1);
}
-static bool rga_handle_irq(struct rockchip_rga *rga)
+static bool rga_handle_irq(struct rga_core *core)
{
int intr;
- intr = rga_read(rga, RGA_INT) & 0xf;
+ intr = rga_read(core, RGA_INT) & 0xf;
- rga_mod(rga, RGA_INT, intr << 4, 0xf << 4);
+ rga_mod(core, RGA_INT, intr << 4, 0xf << 4);
return intr & RGA_INT_COMMAND_FINISHED;
}
-static struct rockchip_rga_version rga_get_version(struct rockchip_rga *rga)
+static struct rockchip_rga_version rga_get_version(struct rga_core *core)
{
- u32 version = rga_read(rga, RGA_VERSION_INFO);
+ u32 version = rga_read(core, RGA_VERSION_INFO);
return (struct rockchip_rga_version) {
.major = (version >> 24) & 0xFF,
diff --git a/drivers/media/platform/rockchip/rga/rga.c b/drivers/media/platform/rockchip/rga/rga.c
index b8edd3596c919..15d095a1d1973 100644
--- a/drivers/media/platform/rockchip/rga/rga.c
+++ b/drivers/media/platform/rockchip/rga/rga.c
@@ -11,6 +11,7 @@
#include <linux/interrupt.h>
#include <linux/module.h>
#include <linux/of.h>
+#include <linux/of_platform.h>
#include <linux/pm_runtime.h>
#include <linux/reset.h>
#include <linux/sched.h>
@@ -35,11 +36,12 @@ static void device_run(void *prv)
{
struct rga_ctx *ctx = prv;
struct rockchip_rga *rga = ctx->rga;
+ struct rga_core *core = rga->cores[0];
struct vb2_v4l2_buffer *src, *dst;
unsigned long flags;
int ret;
- ret = pm_runtime_resume_and_get(rga->dev);
+ ret = pm_runtime_resume_and_get(core->dev);
if (ret < 0) {
v4l2_m2m_buf_done_and_job_finish(rga->m2m_dev, ctx->fh.m2m_ctx,
VB2_BUF_STATE_ERROR);
@@ -54,27 +56,28 @@ static void device_run(void *prv)
}
spin_unlock_irqrestore(&rga->ctrl_lock, flags);
- rga->curr = ctx;
+ core->curr = ctx;
src = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
src->sequence = ctx->osequence++;
dst = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
- rga->hw->start(rga, vb_to_rga(src), vb_to_rga(dst));
+ rga->hw->start(core, vb_to_rga(src), vb_to_rga(dst));
}
static irqreturn_t rga_isr(int irq, void *prv)
{
- struct rockchip_rga *rga = prv;
+ struct rga_core *core = prv;
+ struct rockchip_rga *rga = core->rga;
- if (rga->hw->handle_irq(rga)) {
+ if (rga->hw->handle_irq(core)) {
struct vb2_v4l2_buffer *src, *dst;
- struct rga_ctx *ctx = rga->curr;
+ struct rga_ctx *ctx = core->curr;
WARN_ON(!ctx);
- rga->curr = NULL;
+ core->curr = NULL;
src = v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
dst = v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
@@ -90,7 +93,7 @@ static irqreturn_t rga_isr(int irq, void *prv)
v4l2_m2m_buf_done(dst, VB2_BUF_STATE_DONE);
v4l2_m2m_job_finish(rga->m2m_dev, ctx->fh.m2m_ctx);
- pm_runtime_put_autosuspend(rga->dev);
+ pm_runtime_put_autosuspend(core->dev);
}
return IRQ_HANDLED;
@@ -118,7 +121,7 @@ queue_init(void *priv, struct vb2_queue *src_vq, struct vb2_queue *dst_vq)
src_vq->buf_struct_size = sizeof(struct rga_vb_buffer);
src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
src_vq->lock = &ctx->rga->mutex;
- src_vq->dev = ctx->rga->v4l2_dev.dev;
+ src_vq->dev = ctx->rga->cores[0]->dev;
ret = vb2_queue_init(src_vq);
if (ret)
@@ -136,7 +139,7 @@ queue_init(void *priv, struct vb2_queue *src_vq, struct vb2_queue *dst_vq)
dst_vq->buf_struct_size = sizeof(struct rga_vb_buffer);
dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
dst_vq->lock = &ctx->rga->mutex;
- dst_vq->dev = ctx->rga->v4l2_dev.dev;
+ dst_vq->dev = ctx->rga->cores[0]->dev;
return vb2_queue_init(dst_vq);
}
@@ -275,7 +278,7 @@ static int rga_open(struct file *file)
return -ENOMEM;
/* Create CMD buffer */
- ctx->cmdbuf_virt = dma_alloc_attrs(rga->dev, rga->hw->cmdbuf_size,
+ ctx->cmdbuf_virt = dma_alloc_attrs(rga->cores[0]->dev, rga->hw->cmdbuf_size,
&ctx->cmdbuf_phy, GFP_KERNEL,
DMA_ATTR_WRITE_COMBINE);
if (!ctx->cmdbuf_virt) {
@@ -322,7 +325,7 @@ static int rga_open(struct file *file)
unlock_mutex:
mutex_unlock(&rga->mutex);
rel_cmdbuf:
- dma_free_attrs(rga->dev, rga->hw->cmdbuf_size, ctx->cmdbuf_virt,
+ dma_free_attrs(rga->cores[0]->dev, rga->hw->cmdbuf_size, ctx->cmdbuf_virt,
ctx->cmdbuf_phy, DMA_ATTR_WRITE_COMBINE);
rel_ctx:
kfree(ctx);
@@ -342,7 +345,7 @@ static int rga_release(struct file *file)
v4l2_fh_del(&ctx->fh, file);
v4l2_fh_exit(&ctx->fh);
- dma_free_attrs(rga->dev, rga->hw->cmdbuf_size, ctx->cmdbuf_virt,
+ dma_free_attrs(rga->cores[0]->dev, rga->hw->cmdbuf_size, ctx->cmdbuf_virt,
ctx->cmdbuf_phy, DMA_ATTR_WRITE_COMBINE);
kfree(ctx);
@@ -689,26 +692,26 @@ static const struct video_device rga_videodev = {
.device_caps = V4L2_CAP_VIDEO_M2M_MPLANE | V4L2_CAP_STREAMING,
};
-static int rga_parse_dt(struct rockchip_rga *rga)
+static int rga_parse_dt(struct rga_core *core)
{
struct reset_control *core_rst, *axi_rst, *ahb_rst;
int ret;
- core_rst = devm_reset_control_get(rga->dev, "core");
+ core_rst = devm_reset_control_get(core->dev, "core");
if (IS_ERR(core_rst)) {
- dev_err(rga->dev, "failed to get core reset controller\n");
+ dev_err(core->dev, "failed to get core reset controller\n");
return PTR_ERR(core_rst);
}
- axi_rst = devm_reset_control_get(rga->dev, "axi");
+ axi_rst = devm_reset_control_get(core->dev, "axi");
if (IS_ERR(axi_rst)) {
- dev_err(rga->dev, "failed to get axi reset controller\n");
+ dev_err(core->dev, "failed to get axi reset controller\n");
return PTR_ERR(axi_rst);
}
- ahb_rst = devm_reset_control_get(rga->dev, "ahb");
+ ahb_rst = devm_reset_control_get(core->dev, "ahb");
if (IS_ERR(ahb_rst)) {
- dev_err(rga->dev, "failed to get ahb reset controller\n");
+ dev_err(core->dev, "failed to get ahb reset controller\n");
return PTR_ERR(ahb_rst);
}
@@ -724,12 +727,12 @@ static int rga_parse_dt(struct rockchip_rga *rga)
udelay(1);
reset_control_deassert(ahb_rst);
- ret = devm_clk_bulk_get_all(rga->dev, &rga->clks);
+ ret = devm_clk_bulk_get_all(core->dev, &core->clks);
if (ret < 0) {
- dev_err(rga->dev, "failed to get clocks\n");
+ dev_err(core->dev, "failed to get clocks\n");
return ret;
}
- rga->num_clks = ret;
+ core->num_clks = ret;
return 0;
}
@@ -780,6 +783,7 @@ static int rga_disable_multicore(struct device *dev)
static int rga_probe(struct platform_device *pdev)
{
struct rockchip_rga *rga;
+ struct rga_core *core;
struct video_device *vfd;
int ret = 0;
int irq;
@@ -791,7 +795,7 @@ static int rga_probe(struct platform_device *pdev)
if (ret)
return ret;
- rga = devm_kzalloc(&pdev->dev, sizeof(*rga), GFP_KERNEL);
+ rga = devm_kzalloc(&pdev->dev, sizeof(*rga) + 1 * sizeof(*rga->cores), GFP_KERNEL);
if (!rga)
return -ENOMEM;
@@ -799,20 +803,25 @@ static int rga_probe(struct platform_device *pdev)
if (!rga->hw)
return dev_err_probe(&pdev->dev, -ENODEV, "failed to get match data\n");
- rga->dev = &pdev->dev;
spin_lock_init(&rga->ctrl_lock);
mutex_init(&rga->mutex);
- ret = rga_parse_dt(rga);
+ core = devm_kzalloc(&pdev->dev, sizeof(*core), GFP_KERNEL);
+ core->rga = rga;
+ core->dev = &pdev->dev;
+
+ rga->cores[0] = core;
+
+ ret = rga_parse_dt(core);
if (ret)
return dev_err_probe(&pdev->dev, ret, "Unable to parse OF data\n");
- pm_runtime_set_autosuspend_delay(rga->dev, 50);
- pm_runtime_enable(rga->dev);
+ pm_runtime_set_autosuspend_delay(core->dev, 50);
+ pm_runtime_enable(core->dev);
- rga->regs = devm_platform_ioremap_resource(pdev, 0);
- if (IS_ERR(rga->regs)) {
- ret = PTR_ERR(rga->regs);
+ core->regs = devm_platform_ioremap_resource(pdev, 0);
+ if (IS_ERR(core->regs)) {
+ ret = PTR_ERR(core->regs);
goto err_put_clk;
}
@@ -822,17 +831,17 @@ static int rga_probe(struct platform_device *pdev)
goto err_put_clk;
}
- ret = devm_request_irq(rga->dev, irq, rga_isr,
+ ret = devm_request_irq(core->dev, irq, rga_isr,
rga_has_internal_iommu(rga) ? 0 : IRQF_SHARED,
- dev_name(rga->dev), rga);
+ dev_name(core->dev), core);
if (ret < 0) {
- dev_err(rga->dev, "failed to request irq\n");
+ dev_err(core->dev, "failed to request irq\n");
goto err_put_clk;
}
- ret = dma_set_mask_and_coherent(rga->dev, DMA_BIT_MASK(32));
+ ret = dma_set_mask_and_coherent(core->dev, DMA_BIT_MASK(32));
if (ret) {
- dev_err(rga->dev, "32-bit DMA not supported");
+ dev_err(core->dev, "32-bit DMA not supported");
goto err_put_clk;
}
@@ -852,7 +861,7 @@ static int rga_probe(struct platform_device *pdev)
video_set_drvdata(vfd, rga);
rga->vfd = vfd;
- platform_set_drvdata(pdev, rga);
+ platform_set_drvdata(pdev, core);
rga->m2m_dev = v4l2_m2m_init(&rga_m2m_ops);
if (IS_ERR(rga->m2m_dev)) {
v4l2_err(&rga->v4l2_dev, "Failed to init mem2mem device\n");
@@ -860,16 +869,16 @@ static int rga_probe(struct platform_device *pdev)
goto rel_vdev;
}
- ret = pm_runtime_resume_and_get(rga->dev);
+ ret = pm_runtime_resume_and_get(core->dev);
if (ret < 0)
goto rel_m2m;
- rga->version = rga->hw->get_version(rga);
+ rga->version = rga->hw->get_version(core);
v4l2_info(&rga->v4l2_dev, "HW Version: 0x%02x.%02x\n",
rga->version.major, rga->version.minor);
- pm_runtime_put(rga->dev);
+ pm_runtime_put(core->dev);
ret = video_register_device(vfd, VFL_TYPE_VIDEO, -1);
if (ret) {
@@ -889,14 +898,15 @@ static int rga_probe(struct platform_device *pdev)
unreg_v4l2_dev:
v4l2_device_unregister(&rga->v4l2_dev);
err_put_clk:
- pm_runtime_disable(rga->dev);
+ pm_runtime_disable(core->dev);
return ret;
}
static void rga_remove(struct platform_device *pdev)
{
- struct rockchip_rga *rga = platform_get_drvdata(pdev);
+ struct rga_core *core = platform_get_drvdata(pdev);
+ struct rockchip_rga *rga = core->rga;
v4l2_info(&rga->v4l2_dev, "Removing\n");
@@ -904,23 +914,23 @@ static void rga_remove(struct platform_device *pdev)
video_unregister_device(rga->vfd);
v4l2_device_unregister(&rga->v4l2_dev);
- pm_runtime_disable(rga->dev);
+ pm_runtime_disable(core->dev);
}
static int __maybe_unused rga_runtime_suspend(struct device *dev)
{
- struct rockchip_rga *rga = dev_get_drvdata(dev);
+ struct rga_core *core = dev_get_drvdata(dev);
- clk_bulk_disable_unprepare(rga->num_clks, rga->clks);
+ clk_bulk_disable_unprepare(core->num_clks, core->clks);
return 0;
}
static int __maybe_unused rga_runtime_resume(struct device *dev)
{
- struct rockchip_rga *rga = dev_get_drvdata(dev);
+ struct rga_core *core = dev_get_drvdata(dev);
- return clk_bulk_prepare_enable(rga->num_clks, rga->clks);
+ return clk_bulk_prepare_enable(core->num_clks, core->clks);
}
static const struct dev_pm_ops rga_pm = {
diff --git a/drivers/media/platform/rockchip/rga/rga.h b/drivers/media/platform/rockchip/rga/rga.h
index 0e854cdf739f4..fcf1ef7d2029f 100644
--- a/drivers/media/platform/rockchip/rga/rga.h
+++ b/drivers/media/platform/rockchip/rga/rga.h
@@ -14,7 +14,6 @@
#include <media/v4l2-device.h>
#define RGA_NAME "rockchip-rga"
-
#define DEFAULT_WIDTH 100
#define DEFAULT_HEIGHT 100
@@ -36,6 +35,16 @@ struct rockchip_rga_version {
u32 minor;
};
+struct rga_core {
+ struct device *dev;
+ void __iomem *regs;
+ struct clk_bulk_data *clks;
+ int num_clks;
+
+ struct rockchip_rga *rga;
+ struct rga_ctx *curr;
+};
+
struct rga_ctx {
struct v4l2_fh fh;
struct rockchip_rga *rga;
@@ -70,10 +79,6 @@ struct rockchip_rga {
struct v4l2_m2m_dev *m2m_dev;
struct video_device *vfd;
- struct device *dev;
- void __iomem *regs;
- struct clk_bulk_data *clks;
- int num_clks;
struct rockchip_rga_version version;
/* vfd lock */
@@ -81,9 +86,9 @@ struct rockchip_rga {
/* ctrl parm lock */
spinlock_t ctrl_lock;
- struct rga_ctx *curr;
-
const struct rga_hw *hw;
+
+ struct rga_core *cores[];
};
struct rga_addrs {
@@ -119,22 +124,22 @@ int rga_check_scaling(const struct rga_hw *hw, const struct v4l2_rect *crop_in,
extern const struct vb2_ops rga_qops;
/* RGA Hardware */
-static inline void rga_write(struct rockchip_rga *rga, u32 reg, u32 value)
+static inline void rga_write(struct rga_core *core, u32 reg, u32 value)
{
- writel(value, rga->regs + reg);
+ writel(value, core->regs + reg);
};
-static inline u32 rga_read(struct rockchip_rga *rga, u32 reg)
+static inline u32 rga_read(struct rga_core *core, u32 reg)
{
- return readl(rga->regs + reg);
+ return readl(core->regs + reg);
};
-static inline void rga_mod(struct rockchip_rga *rga, u32 reg, u32 val, u32 mask)
+static inline void rga_mod(struct rga_core *core, u32 reg, u32 val, u32 mask)
{
- u32 temp = rga_read(rga, reg) & ~(mask);
+ u32 temp = rga_read(core, reg) & ~(mask);
temp |= val & mask;
- rga_write(rga, reg, temp);
+ rga_write(core, reg, temp);
};
#define RGA_FEATURE_FLIP BIT(0)
@@ -155,10 +160,10 @@ struct rga_hw {
* Requires that the cmdbuf is already zeroed.
*/
void (*setup_cmdbuf)(struct rga_ctx *ctx);
- void (*start)(struct rockchip_rga *rga,
+ void (*start)(struct rga_core *core,
struct rga_vb_buffer *src, struct rga_vb_buffer *dst);
- bool (*handle_irq)(struct rockchip_rga *rga);
- struct rockchip_rga_version (*get_version)(struct rockchip_rga *rga);
+ bool (*handle_irq)(struct rga_core *core);
+ struct rockchip_rga_version (*get_version)(struct rga_core *core);
void *(*adjust_and_map_format)(struct rga_ctx *ctx,
struct v4l2_pix_format_mplane *format,
bool is_output);
diff --git a/drivers/media/platform/rockchip/rga/rga3-hw.c b/drivers/media/platform/rockchip/rga/rga3-hw.c
index 3469523a5ecad..f7e4bc8c6ff21 100644
--- a/drivers/media/platform/rockchip/rga/rga3-hw.c
+++ b/drivers/media/platform/rockchip/rga/rga3-hw.c
@@ -266,42 +266,42 @@ static void rga3_hw_setup_cmdbuf(struct rga_ctx *ctx)
rga3_cmd_set_wr_format(ctx);
}
-static void rga3_hw_start(struct rockchip_rga *rga,
+static void rga3_hw_start(struct rga_core *core,
struct rga_vb_buffer *src, struct rga_vb_buffer *dst)
{
- struct rga_ctx *ctx = rga->curr;
+ struct rga_ctx *ctx = core->curr;
rga3_cmd_set_win0_addr(ctx, &src->dma_addrs);
rga3_cmd_set_wr_addr(ctx, &dst->dma_addrs);
- rga_write(rga, RGA3_CMD_ADDR, ctx->cmdbuf_phy);
+ rga_write(core, RGA3_CMD_ADDR, ctx->cmdbuf_phy);
/* sync CMD buf for RGA */
- dma_sync_single_for_device(rga->dev, ctx->cmdbuf_phy,
+ dma_sync_single_for_device(core->rga->cores[0]->dev, ctx->cmdbuf_phy,
PAGE_SIZE, DMA_BIDIRECTIONAL);
/* set to master mode and start the conversion */
- rga_write(rga, RGA3_SYS_CTRL,
+ rga_write(core, RGA3_SYS_CTRL,
FIELD_PREP(RGA3_CMD_MODE, RGA3_CMD_MODE_MASTER));
- rga_write(rga, RGA3_INT_EN, FIELD_PREP(RGA3_INT_FRM_DONE, 1));
- rga_write(rga, RGA3_CMD_CTRL,
+ rga_write(core, RGA3_INT_EN, FIELD_PREP(RGA3_INT_FRM_DONE, 1));
+ rga_write(core, RGA3_CMD_CTRL,
FIELD_PREP(RGA3_CMD_LINE_START_PULSE, 1));
}
-static bool rga3_handle_irq(struct rockchip_rga *rga)
+static bool rga3_handle_irq(struct rga_core *core)
{
u32 intr;
- intr = rga_read(rga, RGA3_INT_RAW);
+ intr = rga_read(core, RGA3_INT_RAW);
/* clear all interrupts */
- rga_write(rga, RGA3_INT_CLR, intr);
+ rga_write(core, RGA3_INT_CLR, intr);
return FIELD_GET(RGA3_INT_FRM_DONE, intr);
}
-static struct rockchip_rga_version rga3_get_version(struct rockchip_rga *rga)
+static struct rockchip_rga_version rga3_get_version(struct rga_core *core)
{
- u32 version = rga_read(rga, RGA3_VERSION_NUM);
+ u32 version = rga_read(core, RGA3_VERSION_NUM);
return (struct rockchip_rga_version) {
.major = FIELD_GET(RGA3_VERSION_NUM_MAJOR, version),
--
2.54.0
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH 08/17] media: rockchip: rga: add rga_core structure
2026-06-05 22:06 ` [PATCH 08/17] media: rockchip: rga: add rga_core structure Sven Püschel
@ 2026-06-05 22:22 ` sashiko-bot
0 siblings, 0 replies; 30+ messages in thread
From: sashiko-bot @ 2026-06-05 22:22 UTC (permalink / raw)
To: Sven Püschel; +Cc: linux-media
Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:
New issues:
- [High] Missing NULL pointer check for devm_kzalloc allocation
Pre-existing issues:
- [High] Shared IRQ handler accesses suspended hardware and dereferences NULL context
- [High] Missing synchronize_irq() in runtime suspend with shared interrupts
- [High] Race condition between pm_runtime_enable and platform_set_drvdata
- [High] Use-After-Free during module removal due to incorrect teardown ordering
--
commit 1e1c3cc644587ad2255c5349765b3893e811a0e9
Author: Sven Püschel <s.pueschel@pengutronix.de>
media: rockchip: rga: add rga_core structure
Add a rga_core structure to separate the core specific data from the
m2m device. This is done in preparation for multi-core support.
> diff --git a/drivers/media/platform/rockchip/rga/rga-hw.c b/drivers/media/platform/rockchip/rga/rga-hw.c
> --- a/drivers/media/platform/rockchip/rga/rga-hw.c
> +++ b/drivers/media/platform/rockchip/rga/rga-hw.c
[ ... ]
> @@ -447,36 +447,34 @@ static void rga_hw_setup_cmdbuf(struct rga_ctx *ctx)
> -static bool rga_handle_irq(struct rockchip_rga *rga)
> +static bool rga_handle_irq(struct rga_core *core)
> {
> int intr;
>
> - intr = rga_read(rga, RGA_INT) & 0xf;
> + intr = rga_read(core, RGA_INT) & 0xf;
>
> - rga_mod(rga, RGA_INT, intr << 4, 0xf << 4);
> + rga_mod(core, RGA_INT, intr << 4, 0xf << 4);
>
> return intr & RGA_INT_COMMAND_FINISHED;
> }
[Severity: High]
This is a pre-existing issue, but does this handle spurious interrupts
correctly when the hardware is suspended?
If an interrupt arrives on the shared IRQ line while the device is suspended,
the clock-gated hardware typically returns 0xffffffff for rga_read(). The
bitwise AND with RGA_INT_COMMAND_FINISHED will incorrectly evaluate to true.
The handler would then proceed in rga_isr():
struct rga_ctx *ctx = core->curr;
WARN_ON(!ctx);
core->curr = NULL;
src = v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
If no job is active, core->curr is NULL, and dereferencing ctx->fh.m2m_ctx
will cause a kernel panic. Should we verify the PM state before hardware
access in the ISR using pm_runtime_get_if_active()?
> diff --git a/drivers/media/platform/rockchip/rga/rga.c b/drivers/media/platform/rockchip/rga/rga.c
> --- a/drivers/media/platform/rockchip/rga/rga.c
> +++ b/drivers/media/platform/rockchip/rga/rga.c
[ ... ]
> @@ -799,20 +803,25 @@ static int rga_probe(struct platform_device *pdev)
> if (!rga->hw)
> return dev_err_probe(&pdev->dev, -ENODEV, "failed to get match data\n");
>
> - rga->dev = &pdev->dev;
> spin_lock_init(&rga->ctrl_lock);
> mutex_init(&rga->mutex);
>
> - ret = rga_parse_dt(rga);
> + core = devm_kzalloc(&pdev->dev, sizeof(*core), GFP_KERNEL);
> + core->rga = rga;
> + core->dev = &pdev->dev;
[Severity: High]
Could this lead to a NULL pointer dereference? If devm_kzalloc() fails and
returns NULL, assigning core->rga = rga in the next line will immediately
crash. A check like "if (!core) return -ENOMEM;" appears to be missing here.
> +
> + rga->cores[0] = core;
> +
> + ret = rga_parse_dt(core);
> if (ret)
> return dev_err_probe(&pdev->dev, ret, "Unable to parse OF data\n");
>
> - pm_runtime_set_autosuspend_delay(rga->dev, 50);
> - pm_runtime_enable(rga->dev);
> + pm_runtime_set_autosuspend_delay(core->dev, 50);
> + pm_runtime_enable(core->dev);
>
> - rga->regs = devm_platform_ioremap_resource(pdev, 0);
> - if (IS_ERR(rga->regs)) {
> - ret = PTR_ERR(rga->regs);
> + core->regs = devm_platform_ioremap_resource(pdev, 0);
> + if (IS_ERR(core->regs)) {
> + ret = PTR_ERR(core->regs);
> goto err_put_clk;
> }
[ ... ]
> @@ -852,7 +861,7 @@ static int rga_probe(struct platform_device *pdev)
> video_set_drvdata(vfd, rga);
> rga->vfd = vfd;
>
> - platform_set_drvdata(pdev, rga);
> + platform_set_drvdata(pdev, core);
[Severity: High]
This is a pre-existing issue, but does this create a race condition window
between pm_runtime_enable() and platform_set_drvdata()?
If a PM transition occurs right after pm_runtime_enable(core->dev), the
runtime PM callbacks like rga_runtime_suspend() might execute before
platform_set_drvdata(pdev, core) is called.
In rga_runtime_suspend(), dev_get_drvdata() would return NULL, leading to a
crash when attempting to access core->num_clks.
[ ... ]
> @@ -904,23 +914,23 @@ static void rga_remove(struct platform_device *pdev)
> video_unregister_device(rga->vfd);
> v4l2_device_unregister(&rga->v4l2_dev);
>
> - pm_runtime_disable(rga->dev);
> + pm_runtime_disable(core->dev);
> }
[Severity: High]
This is a pre-existing issue, but could this teardown sequence result in a
use-after-free vulnerability?
Calling v4l2_m2m_release(rga->m2m_dev) (which occurs just before the quoted
lines) before video_unregister_device() leaves the video node accessible to
userspace after the m2m_dev structure has been freed. Concurrent V4L2 ioctls
from userspace might then interact with freed memory.
> static int __maybe_unused rga_runtime_suspend(struct device *dev)
> {
> - struct rockchip_rga *rga = dev_get_drvdata(dev);
> + struct rga_core *core = dev_get_drvdata(dev);
>
> - clk_bulk_disable_unprepare(rga->num_clks, rga->clks);
> + clk_bulk_disable_unprepare(core->num_clks, core->clks);
>
> return 0;
> }
[Severity: High]
This is a pre-existing issue, but are we missing a call to synchronize_irq()
before powering down the hardware?
Since the driver can request a shared interrupt line (IRQF_SHARED), disabling
the clocks without waiting for mid-flight IRQ handlers to finish might result
in bus errors or crashes if another device triggers the shared IRQ while
suspending.
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260606-spu-rga3multicore-v1-0-3ec2b15675f7@pengutronix.de?part=8
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH 09/17] media: rockchip: rga: use components to manage multiple cores
2026-06-05 22:06 [PATCH 00/17] media: rockchip: rga: Add multi-core support Sven Püschel
` (7 preceding siblings ...)
2026-06-05 22:06 ` [PATCH 08/17] media: rockchip: rga: add rga_core structure Sven Püschel
@ 2026-06-05 22:06 ` Sven Püschel
2026-06-05 22:24 ` sashiko-bot
2026-06-05 22:06 ` [PATCH 10/17] media: rockchip: rga: move rockchip_rga allocation to master probe Sven Püschel
` (7 subsequent siblings)
16 siblings, 1 reply; 30+ messages in thread
From: Sven Püschel @ 2026-06-05 22:06 UTC (permalink / raw)
To: Jacob Chen, Ezequiel Garcia, Mauro Carvalho Chehab,
Heiko Stuebner, Philipp Zabel
Cc: linux-media, linux-rockchip, linux-arm-kernel, linux-kernel,
kernel, Detlev Casanova, Michael Tretter, Sven Püschel
Use component helpers to manage multiple cores and aggregate them into a
central master device. This gives us a dedicated master device and
ensures that all cores (components) are properly set up before creating
the video device.
This commit only sets up a basic component device. Instead of the
rga_disable_multicore function only the first core is added to the
master device. To avoid the secondary core creating an additional video
device the whole core probe implementation is moved to the bind method,
which is only called when the core is bound to a master device.
The implementation is based on the etnaviv gpu driver, which also groups
multiple gpu cores under a single etnaviv master device.
Signed-off-by: Sven Püschel <s.pueschel@pengutronix.de>
---
drivers/media/platform/rockchip/rga/rga.c | 256 +++++++++++++++++++++++-------
1 file changed, 202 insertions(+), 54 deletions(-)
diff --git a/drivers/media/platform/rockchip/rga/rga.c b/drivers/media/platform/rockchip/rga/rga.c
index 15d095a1d1973..178f45b8da940 100644
--- a/drivers/media/platform/rockchip/rga/rga.c
+++ b/drivers/media/platform/rockchip/rga/rga.c
@@ -5,6 +5,7 @@
*/
#include <linux/clk.h>
+#include <linux/component.h>
#include <linux/debugfs.h>
#include <linux/delay.h>
#include <linux/fs.h>
@@ -737,51 +738,9 @@ static int rga_parse_dt(struct rga_core *core)
return 0;
}
-/*
- * Some SoCs, like RK3588 have multiple identical RGA3 cores, but the
- * kernel is currently missing support for multi-core handling. Exposing
- * separate devices for each core to userspace is bad, since that does
- * not allow scheduling tasks properly (and creates ABI). With this workaround
- * the driver will only probe for the first core and early exit for the other
- * cores. Once the driver gains multi-core support, the same technique
- * for detecting the main core can be used to cluster all cores together.
- */
-static int rga_disable_multicore(struct device *dev)
-{
- struct device_node *node = NULL;
- const char *compatible;
- bool is_main_core;
- int ret;
-
- /* Intentionally ignores the fallback strings */
- ret = of_property_read_string(dev->of_node, "compatible", &compatible);
- if (ret)
- return ret;
-
- /* The first compatible and available node found is considered the main core */
- do {
- node = of_find_compatible_node(node, NULL, compatible);
- if (of_device_is_available(node))
- break;
- } while (node);
-
- if (!node)
- return -EINVAL;
-
- is_main_core = (dev->of_node == node);
-
- of_node_put(node);
-
- if (!is_main_core) {
- dev_info(dev, "missing multi-core support, ignoring this instance\n");
- return -ENODEV;
- }
-
- return 0;
-}
-
-static int rga_probe(struct platform_device *pdev)
+static int rga_core_bind(struct device *dev, struct device *master, void *data)
{
+ struct platform_device *pdev = to_platform_device(dev);
struct rockchip_rga *rga;
struct rga_core *core;
struct video_device *vfd;
@@ -791,10 +750,6 @@ static int rga_probe(struct platform_device *pdev)
if (!pdev->dev.of_node)
return -ENODEV;
- ret = rga_disable_multicore(&pdev->dev);
- if (ret)
- return ret;
-
rga = devm_kzalloc(&pdev->dev, sizeof(*rga) + 1 * sizeof(*rga->cores), GFP_KERNEL);
if (!rga)
return -ENOMEM;
@@ -903,9 +858,10 @@ static int rga_probe(struct platform_device *pdev)
return ret;
}
-static void rga_remove(struct platform_device *pdev)
+static void rga_core_unbind(struct device *dev, struct device *master,
+ void *data)
{
- struct rga_core *core = platform_get_drvdata(pdev);
+ struct rga_core *core = dev_get_drvdata(dev);
struct rockchip_rga *rga = core->rga;
v4l2_info(&rga->v4l2_dev, "Removing\n");
@@ -917,6 +873,29 @@ static void rga_remove(struct platform_device *pdev)
pm_runtime_disable(core->dev);
}
+static const struct component_ops rga_core_ops = {
+ .bind = rga_core_bind,
+ .unbind = rga_core_unbind,
+};
+
+static int rga_core_probe(struct platform_device *pdev)
+{
+ int ret = 0;
+
+ ret = component_add(&pdev->dev, &rga_core_ops);
+ if (ret < 0) {
+ dev_err(&pdev->dev, "failed to register component: %d", ret);
+ return ret;
+ }
+
+ return 0;
+}
+
+static void rga_core_remove(struct platform_device *pdev)
+{
+ component_del(&pdev->dev, &rga_core_ops);
+}
+
static int __maybe_unused rga_runtime_suspend(struct device *dev)
{
struct rga_core *core = dev_get_drvdata(dev);
@@ -933,7 +912,7 @@ static int __maybe_unused rga_runtime_resume(struct device *dev)
return clk_bulk_prepare_enable(core->num_clks, core->clks);
}
-static const struct dev_pm_ops rga_pm = {
+static const struct dev_pm_ops rga_core_pm = {
SET_RUNTIME_PM_OPS(rga_runtime_suspend,
rga_runtime_resume, NULL)
};
@@ -956,17 +935,186 @@ static const struct of_device_id rockchip_rga_match[] = {
MODULE_DEVICE_TABLE(of, rockchip_rga_match);
+static struct platform_driver rga_core_pdrv = {
+ .probe = rga_core_probe,
+ .remove = rga_core_remove,
+ .driver = {
+ .name = RGA_NAME "-core",
+ .pm = &rga_core_pm,
+ .of_match_table = rockchip_rga_match,
+ },
+};
+
+static int rga_bind(struct device *dev)
+{
+ int ret;
+
+ ret = component_bind_all(dev, NULL);
+ if (ret) {
+ dev_err(dev, "component bind failed\n");
+ return ret;
+ }
+
+ return 0;
+}
+
+static void rga_unbind(struct device *dev)
+{
+ component_unbind_all(dev, NULL);
+}
+
+struct component_master_ops rga_master_ops = {
+ .bind = rga_bind,
+ .unbind = rga_unbind,
+};
+
+static int rga_probe(struct platform_device *pdev)
+{
+ const struct of_device_id *match_desc = pdev->dev.platform_data;
+ struct device *dev = &pdev->dev;
+ struct component_match *match = NULL;
+ struct device_node *core_node;
+
+ if (!match_desc)
+ return dev_err_probe(dev, -ENODEV, "missing platform data\n");
+
+ for_each_compatible_node(core_node, NULL, match_desc->compatible) {
+ if (!of_device_is_available(core_node))
+ continue;
+
+ of_node_get(core_node);
+ component_match_add_release(dev, &match, component_release_of,
+ component_compare_of, core_node);
+
+ /*
+ * As multi core is not implemented yet,
+ * break out of the loop to only have one core per rockchip_rga struct.
+ * Also put the node, which otherwise would've been done by the loop iteration.
+ */
+ of_node_put(core_node);
+ break;
+ }
+
+ if (!match)
+ return dev_err_probe(
+ dev, -ENODEV,
+ "no matching available component devices found\n");
+
+ return component_master_add_with_match(dev, &rga_master_ops, match);
+}
+
+static void rga_remove(struct platform_device *pdev)
+{
+ component_master_del(&pdev->dev, &rga_master_ops);
+}
+
static struct platform_driver rga_pdrv = {
.probe = rga_probe,
.remove = rga_remove,
.driver = {
.name = RGA_NAME,
- .pm = &rga_pm,
- .of_match_table = rockchip_rga_match,
},
};
-module_platform_driver(rga_pdrv);
+static bool rga_of_has_available_node(const char *compat)
+{
+ struct device_node *node;
+
+ for_each_compatible_node(node, NULL, compat) {
+ if (of_device_is_available(node)) {
+ of_node_put(node);
+ return true;
+ }
+ }
+
+ return false;
+}
+
+static int rga_create_platform_device(struct platform_device **ppdev,
+ const struct of_device_id *match)
+{
+ struct platform_device *pdev;
+ int ret;
+
+ pdev = platform_device_alloc(match->compatible, PLATFORM_DEVID_NONE);
+ if (!pdev)
+ return -ENOMEM;
+
+ ret = platform_device_add_data(pdev, match, sizeof(*match));
+ if (ret)
+ goto free_platform_device;
+
+ ret = platform_device_add(pdev);
+ if (ret)
+ goto free_platform_device;
+
+ ret = device_driver_attach(&rga_pdrv.driver, &pdev->dev);
+ if (ret)
+ goto del_platform_device;
+
+ *ppdev = pdev;
+
+ return 0;
+
+del_platform_device:
+ platform_device_del(pdev);
+free_platform_device:
+ platform_device_put(pdev);
+ return ret;
+}
+
+static struct platform_device *master_pdevs[ARRAY_SIZE(rockchip_rga_match) - 1];
+
+static int __init rga_init(void)
+{
+ int ret;
+ unsigned int i;
+
+ ret = platform_driver_register(&rga_core_pdrv);
+ if (ret != 0)
+ return ret;
+
+ ret = platform_driver_register(&rga_pdrv);
+ if (ret != 0)
+ goto unregister_core_driver;
+
+ for (i = 0; i < ARRAY_SIZE(master_pdevs); i++) {
+ if (!rga_of_has_available_node(
+ rockchip_rga_match[i].compatible))
+ continue;
+
+ ret = rga_create_platform_device(&master_pdevs[i],
+ &rockchip_rga_match[i]);
+ if (ret)
+ goto unregister_platform_devices;
+ }
+
+ return 0;
+
+unregister_platform_devices:
+ for (i = 0; i < ARRAY_SIZE(master_pdevs); i++) {
+ platform_device_unregister(master_pdevs[i]);
+ master_pdevs[i] = NULL;
+ }
+ platform_driver_unregister(&rga_pdrv);
+unregister_core_driver:
+ platform_driver_unregister(&rga_core_pdrv);
+ return ret;
+}
+module_init(rga_init);
+
+static void __exit rga_exit(void)
+{
+ unsigned int i;
+
+ for (i = 0; i < ARRAY_SIZE(master_pdevs); i++) {
+ platform_device_unregister(master_pdevs[i]);
+ master_pdevs[i] = NULL;
+ }
+ platform_driver_unregister(&rga_pdrv);
+ platform_driver_unregister(&rga_core_pdrv);
+}
+module_exit(rga_exit);
MODULE_AUTHOR("Jacob Chen <jacob-chen@iotwrt.com>");
MODULE_DESCRIPTION("Rockchip Raster 2d Graphic Acceleration Unit");
--
2.54.0
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH 09/17] media: rockchip: rga: use components to manage multiple cores
2026-06-05 22:06 ` [PATCH 09/17] media: rockchip: rga: use components to manage multiple cores Sven Püschel
@ 2026-06-05 22:24 ` sashiko-bot
0 siblings, 0 replies; 30+ messages in thread
From: sashiko-bot @ 2026-06-05 22:24 UTC (permalink / raw)
To: Sven Püschel; +Cc: linux-media
Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
New issues:
- [Critical] v4l2 `m2m_dev` is freed while the shared interrupt handler remains active, causing a use-after-free.
Pre-existing issues:
- [High] Missing `synchronize_irq()` in runtime suspend and unsafe hardware access in shared IRQ handler.
- [High] Out-of-order `pm_runtime_enable` and `platform_set_drvdata` causes a NULL pointer dereference in `rga_runtime_resume`.
--
commit 56522d85947daeaf3327877b7e9e7555910847ab
Author: Sven Püschel <s.pueschel@pengutronix.de>
media: rockchip: rga: use components to manage multiple cores
Use component helpers to manage multiple cores and aggregate them into a
central master device. This gives us a dedicated master device and
ensures that all cores (components) are properly set up before creating
the video device.
> diff --git a/drivers/media/platform/rockchip/rga/rga.c b/drivers/media/platform/rockchip/rga/rga.c
> index 15d095a1d1973..178f45b8da940 100644
> --- a/drivers/media/platform/rockchip/rga/rga.c
> +++ b/drivers/media/platform/rockchip/rga/rga.c
[ ... ]
> +static int rga_core_probe(struct platform_device *pdev)
> +{
> + int ret = 0;
> +
> + ret = component_add(&pdev->dev, &rga_core_ops);
> + if (ret < 0) {
> + dev_err(&pdev->dev, "failed to register component: %d", ret);
> + return ret;
> + }
> +
> + return 0;
> +}
[Severity: High]
This is a pre-existing issue, but can the initialization order leave the
driver vulnerable to sysfs-triggered PM transitions?
During probe, pm_runtime_enable() is called before the driver data is set:
drivers/media/platform/rockchip/rga/rga.c:
pm_runtime_set_autosuspend_delay(core->dev, 50);
pm_runtime_enable(core->dev);
...
platform_set_drvdata(pdev, core);
If a concurrent PM state transition to 'on' occurs (e.g., via a sysfs write)
before platform_set_drvdata() runs, the PM core will trigger
rga_runtime_resume().
Will this result in a NULL pointer dereference when dev_get_drvdata() is
called?
drivers/media/platform/rockchip/rga/rga.c:rga_runtime_resume() {
struct rga_core *core = dev_get_drvdata(dev);
return clk_bulk_prepare_enable(core->num_clks, core->clks);
}
[ ... ]
> static int __maybe_unused rga_runtime_suspend(struct device *dev)
> {
> struct rga_core *core = dev_get_drvdata(dev);
> @@ -933,7 +912,7 @@ static int __maybe_unused rga_runtime_resume(struct device *dev)
> return clk_bulk_prepare_enable(core->num_clks, core->clks);
> }
>
> -static const struct dev_pm_ops rga_pm = {
> +static const struct dev_pm_ops rga_core_pm = {
> SET_RUNTIME_PM_OPS(rga_runtime_suspend,
> rga_runtime_resume, NULL)
> };
[Severity: High]
This is a pre-existing issue, but does this runtime suspend callback lack
a synchronize_irq() call?
When the hardware lacks an internal IOMMU, the driver requests a shared
interrupt line (IRQF_SHARED). If a shared interrupt fires while the device
is runtime-suspended, the interrupt handler rga_isr() unconditionally
reads hardware registers via rga_handle_irq() without verifying if the
device is powered on using pm_runtime_get_if_active().
Does disabling clocks here in rga_runtime_suspend():
drivers/media/platform/rockchip/rga/rga.c:rga_runtime_suspend() {
...
clk_bulk_disable_unprepare(core->num_clks, core->clks);
...
}
allow the interrupt handler to execute mid-flight while the hardware is
powered down, potentially leading to a synchronous external abort?
[ ... ]
> +static void rga_unbind(struct device *dev)
> +{
> + component_unbind_all(dev, NULL);
> +}
[Severity: Critical]
Does unbinding the master device cause a use-after-free in the active
interrupt handler?
Because the componentization moved the v4l2 m2m_dev allocation to the master
device, but the devres-managed IRQ request is on the core component device,
the shared interrupt handler rga_isr() remains active indefinitely.
When the master device is unbound, rga->m2m_dev is freed:
drivers/media/platform/rockchip/rga/rga.c:rga_unbind() {
...
v4l2_m2m_release(rga->m2m_dev);
...
}
If a shared interrupt fires after the master is unbound, will rga_isr()
access the freed m2m_dev and rga structures when calling
v4l2_m2m_job_finish()?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260606-spu-rga3multicore-v1-0-3ec2b15675f7@pengutronix.de?part=9
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH 10/17] media: rockchip: rga: move rockchip_rga allocation to master probe
2026-06-05 22:06 [PATCH 00/17] media: rockchip: rga: Add multi-core support Sven Püschel
` (8 preceding siblings ...)
2026-06-05 22:06 ` [PATCH 09/17] media: rockchip: rga: use components to manage multiple cores Sven Püschel
@ 2026-06-05 22:06 ` Sven Püschel
2026-06-05 22:23 ` sashiko-bot
2026-06-05 22:06 ` [PATCH 11/17] media: rockchip: rga: move video device to the master Sven Püschel
` (6 subsequent siblings)
16 siblings, 1 reply; 30+ messages in thread
From: Sven Püschel @ 2026-06-05 22:06 UTC (permalink / raw)
To: Jacob Chen, Ezequiel Garcia, Mauro Carvalho Chehab,
Heiko Stuebner, Philipp Zabel
Cc: linux-media, linux-rockchip, linux-arm-kernel, linux-kernel,
kernel, Detlev Casanova, Michael Tretter, Sven Püschel
Move the rockchip_rga struct allocation to the master component probe
function in preparation of enabling all cores. This also adjusts the
allocation to use the actual number of cores found in the of tree
instead of being fixed to one core.
Signed-off-by: Sven Püschel <s.pueschel@pengutronix.de>
---
drivers/media/platform/rockchip/rga/rga.c | 32 ++++++++++++++++++-------------
1 file changed, 19 insertions(+), 13 deletions(-)
diff --git a/drivers/media/platform/rockchip/rga/rga.c b/drivers/media/platform/rockchip/rga/rga.c
index 178f45b8da940..11912bf5b6906 100644
--- a/drivers/media/platform/rockchip/rga/rga.c
+++ b/drivers/media/platform/rockchip/rga/rga.c
@@ -741,7 +741,7 @@ static int rga_parse_dt(struct rga_core *core)
static int rga_core_bind(struct device *dev, struct device *master, void *data)
{
struct platform_device *pdev = to_platform_device(dev);
- struct rockchip_rga *rga;
+ struct rockchip_rga *rga = data;
struct rga_core *core;
struct video_device *vfd;
int ret = 0;
@@ -750,17 +750,6 @@ static int rga_core_bind(struct device *dev, struct device *master, void *data)
if (!pdev->dev.of_node)
return -ENODEV;
- rga = devm_kzalloc(&pdev->dev, sizeof(*rga) + 1 * sizeof(*rga->cores), GFP_KERNEL);
- if (!rga)
- return -ENOMEM;
-
- rga->hw = of_device_get_match_data(&pdev->dev);
- if (!rga->hw)
- return dev_err_probe(&pdev->dev, -ENODEV, "failed to get match data\n");
-
- spin_lock_init(&rga->ctrl_lock);
- mutex_init(&rga->mutex);
-
core = devm_kzalloc(&pdev->dev, sizeof(*core), GFP_KERNEL);
core->rga = rga;
core->dev = &pdev->dev;
@@ -947,9 +936,10 @@ static struct platform_driver rga_core_pdrv = {
static int rga_bind(struct device *dev)
{
+ struct rockchip_rga *rga = dev_get_drvdata(dev);
int ret;
- ret = component_bind_all(dev, NULL);
+ ret = component_bind_all(dev, rga);
if (ret) {
dev_err(dev, "component bind failed\n");
return ret;
@@ -974,6 +964,8 @@ static int rga_probe(struct platform_device *pdev)
struct device *dev = &pdev->dev;
struct component_match *match = NULL;
struct device_node *core_node;
+ struct rockchip_rga *rga;
+ u8 num_cores = 0;
if (!match_desc)
return dev_err_probe(dev, -ENODEV, "missing platform data\n");
@@ -985,6 +977,7 @@ static int rga_probe(struct platform_device *pdev)
of_node_get(core_node);
component_match_add_release(dev, &match, component_release_of,
component_compare_of, core_node);
+ num_cores++;
/*
* As multi core is not implemented yet,
@@ -1000,6 +993,19 @@ static int rga_probe(struct platform_device *pdev)
dev, -ENODEV,
"no matching available component devices found\n");
+ rga = devm_kzalloc(dev, sizeof(*rga) + num_cores * sizeof(*rga->cores), GFP_KERNEL);
+ if (!rga)
+ return -ENOMEM;
+
+ rga->hw = match_desc->data;
+ if (!rga->hw)
+ return dev_err_probe(dev, -ENODEV, "failed to get match data\n");
+
+ spin_lock_init(&rga->ctrl_lock);
+ mutex_init(&rga->mutex);
+
+ dev_set_drvdata(dev, rga);
+
return component_master_add_with_match(dev, &rga_master_ops, match);
}
--
2.54.0
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH 10/17] media: rockchip: rga: move rockchip_rga allocation to master probe
2026-06-05 22:06 ` [PATCH 10/17] media: rockchip: rga: move rockchip_rga allocation to master probe Sven Püschel
@ 2026-06-05 22:23 ` sashiko-bot
0 siblings, 0 replies; 30+ messages in thread
From: sashiko-bot @ 2026-06-05 22:23 UTC (permalink / raw)
To: Sven Püschel; +Cc: linux-media
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [High] Missing IS_ERR() check on the `match` pointer after `component_match_add_release()` can lead to a kernel panic.
--
commit 7be0968d7d9ce3f1b58fa4c450d0c7a606ff6014
Author: Sven Püschel <s.pueschel@pengutronix.de>
media: rockchip: rga: move rockchip_rga allocation to master probe
Move the rockchip_rga struct allocation to the master component probe
function in preparation of enabling all cores. This also adjusts the
allocation to use the actual number of cores found in the of tree
instead of being fixed to one core.
> diff --git a/drivers/media/platform/rockchip/rga/rga.c b/drivers/media/platform/rockchip/rga/rga.c
> index 178f45b8da940..11912bf5b6906 100644
> --- a/drivers/media/platform/rockchip/rga/rga.c
> +++ b/drivers/media/platform/rockchip/rga/rga.c
[ ... ]
> @@ -985,6 +977,7 @@ static int rga_probe(struct platform_device *pdev)
> of_node_get(core_node);
> component_match_add_release(dev, &match, component_release_of,
> component_compare_of, core_node);
> + num_cores++;
>
> /*
> * As multi core is not implemented yet,
> @@ -1000,6 +993,19 @@ static int rga_probe(struct platform_device *pdev)
> dev, -ENODEV,
> "no matching available component devices found\n");
[Severity: High]
If component_match_add_release() fails to allocate memory internally, it can
update the match pointer to ERR_PTR(-ENOMEM).
Since this code only checks if (!match), the error pointer bypasses the check
and is passed into component_master_add_with_match() at the end of rga_probe().
Could this lead to an invalid page fault when component_master_add_with_match()
unconditionally dereferences the match pointer?
> + rga = devm_kzalloc(dev, sizeof(*rga) + num_cores * sizeof(*rga->cores), GFP_KERNEL);
> + if (!rga)
> + return -ENOMEM;
[ ... ]
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260606-spu-rga3multicore-v1-0-3ec2b15675f7@pengutronix.de?part=10
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH 11/17] media: rockchip: rga: move video device to the master
2026-06-05 22:06 [PATCH 00/17] media: rockchip: rga: Add multi-core support Sven Püschel
` (9 preceding siblings ...)
2026-06-05 22:06 ` [PATCH 10/17] media: rockchip: rga: move rockchip_rga allocation to master probe Sven Püschel
@ 2026-06-05 22:06 ` Sven Püschel
2026-06-05 22:21 ` sashiko-bot
2026-06-05 22:06 ` [PATCH 12/17] media: rockchip: rga: move core initialization from bind to probe Sven Püschel
` (5 subsequent siblings)
16 siblings, 1 reply; 30+ messages in thread
From: Sven Püschel @ 2026-06-05 22:06 UTC (permalink / raw)
To: Jacob Chen, Ezequiel Garcia, Mauro Carvalho Chehab,
Heiko Stuebner, Philipp Zabel
Cc: linux-media, linux-rockchip, linux-arm-kernel, linux-kernel,
kernel, Detlev Casanova, Michael Tretter, Sven Püschel
Move the video device allocation and registration to the master
component bind function in preparation for binding multiple cores
to the master. Moving it to the master bind function allows to
only register the v4l2 device when all cores have been successfully
bound to the master device. This also causes the video device to be
bound against the master platform device instead of a specific core.
Signed-off-by: Sven Püschel <s.pueschel@pengutronix.de>
---
drivers/media/platform/rockchip/rga/rga.c | 96 ++++++++++++++++---------------
1 file changed, 50 insertions(+), 46 deletions(-)
diff --git a/drivers/media/platform/rockchip/rga/rga.c b/drivers/media/platform/rockchip/rga/rga.c
index 11912bf5b6906..952377ae467f5 100644
--- a/drivers/media/platform/rockchip/rga/rga.c
+++ b/drivers/media/platform/rockchip/rga/rga.c
@@ -743,7 +743,6 @@ static int rga_core_bind(struct device *dev, struct device *master, void *data)
struct platform_device *pdev = to_platform_device(dev);
struct rockchip_rga *rga = data;
struct rga_core *core;
- struct video_device *vfd;
int ret = 0;
int irq;
@@ -789,33 +788,11 @@ static int rga_core_bind(struct device *dev, struct device *master, void *data)
goto err_put_clk;
}
- ret = v4l2_device_register(&pdev->dev, &rga->v4l2_dev);
- if (ret)
- goto err_put_clk;
- vfd = video_device_alloc();
- if (!vfd) {
- v4l2_err(&rga->v4l2_dev, "Failed to allocate video device\n");
- ret = -ENOMEM;
- goto unreg_v4l2_dev;
- }
- *vfd = rga_videodev;
- vfd->lock = &rga->mutex;
- vfd->v4l2_dev = &rga->v4l2_dev;
-
- video_set_drvdata(vfd, rga);
- rga->vfd = vfd;
-
platform_set_drvdata(pdev, core);
- rga->m2m_dev = v4l2_m2m_init(&rga_m2m_ops);
- if (IS_ERR(rga->m2m_dev)) {
- v4l2_err(&rga->v4l2_dev, "Failed to init mem2mem device\n");
- ret = PTR_ERR(rga->m2m_dev);
- goto rel_vdev;
- }
ret = pm_runtime_resume_and_get(core->dev);
if (ret < 0)
- goto rel_m2m;
+ goto err_put_clk;
rga->version = rga->hw->get_version(core);
@@ -824,23 +801,8 @@ static int rga_core_bind(struct device *dev, struct device *master, void *data)
pm_runtime_put(core->dev);
- ret = video_register_device(vfd, VFL_TYPE_VIDEO, -1);
- if (ret) {
- v4l2_err(&rga->v4l2_dev, "Failed to register video device\n");
- goto rel_m2m;
- }
-
- v4l2_info(&rga->v4l2_dev, "Registered %s as /dev/%s\n",
- vfd->name, video_device_node_name(vfd));
-
return 0;
-rel_m2m:
- v4l2_m2m_release(rga->m2m_dev);
-rel_vdev:
- video_device_release(vfd);
-unreg_v4l2_dev:
- v4l2_device_unregister(&rga->v4l2_dev);
err_put_clk:
pm_runtime_disable(core->dev);
@@ -851,13 +813,6 @@ static void rga_core_unbind(struct device *dev, struct device *master,
void *data)
{
struct rga_core *core = dev_get_drvdata(dev);
- struct rockchip_rga *rga = core->rga;
-
- v4l2_info(&rga->v4l2_dev, "Removing\n");
-
- v4l2_m2m_release(rga->m2m_dev);
- video_unregister_device(rga->vfd);
- v4l2_device_unregister(&rga->v4l2_dev);
pm_runtime_disable(core->dev);
}
@@ -937,6 +892,7 @@ static struct platform_driver rga_core_pdrv = {
static int rga_bind(struct device *dev)
{
struct rockchip_rga *rga = dev_get_drvdata(dev);
+ struct video_device *vfd;
int ret;
ret = component_bind_all(dev, rga);
@@ -945,11 +901,59 @@ static int rga_bind(struct device *dev)
return ret;
}
+ ret = v4l2_device_register(dev, &rga->v4l2_dev);
+ if (ret)
+ return ret;
+ vfd = video_device_alloc();
+ if (!vfd) {
+ v4l2_err(&rga->v4l2_dev, "Failed to allocate video device\n");
+ ret = -ENOMEM;
+ goto unreg_v4l2_dev;
+ }
+ *vfd = rga_videodev;
+ vfd->lock = &rga->mutex;
+ vfd->v4l2_dev = &rga->v4l2_dev;
+
+ video_set_drvdata(vfd, rga);
+ rga->vfd = vfd;
+
+ rga->m2m_dev = v4l2_m2m_init(&rga_m2m_ops);
+ if (IS_ERR(rga->m2m_dev)) {
+ v4l2_err(&rga->v4l2_dev, "Failed to init mem2mem device\n");
+ ret = PTR_ERR(rga->m2m_dev);
+ goto rel_vdev;
+ }
+
+ ret = video_register_device(vfd, VFL_TYPE_VIDEO, -1);
+ if (ret) {
+ v4l2_err(&rga->v4l2_dev, "Failed to register video device\n");
+ goto rel_m2m;
+ }
+
+ v4l2_info(&rga->v4l2_dev, "Registered %s as /dev/%s\n",
+ vfd->name, video_device_node_name(vfd));
+
return 0;
+
+rel_m2m:
+ v4l2_m2m_release(rga->m2m_dev);
+rel_vdev:
+ video_device_release(vfd);
+unreg_v4l2_dev:
+ v4l2_device_unregister(&rga->v4l2_dev);
+ return ret;
}
static void rga_unbind(struct device *dev)
{
+ struct rockchip_rga *rga = dev_get_drvdata(dev);
+
+ v4l2_info(&rga->v4l2_dev, "Removing\n");
+
+ v4l2_m2m_release(rga->m2m_dev);
+ video_unregister_device(rga->vfd);
+ v4l2_device_unregister(&rga->v4l2_dev);
+
component_unbind_all(dev, NULL);
}
--
2.54.0
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH 11/17] media: rockchip: rga: move video device to the master
2026-06-05 22:06 ` [PATCH 11/17] media: rockchip: rga: move video device to the master Sven Püschel
@ 2026-06-05 22:21 ` sashiko-bot
0 siblings, 0 replies; 30+ messages in thread
From: sashiko-bot @ 2026-06-05 22:21 UTC (permalink / raw)
To: Sven Püschel; +Cc: linux-media
Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
New issues:
- [High] The master component bind function (`rga_bind`) fails to unbind previously bound components when a subsequent V4L2 initialization step fails.
- [High] Improper coupling of component lifecycle and devres managed allocations leads to potential resource leaks or use-after-free upon master probe failure.
- [Low] The `rga_core_bind` function uses an unregistered `v4l2_device` for logging, resulting in a missing device name prefix.
Pre-existing issues:
- [High] The driver conditionally uses a shared interrupt (`IRQF_SHARED`) but lacks necessary PM runtime synchronizations, risking invalid hardware access.
--
commit f2c043582ae1f347c444077ff268ed480a98b5bf
Author: Sven Püschel <s.pueschel@pengutronix.de>
media: rockchip: rga: move video device to the master
This commit moves the video device allocation and registration to the master
component bind function. This prepares the driver for binding multiple cores
to the master device, ensuring the v4l2 device is only registered after all
cores have successfully bound.
> diff --git a/drivers/media/platform/rockchip/rga/rga.c b/drivers/media/platform/rockchip/rga/rga.c
> index 11912bf5b6906..952377ae467f5 100644
> --- a/drivers/media/platform/rockchip/rga/rga.c
> +++ b/drivers/media/platform/rockchip/rga/rga.c
[ ... ]
> @@ -789,33 +788,11 @@ static int rga_core_bind(struct device *dev, struct device *master, void *data)
> goto err_put_clk;
> }
>
> - ret = v4l2_device_register(&pdev->dev, &rga->v4l2_dev);
> - if (ret)
> - goto err_put_clk;
[ ... ]
> ret = pm_runtime_resume_and_get(core->dev);
> if (ret < 0)
> - goto rel_m2m;
> + goto err_put_clk;
[Severity: High]
This is a pre-existing issue, but the driver appears to use IRQF_SHARED
conditionally for its interrupts. Does rga_isr() need to use
pm_runtime_get_if_active() to ensure the device is not runtime-suspended
before accessing hardware registers?
Additionally, does rga_runtime_suspend() need to call synchronize_irq()
before disabling clocks to prevent in-flight shared IRQs from executing
while hardware is powered down?
[ ... ]
> @@ -937,6 +892,7 @@ static struct platform_driver rga_core_pdrv = {
> static int rga_bind(struct device *dev)
> {
> struct rockchip_rga *rga = dev_get_drvdata(dev);
> + struct video_device *vfd;
> int ret;
>
> ret = component_bind_all(dev, rga);
> @@ -945,11 +901,59 @@ static int rga_bind(struct device *dev)
> return ret;
> }
>
> + ret = v4l2_device_register(dev, &rga->v4l2_dev);
[Severity: Low]
Since component_bind_all() executes rga_core_bind() before
v4l2_device_register() initializes rga->v4l2_dev.name, does the v4l2_info()
logging call inside rga_core_bind() end up printing an empty device name
prefix?
[Severity: High]
Since rga_core_probe() allocates the shared IRQ using devm_request_irq(),
the IRQ remains active and tied to the core device's devres lifecycle.
If rga_bind() returns an error, the master probe fails and its devres
unwinds, freeing the master's rga structure. If the shared IRQ fires
after this point, could rga_isr() trigger a use-after-free or NULL pointer
dereference when accessing core->rga?
[ ... ]
> +rel_m2m:
> + v4l2_m2m_release(rga->m2m_dev);
> +rel_vdev:
> + video_device_release(vfd);
> +unreg_v4l2_dev:
> + v4l2_device_unregister(&rga->v4l2_dev);
> + return ret;
> }
[Severity: High]
If initialization fails at v4l2_device_register() or later, the function
jumps to these error paths. Should component_unbind_all() be called here
to prevent leaving the child cores in a bound state and leaking resources
acquired during their bind callbacks?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260606-spu-rga3multicore-v1-0-3ec2b15675f7@pengutronix.de?part=11
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH 12/17] media: rockchip: rga: move core initialization from bind to probe
2026-06-05 22:06 [PATCH 00/17] media: rockchip: rga: Add multi-core support Sven Püschel
` (10 preceding siblings ...)
2026-06-05 22:06 ` [PATCH 11/17] media: rockchip: rga: move video device to the master Sven Püschel
@ 2026-06-05 22:06 ` Sven Püschel
2026-06-05 22:20 ` sashiko-bot
2026-06-05 22:06 ` [PATCH 13/17] media: rockchip: rga: bind all cores to the master Sven Püschel
` (4 subsequent siblings)
16 siblings, 1 reply; 30+ messages in thread
From: Sven Püschel @ 2026-06-05 22:06 UTC (permalink / raw)
To: Jacob Chen, Ezequiel Garcia, Mauro Carvalho Chehab,
Heiko Stuebner, Philipp Zabel
Cc: linux-media, linux-rockchip, linux-arm-kernel, linux-kernel,
kernel, Detlev Casanova, Michael Tretter, Sven Püschel
Move the core initialization from the core binding function to the core
probing function. This better matches the actual sequence, where the
core probe initializes most things and the bind function just binds the
core to the actual rga struct from the master device.
Signed-off-by: Sven Püschel <s.pueschel@pengutronix.de>
---
drivers/media/platform/rockchip/rga/rga.c | 83 ++++++++++++++++---------------
1 file changed, 42 insertions(+), 41 deletions(-)
diff --git a/drivers/media/platform/rockchip/rga/rga.c b/drivers/media/platform/rockchip/rga/rga.c
index 952377ae467f5..0413b8518dfc8 100644
--- a/drivers/media/platform/rockchip/rga/rga.c
+++ b/drivers/media/platform/rockchip/rga/rga.c
@@ -740,21 +740,49 @@ static int rga_parse_dt(struct rga_core *core)
static int rga_core_bind(struct device *dev, struct device *master, void *data)
{
- struct platform_device *pdev = to_platform_device(dev);
struct rockchip_rga *rga = data;
+ struct rga_core *core = dev_get_drvdata(dev);
+ int ret = 0;
+
+ core->rga = rga;
+
+ ret = pm_runtime_resume_and_get(core->dev);
+ if (ret < 0)
+ return ret;
+
+ rga->version = rga->hw->get_version(core);
+
+ v4l2_info(&rga->v4l2_dev, "HW Version: 0x%02x.%02x\n",
+ rga->version.major, rga->version.minor);
+
+ pm_runtime_put(core->dev);
+
+ rga->cores[0] = core;
+
+ return 0;
+}
+
+static const struct component_ops rga_core_ops = {
+ .bind = rga_core_bind,
+};
+
+static int rga_core_probe(struct platform_device *pdev)
+{
struct rga_core *core;
+ const struct rga_hw *hw;
int ret = 0;
int irq;
if (!pdev->dev.of_node)
return -ENODEV;
+ hw = of_device_get_match_data(&pdev->dev);
+ if (!hw)
+ return dev_err_probe(&pdev->dev, -ENODEV, "failed to get match data\n");
+
core = devm_kzalloc(&pdev->dev, sizeof(*core), GFP_KERNEL);
- core->rga = rga;
core->dev = &pdev->dev;
- rga->cores[0] = core;
-
ret = rga_parse_dt(core);
if (ret)
return dev_err_probe(&pdev->dev, ret, "Unable to parse OF data\n");
@@ -775,7 +803,7 @@ static int rga_core_bind(struct device *dev, struct device *master, void *data)
}
ret = devm_request_irq(core->dev, irq, rga_isr,
- rga_has_internal_iommu(rga) ? 0 : IRQF_SHARED,
+ hw->has_internal_iommu ? 0 : IRQF_SHARED,
dev_name(core->dev), core);
if (ret < 0) {
dev_err(core->dev, "failed to request irq\n");
@@ -790,42 +818,6 @@ static int rga_core_bind(struct device *dev, struct device *master, void *data)
platform_set_drvdata(pdev, core);
- ret = pm_runtime_resume_and_get(core->dev);
- if (ret < 0)
- goto err_put_clk;
-
- rga->version = rga->hw->get_version(core);
-
- v4l2_info(&rga->v4l2_dev, "HW Version: 0x%02x.%02x\n",
- rga->version.major, rga->version.minor);
-
- pm_runtime_put(core->dev);
-
- return 0;
-
-err_put_clk:
- pm_runtime_disable(core->dev);
-
- return ret;
-}
-
-static void rga_core_unbind(struct device *dev, struct device *master,
- void *data)
-{
- struct rga_core *core = dev_get_drvdata(dev);
-
- pm_runtime_disable(core->dev);
-}
-
-static const struct component_ops rga_core_ops = {
- .bind = rga_core_bind,
- .unbind = rga_core_unbind,
-};
-
-static int rga_core_probe(struct platform_device *pdev)
-{
- int ret = 0;
-
ret = component_add(&pdev->dev, &rga_core_ops);
if (ret < 0) {
dev_err(&pdev->dev, "failed to register component: %d", ret);
@@ -833,11 +825,20 @@ static int rga_core_probe(struct platform_device *pdev)
}
return 0;
+
+err_put_clk:
+ pm_runtime_disable(core->dev);
+
+ return ret;
}
static void rga_core_remove(struct platform_device *pdev)
{
+ struct rga_core *core = platform_get_drvdata(pdev);
+
component_del(&pdev->dev, &rga_core_ops);
+
+ pm_runtime_disable(core->dev);
}
static int __maybe_unused rga_runtime_suspend(struct device *dev)
--
2.54.0
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH 12/17] media: rockchip: rga: move core initialization from bind to probe
2026-06-05 22:06 ` [PATCH 12/17] media: rockchip: rga: move core initialization from bind to probe Sven Püschel
@ 2026-06-05 22:20 ` sashiko-bot
0 siblings, 0 replies; 30+ messages in thread
From: sashiko-bot @ 2026-06-05 22:20 UTC (permalink / raw)
To: Sven Püschel; +Cc: linux-media
Thank you for your contribution! Sashiko AI review found 6 potential issue(s) to consider:
New issues:
- [High] NULL pointer dereference in `rga_isr()` due to early shared IRQ registration.
- [Medium] Missing `pm_runtime_disable()` on `component_add()` error path.
Pre-existing issues:
- [High] Unprotected hardware access in shared interrupt handler.
- [High] Missing `synchronize_irq()` in runtime suspend for shared IRQs.
- [High] Use-After-Free when the master device is unbound.
- [High] Missing NULL check for `devm_kzalloc` allocation.
--
commit 8a3476b206ac8426da5cf08410222d4a74238c92
Author: Sven Püschel <s.pueschel@pengutronix.de>
media: rockchip: rga: move core initialization from bind to probe
This commit moves the core initialization from the core binding function to
the core probing function to better match the actual initialization sequence.
> diff --git a/drivers/media/platform/rockchip/rga/rga.c b/drivers/media/platform/rockchip/rga/rga.c
> index 952377ae467f5..0413b8518dfc8 100644
> --- a/drivers/media/platform/rockchip/rga/rga.c
> +++ b/drivers/media/platform/rockchip/rga/rga.c
[ ... ]
> @@ -740,21 +740,49 @@ static int rga_parse_dt(struct rga_core *core)
>
> static int rga_core_probe(struct platform_device *pdev)
> {
> struct rga_core *core;
> const struct rga_hw *hw;
> int ret = 0;
> int irq;
>
> if (!pdev->dev.of_node)
> return -ENODEV;
>
> + hw = of_device_get_match_data(&pdev->dev);
> + if (!hw)
> + return dev_err_probe(&pdev->dev, -ENODEV, "failed to get match data\n");
> +
> core = devm_kzalloc(&pdev->dev, sizeof(*core), GFP_KERNEL);
> - core->rga = rga;
> core->dev = &pdev->dev;
[Severity: High]
This isn't a bug introduced by this patch, but does this devm_kzalloc()
allocation need a null pointer check?
If the allocation fails under memory pressure, the immediate assignment to
core->dev = &pdev->dev will cause a null pointer dereference.
[ ... ]
> @@ -775,7 +803,7 @@ static int rga_core_bind(struct device *dev, struct device *master, void *data)
> }
>
> ret = devm_request_irq(core->dev, irq, rga_isr,
> - rga_has_internal_iommu(rga) ? 0 : IRQF_SHARED,
> + hw->has_internal_iommu ? 0 : IRQF_SHARED,
> dev_name(core->dev), core);
[Severity: High]
Since devm_request_irq() is now called here in rga_core_probe(), could this
trigger an early null pointer dereference in rga_isr()?
When hw->has_internal_iommu is false, the interrupt is registered as
IRQF_SHARED. For shared interrupts, the handler can be invoked immediately by
another device sharing the line. However, core->rga is not initialized until
rga_core_bind() runs later.
If the interrupt service routine fires early, it will evaluate
rga->hw->handle_irq(core) and unconditionally dereference the null core->rga
pointer.
[Severity: High]
This is a pre-existing issue, but does the shared interrupt handler safely
access hardware registers?
Because rga_isr() is registered as IRQF_SHARED, it can be invoked by another
device while this device is runtime-suspended and unclocked. The handler
unconditionally calls rga->hw->handle_irq(core), which accesses registers.
Reading powered-off registers typically causes a synchronous external abort.
Should the interrupt service routine use pm_runtime_get_if_active() before
accessing registers?
[Severity: High]
This is another pre-existing issue related to the shared interrupt, but is
there a missing synchronize_irq() in the runtime suspend path?
In rga_runtime_suspend(), clocks are disabled without synchronizing the
interrupt. If an interrupt fires on another CPU just before clocks are
disabled, the handler could actively read registers exactly when the hardware
is powered down.
[Severity: High]
This is a pre-existing issue, but could tying the shared interrupt to the
core device lifetime cause a use-after-free when the master device is
unbound?
The shared interrupt is requested via devm_request_irq(core->dev, ...). When
the master device is unbound, its rga structure is freed. Since the core
device is not removed during unbind, the devm-managed interrupt is not freed,
leaving the shared handler active.
A subsequent interrupt will execute rga_isr(), which dereferences the
freed core->rga pointer.
[ ... ]
> @@ -833,11 +825,20 @@ static int rga_core_probe(struct platform_device *pdev)
>
> ret = component_add(&pdev->dev, &rga_core_ops);
> if (ret < 0) {
> dev_err(&pdev->dev, "failed to register component: %d", ret);
> return ret;
> }
[Severity: Medium]
If component_add() fails, does this bypass the pm_runtime_disable() cleanup
at the err_put_clk label?
Returning directly instead of using a goto would leak the device's Runtime
PM usage counter.
> return 0;
> +
> +err_put_clk:
> + pm_runtime_disable(core->dev);
> +
> + return ret;
> }
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260606-spu-rga3multicore-v1-0-3ec2b15675f7@pengutronix.de?part=12
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH 13/17] media: rockchip: rga: bind all cores to the master
2026-06-05 22:06 [PATCH 00/17] media: rockchip: rga: Add multi-core support Sven Püschel
` (11 preceding siblings ...)
2026-06-05 22:06 ` [PATCH 12/17] media: rockchip: rga: move core initialization from bind to probe Sven Püschel
@ 2026-06-05 22:06 ` Sven Püschel
2026-06-05 22:23 ` sashiko-bot
2026-06-05 22:07 ` [PATCH 14/17] media: rockchip: rga: put all cores into first core iommu domain Sven Püschel
` (3 subsequent siblings)
16 siblings, 1 reply; 30+ messages in thread
From: Sven Püschel @ 2026-06-05 22:06 UTC (permalink / raw)
To: Jacob Chen, Ezequiel Garcia, Mauro Carvalho Chehab,
Heiko Stuebner, Philipp Zabel
Cc: linux-media, linux-rockchip, linux-arm-kernel, linux-kernel,
kernel, Detlev Casanova, Michael Tretter, Sven Püschel
Bind all core components to the master component. Previously only the
first core has been added to the master device to avoid creating
multiple video devices. As the video device creation has been moved to
the master component, it allows us to bind all cores without creating
additional video devices.
We expect that all cores to report the same version number, as we only
add cores with the same compatible value. This is important, as we
setup the command buffer before actually scheduling the work to a
specific core. Therefore adjusting command buffers depending on the
version register only works when all cores have the same value.
Signed-off-by: Sven Püschel <s.pueschel@pengutronix.de>
---
drivers/media/platform/rockchip/rga/rga.c | 22 +++++++++++-----------
drivers/media/platform/rockchip/rga/rga.h | 1 +
2 files changed, 12 insertions(+), 11 deletions(-)
diff --git a/drivers/media/platform/rockchip/rga/rga.c b/drivers/media/platform/rockchip/rga/rga.c
index 0413b8518dfc8..6add6c510c127 100644
--- a/drivers/media/platform/rockchip/rga/rga.c
+++ b/drivers/media/platform/rockchip/rga/rga.c
@@ -742,6 +742,7 @@ static int rga_core_bind(struct device *dev, struct device *master, void *data)
{
struct rockchip_rga *rga = data;
struct rga_core *core = dev_get_drvdata(dev);
+ struct rockchip_rga_version version;
int ret = 0;
core->rga = rga;
@@ -750,14 +751,21 @@ static int rga_core_bind(struct device *dev, struct device *master, void *data)
if (ret < 0)
return ret;
- rga->version = rga->hw->get_version(core);
+ version = rga->hw->get_version(core);
v4l2_info(&rga->v4l2_dev, "HW Version: 0x%02x.%02x\n",
- rga->version.major, rga->version.minor);
+ version.major, version.minor);
+
+ if (rga->num_cores) {
+ /* we are not the first core, expect that we have the same version */
+ if (rga->version.major != version.major || rga->version.minor != version.minor)
+ v4l2_warn(&rga->v4l2_dev, "Detected multi-core setup with different core versions!\n");
+ } else
+ rga->version = version;
pm_runtime_put(core->dev);
- rga->cores[0] = core;
+ rga->cores[rga->num_cores++] = core;
return 0;
}
@@ -983,14 +991,6 @@ static int rga_probe(struct platform_device *pdev)
component_match_add_release(dev, &match, component_release_of,
component_compare_of, core_node);
num_cores++;
-
- /*
- * As multi core is not implemented yet,
- * break out of the loop to only have one core per rockchip_rga struct.
- * Also put the node, which otherwise would've been done by the loop iteration.
- */
- of_node_put(core_node);
- break;
}
if (!match)
diff --git a/drivers/media/platform/rockchip/rga/rga.h b/drivers/media/platform/rockchip/rga/rga.h
index fcf1ef7d2029f..6237436b984eb 100644
--- a/drivers/media/platform/rockchip/rga/rga.h
+++ b/drivers/media/platform/rockchip/rga/rga.h
@@ -88,6 +88,7 @@ struct rockchip_rga {
const struct rga_hw *hw;
+ u8 num_cores;
struct rga_core *cores[];
};
--
2.54.0
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH 13/17] media: rockchip: rga: bind all cores to the master
2026-06-05 22:06 ` [PATCH 13/17] media: rockchip: rga: bind all cores to the master Sven Püschel
@ 2026-06-05 22:23 ` sashiko-bot
0 siblings, 0 replies; 30+ messages in thread
From: sashiko-bot @ 2026-06-05 22:23 UTC (permalink / raw)
To: Sven Püschel; +Cc: linux-media
Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:
New issues:
- [High] Unbounded increment of `rga->num_cores` during component re-binding leads to a heap out-of-bounds write on `rga->cores`.
- [High] Missing `component_unbind_all()` in the error paths of `rga_bind()`.
- [Medium] Missing `pm_runtime_disable()` when `component_add()` fails in `rga_core_probe()`.
Pre-existing issues:
- [High] Missing protection against shared interrupts while the device is runtime-suspended causes invalid hardware accesses.
- [High] Prematurely freeing `m2m_dev` in `rga_unbind` causes a use-after-free when open file descriptors are subsequently closed or used concurrently.
--
commit 89fd9460c00f5ff4577ab76e9bec7d1677f60844
Author: Sven Püschel <s.pueschel@pengutronix.de>
media: rockchip: rga: bind all cores to the master
This commit modifies the RGA driver to bind all core components to the
master device instead of only the first one. It ensures all added cores
report the same hardware version before initialization.
> --- a/drivers/media/platform/rockchip/rga/rga.c
> +++ b/drivers/media/platform/rockchip/rga/rga.c
> @@ -750,14 +751,21 @@ static int rga_core_bind(struct device *dev, struct device *master, void *data)
[ ... ]
> pm_runtime_put(core->dev);
>
> - rga->cores[0] = core;
> + rga->cores[rga->num_cores++] = core;
[Severity: High]
If the master device is unbound and then re-bound, does this continuously
increment rga->num_cores without bounds checking?
Looking at rga_unbind(), rga->num_cores is never reset to 0. If a re-bind
occurs, could this write the core pointer out-of-bounds into adjacent heap
memory?
[ ... ]
[Severity: High]
If component_bind_all() succeeds in rga_bind(), but a subsequent step like
v4l2_device_register() fails, the function returns the error directly:
rga_bind() {
...
ret = v4l2_device_register(dev, &rga->v4l2_dev);
if (ret)
return ret;
...
}
Does this skip calling component_unbind_all()? If so, could this permanently
leak the component bindings, and if retried, cause rga_core_bind() to execute
again and increment rga->num_cores out-of-bounds?
[Severity: Medium]
In rga_core_probe(), pm_runtime_enable() is called, but if component_add()
fails:
rga_core_probe() {
...
ret = component_add(&pdev->dev, &rga_core_ops);
if (ret < 0) {
dev_err(&pdev->dev, "failed to register component: %d", ret);
return ret;
}
...
}
Will returning directly here instead of jumping to the err_put_clk cleanup
label leak the runtime PM enablement state for the device?
[Severity: High]
This is a pre-existing issue, but the driver uses a shared IRQ (IRQF_SHARED)
when lacking an internal IOMMU. The interrupt handler rga_isr() accesses
hardware registers without checking if the device is active:
rga_isr() {
...
if (rga->hw->handle_irq(core)) {
...
}
Additionally, rga_runtime_suspend() disables clocks without synchronizing:
rga_runtime_suspend() {
...
clk_bulk_disable_unprepare(core->num_clks, core->clks);
...
}
Should rga_isr() use pm_runtime_get_if_active() before access, and should
rga_runtime_suspend() call synchronize_irq() to prevent unclocked hardware
access if a shared interrupt fires while the device is suspended?
[Severity: High]
This isn't a bug introduced by this patch, but rga_unbind() frees the
memory-to-memory device structure prematurely:
rga_unbind() {
...
v4l2_m2m_release(rga->m2m_dev);
video_unregister_device(rga->vfd);
...
}
If userspace applications still hold open file descriptors, closing them
later triggers rga_release() which calls v4l2_m2m_ctx_release(). Can this
lead to a use-after-free when dereferencing the already freed m2m_dev?
Might it be safer to defer the release of m2m_dev until the final V4L2
release callback?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260606-spu-rga3multicore-v1-0-3ec2b15675f7@pengutronix.de?part=13
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH 14/17] media: rockchip: rga: put all cores into first core iommu domain
2026-06-05 22:06 [PATCH 00/17] media: rockchip: rga: Add multi-core support Sven Püschel
` (12 preceding siblings ...)
2026-06-05 22:06 ` [PATCH 13/17] media: rockchip: rga: bind all cores to the master Sven Püschel
@ 2026-06-05 22:07 ` Sven Püschel
2026-06-05 22:23 ` sashiko-bot
2026-06-05 22:07 ` [PATCH 15/17] media: rockchip: rga: schedule jobs to multiple cores Sven Püschel
` (2 subsequent siblings)
16 siblings, 1 reply; 30+ messages in thread
From: Sven Püschel @ 2026-06-05 22:07 UTC (permalink / raw)
To: Jacob Chen, Ezequiel Garcia, Mauro Carvalho Chehab,
Heiko Stuebner, Philipp Zabel
Cc: linux-media, linux-rockchip, linux-arm-kernel, linux-kernel,
kernel, Detlev Casanova, Michael Tretter, Sven Püschel
Put all cores into the iommu domain of the first core to allow them to
be used by any core. All buffers accessed by the hardware are allocated
on the first core, as the scheduling to a specific core is done after
the allocation. Therefore put all cores into the same domain to have the
same iommu mapping on all cores.
Signed-off-by: Sven Püschel <s.pueschel@pengutronix.de>
---
drivers/media/platform/rockchip/rga/rga.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/drivers/media/platform/rockchip/rga/rga.c b/drivers/media/platform/rockchip/rga/rga.c
index 6add6c510c127..9cebb461b3fd2 100644
--- a/drivers/media/platform/rockchip/rga/rga.c
+++ b/drivers/media/platform/rockchip/rga/rga.c
@@ -10,6 +10,7 @@
#include <linux/delay.h>
#include <linux/fs.h>
#include <linux/interrupt.h>
+#include <linux/iommu.h>
#include <linux/module.h>
#include <linux/of.h>
#include <linux/of_platform.h>
@@ -757,6 +758,19 @@ static int rga_core_bind(struct device *dev, struct device *master, void *data)
version.major, version.minor);
if (rga->num_cores) {
+ /* Attach to the first cores iommu */
+ struct iommu_domain *domain = iommu_get_domain_for_dev(rga->cores[0]->dev);
+
+ if (IS_ERR(domain)) {
+ dev_err(core->dev, "Couldn't get domain of the first core\n");
+ return PTR_ERR(domain);
+ }
+ ret = iommu_attach_device(domain, core->dev);
+ if (ret) {
+ dev_err(core->dev, "Couldn't attach to the domain of the first core\n");
+ return ret;
+ }
+
/* we are not the first core, expect that we have the same version */
if (rga->version.major != version.major || rga->version.minor != version.minor)
v4l2_warn(&rga->v4l2_dev, "Detected multi-core setup with different core versions!\n");
--
2.54.0
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH 14/17] media: rockchip: rga: put all cores into first core iommu domain
2026-06-05 22:07 ` [PATCH 14/17] media: rockchip: rga: put all cores into first core iommu domain Sven Püschel
@ 2026-06-05 22:23 ` sashiko-bot
0 siblings, 0 replies; 30+ messages in thread
From: sashiko-bot @ 2026-06-05 22:23 UTC (permalink / raw)
To: Sven Püschel; +Cc: linux-media
Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
- [Critical] Sharing the primary core's default IOMMU domain with secondary cores creates a severe lifetime mismatch leading to a Use-After-Free if the primary core is removed.
- [High] Incorrect error checking of `iommu_get_domain_for_dev()` using `IS_ERR()` causes a NULL pointer dereference and masks errors.
- [High] The new error paths for IOMMU attachment fail to release the PM runtime reference.
- [High] Missing `.unbind` callback leads to IOMMU domain reference leaks, teardown asymmetry, and array out-of-bounds writes upon driver rebind.
--
commit 3fbaee68845794dcb5e7c93168eef670817da976
Author: Sven Püschel <s.pueschel@pengutronix.de>
media: rockchip: rga: put all cores into first core iommu domain
Put all cores into the iommu domain of the first core to allow them to
be used by any core.
> diff --git a/drivers/media/platform/rockchip/rga/rga.c b/drivers/media/platform/rockchip/rga/rga.c
> --- a/drivers/media/platform/rockchip/rga/rga.c
> +++ b/drivers/media/platform/rockchip/rga/rga.c
> @@ -757,6 +758,19 @@ static int rga_core_bind(struct device *dev, struct device *master, void *data)
> version.major, version.minor);
>
> if (rga->num_cores) {
> + /* Attach to the first cores iommu */
> + struct iommu_domain *domain = iommu_get_domain_for_dev(rga->cores[0]->dev);
> +
> + if (IS_ERR(domain)) {
[Severity: High]
Since iommu_get_domain_for_dev() returns NULL on failure (not an ERR_PTR),
won't this IS_ERR() check fail to catch a NULL domain?
If domain is NULL, it will be passed to iommu_attach_device(), which could
lead to a NULL pointer dereference when accessing domain->owner in
domain_iommu_ops_compatible().
> + dev_err(core->dev, "Couldn't get domain of the first core\n");
> + return PTR_ERR(domain);
[Severity: High]
Does this early return leak the PM runtime reference?
pm_runtime_resume_and_get(core->dev) is called earlier in this function, but
this return path doesn't call pm_runtime_put(core->dev).
> + }
> + ret = iommu_attach_device(domain, core->dev);
[Severity: Critical]
Does sharing the primary core's default IOMMU domain with secondary cores
create a lifetime mismatch?
The IOMMU subsystem couples the lifetime of a default domain strictly to its
original device group. If the platform device for the first core is removed,
the IOMMU core frees the default domain in iommu_deinit_device().
Secondary cores would still retain pointers to this freed domain, leading to a
use-after-free during DMA operations or unbinding.
[Severity: High]
Since we are attaching the device to the IOMMU domain here, how is it
detached when the driver is unbound?
There does not appear to be a corresponding .unbind callback in
rga_core_ops to call iommu_detach_device(). This would leave the IOMMU
group permanently attached, causing subsequent rebinds to fail with -EBUSY
or leak references.
Additionally, without an unbind callback to reset rga->num_cores, won't
re-binding the components cause rga->cores[rga->num_cores++] to continue
incrementing and overflow the array?
> + if (ret) {
> + dev_err(core->dev, "Couldn't attach to the domain of the first core\n");
> + return ret;
[Severity: High]
Could this return path also leak the PM runtime reference acquired by
pm_runtime_resume_and_get(core->dev)?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260606-spu-rga3multicore-v1-0-3ec2b15675f7@pengutronix.de?part=14
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH 15/17] media: rockchip: rga: schedule jobs to multiple cores
2026-06-05 22:06 [PATCH 00/17] media: rockchip: rga: Add multi-core support Sven Püschel
` (13 preceding siblings ...)
2026-06-05 22:07 ` [PATCH 14/17] media: rockchip: rga: put all cores into first core iommu domain Sven Püschel
@ 2026-06-05 22:07 ` Sven Püschel
2026-06-05 22:25 ` sashiko-bot
2026-06-05 22:07 ` [PATCH 16/17] arm64: dts: rockchip: add rga3 dt nodes to rk3588 Sven Püschel
2026-06-05 22:07 ` [PATCH 17/17] iommu/rockchip: disable fetch dte time limit Sven Püschel
16 siblings, 1 reply; 30+ messages in thread
From: Sven Püschel @ 2026-06-05 22:07 UTC (permalink / raw)
To: Jacob Chen, Ezequiel Garcia, Mauro Carvalho Chehab,
Heiko Stuebner, Philipp Zabel
Cc: linux-media, linux-rockchip, linux-arm-kernel, linux-kernel,
kernel, Detlev Casanova, Michael Tretter, Sven Püschel
Schedule jobs to multiple cores to utilize all RGA cores. To avoid race
conditions when selecting the next free core a dedicated spinlock is added.
Note that this doesn't increase the max frame rate of a single
stream, as a context will wait for the job to finish before starting
the next device_run call.
Signed-off-by: Sven Püschel <s.pueschel@pengutronix.de>
---
drivers/media/platform/rockchip/rga/rga.c | 22 +++++++++++++++++++---
drivers/media/platform/rockchip/rga/rga.h | 1 +
2 files changed, 20 insertions(+), 3 deletions(-)
diff --git a/drivers/media/platform/rockchip/rga/rga.c b/drivers/media/platform/rockchip/rga/rga.c
index 9cebb461b3fd2..f00b7f99f2521 100644
--- a/drivers/media/platform/rockchip/rga/rga.c
+++ b/drivers/media/platform/rockchip/rga/rga.c
@@ -38,15 +38,31 @@ static void device_run(void *prv)
{
struct rga_ctx *ctx = prv;
struct rockchip_rga *rga = ctx->rga;
- struct rga_core *core = rga->cores[0];
+ struct rga_core *core = NULL;
struct vb2_v4l2_buffer *src, *dst;
unsigned long flags;
int ret;
+ unsigned int i;
+
+ spin_lock_irqsave(&rga->cores_lock, flags);
+ for (i = 0; i < rga->num_cores; i++) {
+ if (!rga->cores[i]->curr) {
+ core = rga->cores[i];
+ core->curr = ctx;
+ break;
+ }
+ }
+ spin_unlock_irqrestore(&rga->cores_lock, flags);
+
+ WARN_ONCE(!core, "No free core although max parallel jobs matches the core count!\n");
+ if (!core)
+ return;
ret = pm_runtime_resume_and_get(core->dev);
if (ret < 0) {
v4l2_m2m_buf_done_and_job_finish(rga->m2m_dev, ctx->fh.m2m_ctx,
VB2_BUF_STATE_ERROR);
+ core->curr = NULL;
return;
}
@@ -58,8 +74,6 @@ static void device_run(void *prv)
}
spin_unlock_irqrestore(&rga->ctrl_lock, flags);
- core->curr = ctx;
-
src = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
src->sequence = ctx->osequence++;
@@ -946,6 +960,7 @@ static int rga_bind(struct device *dev)
ret = PTR_ERR(rga->m2m_dev);
goto rel_vdev;
}
+ v4l2_m2m_set_max_parallel_jobs(rga->m2m_dev, rga->num_cores);
ret = video_register_device(vfd, VFL_TYPE_VIDEO, -1);
if (ret) {
@@ -1021,6 +1036,7 @@ static int rga_probe(struct platform_device *pdev)
return dev_err_probe(dev, -ENODEV, "failed to get match data\n");
spin_lock_init(&rga->ctrl_lock);
+ spin_lock_init(&rga->cores_lock);
mutex_init(&rga->mutex);
dev_set_drvdata(dev, rga);
diff --git a/drivers/media/platform/rockchip/rga/rga.h b/drivers/media/platform/rockchip/rga/rga.h
index 6237436b984eb..c0dfacdb6f212 100644
--- a/drivers/media/platform/rockchip/rga/rga.h
+++ b/drivers/media/platform/rockchip/rga/rga.h
@@ -85,6 +85,7 @@ struct rockchip_rga {
struct mutex mutex;
/* ctrl parm lock */
spinlock_t ctrl_lock;
+ spinlock_t cores_lock;
const struct rga_hw *hw;
--
2.54.0
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH 15/17] media: rockchip: rga: schedule jobs to multiple cores
2026-06-05 22:07 ` [PATCH 15/17] media: rockchip: rga: schedule jobs to multiple cores Sven Püschel
@ 2026-06-05 22:25 ` sashiko-bot
0 siblings, 0 replies; 30+ messages in thread
From: sashiko-bot @ 2026-06-05 22:25 UTC (permalink / raw)
To: Sven Püschel; +Cc: linux-media
Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:
New issues:
- [High] Race condition in device_run() error path and unhandled early return permanently deadlock the mem2mem queue.
- [Medium] Data race on core->curr pointer when clearing job assignments.
Pre-existing issues:
- [High] Out-of-bounds array write on unbind/rebind of components because rga->num_cores is never reset.
- [High] Missing pm_runtime_get_if_active() before hardware access in shared IRQ handler.
- [High] Missing synchronize_irq() in runtime suspend callback for shared IRQ.
--
commit f03021dff03be4350cbe7b0ab4ce4a9a54a7602f
Author: Sven Püschel <s.pueschel@pengutronix.de>
media: rockchip: rga: schedule jobs to multiple cores
This commit modifies the Rockchip RGA driver to schedule jobs across
multiple cores to better utilize hardware. A dedicated spinlock is added
to protect the selection of the next free core from race conditions.
> diff --git a/drivers/media/platform/rockchip/rga/rga.c b/drivers/media/platform/rockchip/rga/rga.c
> index 9cebb461b3fd2..f00b7f99f2521 100644
> --- a/drivers/media/platform/rockchip/rga/rga.c
> +++ b/drivers/media/platform/rockchip/rga/rga.c
> @@ -38,15 +38,31 @@ static void device_run(void *prv)
> {
> struct rga_ctx *ctx = prv;
> struct rockchip_rga *rga = ctx->rga;
> - struct rga_core *core = rga->cores[0];
> + struct rga_core *core = NULL;
> struct vb2_v4l2_buffer *src, *dst;
> unsigned long flags;
> int ret;
> + unsigned int i;
> +
> + spin_lock_irqsave(&rga->cores_lock, flags);
> + for (i = 0; i < rga->num_cores; i++) {
> + if (!rga->cores[i]->curr) {
> + core = rga->cores[i];
> + core->curr = ctx;
> + break;
> + }
> + }
> + spin_unlock_irqrestore(&rga->cores_lock, flags);
> +
> + WARN_ONCE(!core, "No free core although max parallel jobs matches the core count!\n");
> + if (!core)
> + return;
>
> ret = pm_runtime_resume_and_get(core->dev);
> if (ret < 0) {
> v4l2_m2m_buf_done_and_job_finish(rga->m2m_dev, ctx->fh.m2m_ctx,
> VB2_BUF_STATE_ERROR);
> + core->curr = NULL;
[Severity: High]
Could this error path introduce a race condition that permanently deadlocks the
mem2mem queue?
v4l2_m2m_buf_done_and_job_finish() is called before core->curr is cleared,
which asynchronously schedules the next job via a workqueue.
If the worker executes device_run() on another CPU before the current CPU marks
the core as free, the worker will see the core as busy. If no other core is
free, device_run() will trigger the WARN_ONCE and return early without
notifying the v4l2 mem2mem framework:
if (!core)
return;
Because the new job is already marked as TRANS_RUNNING, returning early without
signaling job completion permanently hangs the job queue, and can cause a
D-state deadlock if userspace attempts to close the device file descriptor.
[Severity: Medium]
Additionally, could there be a data race when clearing core->curr here?
While cores_lock is acquired when scanning and assigning a core earlier in
device_run(), it is not held when releasing the core in this error path.
Concurrent unprotected reads and writes to core->curr across CPUs could lead to
undefined behavior.
> return;
> }
[ ... ]
[Severity: Medium]
This isn't a bug introduced by this patch, but does modifying core->curr in
rga_isr() without holding cores_lock introduce a data race?
Since cores_lock was introduced to protect against concurrent selections,
shouldn't it also be held when releasing the core in the interrupt handler?
drivers/media/platform/rockchip/rga/rga.c:rga_isr() {
...
core->curr = NULL;
...
}
[Severity: High]
This isn't a bug introduced by this patch, but does the unbind/rebind path
cause an out-of-bounds array write in rga->cores[]?
During component bind, rga_core_bind() appends to the rga->cores array:
drivers/media/platform/rockchip/rga/rga.c:rga_core_bind() {
...
rga->cores[rga->num_cores++] = core;
...
}
However, the master unbind callback and the component system do not reset
rga->num_cores to 0:
drivers/media/platform/rockchip/rga/rga.c:rga_unbind() {
...
component_unbind_all(dev, NULL);
}
The struct rockchip_rga structure is not freed when a component is unbound. If
a component is unbound and subsequently re-bound, rga_core_bind() will continue
incrementing rga->num_cores past the initially allocated length, corrupting
memory.
[Severity: High]
This isn't a bug introduced by this patch, but is the driver missing a
pm_runtime_get_if_active() check before accessing hardware in the shared IRQ
handler?
The driver registers rga_isr() with IRQF_SHARED if there is no internal IOMMU.
When an interrupt is triggered by another device on the shared line, rga_isr()
blindly calls handle_irq(), which reads hardware registers:
drivers/media/platform/rockchip/rga/rga.c:rga_isr() {
...
if (rga->hw->handle_irq(core)) {
...
}
If the RGA core is runtime-suspended and its clocks are disabled, accessing its
registers will cause an external abort / SError.
[Severity: High]
This isn't a bug introduced by this patch, but is there a missing
synchronize_irq() in the runtime suspend callback for shared IRQs?
In rga_runtime_suspend(), the driver disables clocks unconditionally:
drivers/media/platform/rockchip/rga/rga.c:rga_runtime_suspend() {
...
clk_bulk_disable_unprepare(core->num_clks, core->clks);
...
}
Because the driver can use a shared IRQ, an interrupt handler could be
mid-flight on another CPU when the clocks are disabled, causing an external
abort when the handler accesses the now-unclocked hardware.
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260606-spu-rga3multicore-v1-0-3ec2b15675f7@pengutronix.de?part=15
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH 16/17] arm64: dts: rockchip: add rga3 dt nodes to rk3588
2026-06-05 22:06 [PATCH 00/17] media: rockchip: rga: Add multi-core support Sven Püschel
` (14 preceding siblings ...)
2026-06-05 22:07 ` [PATCH 15/17] media: rockchip: rga: schedule jobs to multiple cores Sven Püschel
@ 2026-06-05 22:07 ` Sven Püschel
2026-06-05 22:07 ` [PATCH 17/17] iommu/rockchip: disable fetch dte time limit Sven Püschel
16 siblings, 0 replies; 30+ messages in thread
From: Sven Püschel @ 2026-06-05 22:07 UTC (permalink / raw)
To: Jacob Chen, Ezequiel Garcia, Mauro Carvalho Chehab,
Heiko Stuebner, Philipp Zabel
Cc: linux-media, linux-rockchip, linux-arm-kernel, linux-kernel,
kernel, Detlev Casanova, Michael Tretter, Sven Püschel
Add devicetree nodes for the RGA3 (Raster Graphics Acceleration 3)
peripheral in the RK3588.
The existing rga node refers to the RGA2-Enhanced peripheral. The RK3588
contains one RGA2-Enhanced core and two RGA3 cores. Both feature a similar
functionality of scaling, cropping and rotating of up to two input
images into one output image. Key differences of the RGA3 are:
- supports 10bit YUV output formats
- supports 8x8 tiles and FBCD as inputs and outputs
- supports BT2020 color space conversion
- max output resolution of (8192-64)x(8192-64)
- MMU can map up to 32G DDR RAM
- fully planar formats (3 planes) are not supported
- max scale up/down factor of 8 (RGA2 allows up to 16)
Signed-off-by: Sven Püschel <s.pueschel@pengutronix.de>
Link: https://patch.msgid.link/20260521-spu-rga3-v7-28-3f33e8c7145f@pengutronix.de
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
---
(cherry picked from commit 25ee898961a2c661e4cd72bc98f0060f1cd11222)
picked from linux-next to have the necessary RGA3 nodes available.
---
arch/arm64/boot/dts/rockchip/rk3588-base.dtsi | 44 +++++++++++++++++++++++++++
1 file changed, 44 insertions(+)
diff --git a/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi b/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi
index 4fb8888c281c8..a4f44af512375 100644
--- a/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi
@@ -1262,6 +1262,50 @@ vpu121_mmu: iommu@fdb50800 {
#iommu-cells = <0>;
};
+ rga3_core0: rga@fdb60000 {
+ compatible = "rockchip,rk3588-rga3";
+ reg = <0x0 0xfdb60000 0x0 0x200>;
+ interrupts = <GIC_SPI 114 IRQ_TYPE_LEVEL_HIGH 0>;
+ clocks = <&cru ACLK_RGA3_0>, <&cru HCLK_RGA3_0>, <&cru CLK_RGA3_0_CORE>;
+ clock-names = "aclk", "hclk", "sclk";
+ resets = <&cru SRST_RGA3_0_CORE>, <&cru SRST_A_RGA3_0>, <&cru SRST_H_RGA3_0>;
+ reset-names = "core", "axi", "ahb";
+ power-domains = <&power RK3588_PD_RGA30>;
+ iommus = <&rga3_0_mmu>;
+ };
+
+ rga3_0_mmu: iommu@fdb60f00 {
+ compatible = "rockchip,rk3588-iommu", "rockchip,rk3568-iommu";
+ reg = <0x0 0xfdb60f00 0x0 0x100>;
+ interrupts = <GIC_SPI 114 IRQ_TYPE_LEVEL_HIGH 0>;
+ clocks = <&cru ACLK_RGA3_0>, <&cru HCLK_RGA3_0>;
+ clock-names = "aclk", "iface";
+ #iommu-cells = <0>;
+ power-domains = <&power RK3588_PD_RGA30>;
+ };
+
+ rga3_core1: rga@fdb70000 {
+ compatible = "rockchip,rk3588-rga3";
+ reg = <0x0 0xfdb70000 0x0 0x200>;
+ interrupts = <GIC_SPI 115 IRQ_TYPE_LEVEL_HIGH 0>;
+ clocks = <&cru ACLK_RGA3_1>, <&cru HCLK_RGA3_1>, <&cru CLK_RGA3_1_CORE>;
+ clock-names = "aclk", "hclk", "sclk";
+ resets = <&cru SRST_RGA3_1_CORE>, <&cru SRST_A_RGA3_1>, <&cru SRST_H_RGA3_1>;
+ reset-names = "core", "axi", "ahb";
+ power-domains = <&power RK3588_PD_RGA31>;
+ iommus = <&rga3_1_mmu>;
+ };
+
+ rga3_1_mmu: iommu@fdb70f00 {
+ compatible = "rockchip,rk3588-iommu", "rockchip,rk3568-iommu";
+ reg = <0x0 0xfdb70f00 0x0 0x100>;
+ interrupts = <GIC_SPI 115 IRQ_TYPE_LEVEL_HIGH 0>;
+ clocks = <&cru ACLK_RGA3_1>, <&cru HCLK_RGA3_1>;
+ clock-names = "aclk", "iface";
+ #iommu-cells = <0>;
+ power-domains = <&power RK3588_PD_RGA31>;
+ };
+
rga: rga@fdb80000 {
compatible = "rockchip,rk3588-rga", "rockchip,rk3288-rga";
reg = <0x0 0xfdb80000 0x0 0x180>;
--
2.54.0
^ permalink raw reply related [flat|nested] 30+ messages in thread* [PATCH 17/17] iommu/rockchip: disable fetch dte time limit
2026-06-05 22:06 [PATCH 00/17] media: rockchip: rga: Add multi-core support Sven Püschel
` (15 preceding siblings ...)
2026-06-05 22:07 ` [PATCH 16/17] arm64: dts: rockchip: add rga3 dt nodes to rk3588 Sven Püschel
@ 2026-06-05 22:07 ` Sven Püschel
2026-06-05 22:26 ` sashiko-bot
16 siblings, 1 reply; 30+ messages in thread
From: Sven Püschel @ 2026-06-05 22:07 UTC (permalink / raw)
To: Jacob Chen, Ezequiel Garcia, Mauro Carvalho Chehab,
Heiko Stuebner, Philipp Zabel
Cc: linux-media, linux-rockchip, linux-arm-kernel, linux-kernel,
kernel, Detlev Casanova, Michael Tretter, Sven Püschel,
Simon Xue, Joerg Roedel
From: Simon Xue <xxm@rock-chips.com>
Disable the Bit 31 of the AUTO_GATING iommu register, as it causes
hangups with the RGA3 (Raster Graphics Acceleration 3) peripheral.
The RGA3 register description of the TRM already states that the bit
must be set to 1. The vendor kernel sets the bit unconditionally to
1 to fix VOP (Video Output Processor) screen black issues. This patch
squashes the 2 vendor kernel commits with the following commit messages:
Master fetch data and cpu update page table may work in parallel, may
have the following procedure:
master cpu
fetch dte update page tabl
| |
(make dte invalid) <- zap iotlb entry
| |
fetch dte again
(make dte invalid) <- zap iotlb entry
| |
fetch dte again
(make dte invalid) <- zap iotlb entry
| |
fetch dte again
(make iommu block) <- zap iotlb entry
New iommu version has the above bug, if fetch dte consecutively four
times, then it will be blocked. Fortunately, we can set bit 31 of
register MMU_AUTO_GATING to 1 to make it work as old version which does
not have this issue.
This issue only appears on RV1126 so far, so make a workaround dedicated
to "rockchip,rv1126" machine type.
iommu/rockchip: fix vop blocked and screen black on RK356X and RK3588
RK3568 and RK3588 has the same issue as RV1126/RV1109 that caused by
dte fetch time limit, So we can set BIT(31) of register 0x24 default
to 1 as a workaround.
Signed-off-by: Simon Xue <xxm@rock-chips.com>
Signed-off-by: Sven Püschel <s.pueschel@pengutronix.de>
Acked-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
(cherry picked from commit 8d4346ecd4950ae08cc76a6de327c264e846758c)
picked from the next branch of
https://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux.git
for a convenient usage of the patches.
---
drivers/iommu/rockchip-iommu.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
index 0013cf196c573..87ae036d64145 100644
--- a/drivers/iommu/rockchip-iommu.c
+++ b/drivers/iommu/rockchip-iommu.c
@@ -76,6 +76,8 @@
#define SPAGE_ORDER 12
#define SPAGE_SIZE (1 << SPAGE_ORDER)
+#define DISABLE_FETCH_DTE_TIME_LIMIT BIT(31)
+
/*
* Support mapping any size that fits in one page table:
* 4 KiB to 4 MiB
@@ -930,6 +932,7 @@ static int rk_iommu_enable(struct rk_iommu *iommu)
struct iommu_domain *domain = iommu->domain;
struct rk_iommu_domain *rk_domain = to_rk_domain(domain);
int ret, i;
+ u32 auto_gate;
ret = clk_bulk_enable(iommu->num_clocks, iommu->clocks);
if (ret)
@@ -948,6 +951,11 @@ static int rk_iommu_enable(struct rk_iommu *iommu)
rk_ops->mk_dtentries(rk_domain->dt_dma));
rk_iommu_base_command(iommu->bases[i], RK_MMU_CMD_ZAP_CACHE);
rk_iommu_write(iommu->bases[i], RK_MMU_INT_MASK, RK_MMU_IRQ_MASK);
+
+ /* Workaround for iommu blocked, BIT(31) default to 1 */
+ auto_gate = rk_iommu_read(iommu->bases[i], RK_MMU_AUTO_GATING);
+ auto_gate |= DISABLE_FETCH_DTE_TIME_LIMIT;
+ rk_iommu_write(iommu->bases[i], RK_MMU_AUTO_GATING, auto_gate);
}
ret = rk_iommu_enable_paging(iommu);
--
2.54.0
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH 17/17] iommu/rockchip: disable fetch dte time limit
2026-06-05 22:07 ` [PATCH 17/17] iommu/rockchip: disable fetch dte time limit Sven Püschel
@ 2026-06-05 22:26 ` sashiko-bot
0 siblings, 0 replies; 30+ messages in thread
From: sashiko-bot @ 2026-06-05 22:26 UTC (permalink / raw)
To: Sven Püschel; +Cc: linux-media
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [High] Unconditional application of a hardware workaround to all Rockchip IOMMU versions.
--
commit 927dc2f6e4d804f46fcc9640f9c6e784b79d0d3b
Author: Simon Xue <xxm@rock-chips.com>
iommu/rockchip: disable fetch dte time limit
This commit sets bit 31 of the AUTO_GATING IOMMU register to 1 to fix an
issue where the IOMMU blocks after fetching DTE consecutively four times.
It squashes two vendor commits covering RV1126, RK3568, and RK3588 hardware.
> diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
> index 0013cf196c573..87ae036d64145 100644
> --- a/drivers/iommu/rockchip-iommu.c
> +++ b/drivers/iommu/rockchip-iommu.c
[ ... ]
> @@ -948,6 +951,11 @@ static int rk_iommu_enable(struct rk_iommu *iommu)
> rk_ops->mk_dtentries(rk_domain->dt_dma));
> rk_iommu_base_command(iommu->bases[i], RK_MMU_CMD_ZAP_CACHE);
> rk_iommu_write(iommu->bases[i], RK_MMU_INT_MASK, RK_MMU_IRQ_MASK);
> +
> + /* Workaround for iommu blocked, BIT(31) default to 1 */
> + auto_gate = rk_iommu_read(iommu->bases[i], RK_MMU_AUTO_GATING);
> + auto_gate |= DISABLE_FETCH_DTE_TIME_LIMIT;
> + rk_iommu_write(iommu->bases[i], RK_MMU_AUTO_GATING, auto_gate);
> }
[Severity: High]
The commit message indicates that this issue affects new IOMMU versions
(specifically RV1126, RK3568, and RK3588) and mentions that older versions
do not have this bug.
Does this code apply the workaround unconditionally to all hardware versions
managed by the driver?
If the RK_MMU_AUTO_GATING register is unimplemented, reserved, or has a
different meaning on legacy v1 silicon (such as RK3288 or RK3399), could
reading and writing it trigger an asynchronous bus error or corrupt
undocumented hardware state?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260606-spu-rga3multicore-v1-0-3ec2b15675f7@pengutronix.de?part=17
^ permalink raw reply [flat|nested] 30+ messages in thread