Linux Media Controller development
 help / color / mirror / Atom feed
* [PATCH 00/17] media: rockchip: rga: Add multi-core support
@ 2026-06-05 22:06 Sven Püschel
  2026-06-05 22:06 ` [PATCH 01/17] media: rockchip: rga: zero cmdbuf in shared code Sven Püschel
                   ` (16 more replies)
  0 siblings, 17 replies; 30+ messages in thread
From: Sven Püschel @ 2026-06-05 22:06 UTC (permalink / raw)
  To: Jacob Chen, Ezequiel Garcia, Mauro Carvalho Chehab,
	Heiko Stuebner, Philipp Zabel
  Cc: linux-media, linux-rockchip, linux-arm-kernel, linux-kernel,
	kernel, Detlev Casanova, Michael Tretter, Sven Püschel,
	Simon Xue, Joerg Roedel

Add multi-core support to the RGA (Raster Graphic Accelerator) driver
for Rockchip SoCs. This works by scheduling the given work to multiple
identical RGA cores. Previously other identical cores were discarded
while probing with -ENODEV to avoid exposing multiple video devices for
identical cores and breaking the ABI when adding an in-kernel scheduling.

This series targets the RK3588 SoC, which has one RGA2-Enhance core
and two RGA3 cores (see [1] for an overview of the different RGA cores).
The slimmed down RK3576 SoC also features two RGA2-Pro
(also described as RGA2.5) cores, but is currently not supported by
the driver. Tests are done on a Radxa Rock 5T SBC.

The scheduling is done only on a context level, which causes no
increased performance for a single stream (which uses only one mem2mem
context). Therefore at least N parallel stream are necessary to utilize
N cores. This avoids the more complex buffer handling required to avoid
mixing the frame ordering when one core is slightly faster than the
other (e.g. due to memory transfer timings or different clocks).

While the work is based on Detlev Casanova's multi-core series for the
rkvdec driver [2], it differs in two major aspects:

(1) It doesn't directly call v4l2_m2m_job_finish to mark the current job
as finished in the device_run callback. Detlev used this to trick the
m2m framework to directly schedule the next job. This looked like a
dirty hack and had me running into some of it's pitfalls (e.g. the
difference between the v4l2_m2m_buf_done and the newly introduced
v4l2_m2m_buf_done_manual function).
Instead I've dropped the current curr_ctx member of the v4l2_m2m_dev
struct and added a max_parallel_jobs member to specify the maximum
number of parallel jobs. This allows the driver to set it's maximum
number of parallel jobs with the newly introduced
v4l2_m2m_set_max_parallel_jobs function. The RGA driver uses it to set
it's number of parallel jobs to it's number of available cores. The m2m
framework then schedules the first N jobs on it's job queue to the
device_run callback instead of only one.

(2) Instead of attaching an identical RGA core on probe to the first
probed RGA core instance, use component helpers to add all cores as
components to a virtual platform device. This has the advantage of only
creating the video device after all cores have been probed successfully
and tearing it down if one core is being removed (e.g. by the sysfs),
which otherwise could lead to nasty memory bugs. The implementation is
based on the driver of the etnaviv gpu. As the virtual platform device
doesn't has an iommu, we still allocate all relevant drives on the first
core, which shares it's iommu domain with all other cores.

v4l2-compliance results:
    v4l2-compliance 1.32.0, 64 bits, 64-bit time_t
    ...
    	Card type        : rga2
    ...
    Total for rockchip-rga device /dev/video0: 48, Succeeded: 48, Failed: 0, Warnings: 0
    
    v4l2-compliance 1.32.0, 64 bits, 64-bit time_t
    ...
    	Card type        : rga3
    ...
    Total for rockchip-rga device /dev/video1: 48, Succeeded: 48, Failed: 0, Warnings: 0

The DTS and iommu changes at the end are picked out of other next trees
to provide an easy way to actually test the changes with an RGA3 on a
rk3588 SoC. They'll be dropped when they get into media/next.

Patch 1-3 address review comments from my last RGA3 patch series
Patch 4 additional driver cleanup
Patch 5 implements support for parallel jobs in the m2m framework
Patch 6-8 add multi core preparations to the driver
Patch 9-13 rework the driver to use component helpers
Patch 14 puts all cores into the same iommu domain
Patch 15 enables the multi-core support
patch 16-17 just pick patches required for testing

[1] https://codeberg.org/airockchip/librga/src/branch/main/docs/Rockchip_Developer_Guide_RGA_EN.md#design-index
[2] https://lore.kernel.org/linux-media/20260409-rkvdec-multicore-v1-0-62b316abf0f7@collabora.com/

Signed-off-by: Sven Püschel <s.pueschel@pengutronix.de>
---
Simon Xue (1):
      iommu/rockchip: disable fetch dte time limit

Sven Püschel (16):
      media: rockchip: rga: zero cmdbuf in shared code
      media: rockchip: rga: add comment about pixel alignment for YUV formats
      media: rockchip: rga: move early return into if condition in vidioc_enum_fmt
      media: rockchip: rga: removed unused regmap member
      media: v4l2-mem2mem: support running multiple jobs in parallel
      media: rockchip: rga:  move power handling to device_run
      media: rockchip: rga: adjust get_version to return the version
      media: rockchip: rga: add rga_core structure
      media: rockchip: rga: use components to manage multiple cores
      media: rockchip: rga: move rockchip_rga allocation to master probe
      media: rockchip: rga: move video device to the master
      media: rockchip: rga: move core initialization from bind to probe
      media: rockchip: rga: bind all cores to the master
      media: rockchip: rga: put all cores into first core iommu domain
      media: rockchip: rga: schedule jobs to multiple cores
      arm64: dts: rockchip: add rga3 dt nodes to rk3588

 arch/arm64/boot/dts/rockchip/rk3588-base.dtsi |  44 +++
 drivers/iommu/rockchip-iommu.c                |   8 +
 drivers/media/platform/rockchip/rga/rga-buf.c |  16 +-
 drivers/media/platform/rockchip/rga/rga-hw.c  |  40 +-
 drivers/media/platform/rockchip/rga/rga.c     | 501 +++++++++++++++++++-------
 drivers/media/platform/rockchip/rga/rga.h     |  45 ++-
 drivers/media/platform/rockchip/rga/rga3-hw.c |  32 +-
 drivers/media/v4l2-core/v4l2-mem2mem.c        |  89 +++--
 include/media/v4l2-mem2mem.h                  |   3 +
 9 files changed, 541 insertions(+), 237 deletions(-)
---
base-commit: 6a75e3d4f6428b90f398354212e3a2e0172851d6
change-id: 20260602-spu-rga3multicore-ae8c8caf01e9

Best regards,
--  
Sven Püschel <s.pueschel@pengutronix.de>


^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2026-06-05 22:26 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-05 22:06 [PATCH 00/17] media: rockchip: rga: Add multi-core support Sven Püschel
2026-06-05 22:06 ` [PATCH 01/17] media: rockchip: rga: zero cmdbuf in shared code Sven Püschel
2026-06-05 22:20   ` sashiko-bot
2026-06-05 22:06 ` [PATCH 02/17] media: rockchip: rga: add comment about pixel alignment for YUV formats Sven Püschel
2026-06-05 22:06 ` [PATCH 03/17] media: rockchip: rga: move early return into if condition in vidioc_enum_fmt Sven Püschel
2026-06-05 22:06 ` [PATCH 04/17] media: rockchip: rga: removed unused regmap member Sven Püschel
2026-06-05 22:06 ` [PATCH 05/17] media: v4l2-mem2mem: support running multiple jobs in parallel Sven Püschel
2026-06-05 22:18   ` sashiko-bot
2026-06-05 22:06 ` [PATCH 06/17] media: rockchip: rga: move power handling to device_run Sven Püschel
2026-06-05 22:22   ` sashiko-bot
2026-06-05 22:06 ` [PATCH 07/17] media: rockchip: rga: adjust get_version to return the version Sven Püschel
2026-06-05 22:06 ` [PATCH 08/17] media: rockchip: rga: add rga_core structure Sven Püschel
2026-06-05 22:22   ` sashiko-bot
2026-06-05 22:06 ` [PATCH 09/17] media: rockchip: rga: use components to manage multiple cores Sven Püschel
2026-06-05 22:24   ` sashiko-bot
2026-06-05 22:06 ` [PATCH 10/17] media: rockchip: rga: move rockchip_rga allocation to master probe Sven Püschel
2026-06-05 22:23   ` sashiko-bot
2026-06-05 22:06 ` [PATCH 11/17] media: rockchip: rga: move video device to the master Sven Püschel
2026-06-05 22:21   ` sashiko-bot
2026-06-05 22:06 ` [PATCH 12/17] media: rockchip: rga: move core initialization from bind to probe Sven Püschel
2026-06-05 22:20   ` sashiko-bot
2026-06-05 22:06 ` [PATCH 13/17] media: rockchip: rga: bind all cores to the master Sven Püschel
2026-06-05 22:23   ` sashiko-bot
2026-06-05 22:07 ` [PATCH 14/17] media: rockchip: rga: put all cores into first core iommu domain Sven Püschel
2026-06-05 22:23   ` sashiko-bot
2026-06-05 22:07 ` [PATCH 15/17] media: rockchip: rga: schedule jobs to multiple cores Sven Püschel
2026-06-05 22:25   ` sashiko-bot
2026-06-05 22:07 ` [PATCH 16/17] arm64: dts: rockchip: add rga3 dt nodes to rk3588 Sven Püschel
2026-06-05 22:07 ` [PATCH 17/17] iommu/rockchip: disable fetch dte time limit Sven Püschel
2026-06-05 22:26   ` sashiko-bot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox