From: MidG971 <midgy971@gmail.com>
To: Tomeu Vizoso <tomeu@tomeuvizoso.net>, Oded Gabbay <ogabbay@kernel.org>
Cc: Rob Herring <robh@kernel.org>,
Krzysztof Kozlowski <krzk+dt@kernel.org>,
Conor Dooley <conor+dt@kernel.org>,
Heiko Stuebner <heiko@sntech.de>,
dri-devel@lists.freedesktop.org, devicetree@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
linux-rockchip@lists.infradead.org, linux-kernel@vger.kernel.org,
Midgy BALON <midgy971@gmail.com>
Subject: [PATCH v2 0/4] accel: rocket: Add RK3568 NPU support
Date: Fri, 29 May 2026 17:58:20 +0200 [thread overview]
Message-ID: <20260529155824.3099831-1-midgy971@gmail.com> (raw)
From: Midgy BALON <midgy971@gmail.com>
This series adds Rockchip RK3568 support to the upstream Rocket accel
driver (drivers/accel/rocket/), tested on a Radxa ROCK 3B board running
Linux 6.19-rc5.
The RK3568 carries a single NVDLA-derived NPU core (0.8 TOPS), the same
IP family as the three-core RK3588 NPU already supported by the driver.
The hardware register layout (pc/cna/core regions, interrupt, IOMMU) is
identical; the differences are:
- 32-bit DMA address limit (NPU AXI bus and IOMMU page walker are 32-bit)
- Requires explicit PVTPLL initialisation via two TF-A SCMI calls before
the NPU NOC bus can be de-idled
- Requires explicit PMU writes to power on the NPU domain (because the
RK3568 power domain RK3568_PD_NPU is always_on so the generic
pm-domains callback is a no-op) and de-idle the NPU NOC bus
Patch 1 introduces a per-SoC rocket_soc_data abstraction (dma_bits and
optional noc_init callback) plumbed via of_device_get_match_data(), and
adds RK3568 SoC support on top of it. The DMA mask for the parent
DRM facade device is chosen based on the narrowest core present
(32-bit if any RK3568 core is in the system).
Patch 2 documents the new rk3568-rknn-core compatible and the
rockchip,pmu phandle that RK3568 requires; the sram-supply property
becomes conditional (RK3588-only).
Patches 3-4 add the RK3568 NPU and IOMMU nodes to rk356x-base.dtsi and
enable them on the Radxa ROCK 3B.
Verified on Radxa ROCK 3B (RK3568, 8 GB RAM):
- /dev/accel/accel0 created at boot
- dmesg: "Rockchip NPU core 0 version: 0"
- IOMMU domain attached per open()
- Job submission path complete: regcmd reaches the NPU's program
controller, PC processes all 135 regcmd entries, broadcasts to
sub-units, and advances to BSP-equivalent completion state
(PC_TASKST=0x11000)
Status of end-to-end inference: NOT YET WORKING. After 12 days of
investigation comparing rocket's behaviour against the vendor BSP RKNPU
driver, the NPU's MMIO state at submission time matches BSP byte-for-byte
(CNA configs, sub-unit OP_ENABLE registers, CBUF_CON0, etc.) but no
sub-unit transitions to its EXECUTER state and the completion IRQ never
fires. The kernel driver and DT infrastructure in this series stand on
their own — the driver loads, IOMMU domain is attached, regcmd reaches
the NPU, PC state machine matches BSP — but a mesa-side regcmd issue
(or another piece we have not yet found) blocks the final conv firing.
I am sending this series now because the kernel and DT pieces are
self-contained, verifiable, and ready for review. A separate RFC on
mesa-dev will follow with the userspace findings. Detailed investigation
notes are available on request; relevant highlights for the maintainer:
1. Mesa rocket userspace (src/gallium/drivers/rocket/) targets RK3588.
For RK3568, several encoded values need adjustment. Most notably,
sub-unit OP_ENABLE register offset on RK3568 is 0x_00c, not 0x_008.
Mesa emits writes at 0x1008/0x2008/0x3008/0x4008/0x5008 — BSP regcmd
captures show no writes at these offsets across two distinct conv
shapes (YOLOv5s 6x6/s2 and MobileNet 3x3/s2). BSP writes OP_ENABLE
at offset 0x_00c with multi-bit values (CMAC=0x1, ACCU=0x0, DPU=0x108,
DPU_RDMA=0x13f), not bit-0 booleans. This and a handful of other
shape-independent value differences will be filed as a mesa RFC.
2. The vendor BSP RKNPU driver writes the userspace task_base_addr to
PC_DMA_BASE_ADDR (PC offset 0x34); the rocket driver did not. PC's
TASK_DMA engine reads struct rknpu_task descriptors from there. With
task_pp_en=1 in TASK_CON and a kernel-allocated descriptor BO,
PC's task counter state machine advances from "stuck at 0xf000" to
the BSP completion state. This is the most invasive piece of the
investigation and is held back for a follow-on patch (not in this
series); the current series gets the driver to a working /dev/accel/
node and an attached IOMMU domain, which is the right shape for v2.
3. The NPU's master AXI port is 32-bit, but dma_alloc_coherent() through
the dma-iommu framework silently ignores GFP_DMA32 even with a 32-bit
dma_mask set on the device. When BOs for the NPU are allocated kernel-
side, __get_free_pages(GFP_DMA32 | __GFP_ZERO, order) + dma_map_single()
is the working pattern. Not in this series, but might be a useful
documentation note for other 32-bit AXI accelerators using dma-iommu.
This series builds against current v6.19-rc5 with no checkpatch warnings,
the dtb builds, and dtbs_check passes. The April v1 series included a
fifth patch ("Use of_find_matching_node() instead of for_each_of_allnodes")
which is no longer required — upstream rocket already uses
for_each_compatible_node() since v6.19-rc5.
Changes since v1 (April 2026, never sent on-list):
- Rebased to v6.19-rc5
- Patch 1 absorbed v1 patch 1 (obsolete) and now includes the
rocket_soc_data abstraction needed to support both RK3568 and
RK3588 cores in the same driver
- Cover letter expanded with current investigation status
Assisted by Claude Sonnet/Opus 4.x throughout the investigation. All
findings empirically verified via BSP register captures and side-by-side
rocket execution traces on the same board.
Midgy BALON (4):
accel: rocket: Add support for Rockchip RK3568
dt-bindings: npu: rockchip,rk3588-rknn-core: Add RK3568 support
arm64: dts: rockchip: rk356x: Add NPU and its IOMMU
arm64: dts: rockchip: rk3568-rock-3b: Enable NPU
Documentation/devicetree/bindings/npu/rockchip,rk3588-rknn-core.yaml | 18 ++++++++++++++--
arch/arm64/boot/dts/rockchip/rk356x-base.dtsi | 31 +++++++++++++++++++++++++++
arch/arm64/boot/dts/rockchip/rk3568-rock-3b.dts | 9 ++++++++
drivers/accel/rocket/rocket_core.c | 21 +++++++++++++++++-
drivers/accel/rocket/rocket_core.h | 18 ++++++++++++++--
drivers/accel/rocket/rocket_device.c | 23 +++++++++++++++++--
drivers/accel/rocket/rocket_drv.c | 79 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
7 files changed, 192 insertions(+), 7 deletions(-)
Midgy BALON (4):
accel: rocket: Add support for Rockchip RK3568
dt-bindings: npu: rockchip,rk3588-rknn-core: Add RK3568 support
arm64: dts: rockchip: rk356x: Add NPU and its IOMMU
arm64: dts: rockchip: rk3568-rock-3b: Enable NPU
.../npu/rockchip,rk3588-rknn-core.yaml | 18 ++++-
.../boot/dts/rockchip/rk3568-rock-3b.dts | 9 +++
arch/arm64/boot/dts/rockchip/rk356x-base.dtsi | 31 ++++++++
drivers/accel/rocket/rocket_core.c | 18 ++++-
drivers/accel/rocket/rocket_core.h | 16 +++++
drivers/accel/rocket/rocket_device.c | 25 ++++++-
drivers/accel/rocket/rocket_drv.c | 71 ++++++++++++++++++-
7 files changed, 182 insertions(+), 6 deletions(-)
--
2.39.5
next reply other threads:[~2026-05-29 15:56 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-29 15:58 MidG971 [this message]
2026-05-29 15:58 ` [PATCH v2 1/4] accel: rocket: Add support for Rockchip RK3568 MidG971
2026-05-29 18:19 ` Heiko Stuebner
2026-05-29 15:58 ` [PATCH v2 2/4] dt-bindings: npu: rockchip,rk3588-rknn-core: Add RK3568 support MidG971
2026-05-29 16:18 ` Krzysztof Kozlowski
2026-05-29 15:58 ` [PATCH v2 3/4] arm64: dts: rockchip: rk356x: Add NPU and its IOMMU MidG971
2026-05-29 15:58 ` [PATCH v2 4/4] arm64: dts: rockchip: rk3568-rock-3b: Enable NPU MidG971
2026-05-29 16:17 ` [PATCH v2 0/4] accel: rocket: Add RK3568 NPU support Krzysztof Kozlowski
2026-05-29 18:04 ` Heiko Stuebner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260529155824.3099831-1-midgy971@gmail.com \
--to=midgy971@gmail.com \
--cc=conor+dt@kernel.org \
--cc=devicetree@vger.kernel.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=heiko@sntech.de \
--cc=krzk+dt@kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rockchip@lists.infradead.org \
--cc=ogabbay@kernel.org \
--cc=robh@kernel.org \
--cc=tomeu@tomeuvizoso.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox