* [PATCH v9 00/10] New DRM accel driver for Rockchip's RKNN NPU
@ 2025-07-21 9:17 Tomeu Vizoso
2025-07-21 9:17 ` [PATCH v9 02/10] accel/rocket: Add a new driver for Rockchip's NPU Tomeu Vizoso
` (11 more replies)
0 siblings, 12 replies; 16+ messages in thread
From: Tomeu Vizoso @ 2025-07-21 9:17 UTC (permalink / raw)
To: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Heiko Stuebner,
Oded Gabbay, Jonathan Corbet, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, Sebastian Reichel, Nicolas Frattaroli,
Kever Yang, Robin Murphy, Daniel Stone, Da Xue, Philipp Zabel,
Jeff Hugo
Cc: devicetree, linux-arm-kernel, linux-rockchip, linux-kernel,
dri-devel, linux-doc, linux-media, linaro-mm-sig, Tomeu Vizoso,
Robert Foss, Jeff Hugo, Krzysztof Kozlowski
This series adds a new driver for the NPU that Rockchip includes in its
newer SoCs, developed by them on the NVDLA base.
In its current form, it supports the specific NPU in the RK3588 SoC.
The userspace driver is part of Mesa and an initial draft can be found at:
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29698
Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
---
Changes in v9:
- Rename the DT reference for the IOMMU for core 0
- Link to v8: https://lore.kernel.org/r/20250713-6-10-rocket-v8-0-64fa3115e910@tomeuvizoso.net
Changes in v8:
- Kconfig improvements
- Removed notion of top core, all cores are equivalent now
- Explicitly allocate DMA addresses
- Sync BOs always in both directions
- UAPI improvements
- Simplified job scheduling
- Misc. style improvements
- Link to v7: https://lore.kernel.org/r/20250606-6-10-rocket-v7-0-dc16cfe6fe4e@tomeuvizoso.net
Changes in v7:
- Actually enable process isolation by allocating its own IOMMU domain
to each DRM client.
- Link to v6: https://lore.kernel.org/r/20250604-6-10-rocket-v6-0-237ac75ddb5e@tomeuvizoso.net
Changes in v6:
- Make all cores depend on pclk and npu clocks
- Fix BO sync direction logic
- Misc. cleanups
- Link to v5: https://lore.kernel.org/r/20250520-6-10-rocket-v5-0-18c9ca0fcb3c@tomeuvizoso.net
Changes in v5:
- Use bulk clk API
- Rename bindings file
- Syntax improvement to bindings
- Link to v4: https://lore.kernel.org/r/20250519-6-10-rocket-v4-0-d6dff6b4c0ae@tomeuvizoso.net
Changes in v4:
- Several fixes to DT bindings.
- Link to v3: https://lore.kernel.org/r/20250516-6-10-rocket-v3-0-7051ac9225db@tomeuvizoso.net
Changes in v3:
- Reference in the device tree only the register blocks that are
actually used.
- Several style and robustness fixes suggested in the mailing list.
- Added patches from Nicolas Frattaroli that add support to the NPU for
the Rock 5B board.
- Link to v2: https://lore.kernel.org/r/20250225-6-10-rocket-v2-0-d4dbcfafc141@tomeuvizoso.net
Changes in v2:
- Drop patch adding the rk3588 compatible to rockchip-iommu (Sebastian Reichel)
- Drop patch adding support for multiple power domains to rockchip-iommu (Sebastian Reichel)
- Link to v1: https://lore.kernel.org/r/20240612-6-10-rocket-v1-0-060e48eea250@tomeuvizoso.net
---
Nicolas Frattaroli (2):
arm64: dts: rockchip: add pd_npu label for RK3588 power domains
arm64: dts: rockchip: enable NPU on ROCK 5B
Tomeu Vizoso (8):
accel/rocket: Add registers header
accel/rocket: Add a new driver for Rockchip's NPU
accel/rocket: Add IOCTL for BO creation
accel/rocket: Add job submission IOCTL
accel/rocket: Add IOCTLs for synchronizing memory accesses
dt-bindings: npu: rockchip,rknn: Add bindings
arm64: dts: rockchip: Add nodes for NPU and its MMU to rk3588-base
arm64: dts: rockchip: Enable the NPU on quartzpro64
Documentation/accel/index.rst | 1 +
Documentation/accel/rocket/index.rst | 19 +
.../bindings/npu/rockchip,rk3588-rknn-core.yaml | 112 +
MAINTAINERS | 10 +
arch/arm64/boot/dts/rockchip/rk3588-base.dtsi | 93 +-
.../arm64/boot/dts/rockchip/rk3588-quartzpro64.dts | 30 +
arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dtsi | 57 +
drivers/accel/Kconfig | 1 +
drivers/accel/Makefile | 1 +
drivers/accel/rocket/Kconfig | 24 +
drivers/accel/rocket/Makefile | 10 +
drivers/accel/rocket/rocket_core.c | 110 +
drivers/accel/rocket/rocket_core.h | 64 +
drivers/accel/rocket/rocket_device.c | 60 +
drivers/accel/rocket/rocket_device.h | 30 +
drivers/accel/rocket/rocket_drv.c | 290 ++
drivers/accel/rocket/rocket_drv.h | 30 +
drivers/accel/rocket/rocket_gem.c | 181 +
drivers/accel/rocket/rocket_gem.h | 34 +
drivers/accel/rocket/rocket_job.c | 636 +++
drivers/accel/rocket/rocket_job.h | 52 +
drivers/accel/rocket/rocket_registers.h | 4404 ++++++++++++++++++++
include/uapi/drm/rocket_accel.h | 142 +
23 files changed, 6390 insertions(+), 1 deletion(-)
---
base-commit: 156faa3ffe21347203b35a3edb6d2bcb663f429b
change-id: 20240612-6-10-rocket-9316defc14c7
Best regards,
--
Tomeu Vizoso <tomeu@tomeuvizoso.net>
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v9 02/10] accel/rocket: Add a new driver for Rockchip's NPU
2025-07-21 9:17 [PATCH v9 00/10] New DRM accel driver for Rockchip's RKNN NPU Tomeu Vizoso
@ 2025-07-21 9:17 ` Tomeu Vizoso
2025-07-21 9:17 ` [PATCH v9 03/10] accel/rocket: Add IOCTL for BO creation Tomeu Vizoso
` (10 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: Tomeu Vizoso @ 2025-07-21 9:17 UTC (permalink / raw)
To: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Heiko Stuebner,
Oded Gabbay, Jonathan Corbet, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, Sebastian Reichel, Nicolas Frattaroli,
Kever Yang, Robin Murphy, Daniel Stone, Da Xue, Philipp Zabel,
Jeff Hugo
Cc: devicetree, linux-arm-kernel, linux-rockchip, linux-kernel,
dri-devel, linux-doc, linux-media, linaro-mm-sig, Tomeu Vizoso,
Robert Foss, Jeff Hugo
This initial version supports the NPU as shipped in the RK3588 SoC and
described in the first part of its TRM, in Chapter 36.
This NPU contains 3 independent cores that the driver can submit jobs
to.
This commit adds just hardware initialization and power management.
v2:
- Split cores and IOMMUs as independent devices (Sebastian Reichel)
- Add some documentation (Jeffrey Hugo)
- Be more explicit in the Kconfig documentation (Jeffrey Hugo)
- Remove resets, as these haven't been found useful so far (Zenghui Yu)
- Repack structs (Jeffrey Hugo)
- Use DEFINE_DRM_ACCEL_FOPS (Jeffrey Hugo)
- Use devm_drm_dev_alloc (Jeffrey Hugo)
- Use probe log helper (Jeffrey Hugo)
- Introduce UABI header in a later patch (Jeffrey Hugo)
v3:
- Adapt to a split of the register block in the DT bindings (Nicolas
Frattaroli)
- Move registers header to its own commit (Thomas Zimmermann)
- Misc. cleanups (Thomas Zimmermann and Jeff Hugo)
- Make use of GPL-2.0-only for the copyright notice (Jeff Hugo)
- PM improvements (Nicolas Frattaroli)
v4:
- Use bulk clk API (Krzysztof Kozlowski)
v6:
- Remove mention to NVDLA, as the hardware is only incidentally related
(Kever Yang)
- Use calloc instead of GFP_ZERO (Jeff Hugo)
- Explicitly include linux/container_of.h (Jeff Hugo)
- pclk and npu clocks are now needed by all cores (Rob Herring)
v7:
- Assign its own IOMMU domain to each client, for isolation (Daniel
Stone and Robin Murphy)
v8:
- Kconfig: fix depends to be more explicit about Rockchip, and remove
superfluous selects (Robin Murphy)
- Use reset lines to reset the cores (Robin Murphy)
- Reference count the module
- Set dma_set_max_seg_size
- Correctly acquire a reference to the IOMMU (Robin Murphy)
- Remove notion of top core (Robin Murphy)
Reviewed-by: Robert Foss <rfoss@kernel.org>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
---
Documentation/accel/index.rst | 1 +
Documentation/accel/rocket/index.rst | 19 +++
MAINTAINERS | 10 ++
drivers/accel/Kconfig | 1 +
drivers/accel/Makefile | 1 +
drivers/accel/rocket/Kconfig | 24 ++++
drivers/accel/rocket/Makefile | 8 ++
drivers/accel/rocket/rocket_core.c | 100 ++++++++++++++
drivers/accel/rocket/rocket_core.h | 49 +++++++
drivers/accel/rocket/rocket_device.c | 56 ++++++++
drivers/accel/rocket/rocket_device.h | 28 ++++
drivers/accel/rocket/rocket_drv.c | 261 +++++++++++++++++++++++++++++++++++
drivers/accel/rocket/rocket_drv.h | 23 +++
13 files changed, 581 insertions(+)
diff --git a/Documentation/accel/index.rst b/Documentation/accel/index.rst
index bc85f26533d88891dde482f91e26c99991b22869..d8fa332d60a890dbb617454d2a26d9b6f9b196aa 100644
--- a/Documentation/accel/index.rst
+++ b/Documentation/accel/index.rst
@@ -10,6 +10,7 @@ Compute Accelerators
introduction
amdxdna/index
qaic/index
+ rocket/index
.. only:: subproject and html
diff --git a/Documentation/accel/rocket/index.rst b/Documentation/accel/rocket/index.rst
new file mode 100644
index 0000000000000000000000000000000000000000..70f97bccf100022550ac7a0718dc77094f1a8c28
--- /dev/null
+++ b/Documentation/accel/rocket/index.rst
@@ -0,0 +1,19 @@
+.. SPDX-License-Identifier: GPL-2.0-only
+
+=====================================
+ accel/rocket Rockchip NPU driver
+=====================================
+
+The accel/rocket driver supports the Neural Processing Units (NPUs) inside some
+Rockchip SoCs such as the RK3588. Rockchip calls it RKNN and sometimes RKNPU.
+
+The hardware is described in chapter 36 in the RK3588 TRM.
+
+This driver just powers the hardware on and off, allocates and maps buffers to
+the device and submits jobs to the frontend unit. Everything else is done in
+userspace, as a Gallium driver (also called rocket) that is part of the Mesa3D
+project.
+
+Hardware currently supported:
+
+* RK3588
diff --git a/MAINTAINERS b/MAINTAINERS
index a92290fffa163f9fe8fe3f04bf66426f9a894409..3ae890e178d1455d99323a5941972a50e82b70b6 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -7390,6 +7390,16 @@ T: git https://gitlab.freedesktop.org/drm/misc/kernel.git
F: drivers/accel/ivpu/
F: include/uapi/drm/ivpu_accel.h
+DRM ACCEL DRIVER FOR ROCKCHIP NPU
+M: Tomeu Vizoso <tomeu@tomeuvizoso.net>
+L: dri-devel@lists.freedesktop.org
+S: Supported
+T: git https://gitlab.freedesktop.org/drm/misc/kernel.git
+F: Documentation/accel/rocket/
+F: Documentation/devicetree/bindings/npu/rockchip,rknn-core.yaml
+F: drivers/accel/rocket/
+F: include/uapi/drm/rocket_accel.h
+
DRM COMPUTE ACCELERATORS DRIVERS AND FRAMEWORK
M: Oded Gabbay <ogabbay@kernel.org>
L: dri-devel@lists.freedesktop.org
diff --git a/drivers/accel/Kconfig b/drivers/accel/Kconfig
index 5b9490367a39fd12d35a8d9021768aa186c09308..bb01cebc42bf16ebf02e938040f339ff94869e33 100644
--- a/drivers/accel/Kconfig
+++ b/drivers/accel/Kconfig
@@ -28,5 +28,6 @@ source "drivers/accel/amdxdna/Kconfig"
source "drivers/accel/habanalabs/Kconfig"
source "drivers/accel/ivpu/Kconfig"
source "drivers/accel/qaic/Kconfig"
+source "drivers/accel/rocket/Kconfig"
endif
diff --git a/drivers/accel/Makefile b/drivers/accel/Makefile
index a301fb6089d4c515430175c5e2ba9190f6dc9158..ffc3fa58866616d933184a7659573cd4d4780a8d 100644
--- a/drivers/accel/Makefile
+++ b/drivers/accel/Makefile
@@ -4,3 +4,4 @@ obj-$(CONFIG_DRM_ACCEL_AMDXDNA) += amdxdna/
obj-$(CONFIG_DRM_ACCEL_HABANALABS) += habanalabs/
obj-$(CONFIG_DRM_ACCEL_IVPU) += ivpu/
obj-$(CONFIG_DRM_ACCEL_QAIC) += qaic/
+obj-$(CONFIG_DRM_ACCEL_ROCKET) += rocket/
\ No newline at end of file
diff --git a/drivers/accel/rocket/Kconfig b/drivers/accel/rocket/Kconfig
new file mode 100644
index 0000000000000000000000000000000000000000..43d6cd98ec8e4e3448df4d032da720932e2db9c3
--- /dev/null
+++ b/drivers/accel/rocket/Kconfig
@@ -0,0 +1,24 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+config DRM_ACCEL_ROCKET
+ tristate "Rocket (support for Rockchip NPUs)"
+ depends on DRM
+ depends on (ARCH_ROCKCHIP && ARM64) || COMPILE_TEST
+ depends on ROCKCHIP_IOMMU || COMPILE_TEST
+ depends on MMU
+ select DRM_SCHED
+ select DRM_GEM_SHMEM_HELPER
+ help
+ Choose this option if you have a Rockchip SoC that contains a
+ compatible Neural Processing Unit (NPU), such as the RK3588. Called by
+ Rockchip either RKNN or RKNPU, it accelerates inference of neural
+ networks.
+
+ The interface exposed to userspace is described in
+ include/uapi/drm/rocket_accel.h and is used by the Rocket userspace
+ driver in Mesa3D.
+
+ If unsure, say N.
+
+ To compile this driver as a module, choose M here: the
+ module will be called rocket.
diff --git a/drivers/accel/rocket/Makefile b/drivers/accel/rocket/Makefile
new file mode 100644
index 0000000000000000000000000000000000000000..abdd75f2492eaecf8bf5e78a2ac150ea19ac3e96
--- /dev/null
+++ b/drivers/accel/rocket/Makefile
@@ -0,0 +1,8 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+obj-$(CONFIG_DRM_ACCEL_ROCKET) := rocket.o
+
+rocket-y := \
+ rocket_core.o \
+ rocket_device.o \
+ rocket_drv.o
diff --git a/drivers/accel/rocket/rocket_core.c b/drivers/accel/rocket/rocket_core.c
new file mode 100644
index 0000000000000000000000000000000000000000..9be964b5fbaef31f9b283b8da6fe15e6c540e916
--- /dev/null
+++ b/drivers/accel/rocket/rocket_core.c
@@ -0,0 +1,100 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright 2024-2025 Tomeu Vizoso <tomeu@tomeuvizoso.net> */
+
+#include <linux/clk.h>
+#include <linux/delay.h>
+#include <linux/dev_printk.h>
+#include <linux/dma-mapping.h>
+#include <linux/err.h>
+#include <linux/iommu.h>
+#include <linux/platform_device.h>
+#include <linux/pm_runtime.h>
+#include <linux/reset.h>
+
+#include "rocket_core.h"
+
+int rocket_core_init(struct rocket_core *core)
+{
+ struct device *dev = core->dev;
+ struct platform_device *pdev = to_platform_device(dev);
+ u32 version;
+ int err = 0;
+
+ core->resets[0].id = "srst_a";
+ core->resets[1].id = "srst_h";
+ err = devm_reset_control_bulk_get_exclusive(&pdev->dev, ARRAY_SIZE(core->resets),
+ core->resets);
+ if (err)
+ return dev_err_probe(dev, err, "failed to get resets for core %d\n", core->index);
+
+ err = devm_clk_bulk_get(dev, ARRAY_SIZE(core->clks), core->clks);
+ if (err)
+ return dev_err_probe(dev, err, "failed to get clocks for core %d\n", core->index);
+
+ core->pc_iomem = devm_platform_ioremap_resource_byname(pdev, "pc");
+ if (IS_ERR(core->pc_iomem)) {
+ dev_err(dev, "couldn't find PC registers %ld\n", PTR_ERR(core->pc_iomem));
+ return PTR_ERR(core->pc_iomem);
+ }
+
+ core->cna_iomem = devm_platform_ioremap_resource_byname(pdev, "cna");
+ if (IS_ERR(core->cna_iomem)) {
+ dev_err(dev, "couldn't find CNA registers %ld\n", PTR_ERR(core->cna_iomem));
+ return PTR_ERR(core->cna_iomem);
+ }
+
+ core->core_iomem = devm_platform_ioremap_resource_byname(pdev, "core");
+ if (IS_ERR(core->core_iomem)) {
+ dev_err(dev, "couldn't find CORE registers %ld\n", PTR_ERR(core->core_iomem));
+ return PTR_ERR(core->core_iomem);
+ }
+
+ dma_set_max_seg_size(dev, UINT_MAX);
+
+ err = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(40));
+ if (err)
+ return err;
+
+ core->iommu_group = iommu_group_get(dev);
+
+ pm_runtime_use_autosuspend(dev);
+
+ /*
+ * As this NPU will be most often used as part of a media pipeline that
+ * ends presenting in a display, choose 50 ms (~3 frames at 60Hz) as an
+ * autosuspend delay as that will keep the device powered up while the
+ * pipeline is running.
+ */
+ pm_runtime_set_autosuspend_delay(dev, 50);
+
+ pm_runtime_enable(dev);
+
+ err = pm_runtime_get_sync(dev);
+
+ version = rocket_pc_readl(core, VERSION);
+ version += rocket_pc_readl(core, VERSION_NUM) & 0xffff;
+
+ pm_runtime_mark_last_busy(dev);
+ pm_runtime_put_autosuspend(dev);
+
+ dev_info(dev, "Rockchip NPU core %d version: %d\n", core->index, version);
+
+ return 0;
+}
+
+void rocket_core_fini(struct rocket_core *core)
+{
+ pm_runtime_dont_use_autosuspend(core->dev);
+ pm_runtime_disable(core->dev);
+ iommu_group_put(core->iommu_group);
+ core->iommu_group = NULL;
+}
+
+void rocket_core_reset(struct rocket_core *core)
+{
+ reset_control_bulk_assert(ARRAY_SIZE(core->resets), core->resets);
+
+ udelay(10);
+
+ reset_control_bulk_deassert(ARRAY_SIZE(core->resets), core->resets);
+}
diff --git a/drivers/accel/rocket/rocket_core.h b/drivers/accel/rocket/rocket_core.h
new file mode 100644
index 0000000000000000000000000000000000000000..660de2d70f7d9294bc4db8b235b3796941136307
--- /dev/null
+++ b/drivers/accel/rocket/rocket_core.h
@@ -0,0 +1,49 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* Copyright 2024-2025 Tomeu Vizoso <tomeu@tomeuvizoso.net> */
+
+#ifndef __ROCKET_CORE_H__
+#define __ROCKET_CORE_H__
+
+#include <drm/gpu_scheduler.h>
+#include <linux/clk.h>
+#include <linux/io.h>
+#include <linux/mutex_types.h>
+#include <linux/reset.h>
+
+#include "rocket_registers.h"
+
+#define rocket_pc_readl(core, reg) \
+ readl((core)->pc_iomem + (REG_PC_##reg))
+#define rocket_pc_writel(core, reg, value) \
+ writel(value, (core)->pc_iomem + (REG_PC_##reg))
+
+#define rocket_cna_readl(core, reg) \
+ readl((core)->cna_iomem + (REG_CNA_##reg) - REG_CNA_S_STATUS)
+#define rocket_cna_writel(core, reg, value) \
+ writel(value, (core)->cna_iomem + (REG_CNA_##reg) - REG_CNA_S_STATUS)
+
+#define rocket_core_readl(core, reg) \
+ readl((core)->core_iomem + (REG_CORE_##reg) - REG_CORE_S_STATUS)
+#define rocket_core_writel(core, reg, value) \
+ writel(value, (core)->core_iomem + (REG_CORE_##reg) - REG_CORE_S_STATUS)
+
+struct rocket_core {
+ struct device *dev;
+ struct rocket_device *rdev;
+ unsigned int index;
+
+ int irq;
+ void __iomem *pc_iomem;
+ void __iomem *cna_iomem;
+ void __iomem *core_iomem;
+ struct clk_bulk_data clks[4];
+ struct reset_control_bulk_data resets[2];
+
+ struct iommu_group *iommu_group;
+};
+
+int rocket_core_init(struct rocket_core *core);
+void rocket_core_fini(struct rocket_core *core);
+void rocket_core_reset(struct rocket_core *core);
+
+#endif
diff --git a/drivers/accel/rocket/rocket_device.c b/drivers/accel/rocket/rocket_device.c
new file mode 100644
index 0000000000000000000000000000000000000000..b05a0df91d48385ce4ada22137842b3e819f8266
--- /dev/null
+++ b/drivers/accel/rocket/rocket_device.c
@@ -0,0 +1,56 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright 2024-2025 Tomeu Vizoso <tomeu@tomeuvizoso.net> */
+
+#include <drm/drm_drv.h>
+#include <linux/array_size.h>
+#include <linux/clk.h>
+#include <linux/dma-mapping.h>
+#include <linux/platform_device.h>
+#include <linux/of.h>
+
+#include "rocket_device.h"
+
+struct rocket_device *rocket_device_init(struct platform_device *pdev,
+ const struct drm_driver *rocket_drm_driver)
+{
+ struct device *dev = &pdev->dev;
+ struct device_node *core_node;
+ struct rocket_device *rdev;
+ struct drm_device *ddev;
+ unsigned int num_cores = 0;
+ int err;
+
+ rdev = devm_drm_dev_alloc(dev, rocket_drm_driver, struct rocket_device, ddev);
+ if (IS_ERR(rdev))
+ return rdev;
+
+ ddev = &rdev->ddev;
+ dev_set_drvdata(dev, rdev);
+
+ for_each_compatible_node(core_node, NULL, "rockchip,rk3588-rknn-core")
+ if (of_device_is_available(core_node))
+ num_cores++;
+
+ rdev->cores = devm_kcalloc(dev, num_cores, sizeof(*rdev->cores), GFP_KERNEL);
+ if (!rdev->cores)
+ return ERR_PTR(-ENOMEM);
+
+ dma_set_max_seg_size(dev, UINT_MAX);
+
+ err = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(40));
+ if (err)
+ return ERR_PTR(err);
+
+ err = drm_dev_register(ddev, 0);
+ if (err)
+ return ERR_PTR(err);
+
+ return rdev;
+}
+
+void rocket_device_fini(struct rocket_device *rdev)
+{
+ WARN_ON(rdev->num_cores > 0);
+
+ drm_dev_unregister(&rdev->ddev);
+}
diff --git a/drivers/accel/rocket/rocket_device.h b/drivers/accel/rocket/rocket_device.h
new file mode 100644
index 0000000000000000000000000000000000000000..a5a5857bb1991161ebfbdcbffacd75bbb579c572
--- /dev/null
+++ b/drivers/accel/rocket/rocket_device.h
@@ -0,0 +1,28 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* Copyright 2024-2025 Tomeu Vizoso <tomeu@tomeuvizoso.net> */
+
+#ifndef __ROCKET_DEVICE_H__
+#define __ROCKET_DEVICE_H__
+
+#include <drm/drm_device.h>
+#include <linux/clk.h>
+#include <linux/container_of.h>
+#include <linux/iommu.h>
+#include <linux/platform_device.h>
+
+#include "rocket_core.h"
+
+struct rocket_device {
+ struct drm_device ddev;
+
+ struct rocket_core *cores;
+ unsigned int num_cores;
+};
+
+struct rocket_device *rocket_device_init(struct platform_device *pdev,
+ const struct drm_driver *rocket_drm_driver);
+void rocket_device_fini(struct rocket_device *rdev);
+#define to_rocket_device(drm_dev) \
+ ((struct rocket_device *)(container_of((drm_dev), struct rocket_device, ddev)))
+
+#endif /* __ROCKET_DEVICE_H__ */
diff --git a/drivers/accel/rocket/rocket_drv.c b/drivers/accel/rocket/rocket_drv.c
new file mode 100644
index 0000000000000000000000000000000000000000..a5df94f6b1259ae335fbccd0105ba44f3432999c
--- /dev/null
+++ b/drivers/accel/rocket/rocket_drv.c
@@ -0,0 +1,261 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright 2024-2025 Tomeu Vizoso <tomeu@tomeuvizoso.net> */
+
+#include <drm/drm_accel.h>
+#include <drm/drm_drv.h>
+#include <drm/drm_gem.h>
+#include <drm/drm_ioctl.h>
+#include <linux/clk.h>
+#include <linux/err.h>
+#include <linux/iommu.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+#include <linux/pm_runtime.h>
+
+#include "rocket_drv.h"
+
+/*
+ * Facade device, used to expose a single DRM device to userspace, that
+ * schedules jobs to any RKNN cores in the system.
+ */
+static struct platform_device *drm_dev;
+static struct rocket_device *rdev;
+
+static void
+rocket_iommu_domain_destroy(struct kref *kref)
+{
+ struct rocket_iommu_domain *domain = container_of(kref, struct rocket_iommu_domain, kref);
+
+ iommu_domain_free(domain->domain);
+ domain->domain = NULL;
+ kfree(domain);
+}
+
+static struct rocket_iommu_domain*
+rocket_iommu_domain_create(struct device *dev)
+{
+ struct rocket_iommu_domain *domain = kmalloc(sizeof(*domain), GFP_KERNEL);
+ void *err;
+
+ if (!domain)
+ return ERR_PTR(-ENOMEM);
+
+ domain->domain = iommu_paging_domain_alloc(dev);
+ if (IS_ERR(domain->domain)) {
+ err = ERR_CAST(domain->domain);
+ kfree(domain);
+ return err;
+ }
+ kref_init(&domain->kref);
+
+ return domain;
+}
+
+struct rocket_iommu_domain *
+rocket_iommu_domain_get(struct rocket_file_priv *rocket_priv)
+{
+ kref_get(&rocket_priv->domain->kref);
+ return rocket_priv->domain;
+}
+
+void
+rocket_iommu_domain_put(struct rocket_iommu_domain *domain)
+{
+ kref_put(&domain->kref, rocket_iommu_domain_destroy);
+}
+
+static int
+rocket_open(struct drm_device *dev, struct drm_file *file)
+{
+ struct rocket_device *rdev = to_rocket_device(dev);
+ struct rocket_file_priv *rocket_priv;
+ int ret;
+
+ if (!try_module_get(THIS_MODULE))
+ return -EINVAL;
+
+ rocket_priv = kzalloc(sizeof(*rocket_priv), GFP_KERNEL);
+ if (!rocket_priv) {
+ ret = -ENOMEM;
+ goto err_put_mod;
+ }
+
+ rocket_priv->rdev = rdev;
+ rocket_priv->domain = rocket_iommu_domain_create(rdev->cores[0].dev);
+ if (IS_ERR(rocket_priv->domain)) {
+ ret = PTR_ERR(rocket_priv->domain);
+ goto err_free;
+ }
+
+ file->driver_priv = rocket_priv;
+
+ return 0;
+
+err_free:
+ kfree(rocket_priv);
+err_put_mod:
+ module_put(THIS_MODULE);
+ return ret;
+}
+
+static void
+rocket_postclose(struct drm_device *dev, struct drm_file *file)
+{
+ struct rocket_file_priv *rocket_priv = file->driver_priv;
+
+ rocket_iommu_domain_put(rocket_priv->domain);
+ kfree(rocket_priv);
+ module_put(THIS_MODULE);
+}
+
+static const struct drm_ioctl_desc rocket_drm_driver_ioctls[] = {
+#define ROCKET_IOCTL(n, func) \
+ DRM_IOCTL_DEF_DRV(ROCKET_##n, rocket_ioctl_##func, 0)
+};
+
+DEFINE_DRM_ACCEL_FOPS(rocket_accel_driver_fops);
+
+/*
+ * Rocket driver version:
+ * - 1.0 - initial interface
+ */
+static const struct drm_driver rocket_drm_driver = {
+ .driver_features = DRIVER_COMPUTE_ACCEL,
+ .open = rocket_open,
+ .postclose = rocket_postclose,
+ .ioctls = rocket_drm_driver_ioctls,
+ .num_ioctls = ARRAY_SIZE(rocket_drm_driver_ioctls),
+ .fops = &rocket_accel_driver_fops,
+ .name = "rocket",
+ .desc = "rocket DRM",
+};
+
+static int rocket_probe(struct platform_device *pdev)
+{
+ if (rdev == NULL) {
+ /* First core probing, initialize DRM device. */
+ rdev = rocket_device_init(drm_dev, &rocket_drm_driver);
+ if (IS_ERR(rdev)) {
+ dev_err(&pdev->dev, "failed to initialize rocket device\n");
+ return PTR_ERR(rdev);
+ }
+ }
+
+ unsigned int core = rdev->num_cores;
+
+ dev_set_drvdata(&pdev->dev, rdev);
+
+ rdev->cores[core].rdev = rdev;
+ rdev->cores[core].dev = &pdev->dev;
+ rdev->cores[core].index = core;
+
+ rdev->num_cores++;
+
+ return rocket_core_init(&rdev->cores[core]);
+}
+
+static void rocket_remove(struct platform_device *pdev)
+{
+ struct device *dev = &pdev->dev;
+
+ for (unsigned int core = 0; core < rdev->num_cores; core++) {
+ if (rdev->cores[core].dev == dev) {
+ rocket_core_fini(&rdev->cores[core]);
+ rdev->num_cores--;
+ break;
+ }
+ }
+
+ if (rdev->num_cores == 0) {
+ /* Last core removed, deinitialize DRM device. */
+ rocket_device_fini(rdev);
+ rdev = NULL;
+ }
+}
+
+static const struct of_device_id dt_match[] = {
+ { .compatible = "rockchip,rk3588-rknn-core" },
+ {}
+};
+MODULE_DEVICE_TABLE(of, dt_match);
+
+static int find_core_for_dev(struct device *dev)
+{
+ struct rocket_device *rdev = dev_get_drvdata(dev);
+
+ for (unsigned int core = 0; core < rdev->num_cores; core++) {
+ if (dev == rdev->cores[core].dev)
+ return core;
+ }
+
+ return -1;
+}
+
+static int rocket_device_runtime_resume(struct device *dev)
+{
+ struct rocket_device *rdev = dev_get_drvdata(dev);
+ int core = find_core_for_dev(dev);
+ int err = 0;
+
+ if (core < 0)
+ return -ENODEV;
+
+ err = clk_bulk_prepare_enable(ARRAY_SIZE(rdev->cores[core].clks), rdev->cores[core].clks);
+ if (err) {
+ dev_err(dev, "failed to enable (%d) clocks for core %d\n", err, core);
+ return err;
+ }
+
+ return 0;
+}
+
+static int rocket_device_runtime_suspend(struct device *dev)
+{
+ struct rocket_device *rdev = dev_get_drvdata(dev);
+ int core = find_core_for_dev(dev);
+
+ if (core < 0)
+ return -ENODEV;
+
+ clk_bulk_disable_unprepare(ARRAY_SIZE(rdev->cores[core].clks), rdev->cores[core].clks);
+
+ return 0;
+}
+
+EXPORT_GPL_DEV_PM_OPS(rocket_pm_ops) = {
+ RUNTIME_PM_OPS(rocket_device_runtime_suspend, rocket_device_runtime_resume, NULL)
+ SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, pm_runtime_force_resume)
+};
+
+static struct platform_driver rocket_driver = {
+ .probe = rocket_probe,
+ .remove = rocket_remove,
+ .driver = {
+ .name = "rocket",
+ .pm = pm_ptr(&rocket_pm_ops),
+ .of_match_table = dt_match,
+ },
+};
+
+static int __init rocket_register(void)
+{
+ drm_dev = platform_device_register_simple("rknn", -1, NULL, 0);
+ if (IS_ERR(drm_dev))
+ return PTR_ERR(drm_dev);
+
+ return platform_driver_register(&rocket_driver);
+}
+
+static void __exit rocket_unregister(void)
+{
+ platform_driver_unregister(&rocket_driver);
+
+ platform_device_unregister(drm_dev);
+}
+
+module_init(rocket_register);
+module_exit(rocket_unregister);
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("DRM driver for the Rockchip NPU IP");
+MODULE_AUTHOR("Tomeu Vizoso");
diff --git a/drivers/accel/rocket/rocket_drv.h b/drivers/accel/rocket/rocket_drv.h
new file mode 100644
index 0000000000000000000000000000000000000000..36b1291b0ead388b8843965758c57a0405315519
--- /dev/null
+++ b/drivers/accel/rocket/rocket_drv.h
@@ -0,0 +1,23 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* Copyright 2024-2025 Tomeu Vizoso <tomeu@tomeuvizoso.net> */
+
+#ifndef __ROCKET_DRV_H__
+#define __ROCKET_DRV_H__
+
+#include "rocket_device.h"
+
+struct rocket_iommu_domain {
+ struct iommu_domain *domain;
+ struct kref kref;
+};
+
+struct rocket_file_priv {
+ struct rocket_device *rdev;
+
+ struct rocket_iommu_domain *domain;
+};
+
+struct rocket_iommu_domain *rocket_iommu_domain_get(struct rocket_file_priv *rocket_priv);
+void rocket_iommu_domain_put(struct rocket_iommu_domain *domain);
+
+#endif
--
2.50.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v9 03/10] accel/rocket: Add IOCTL for BO creation
2025-07-21 9:17 [PATCH v9 00/10] New DRM accel driver for Rockchip's RKNN NPU Tomeu Vizoso
2025-07-21 9:17 ` [PATCH v9 02/10] accel/rocket: Add a new driver for Rockchip's NPU Tomeu Vizoso
@ 2025-07-21 9:17 ` Tomeu Vizoso
2025-07-21 9:17 ` [PATCH v9 04/10] accel/rocket: Add job submission IOCTL Tomeu Vizoso
` (9 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: Tomeu Vizoso @ 2025-07-21 9:17 UTC (permalink / raw)
To: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Heiko Stuebner,
Oded Gabbay, Jonathan Corbet, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, Sebastian Reichel, Nicolas Frattaroli,
Kever Yang, Robin Murphy, Daniel Stone, Da Xue, Philipp Zabel,
Jeff Hugo
Cc: devicetree, linux-arm-kernel, linux-rockchip, linux-kernel,
dri-devel, linux-doc, linux-media, linaro-mm-sig, Tomeu Vizoso
This uses the SHMEM DRM helpers and we map right away to the CPU and NPU
sides, as all buffers are expected to be accessed from both.
v2:
- Sync the IOMMUs for the other cores when mapping and unmapping.
v3:
- Make use of GPL-2.0-only for the copyright notice (Jeff Hugo)
v6:
- Use mutexes guard (Markus Elfring)
v7:
- Assign its own IOMMU domain to each client, for isolation (Daniel
Stone and Robin Murphy)
v8:
- Correctly acquire a reference to the IOMMU (Robin Murphy)
- Allocate DMA address ourselves with drm_mm (Robin Murphy)
- Use refcount_read (Heiko Stuebner)
- Remove superfluous dma_sync_sgtable_for_device (Robin Murphy)
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
---
drivers/accel/rocket/Makefile | 3 +-
drivers/accel/rocket/rocket_drv.c | 15 ++++-
drivers/accel/rocket/rocket_drv.h | 4 ++
drivers/accel/rocket/rocket_gem.c | 125 ++++++++++++++++++++++++++++++++++++++
drivers/accel/rocket/rocket_gem.h | 30 +++++++++
include/uapi/drm/rocket_accel.h | 44 ++++++++++++++
6 files changed, 219 insertions(+), 2 deletions(-)
diff --git a/drivers/accel/rocket/Makefile b/drivers/accel/rocket/Makefile
index abdd75f2492eaecf8bf5e78a2ac150ea19ac3e96..4deef267f9e1238c4d8bd108dcc8afd9dc8b2b8f 100644
--- a/drivers/accel/rocket/Makefile
+++ b/drivers/accel/rocket/Makefile
@@ -5,4 +5,5 @@ obj-$(CONFIG_DRM_ACCEL_ROCKET) := rocket.o
rocket-y := \
rocket_core.o \
rocket_device.o \
- rocket_drv.o
+ rocket_drv.o \
+ rocket_gem.o
diff --git a/drivers/accel/rocket/rocket_drv.c b/drivers/accel/rocket/rocket_drv.c
index a5df94f6b1259ae335fbccd0105ba44f3432999c..8b7fbe9226f424b69d409e47b58651cba8c42bcf 100644
--- a/drivers/accel/rocket/rocket_drv.c
+++ b/drivers/accel/rocket/rocket_drv.c
@@ -5,6 +5,7 @@
#include <drm/drm_drv.h>
#include <drm/drm_gem.h>
#include <drm/drm_ioctl.h>
+#include <drm/rocket_accel.h>
#include <linux/clk.h>
#include <linux/err.h>
#include <linux/iommu.h>
@@ -13,6 +14,7 @@
#include <linux/pm_runtime.h>
#include "rocket_drv.h"
+#include "rocket_gem.h"
/*
* Facade device, used to expose a single DRM device to userspace, that
@@ -69,6 +71,7 @@ rocket_open(struct drm_device *dev, struct drm_file *file)
{
struct rocket_device *rdev = to_rocket_device(dev);
struct rocket_file_priv *rocket_priv;
+ u64 start, end;
int ret;
if (!try_module_get(THIS_MODULE))
@@ -89,6 +92,11 @@ rocket_open(struct drm_device *dev, struct drm_file *file)
file->driver_priv = rocket_priv;
+ start = rocket_priv->domain->domain->geometry.aperture_start;
+ end = rocket_priv->domain->domain->geometry.aperture_end;
+ drm_mm_init(&rocket_priv->mm, start, end - start + 1);
+ mutex_init(&rocket_priv->mm_lock);
+
return 0;
err_free:
@@ -103,6 +111,8 @@ rocket_postclose(struct drm_device *dev, struct drm_file *file)
{
struct rocket_file_priv *rocket_priv = file->driver_priv;
+ mutex_destroy(&rocket_priv->mm_lock);
+ drm_mm_takedown(&rocket_priv->mm);
rocket_iommu_domain_put(rocket_priv->domain);
kfree(rocket_priv);
module_put(THIS_MODULE);
@@ -111,6 +121,8 @@ rocket_postclose(struct drm_device *dev, struct drm_file *file)
static const struct drm_ioctl_desc rocket_drm_driver_ioctls[] = {
#define ROCKET_IOCTL(n, func) \
DRM_IOCTL_DEF_DRV(ROCKET_##n, rocket_ioctl_##func, 0)
+
+ ROCKET_IOCTL(CREATE_BO, create_bo),
};
DEFINE_DRM_ACCEL_FOPS(rocket_accel_driver_fops);
@@ -120,9 +132,10 @@ DEFINE_DRM_ACCEL_FOPS(rocket_accel_driver_fops);
* - 1.0 - initial interface
*/
static const struct drm_driver rocket_drm_driver = {
- .driver_features = DRIVER_COMPUTE_ACCEL,
+ .driver_features = DRIVER_COMPUTE_ACCEL | DRIVER_GEM,
.open = rocket_open,
.postclose = rocket_postclose,
+ .gem_create_object = rocket_gem_create_object,
.ioctls = rocket_drm_driver_ioctls,
.num_ioctls = ARRAY_SIZE(rocket_drm_driver_ioctls),
.fops = &rocket_accel_driver_fops,
diff --git a/drivers/accel/rocket/rocket_drv.h b/drivers/accel/rocket/rocket_drv.h
index 36b1291b0ead388b8843965758c57a0405315519..2944e0136ab991da61fb8f66f7e9c1ba214878a6 100644
--- a/drivers/accel/rocket/rocket_drv.h
+++ b/drivers/accel/rocket/rocket_drv.h
@@ -4,6 +4,8 @@
#ifndef __ROCKET_DRV_H__
#define __ROCKET_DRV_H__
+#include <drm/drm_mm.h>
+
#include "rocket_device.h"
struct rocket_iommu_domain {
@@ -15,6 +17,8 @@ struct rocket_file_priv {
struct rocket_device *rdev;
struct rocket_iommu_domain *domain;
+ struct drm_mm mm;
+ struct mutex mm_lock;
};
struct rocket_iommu_domain *rocket_iommu_domain_get(struct rocket_file_priv *rocket_priv);
diff --git a/drivers/accel/rocket/rocket_gem.c b/drivers/accel/rocket/rocket_gem.c
new file mode 100644
index 0000000000000000000000000000000000000000..05cf46040865c01fe14a169c865227780f2db679
--- /dev/null
+++ b/drivers/accel/rocket/rocket_gem.c
@@ -0,0 +1,125 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright 2024-2025 Tomeu Vizoso <tomeu@tomeuvizoso.net> */
+
+#include <drm/drm_device.h>
+#include <drm/drm_utils.h>
+#include <drm/rocket_accel.h>
+#include <linux/dma-mapping.h>
+#include <linux/iommu.h>
+
+#include "rocket_drv.h"
+#include "rocket_gem.h"
+
+static void rocket_gem_bo_free(struct drm_gem_object *obj)
+{
+ struct rocket_gem_object *bo = to_rocket_bo(obj);
+ struct rocket_file_priv *rocket_priv = bo->driver_priv;
+ size_t unmapped;
+
+ drm_WARN_ON(obj->dev, refcount_read(&bo->base.pages_use_count) > 1);
+
+ unmapped = iommu_unmap(bo->domain->domain, bo->mm.start, bo->size);
+ drm_WARN_ON(obj->dev, unmapped != bo->size);
+
+ mutex_lock(&rocket_priv->mm_lock);
+ drm_mm_remove_node(&bo->mm);
+ mutex_unlock(&rocket_priv->mm_lock);
+
+ rocket_iommu_domain_put(bo->domain);
+ bo->domain = NULL;
+
+ drm_gem_shmem_free(&bo->base);
+}
+
+static const struct drm_gem_object_funcs rocket_gem_funcs = {
+ .free = rocket_gem_bo_free,
+ .print_info = drm_gem_shmem_object_print_info,
+ .pin = drm_gem_shmem_object_pin,
+ .unpin = drm_gem_shmem_object_unpin,
+ .get_sg_table = drm_gem_shmem_object_get_sg_table,
+ .vmap = drm_gem_shmem_object_vmap,
+ .vunmap = drm_gem_shmem_object_vunmap,
+ .mmap = drm_gem_shmem_object_mmap,
+ .vm_ops = &drm_gem_shmem_vm_ops,
+};
+
+struct drm_gem_object *rocket_gem_create_object(struct drm_device *dev, size_t size)
+{
+ struct rocket_gem_object *obj;
+
+ obj = kzalloc(sizeof(*obj), GFP_KERNEL);
+ if (!obj)
+ return ERR_PTR(-ENOMEM);
+
+ obj->base.base.funcs = &rocket_gem_funcs;
+
+ return &obj->base.base;
+}
+
+int rocket_ioctl_create_bo(struct drm_device *dev, void *data, struct drm_file *file)
+{
+ struct rocket_file_priv *rocket_priv = file->driver_priv;
+ struct drm_rocket_create_bo *args = data;
+ struct drm_gem_shmem_object *shmem_obj;
+ struct rocket_gem_object *rkt_obj;
+ struct drm_gem_object *gem_obj;
+ struct sg_table *sgt;
+ int ret;
+
+ shmem_obj = drm_gem_shmem_create(dev, args->size);
+ if (IS_ERR(shmem_obj))
+ return PTR_ERR(shmem_obj);
+
+ gem_obj = &shmem_obj->base;
+ rkt_obj = to_rocket_bo(gem_obj);
+
+ rkt_obj->driver_priv = rocket_priv;
+ rkt_obj->domain = rocket_iommu_domain_get(rocket_priv);
+ rkt_obj->size = args->size;
+ rkt_obj->offset = 0;
+
+ ret = drm_gem_handle_create(file, gem_obj, &args->handle);
+ drm_gem_object_put(gem_obj);
+ if (ret)
+ goto err;
+
+ sgt = drm_gem_shmem_get_pages_sgt(shmem_obj);
+ if (IS_ERR(sgt)) {
+ ret = PTR_ERR(sgt);
+ goto err;
+ }
+
+ mutex_lock(&rocket_priv->mm_lock);
+ ret = drm_mm_insert_node_generic(&rocket_priv->mm, &rkt_obj->mm,
+ rkt_obj->size, PAGE_SIZE,
+ 0, 0);
+ mutex_unlock(&rocket_priv->mm_lock);
+
+ ret = iommu_map_sgtable(rocket_priv->domain->domain,
+ rkt_obj->mm.start,
+ shmem_obj->sgt,
+ IOMMU_READ | IOMMU_WRITE);
+ if (ret < 0 || ret < args->size) {
+ drm_err(dev, "failed to map buffer: size=%d request_size=%u\n",
+ ret, args->size);
+ ret = -ENOMEM;
+ goto err_remove_node;
+ }
+
+ /* iommu_map_sgtable might have aligned the size */
+ rkt_obj->size = ret;
+ args->offset = drm_vma_node_offset_addr(&gem_obj->vma_node);
+ args->dma_address = rkt_obj->mm.start;
+
+ return 0;
+
+err_remove_node:
+ mutex_lock(&rocket_priv->mm_lock);
+ drm_mm_remove_node(&rkt_obj->mm);
+ mutex_unlock(&rocket_priv->mm_lock);
+
+err:
+ drm_gem_shmem_object_free(gem_obj);
+
+ return ret;
+}
diff --git a/drivers/accel/rocket/rocket_gem.h b/drivers/accel/rocket/rocket_gem.h
new file mode 100644
index 0000000000000000000000000000000000000000..91a1fc09c56ce483ebe80959e1a7ff934867bedc
--- /dev/null
+++ b/drivers/accel/rocket/rocket_gem.h
@@ -0,0 +1,30 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* Copyright 2024-2025 Tomeu Vizoso <tomeu@tomeuvizoso.net> */
+
+#ifndef __ROCKET_GEM_H__
+#define __ROCKET_GEM_H__
+
+#include <drm/drm_gem_shmem_helper.h>
+
+struct rocket_gem_object {
+ struct drm_gem_shmem_object base;
+
+ struct rocket_file_priv *driver_priv;
+
+ struct rocket_iommu_domain *domain;
+ struct drm_mm_node mm;
+ size_t size;
+ u32 offset;
+};
+
+struct drm_gem_object *rocket_gem_create_object(struct drm_device *dev, size_t size);
+
+int rocket_ioctl_create_bo(struct drm_device *dev, void *data, struct drm_file *file);
+
+static inline
+struct rocket_gem_object *to_rocket_bo(struct drm_gem_object *obj)
+{
+ return container_of(to_drm_gem_shmem_obj(obj), struct rocket_gem_object, base);
+}
+
+#endif
diff --git a/include/uapi/drm/rocket_accel.h b/include/uapi/drm/rocket_accel.h
new file mode 100644
index 0000000000000000000000000000000000000000..95720702b7c4413d72b89c1f0f59abb22dc8c6b3
--- /dev/null
+++ b/include/uapi/drm/rocket_accel.h
@@ -0,0 +1,44 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2024 Tomeu Vizoso
+ */
+#ifndef __DRM_UAPI_ROCKET_ACCEL_H__
+#define __DRM_UAPI_ROCKET_ACCEL_H__
+
+#include "drm.h"
+
+#if defined(__cplusplus)
+extern "C" {
+#endif
+
+#define DRM_ROCKET_CREATE_BO 0x00
+
+#define DRM_IOCTL_ROCKET_CREATE_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_ROCKET_CREATE_BO, struct drm_rocket_create_bo)
+
+/**
+ * struct drm_rocket_create_bo - ioctl argument for creating Rocket BOs.
+ *
+ */
+struct drm_rocket_create_bo {
+ /** Input: Size of the requested BO. */
+ __u32 size;
+
+ /** Output: GEM handle for the BO. */
+ __u32 handle;
+
+ /**
+ * Output: DMA address for the BO in the NPU address space. This address
+ * is private to the DRM fd and is valid for the lifetime of the GEM
+ * handle.
+ */
+ __u64 dma_address;
+
+ /** Output: Offset into the drm node to use for subsequent mmap call. */
+ __u64 offset;
+};
+
+#if defined(__cplusplus)
+}
+#endif
+
+#endif /* __DRM_UAPI_ROCKET_ACCEL_H__ */
--
2.50.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v9 04/10] accel/rocket: Add job submission IOCTL
2025-07-21 9:17 [PATCH v9 00/10] New DRM accel driver for Rockchip's RKNN NPU Tomeu Vizoso
2025-07-21 9:17 ` [PATCH v9 02/10] accel/rocket: Add a new driver for Rockchip's NPU Tomeu Vizoso
2025-07-21 9:17 ` [PATCH v9 03/10] accel/rocket: Add IOCTL for BO creation Tomeu Vizoso
@ 2025-07-21 9:17 ` Tomeu Vizoso
2025-07-21 9:17 ` [PATCH v9 05/10] accel/rocket: Add IOCTLs for synchronizing memory accesses Tomeu Vizoso
` (8 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: Tomeu Vizoso @ 2025-07-21 9:17 UTC (permalink / raw)
To: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Heiko Stuebner,
Oded Gabbay, Jonathan Corbet, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, Sebastian Reichel, Nicolas Frattaroli,
Kever Yang, Robin Murphy, Daniel Stone, Da Xue, Philipp Zabel,
Jeff Hugo
Cc: devicetree, linux-arm-kernel, linux-rockchip, linux-kernel,
dri-devel, linux-doc, linux-media, linaro-mm-sig, Tomeu Vizoso,
Jeff Hugo
Using the DRM GPU scheduler infrastructure, with a scheduler for each
core.
Userspace can decide for a series of tasks to be executed sequentially
in the same core, so SRAM locality can be taken advantage of.
The job submission code was initially based on Panfrost.
v2:
- Remove hardcoded number of cores
- Misc. style fixes (Jeffrey Hugo)
- Repack IOCTL struct (Jeffrey Hugo)
v3:
- Adapt to a split of the register block in the DT bindings (Nicolas
Frattaroli)
- Make use of GPL-2.0-only for the copyright notice (Jeff Hugo)
- Use drm_* logging functions (Thomas Zimmermann)
- Rename reg i/o macros (Thomas Zimmermann)
- Add padding to ioctls and check for zero (Jeff Hugo)
- Improve error handling (Nicolas Frattaroli)
v6:
- Use mutexes guard (Markus Elfring)
- Use u64_to_user_ptr (Jeff Hugo)
- Drop rocket_fence (Rob Herring)
v7:
- Assign its own IOMMU domain to each client, for isolation (Daniel
Stone and Robin Murphy)
v8:
- Use reset lines to reset the cores (Robin Murphy)
- Use the macros to compute the values for the bitfields (Robin Murphy)
- More descriptive name for the IRQ (Robin Murphy)
- Simplify job interrupt handing (Robin Murphy)
- Correctly acquire a reference to the IOMMU (Robin Murphy)
- Specify the size of the embedded structs in the IOCTLs for future
extensibility (Rob Herring)
- Expose only 32 bits for the address of the regcmd BO (Robin Murphy)
Tested-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
---
drivers/accel/rocket/Makefile | 3 +-
drivers/accel/rocket/rocket_core.c | 10 +
drivers/accel/rocket/rocket_core.h | 15 +
drivers/accel/rocket/rocket_device.c | 4 +
drivers/accel/rocket/rocket_device.h | 2 +
drivers/accel/rocket/rocket_drv.c | 14 +
drivers/accel/rocket/rocket_drv.h | 3 +
drivers/accel/rocket/rocket_job.c | 636 +++++++++++++++++++++++++++++++++++
drivers/accel/rocket/rocket_job.h | 52 +++
include/uapi/drm/rocket_accel.h | 64 ++++
10 files changed, 802 insertions(+), 1 deletion(-)
diff --git a/drivers/accel/rocket/Makefile b/drivers/accel/rocket/Makefile
index 4deef267f9e1238c4d8bd108dcc8afd9dc8b2b8f..3713dfe223d6ec6293ced3ef9291af2f3d144131 100644
--- a/drivers/accel/rocket/Makefile
+++ b/drivers/accel/rocket/Makefile
@@ -6,4 +6,5 @@ rocket-y := \
rocket_core.o \
rocket_device.o \
rocket_drv.o \
- rocket_gem.o
+ rocket_gem.o \
+ rocket_job.o
diff --git a/drivers/accel/rocket/rocket_core.c b/drivers/accel/rocket/rocket_core.c
index 9be964b5fbaef31f9b283b8da6fe15e6c540e916..72fb5e5798fac8d01f351f16a1da0539f042d02e 100644
--- a/drivers/accel/rocket/rocket_core.c
+++ b/drivers/accel/rocket/rocket_core.c
@@ -12,6 +12,7 @@
#include <linux/reset.h>
#include "rocket_core.h"
+#include "rocket_job.h"
int rocket_core_init(struct rocket_core *core)
{
@@ -57,6 +58,10 @@ int rocket_core_init(struct rocket_core *core)
core->iommu_group = iommu_group_get(dev);
+ err = rocket_job_init(core);
+ if (err)
+ return err;
+
pm_runtime_use_autosuspend(dev);
/*
@@ -70,6 +75,10 @@ int rocket_core_init(struct rocket_core *core)
pm_runtime_enable(dev);
err = pm_runtime_get_sync(dev);
+ if (err) {
+ rocket_job_fini(core);
+ return err;
+ }
version = rocket_pc_readl(core, VERSION);
version += rocket_pc_readl(core, VERSION_NUM) & 0xffff;
@@ -88,6 +97,7 @@ void rocket_core_fini(struct rocket_core *core)
pm_runtime_disable(core->dev);
iommu_group_put(core->iommu_group);
core->iommu_group = NULL;
+ rocket_job_fini(core);
}
void rocket_core_reset(struct rocket_core *core)
diff --git a/drivers/accel/rocket/rocket_core.h b/drivers/accel/rocket/rocket_core.h
index 660de2d70f7d9294bc4db8b235b3796941136307..f6d7382854ca9eaf53971fe70a1b5341b73bb76e 100644
--- a/drivers/accel/rocket/rocket_core.h
+++ b/drivers/accel/rocket/rocket_core.h
@@ -40,6 +40,21 @@ struct rocket_core {
struct reset_control_bulk_data resets[2];
struct iommu_group *iommu_group;
+
+ struct mutex job_lock;
+ struct rocket_job *in_flight_job;
+
+ spinlock_t fence_lock;
+
+ struct {
+ struct workqueue_struct *wq;
+ struct work_struct work;
+ atomic_t pending;
+ } reset;
+
+ struct drm_gpu_scheduler sched;
+ u64 fence_context;
+ u64 emit_seqno;
};
int rocket_core_init(struct rocket_core *core);
diff --git a/drivers/accel/rocket/rocket_device.c b/drivers/accel/rocket/rocket_device.c
index b05a0df91d48385ce4ada22137842b3e819f8266..46e6ee1e72c5f23048ef631b7cc0fe8cb4349f46 100644
--- a/drivers/accel/rocket/rocket_device.c
+++ b/drivers/accel/rocket/rocket_device.c
@@ -41,6 +41,10 @@ struct rocket_device *rocket_device_init(struct platform_device *pdev,
if (err)
return ERR_PTR(err);
+ err = devm_mutex_init(dev, &rdev->sched_lock);
+ if (err)
+ return ERR_PTR(-ENOMEM);
+
err = drm_dev_register(ddev, 0);
if (err)
return ERR_PTR(err);
diff --git a/drivers/accel/rocket/rocket_device.h b/drivers/accel/rocket/rocket_device.h
index a5a5857bb1991161ebfbdcbffacd75bbb579c572..ce662abc01d3d1c384d3c4bc2f0ded5400b57c7f 100644
--- a/drivers/accel/rocket/rocket_device.h
+++ b/drivers/accel/rocket/rocket_device.h
@@ -15,6 +15,8 @@
struct rocket_device {
struct drm_device ddev;
+ struct mutex sched_lock;
+
struct rocket_core *cores;
unsigned int num_cores;
};
diff --git a/drivers/accel/rocket/rocket_drv.c b/drivers/accel/rocket/rocket_drv.c
index 8b7fbe9226f424b69d409e47b58651cba8c42bcf..a21aa9aa189ba585c70fbf57d2a41fb578357efd 100644
--- a/drivers/accel/rocket/rocket_drv.c
+++ b/drivers/accel/rocket/rocket_drv.c
@@ -15,6 +15,7 @@
#include "rocket_drv.h"
#include "rocket_gem.h"
+#include "rocket_job.h"
/*
* Facade device, used to expose a single DRM device to userspace, that
@@ -97,8 +98,16 @@ rocket_open(struct drm_device *dev, struct drm_file *file)
drm_mm_init(&rocket_priv->mm, start, end - start + 1);
mutex_init(&rocket_priv->mm_lock);
+ ret = rocket_job_open(rocket_priv);
+ if (ret)
+ goto err_mm_takedown;
+
return 0;
+err_mm_takedown:
+ mutex_destroy(&rocket_priv->mm_lock);
+ drm_mm_takedown(&rocket_priv->mm);
+ rocket_iommu_domain_put(rocket_priv->domain);
err_free:
kfree(rocket_priv);
err_put_mod:
@@ -111,6 +120,7 @@ rocket_postclose(struct drm_device *dev, struct drm_file *file)
{
struct rocket_file_priv *rocket_priv = file->driver_priv;
+ rocket_job_close(rocket_priv);
mutex_destroy(&rocket_priv->mm_lock);
drm_mm_takedown(&rocket_priv->mm);
rocket_iommu_domain_put(rocket_priv->domain);
@@ -123,6 +133,7 @@ static const struct drm_ioctl_desc rocket_drm_driver_ioctls[] = {
DRM_IOCTL_DEF_DRV(ROCKET_##n, rocket_ioctl_##func, 0)
ROCKET_IOCTL(CREATE_BO, create_bo),
+ ROCKET_IOCTL(SUBMIT, submit),
};
DEFINE_DRM_ACCEL_FOPS(rocket_accel_driver_fops);
@@ -230,6 +241,9 @@ static int rocket_device_runtime_suspend(struct device *dev)
if (core < 0)
return -ENODEV;
+ if (!rocket_job_is_idle(&rdev->cores[core]))
+ return -EBUSY;
+
clk_bulk_disable_unprepare(ARRAY_SIZE(rdev->cores[core].clks), rdev->cores[core].clks);
return 0;
diff --git a/drivers/accel/rocket/rocket_drv.h b/drivers/accel/rocket/rocket_drv.h
index 2944e0136ab991da61fb8f66f7e9c1ba214878a6..f50634935b605c542cce16a2b91c1e43ec16bc81 100644
--- a/drivers/accel/rocket/rocket_drv.h
+++ b/drivers/accel/rocket/rocket_drv.h
@@ -5,6 +5,7 @@
#define __ROCKET_DRV_H__
#include <drm/drm_mm.h>
+#include <drm/gpu_scheduler.h>
#include "rocket_device.h"
@@ -19,6 +20,8 @@ struct rocket_file_priv {
struct rocket_iommu_domain *domain;
struct drm_mm mm;
struct mutex mm_lock;
+
+ struct drm_sched_entity sched_entity;
};
struct rocket_iommu_domain *rocket_iommu_domain_get(struct rocket_file_priv *rocket_priv);
diff --git a/drivers/accel/rocket/rocket_job.c b/drivers/accel/rocket/rocket_job.c
new file mode 100644
index 0000000000000000000000000000000000000000..e731da15ebffca12e74035d2739a666a8e02d747
--- /dev/null
+++ b/drivers/accel/rocket/rocket_job.c
@@ -0,0 +1,636 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright 2019 Linaro, Ltd, Rob Herring <robh@kernel.org> */
+/* Copyright 2019 Collabora ltd. */
+/* Copyright 2024-2025 Tomeu Vizoso <tomeu@tomeuvizoso.net> */
+
+#include <drm/drm_print.h>
+#include <drm/drm_file.h>
+#include <drm/drm_gem.h>
+#include <drm/rocket_accel.h>
+#include <linux/interrupt.h>
+#include <linux/iommu.h>
+#include <linux/platform_device.h>
+#include <linux/pm_runtime.h>
+
+#include "rocket_core.h"
+#include "rocket_device.h"
+#include "rocket_drv.h"
+#include "rocket_job.h"
+#include "rocket_registers.h"
+
+#define JOB_TIMEOUT_MS 500
+
+static struct rocket_job *
+to_rocket_job(struct drm_sched_job *sched_job)
+{
+ return container_of(sched_job, struct rocket_job, base);
+}
+
+static const char *rocket_fence_get_driver_name(struct dma_fence *fence)
+{
+ return "rocket";
+}
+
+static const char *rocket_fence_get_timeline_name(struct dma_fence *fence)
+{
+ return "rockchip-npu";
+}
+
+static const struct dma_fence_ops rocket_fence_ops = {
+ .get_driver_name = rocket_fence_get_driver_name,
+ .get_timeline_name = rocket_fence_get_timeline_name,
+};
+
+static struct dma_fence *rocket_fence_create(struct rocket_core *core)
+{
+ struct dma_fence *fence;
+
+ fence = kzalloc(sizeof(*fence), GFP_KERNEL);
+ if (!fence)
+ return ERR_PTR(-ENOMEM);
+
+ dma_fence_init(fence, &rocket_fence_ops, &core->fence_lock,
+ core->fence_context, ++core->emit_seqno);
+
+ return fence;
+}
+
+static int
+rocket_copy_tasks(struct drm_device *dev,
+ struct drm_file *file_priv,
+ struct drm_rocket_job *job,
+ struct rocket_job *rjob)
+{
+ int ret = 0;
+
+ if (job->task_struct_size < sizeof(struct drm_rocket_task))
+ return -EINVAL;
+
+ rjob->task_count = job->task_count;
+
+ if (!rjob->task_count)
+ return 0;
+
+ rjob->tasks = kvmalloc_array(job->task_count, sizeof(*rjob->tasks), GFP_KERNEL);
+ if (!rjob->tasks) {
+ drm_dbg(dev, "Failed to allocate task array\n");
+ return -ENOMEM;
+ }
+
+ for (int i = 0; i < rjob->task_count; i++) {
+ struct drm_rocket_task task = {0};
+
+ if (copy_from_user(&task,
+ u64_to_user_ptr(job->tasks) + i * job->task_struct_size,
+ sizeof(task))) {
+ drm_dbg(dev, "Failed to copy incoming tasks\n");
+ ret = -EFAULT;
+ goto fail;
+ }
+
+ if (task.regcmd_count == 0) {
+ drm_dbg(dev, "regcmd_count field in drm_rocket_task should be > 0.\n");
+ ret = -EINVAL;
+ goto fail;
+ }
+
+ rjob->tasks[i].regcmd = task.regcmd;
+ rjob->tasks[i].regcmd_count = task.regcmd_count;
+ }
+
+ return 0;
+
+fail:
+ kvfree(rjob->tasks);
+ return ret;
+}
+
+static void rocket_job_hw_submit(struct rocket_core *core, struct rocket_job *job)
+{
+ struct rocket_task *task;
+ unsigned int extra_bit;
+
+ /* Don't queue the job if a reset is in progress */
+ if (atomic_read(&core->reset.pending))
+ return;
+
+ /* GO ! */
+
+ task = &job->tasks[job->next_task_idx];
+ job->next_task_idx++;
+
+ rocket_pc_writel(core, BASE_ADDRESS, 0x1);
+
+ /* From rknpu, in the TRM this bit is marked as reserved */
+ extra_bit = 0x10000000 * core->index;
+ rocket_cna_writel(core, S_POINTER, CNA_S_POINTER_POINTER_PP_EN(1) |
+ CNA_S_POINTER_EXECUTER_PP_EN(1) |
+ CNA_S_POINTER_POINTER_PP_MODE(1) |
+ extra_bit);
+
+ rocket_core_writel(core, S_POINTER, CORE_S_POINTER_POINTER_PP_EN(1) |
+ CORE_S_POINTER_EXECUTER_PP_EN(1) |
+ CORE_S_POINTER_POINTER_PP_MODE(1) |
+ extra_bit);
+
+ rocket_pc_writel(core, BASE_ADDRESS, task->regcmd);
+ rocket_pc_writel(core, REGISTER_AMOUNTS,
+ PC_REGISTER_AMOUNTS_PC_DATA_AMOUNT((task->regcmd_count + 1) / 2 - 1));
+
+ rocket_pc_writel(core, INTERRUPT_MASK, PC_INTERRUPT_MASK_DPU_0 | PC_INTERRUPT_MASK_DPU_1);
+ rocket_pc_writel(core, INTERRUPT_CLEAR, PC_INTERRUPT_CLEAR_DPU_0 | PC_INTERRUPT_CLEAR_DPU_1);
+
+ rocket_pc_writel(core, TASK_CON, PC_TASK_CON_RESERVED_0(1) |
+ PC_TASK_CON_TASK_COUNT_CLEAR(1) |
+ PC_TASK_CON_TASK_NUMBER(1) |
+ PC_TASK_CON_TASK_PP_EN(1));
+
+ rocket_pc_writel(core, TASK_DMA_BASE_ADDR, PC_TASK_DMA_BASE_ADDR_DMA_BASE_ADDR(0x0));
+
+ rocket_pc_writel(core, OPERATION_ENABLE, PC_OPERATION_ENABLE_OP_EN(1));
+
+ dev_dbg(core->dev, "Submitted regcmd at 0x%llx to core %d", task->regcmd, core->index);
+}
+
+static int rocket_acquire_object_fences(struct drm_gem_object **bos,
+ int bo_count,
+ struct drm_sched_job *job,
+ bool is_write)
+{
+ int i, ret;
+
+ for (i = 0; i < bo_count; i++) {
+ ret = dma_resv_reserve_fences(bos[i]->resv, 1);
+ if (ret)
+ return ret;
+
+ ret = drm_sched_job_add_implicit_dependencies(job, bos[i],
+ is_write);
+ if (ret)
+ return ret;
+ }
+
+ return 0;
+}
+
+static void rocket_attach_object_fences(struct drm_gem_object **bos,
+ int bo_count,
+ struct dma_fence *fence)
+{
+ int i;
+
+ for (i = 0; i < bo_count; i++)
+ dma_resv_add_fence(bos[i]->resv, fence, DMA_RESV_USAGE_WRITE);
+}
+
+static int rocket_job_push(struct rocket_job *job)
+{
+ struct rocket_device *rdev = job->rdev;
+ struct drm_gem_object **bos;
+ struct ww_acquire_ctx acquire_ctx;
+ int ret = 0;
+
+ bos = kvmalloc_array(job->in_bo_count + job->out_bo_count, sizeof(void *),
+ GFP_KERNEL);
+ memcpy(bos, job->in_bos, job->in_bo_count * sizeof(void *));
+ memcpy(&bos[job->in_bo_count], job->out_bos, job->out_bo_count * sizeof(void *));
+
+ ret = drm_gem_lock_reservations(bos, job->in_bo_count + job->out_bo_count, &acquire_ctx);
+ if (ret)
+ goto err;
+
+ scoped_guard(mutex, &rdev->sched_lock) {
+ drm_sched_job_arm(&job->base);
+
+ job->inference_done_fence = dma_fence_get(&job->base.s_fence->finished);
+
+ ret = rocket_acquire_object_fences(job->in_bos, job->in_bo_count, &job->base, false);
+ if (ret)
+ goto err_unlock;
+
+ ret = rocket_acquire_object_fences(job->out_bos, job->out_bo_count, &job->base, true);
+ if (ret)
+ goto err_unlock;
+
+ kref_get(&job->refcount); /* put by scheduler job completion */
+
+ drm_sched_entity_push_job(&job->base);
+ }
+
+ rocket_attach_object_fences(job->out_bos, job->out_bo_count, job->inference_done_fence);
+
+err_unlock:
+ drm_gem_unlock_reservations(bos, job->in_bo_count + job->out_bo_count, &acquire_ctx);
+err:
+ kfree(bos);
+
+ return ret;
+}
+
+static void rocket_job_cleanup(struct kref *ref)
+{
+ struct rocket_job *job = container_of(ref, struct rocket_job,
+ refcount);
+ unsigned int i;
+
+ rocket_iommu_domain_put(job->domain);
+
+ dma_fence_put(job->done_fence);
+ dma_fence_put(job->inference_done_fence);
+
+ if (job->in_bos) {
+ for (i = 0; i < job->in_bo_count; i++)
+ drm_gem_object_put(job->in_bos[i]);
+
+ kvfree(job->in_bos);
+ }
+
+ if (job->out_bos) {
+ for (i = 0; i < job->out_bo_count; i++)
+ drm_gem_object_put(job->out_bos[i]);
+
+ kvfree(job->out_bos);
+ }
+
+ kvfree(job->tasks);
+
+ kfree(job);
+}
+
+static void rocket_job_put(struct rocket_job *job)
+{
+ kref_put(&job->refcount, rocket_job_cleanup);
+}
+
+static void rocket_job_free(struct drm_sched_job *sched_job)
+{
+ struct rocket_job *job = to_rocket_job(sched_job);
+
+ drm_sched_job_cleanup(sched_job);
+
+ rocket_job_put(job);
+}
+
+static struct rocket_core *sched_to_core(struct rocket_device *rdev,
+ struct drm_gpu_scheduler *sched)
+{
+ unsigned int core;
+
+ for (core = 0; core < rdev->num_cores; core++) {
+ if (&rdev->cores[core].sched == sched)
+ return &rdev->cores[core];
+ }
+
+ return NULL;
+}
+
+static struct dma_fence *rocket_job_run(struct drm_sched_job *sched_job)
+{
+ struct rocket_job *job = to_rocket_job(sched_job);
+ struct rocket_device *rdev = job->rdev;
+ struct rocket_core *core = sched_to_core(rdev, sched_job->sched);
+ struct dma_fence *fence = NULL;
+ int ret;
+
+ if (unlikely(job->base.s_fence->finished.error))
+ return NULL;
+
+ /*
+ * Nothing to execute: can happen if the job has finished while
+ * we were resetting the NPU.
+ */
+ if (job->next_task_idx == job->task_count)
+ return NULL;
+
+ fence = rocket_fence_create(core);
+ if (IS_ERR(fence))
+ return fence;
+
+ if (job->done_fence)
+ dma_fence_put(job->done_fence);
+ job->done_fence = dma_fence_get(fence);
+
+ ret = pm_runtime_get_sync(core->dev);
+ if (ret < 0)
+ return fence;
+
+ ret = iommu_attach_group(job->domain->domain, core->iommu_group);
+ if (ret < 0)
+ return fence;
+
+ scoped_guard(mutex, &core->job_lock) {
+ core->in_flight_job = job;
+ rocket_job_hw_submit(core, job);
+ }
+
+ return fence;
+}
+
+static void rocket_job_handle_irq(struct rocket_core *core)
+{
+ pm_runtime_mark_last_busy(core->dev);
+
+ rocket_pc_writel(core, OPERATION_ENABLE, 0x0);
+ rocket_pc_writel(core, INTERRUPT_CLEAR, 0x1ffff);
+
+ scoped_guard(mutex, &core->job_lock)
+ if (core->in_flight_job) {
+ if (core->in_flight_job->next_task_idx < core->in_flight_job->task_count) {
+ rocket_job_hw_submit(core, core->in_flight_job);
+ return;
+ }
+
+ iommu_detach_group(NULL, iommu_group_get(core->dev));
+ dma_fence_signal(core->in_flight_job->done_fence);
+ pm_runtime_put_autosuspend(core->dev);
+ core->in_flight_job = NULL;
+ }
+}
+
+static void
+rocket_reset(struct rocket_core *core, struct drm_sched_job *bad)
+{
+ if (!atomic_read(&core->reset.pending))
+ return;
+
+ drm_sched_stop(&core->sched, bad);
+
+ /*
+ * Remaining interrupts have been handled, but we might still have
+ * stuck jobs. Let's make sure the PM counters stay balanced by
+ * manually calling pm_runtime_put_noidle().
+ */
+ scoped_guard(mutex, &core->job_lock) {
+ if (core->in_flight_job)
+ pm_runtime_put_noidle(core->dev);
+
+ iommu_detach_group(NULL, core->iommu_group);
+
+ core->in_flight_job = NULL;
+ }
+
+ /* Proceed with reset now. */
+ rocket_core_reset(core);
+
+ /* NPU has been reset, we can clear the reset pending bit. */
+ atomic_set(&core->reset.pending, 0);
+
+ /* Restart the scheduler */
+ drm_sched_start(&core->sched, 0);
+}
+
+static enum drm_gpu_sched_stat rocket_job_timedout(struct drm_sched_job *sched_job)
+{
+ struct rocket_job *job = to_rocket_job(sched_job);
+ struct rocket_device *rdev = job->rdev;
+ struct rocket_core *core = sched_to_core(rdev, sched_job->sched);
+
+ dev_err(core->dev, "NPU job timed out");
+
+ atomic_set(&core->reset.pending, 1);
+ rocket_reset(core, sched_job);
+
+ return DRM_GPU_SCHED_STAT_NOMINAL;
+}
+
+static void rocket_reset_work(struct work_struct *work)
+{
+ struct rocket_core *core;
+
+ core = container_of(work, struct rocket_core, reset.work);
+ rocket_reset(core, NULL);
+}
+
+static const struct drm_sched_backend_ops rocket_sched_ops = {
+ .run_job = rocket_job_run,
+ .timedout_job = rocket_job_timedout,
+ .free_job = rocket_job_free
+};
+
+static irqreturn_t rocket_job_irq_handler_thread(int irq, void *data)
+{
+ struct rocket_core *core = data;
+
+ rocket_job_handle_irq(core);
+
+ return IRQ_HANDLED;
+}
+
+static irqreturn_t rocket_job_irq_handler(int irq, void *data)
+{
+ struct rocket_core *core = data;
+ u32 raw_status = rocket_pc_readl(core, INTERRUPT_RAW_STATUS);
+
+ WARN_ON(raw_status & PC_INTERRUPT_RAW_STATUS_DMA_READ_ERROR);
+ WARN_ON(raw_status & PC_INTERRUPT_RAW_STATUS_DMA_READ_ERROR);
+
+ if (!(raw_status & PC_INTERRUPT_RAW_STATUS_DPU_0 ||
+ raw_status & PC_INTERRUPT_RAW_STATUS_DPU_1))
+ return IRQ_NONE;
+
+ rocket_pc_writel(core, INTERRUPT_MASK, 0x0);
+
+ return IRQ_WAKE_THREAD;
+}
+
+int rocket_job_init(struct rocket_core *core)
+{
+ struct drm_sched_init_args args = {
+ .ops = &rocket_sched_ops,
+ .num_rqs = DRM_SCHED_PRIORITY_COUNT,
+ .credit_limit = 1,
+ .timeout = msecs_to_jiffies(JOB_TIMEOUT_MS),
+ .name = dev_name(core->dev),
+ .dev = core->dev,
+ };
+ int ret;
+
+ INIT_WORK(&core->reset.work, rocket_reset_work);
+ spin_lock_init(&core->fence_lock);
+ mutex_init(&core->job_lock);
+
+ core->irq = platform_get_irq(to_platform_device(core->dev), 0);
+ if (core->irq < 0)
+ return core->irq;
+
+ ret = devm_request_threaded_irq(core->dev, core->irq,
+ rocket_job_irq_handler,
+ rocket_job_irq_handler_thread,
+ IRQF_SHARED, dev_name(core->dev),
+ core);
+ if (ret) {
+ dev_err(core->dev, "failed to request job irq");
+ return ret;
+ }
+
+ core->reset.wq = alloc_ordered_workqueue("rocket-reset-%d", 0, core->index);
+ if (!core->reset.wq)
+ return -ENOMEM;
+
+ core->fence_context = dma_fence_context_alloc(1);
+
+ args.timeout_wq = core->reset.wq;
+ ret = drm_sched_init(&core->sched, &args);
+ if (ret) {
+ dev_err(core->dev, "Failed to create scheduler: %d.", ret);
+ goto err_sched;
+ }
+
+ return 0;
+
+err_sched:
+ drm_sched_fini(&core->sched);
+
+ destroy_workqueue(core->reset.wq);
+ return ret;
+}
+
+void rocket_job_fini(struct rocket_core *core)
+{
+ drm_sched_fini(&core->sched);
+
+ cancel_work_sync(&core->reset.work);
+ destroy_workqueue(core->reset.wq);
+}
+
+int rocket_job_open(struct rocket_file_priv *rocket_priv)
+{
+ struct rocket_device *rdev = rocket_priv->rdev;
+ struct drm_gpu_scheduler **scheds = kmalloc_array(rdev->num_cores, sizeof(scheds),
+ GFP_KERNEL);
+ unsigned int core;
+ int ret;
+
+ for (core = 0; core < rdev->num_cores; core++)
+ scheds[core] = &rdev->cores[core].sched;
+
+ ret = drm_sched_entity_init(&rocket_priv->sched_entity,
+ DRM_SCHED_PRIORITY_NORMAL,
+ scheds,
+ rdev->num_cores, NULL);
+ if (WARN_ON(ret))
+ return ret;
+
+ return 0;
+}
+
+void rocket_job_close(struct rocket_file_priv *rocket_priv)
+{
+ struct drm_sched_entity *entity = &rocket_priv->sched_entity;
+
+ kfree(entity->sched_list);
+ drm_sched_entity_destroy(entity);
+}
+
+int rocket_job_is_idle(struct rocket_core *core)
+{
+ /* If there are any jobs in this HW queue, we're not idle */
+ if (atomic_read(&core->sched.credit_count))
+ return false;
+
+ return true;
+}
+
+static int rocket_ioctl_submit_job(struct drm_device *dev, struct drm_file *file,
+ struct drm_rocket_job *job)
+{
+ struct rocket_device *rdev = to_rocket_device(dev);
+ struct rocket_file_priv *file_priv = file->driver_priv;
+ struct rocket_job *rjob = NULL;
+ int ret = 0;
+
+ if (job->task_count == 0)
+ return -EINVAL;
+
+ rjob = kzalloc(sizeof(*rjob), GFP_KERNEL);
+ if (!rjob)
+ return -ENOMEM;
+
+ kref_init(&rjob->refcount);
+
+ rjob->rdev = rdev;
+
+ ret = drm_sched_job_init(&rjob->base,
+ &file_priv->sched_entity,
+ 1, NULL);
+ if (ret)
+ goto out_put_job;
+
+ ret = rocket_copy_tasks(dev, file, job, rjob);
+ if (ret)
+ goto out_cleanup_job;
+
+ ret = drm_gem_objects_lookup(file, u64_to_user_ptr(job->in_bo_handles),
+ job->in_bo_handle_count, &rjob->in_bos);
+ if (ret)
+ goto out_cleanup_job;
+
+ rjob->in_bo_count = job->in_bo_handle_count;
+
+ ret = drm_gem_objects_lookup(file, u64_to_user_ptr(job->out_bo_handles),
+ job->out_bo_handle_count, &rjob->out_bos);
+ if (ret)
+ goto out_cleanup_job;
+
+ rjob->out_bo_count = job->out_bo_handle_count;
+
+ rjob->domain = rocket_iommu_domain_get(file_priv);
+
+ ret = rocket_job_push(rjob);
+ if (ret)
+ goto out_cleanup_job;
+
+out_cleanup_job:
+ if (ret)
+ drm_sched_job_cleanup(&rjob->base);
+out_put_job:
+ rocket_job_put(rjob);
+
+ return ret;
+}
+
+int rocket_ioctl_submit(struct drm_device *dev, void *data, struct drm_file *file)
+{
+ struct drm_rocket_submit *args = data;
+ struct drm_rocket_job *jobs;
+ int ret = 0;
+ unsigned int i = 0;
+
+ if (args->job_count == 0)
+ return 0;
+
+ if (args->job_struct_size < sizeof(struct drm_rocket_job)) {
+ drm_dbg(dev, "job_struct_size field in drm_rocket_submit struct is too small.\n");
+ return -EINVAL;
+ }
+
+ if (args->reserved != 0) {
+ drm_dbg(dev, "Reserved field in drm_rocket_submit struct should be 0.\n");
+ return -EINVAL;
+ }
+
+ jobs = kvmalloc_array(args->job_count, sizeof(*jobs), GFP_KERNEL);
+ if (!jobs) {
+ drm_dbg(dev, "Failed to allocate incoming job array\n");
+ return -ENOMEM;
+ }
+
+ for (i = 0; i < args->job_count; i++) {
+ if (copy_from_user(&jobs[i],
+ u64_to_user_ptr(args->jobs) + i * args->job_struct_size,
+ sizeof(*jobs))) {
+ ret = -EFAULT;
+ drm_dbg(dev, "Failed to copy incoming job array\n");
+ goto exit;
+ }
+ }
+
+
+ for (i = 0; i < args->job_count; i++)
+ rocket_ioctl_submit_job(dev, file, &jobs[i]);
+
+exit:
+ kfree(jobs);
+
+ return ret;
+}
diff --git a/drivers/accel/rocket/rocket_job.h b/drivers/accel/rocket/rocket_job.h
new file mode 100644
index 0000000000000000000000000000000000000000..4ae00feec3b939a592b99ee6c32854f788acd395
--- /dev/null
+++ b/drivers/accel/rocket/rocket_job.h
@@ -0,0 +1,52 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* Copyright 2024-2025 Tomeu Vizoso <tomeu@tomeuvizoso.net> */
+
+#ifndef __ROCKET_JOB_H__
+#define __ROCKET_JOB_H__
+
+#include <drm/drm_drv.h>
+#include <drm/gpu_scheduler.h>
+
+#include "rocket_core.h"
+#include "rocket_drv.h"
+
+struct rocket_task {
+ u64 regcmd;
+ u32 regcmd_count;
+};
+
+struct rocket_job {
+ struct drm_sched_job base;
+
+ struct rocket_device *rdev;
+
+ struct drm_gem_object **in_bos;
+ struct drm_gem_object **out_bos;
+
+ u32 in_bo_count;
+ u32 out_bo_count;
+
+ struct rocket_task *tasks;
+ u32 task_count;
+ u32 next_task_idx;
+
+ /* Fence to be signaled by drm-sched once its done with the job */
+ struct dma_fence *inference_done_fence;
+
+ /* Fence to be signaled by IRQ handler when the job is complete. */
+ struct dma_fence *done_fence;
+
+ struct rocket_iommu_domain *domain;
+
+ struct kref refcount;
+};
+
+int rocket_ioctl_submit(struct drm_device *dev, void *data, struct drm_file *file);
+
+int rocket_job_init(struct rocket_core *core);
+void rocket_job_fini(struct rocket_core *core);
+int rocket_job_open(struct rocket_file_priv *rocket_priv);
+void rocket_job_close(struct rocket_file_priv *rocket_priv);
+int rocket_job_is_idle(struct rocket_core *core);
+
+#endif
diff --git a/include/uapi/drm/rocket_accel.h b/include/uapi/drm/rocket_accel.h
index 95720702b7c4413d72b89c1f0f59abb22dc8c6b3..374f8370ac9df6944fdb6ef06e56f15226e072ba 100644
--- a/include/uapi/drm/rocket_accel.h
+++ b/include/uapi/drm/rocket_accel.h
@@ -12,8 +12,10 @@ extern "C" {
#endif
#define DRM_ROCKET_CREATE_BO 0x00
+#define DRM_ROCKET_SUBMIT 0x01
#define DRM_IOCTL_ROCKET_CREATE_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_ROCKET_CREATE_BO, struct drm_rocket_create_bo)
+#define DRM_IOCTL_ROCKET_SUBMIT DRM_IOW(DRM_COMMAND_BASE + DRM_ROCKET_SUBMIT, struct drm_rocket_submit)
/**
* struct drm_rocket_create_bo - ioctl argument for creating Rocket BOs.
@@ -37,6 +39,68 @@ struct drm_rocket_create_bo {
__u64 offset;
};
+/**
+ * struct drm_rocket_task - A task to be run on the NPU
+ *
+ * A task is the smallest unit of work that can be run on the NPU.
+ */
+struct drm_rocket_task {
+ /** Input: DMA address to NPU mapping of register command buffer */
+ __u32 regcmd;
+
+ /** Input: Number of commands in the register command buffer */
+ __u32 regcmd_count;
+};
+
+/**
+ * struct drm_rocket_job - A job to be run on the NPU
+ *
+ * The kernel will schedule the execution of this job taking into account its
+ * dependencies with other jobs. All tasks in the same job will be executed
+ * sequentially on the same core, to benefit from memory residency in SRAM.
+ */
+struct drm_rocket_job {
+ /** Input: Pointer to an array of struct drm_rocket_task. */
+ __u64 tasks;
+
+ /** Input: Pointer to a u32 array of the BOs that are read by the job. */
+ __u64 in_bo_handles;
+
+ /** Input: Pointer to a u32 array of the BOs that are written to by the job. */
+ __u64 out_bo_handles;
+
+ /** Input: Number of tasks passed in. */
+ __u32 task_count;
+
+ /** Input: Size in bytes of the structs in the @tasks field. */
+ __u32 task_struct_size;
+
+ /** Input: Number of input BO handles passed in (size is that times 4). */
+ __u32 in_bo_handle_count;
+
+ /** Input: Number of output BO handles passed in (size is that times 4). */
+ __u32 out_bo_handle_count;
+};
+
+/**
+ * struct drm_rocket_submit - ioctl argument for submitting commands to the NPU.
+ *
+ * The kernel will schedule the execution of these jobs in dependency order.
+ */
+struct drm_rocket_submit {
+ /** Input: Pointer to an array of struct drm_rocket_job. */
+ __u64 jobs;
+
+ /** Input: Number of jobs passed in. */
+ __u32 job_count;
+
+ /** Input: Size in bytes of the structs in the @jobs field. */
+ __u32 job_struct_size;
+
+ /** Reserved, must be zero. */
+ __u64 reserved;
+};
+
#if defined(__cplusplus)
}
#endif
--
2.50.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v9 05/10] accel/rocket: Add IOCTLs for synchronizing memory accesses
2025-07-21 9:17 [PATCH v9 00/10] New DRM accel driver for Rockchip's RKNN NPU Tomeu Vizoso
` (2 preceding siblings ...)
2025-07-21 9:17 ` [PATCH v9 04/10] accel/rocket: Add job submission IOCTL Tomeu Vizoso
@ 2025-07-21 9:17 ` Tomeu Vizoso
2025-07-21 9:17 ` [PATCH v9 06/10] dt-bindings: npu: rockchip,rknn: Add bindings Tomeu Vizoso
` (7 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: Tomeu Vizoso @ 2025-07-21 9:17 UTC (permalink / raw)
To: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Heiko Stuebner,
Oded Gabbay, Jonathan Corbet, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, Sebastian Reichel, Nicolas Frattaroli,
Kever Yang, Robin Murphy, Daniel Stone, Da Xue, Philipp Zabel,
Jeff Hugo
Cc: devicetree, linux-arm-kernel, linux-rockchip, linux-kernel,
dri-devel, linux-doc, linux-media, linaro-mm-sig, Tomeu Vizoso,
Jeff Hugo
The NPU cores have their own access to the memory bus, and this isn't
cache coherent with the CPUs.
Add IOCTLs so userspace can mark when the caches need to be flushed, and
also when a writer job needs to be waited for before the buffer can be
accessed from the CPU.
Initially based on the same IOCTLs from the Etnaviv driver.
v2:
- Don't break UABI by reordering the IOCTL IDs (Jeff Hugo)
v3:
- Check that padding fields in IOCTLs are zero (Jeff Hugo)
v6:
- Fix conversion logic to make sure we use DMA_BIDIRECTIONAL when needed
(Lucas Stach)
v8:
- Always sync BOs in both directions (Robin Murphy)
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
---
drivers/accel/rocket/rocket_drv.c | 2 ++
drivers/accel/rocket/rocket_gem.c | 56 +++++++++++++++++++++++++++++++++++++++
drivers/accel/rocket/rocket_gem.h | 4 +++
include/uapi/drm/rocket_accel.h | 34 ++++++++++++++++++++++++
4 files changed, 96 insertions(+)
diff --git a/drivers/accel/rocket/rocket_drv.c b/drivers/accel/rocket/rocket_drv.c
index a21aa9aa189ba585c70fbf57d2a41fb578357efd..5c0b63f0a8f00dc71060e7177d0ed1ca15755ec4 100644
--- a/drivers/accel/rocket/rocket_drv.c
+++ b/drivers/accel/rocket/rocket_drv.c
@@ -134,6 +134,8 @@ static const struct drm_ioctl_desc rocket_drm_driver_ioctls[] = {
ROCKET_IOCTL(CREATE_BO, create_bo),
ROCKET_IOCTL(SUBMIT, submit),
+ ROCKET_IOCTL(PREP_BO, prep_bo),
+ ROCKET_IOCTL(FINI_BO, fini_bo),
};
DEFINE_DRM_ACCEL_FOPS(rocket_accel_driver_fops);
diff --git a/drivers/accel/rocket/rocket_gem.c b/drivers/accel/rocket/rocket_gem.c
index 05cf46040865c01fe14a169c865227780f2db679..0551e11cc184143a582d1718a621e22086217ad9 100644
--- a/drivers/accel/rocket/rocket_gem.c
+++ b/drivers/accel/rocket/rocket_gem.c
@@ -123,3 +123,59 @@ int rocket_ioctl_create_bo(struct drm_device *dev, void *data, struct drm_file *
return ret;
}
+
+int rocket_ioctl_prep_bo(struct drm_device *dev, void *data, struct drm_file *file)
+{
+ struct drm_rocket_prep_bo *args = data;
+ unsigned long timeout = drm_timeout_abs_to_jiffies(args->timeout_ns);
+ struct drm_gem_object *gem_obj;
+ struct drm_gem_shmem_object *shmem_obj;
+ long ret = 0;
+
+ if (args->reserved != 0) {
+ drm_dbg(dev, "Reserved field in drm_rocket_prep_bo struct should be 0.\n");
+ return -EINVAL;
+ }
+
+ gem_obj = drm_gem_object_lookup(file, args->handle);
+ if (!gem_obj)
+ return -ENOENT;
+
+ ret = dma_resv_wait_timeout(gem_obj->resv, DMA_RESV_USAGE_WRITE, true, timeout);
+ if (!ret)
+ ret = timeout ? -ETIMEDOUT : -EBUSY;
+
+ shmem_obj = &to_rocket_bo(gem_obj)->base;
+
+ dma_sync_sgtable_for_cpu(dev->dev, shmem_obj->sgt, DMA_BIDIRECTIONAL);
+
+ drm_gem_object_put(gem_obj);
+
+ return ret;
+}
+
+int rocket_ioctl_fini_bo(struct drm_device *dev, void *data, struct drm_file *file)
+{
+ struct drm_rocket_fini_bo *args = data;
+ struct drm_gem_shmem_object *shmem_obj;
+ struct rocket_gem_object *rkt_obj;
+ struct drm_gem_object *gem_obj;
+
+ if (args->reserved != 0) {
+ drm_dbg(dev, "Reserved field in drm_rocket_fini_bo struct should be 0.\n");
+ return -EINVAL;
+ }
+
+ gem_obj = drm_gem_object_lookup(file, args->handle);
+ if (!gem_obj)
+ return -ENOENT;
+
+ rkt_obj = to_rocket_bo(gem_obj);
+ shmem_obj = &rkt_obj->base;
+
+ dma_sync_sgtable_for_device(dev->dev, shmem_obj->sgt, DMA_BIDIRECTIONAL);
+
+ drm_gem_object_put(gem_obj);
+
+ return 0;
+}
diff --git a/drivers/accel/rocket/rocket_gem.h b/drivers/accel/rocket/rocket_gem.h
index 91a1fc09c56ce483ebe80959e1a7ff934867bedc..24043033450941cb866a21378875810c6e8b9323 100644
--- a/drivers/accel/rocket/rocket_gem.h
+++ b/drivers/accel/rocket/rocket_gem.h
@@ -21,6 +21,10 @@ struct drm_gem_object *rocket_gem_create_object(struct drm_device *dev, size_t s
int rocket_ioctl_create_bo(struct drm_device *dev, void *data, struct drm_file *file);
+int rocket_ioctl_prep_bo(struct drm_device *dev, void *data, struct drm_file *file);
+
+int rocket_ioctl_fini_bo(struct drm_device *dev, void *data, struct drm_file *file);
+
static inline
struct rocket_gem_object *to_rocket_bo(struct drm_gem_object *obj)
{
diff --git a/include/uapi/drm/rocket_accel.h b/include/uapi/drm/rocket_accel.h
index 374f8370ac9df6944fdb6ef06e56f15226e072ba..14b2e12b7c49288a84e645570cdeb815cd632d96 100644
--- a/include/uapi/drm/rocket_accel.h
+++ b/include/uapi/drm/rocket_accel.h
@@ -13,9 +13,13 @@ extern "C" {
#define DRM_ROCKET_CREATE_BO 0x00
#define DRM_ROCKET_SUBMIT 0x01
+#define DRM_ROCKET_PREP_BO 0x02
+#define DRM_ROCKET_FINI_BO 0x03
#define DRM_IOCTL_ROCKET_CREATE_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_ROCKET_CREATE_BO, struct drm_rocket_create_bo)
#define DRM_IOCTL_ROCKET_SUBMIT DRM_IOW(DRM_COMMAND_BASE + DRM_ROCKET_SUBMIT, struct drm_rocket_submit)
+#define DRM_IOCTL_ROCKET_PREP_BO DRM_IOW(DRM_COMMAND_BASE + DRM_ROCKET_PREP_BO, struct drm_rocket_prep_bo)
+#define DRM_IOCTL_ROCKET_FINI_BO DRM_IOW(DRM_COMMAND_BASE + DRM_ROCKET_FINI_BO, struct drm_rocket_fini_bo)
/**
* struct drm_rocket_create_bo - ioctl argument for creating Rocket BOs.
@@ -39,6 +43,36 @@ struct drm_rocket_create_bo {
__u64 offset;
};
+/**
+ * struct drm_rocket_prep_bo - ioctl argument for starting CPU ownership of the BO.
+ *
+ * Takes care of waiting for any NPU jobs that might still use the NPU and performs cache
+ * synchronization.
+ */
+struct drm_rocket_prep_bo {
+ /** Input: GEM handle of the buffer object. */
+ __u32 handle;
+
+ /** Reserved, must be zero. */
+ __u32 reserved;
+
+ /** Input: Amount of time to wait for NPU jobs. */
+ __s64 timeout_ns;
+};
+
+/**
+ * struct drm_rocket_fini_bo - ioctl argument for finishing CPU ownership of the BO.
+ *
+ * Synchronize caches for NPU access.
+ */
+struct drm_rocket_fini_bo {
+ /** Input: GEM handle of the buffer object. */
+ __u32 handle;
+
+ /** Reserved, must be zero. */
+ __u32 reserved;
+};
+
/**
* struct drm_rocket_task - A task to be run on the NPU
*
--
2.50.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v9 06/10] dt-bindings: npu: rockchip,rknn: Add bindings
2025-07-21 9:17 [PATCH v9 00/10] New DRM accel driver for Rockchip's RKNN NPU Tomeu Vizoso
` (3 preceding siblings ...)
2025-07-21 9:17 ` [PATCH v9 05/10] accel/rocket: Add IOCTLs for synchronizing memory accesses Tomeu Vizoso
@ 2025-07-21 9:17 ` Tomeu Vizoso
2025-07-21 15:00 ` Rob Herring (Arm)
2025-07-21 9:17 ` [PATCH v9 07/10] arm64: dts: rockchip: add pd_npu label for RK3588 power domains Tomeu Vizoso
` (6 subsequent siblings)
11 siblings, 1 reply; 16+ messages in thread
From: Tomeu Vizoso @ 2025-07-21 9:17 UTC (permalink / raw)
To: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Heiko Stuebner,
Oded Gabbay, Jonathan Corbet, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, Sebastian Reichel, Nicolas Frattaroli,
Kever Yang, Robin Murphy, Daniel Stone, Da Xue, Philipp Zabel,
Jeff Hugo
Cc: devicetree, linux-arm-kernel, linux-rockchip, linux-kernel,
dri-devel, linux-doc, linux-media, linaro-mm-sig, Tomeu Vizoso,
Krzysztof Kozlowski
Add the bindings for the Neural Processing Unit IP from Rockchip.
v2:
- Adapt to new node structure (one node per core, each with its own
IOMMU)
- Several misc. fixes from Sebastian Reichel
v3:
- Split register block in its constituent subblocks, and only require
the ones that the kernel would ever use (Nicolas Frattaroli)
- Group supplies (Rob Herring)
- Explain the way in which the top core is special (Rob Herring)
v4:
- Change required node name to npu@ (Rob Herring and Krzysztof Kozlowski)
- Remove unneeded items: (Krzysztof Kozlowski)
- Fix use of minItems/maxItems (Krzysztof Kozlowski)
- Add reg-names to list of required properties (Krzysztof Kozlowski)
- Fix example (Krzysztof Kozlowski)
v5:
- Rename file to rockchip,rk3588-rknn-core.yaml (Krzysztof Kozlowski)
- Streamline compatible property (Krzysztof Kozlowski)
v6:
- Remove mention to NVDLA, as the hardware is only incidentally related
(Kever Yang)
- Mark pclk and npu clocks as required by all clocks (Rob Herring)
v7:
- Remove allOf section, not needed now that all nodes require 4 clocks
(Heiko Stübner)
v8:
- Remove notion of top core (Robin Murphy)
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
---
.../bindings/npu/rockchip,rk3588-rknn-core.yaml | 112 +++++++++++++++++++++
1 file changed, 112 insertions(+)
diff --git a/Documentation/devicetree/bindings/npu/rockchip,rk3588-rknn-core.yaml b/Documentation/devicetree/bindings/npu/rockchip,rk3588-rknn-core.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..caca2a4903cd1556226fd2bff6ea9a63dbd375c2
--- /dev/null
+++ b/Documentation/devicetree/bindings/npu/rockchip,rk3588-rknn-core.yaml
@@ -0,0 +1,112 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/npu/rockchip,rk3588-rknn-core.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Neural Processing Unit IP from Rockchip
+
+maintainers:
+ - Tomeu Vizoso <tomeu@tomeuvizoso.net>
+
+description:
+ Rockchip IP for accelerating inference of neural networks.
+
+ There is to be a node per each NPU core in the SoC, and each core should reference all the
+ resources that it needs to function, such as clocks, power domains, and resets.
+
+properties:
+ $nodename:
+ pattern: '^npu@[a-f0-9]+$'
+
+ compatible:
+ enum:
+ - rockchip,rk3588-rknn-core
+
+ reg:
+ maxItems: 3
+
+ reg-names:
+ items:
+ - const: pc # Program Control-related registers
+ - const: cna # Convolution Neural Network Accelerator registers
+ - const: core # Main NPU core processing unit registers
+
+ clocks:
+ maxItems: 4
+
+ clock-names:
+ items:
+ - const: aclk
+ - const: hclk
+ - const: npu
+ - const: pclk
+
+ interrupts:
+ maxItems: 1
+
+ iommus:
+ maxItems: 1
+
+ npu-supply: true
+
+ power-domains:
+ maxItems: 1
+
+ resets:
+ maxItems: 2
+
+ reset-names:
+ items:
+ - const: srst_a
+ - const: srst_h
+
+ sram-supply: true
+
+required:
+ - compatible
+ - reg
+ - reg-names
+ - clocks
+ - clock-names
+ - interrupts
+ - iommus
+ - power-domains
+ - resets
+ - reset-names
+ - npu-supply
+ - sram-supply
+
+additionalProperties: false
+
+examples:
+ - |
+ #include <dt-bindings/clock/rockchip,rk3588-cru.h>
+ #include <dt-bindings/interrupt-controller/irq.h>
+ #include <dt-bindings/interrupt-controller/arm-gic.h>
+ #include <dt-bindings/power/rk3588-power.h>
+ #include <dt-bindings/reset/rockchip,rk3588-cru.h>
+
+ bus {
+ #address-cells = <2>;
+ #size-cells = <2>;
+
+ npu@fdab0000 {
+ compatible = "rockchip,rk3588-rknn-core";
+ reg = <0x0 0xfdab0000 0x0 0x1000>,
+ <0x0 0xfdab1000 0x0 0x1000>,
+ <0x0 0xfdab3000 0x0 0x1000>;
+ reg-names = "pc", "cna", "core";
+ clocks = <&cru ACLK_NPU0>, <&cru HCLK_NPU0>,
+ <&scmi_clk SCMI_CLK_NPU>, <&cru PCLK_NPU_ROOT>;
+ clock-names = "aclk", "hclk", "npu", "pclk";
+ interrupts = <GIC_SPI 110 IRQ_TYPE_LEVEL_HIGH 0>;
+ iommus = <&rknn_mmu_0>;
+ npu-supply = <&vdd_npu_s0>;
+ power-domains = <&power RK3588_PD_NPUTOP>;
+ resets = <&cru SRST_A_RKNN0>, <&cru SRST_H_RKNN0>;
+ reset-names = "srst_a", "srst_h";
+ sram-supply = <&vdd_npu_mem_s0>;
+ };
+ };
+...
--
2.50.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v9 07/10] arm64: dts: rockchip: add pd_npu label for RK3588 power domains
2025-07-21 9:17 [PATCH v9 00/10] New DRM accel driver for Rockchip's RKNN NPU Tomeu Vizoso
` (4 preceding siblings ...)
2025-07-21 9:17 ` [PATCH v9 06/10] dt-bindings: npu: rockchip,rknn: Add bindings Tomeu Vizoso
@ 2025-07-21 9:17 ` Tomeu Vizoso
2025-07-21 9:17 ` [PATCH v9 08/10] arm64: dts: rockchip: Add nodes for NPU and its MMU to rk3588-base Tomeu Vizoso
` (5 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: Tomeu Vizoso @ 2025-07-21 9:17 UTC (permalink / raw)
To: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Heiko Stuebner,
Oded Gabbay, Jonathan Corbet, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, Sebastian Reichel, Nicolas Frattaroli,
Kever Yang, Robin Murphy, Daniel Stone, Da Xue, Philipp Zabel,
Jeff Hugo
Cc: devicetree, linux-arm-kernel, linux-rockchip, linux-kernel,
dri-devel, linux-doc, linux-media, linaro-mm-sig, Tomeu Vizoso
From: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
The NPU of the RK3588 has an external supply. This supply also affects
the power domain of the NPU, not just the NPU device nodes themselves.
Since correctly modelled boards will want the power domain to be aware
of the regulator so that it doesn't always have to be on, add a label to
the NPU power domain node so board files can reference it.
Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
---
arch/arm64/boot/dts/rockchip/rk3588-base.dtsi | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi b/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi
index 70f03e68ba550d6b9142131dcca86e8ded36e2f1..1eddc69fd9c9ed95cdc810ba48d9683e3f82489a 100644
--- a/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi
@@ -841,7 +841,7 @@ power: power-controller {
status = "okay";
/* These power domains are grouped by VD_NPU */
- power-domain@RK3588_PD_NPU {
+ pd_npu: power-domain@RK3588_PD_NPU {
reg = <RK3588_PD_NPU>;
#power-domain-cells = <0>;
#address-cells = <1>;
--
2.50.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v9 08/10] arm64: dts: rockchip: Add nodes for NPU and its MMU to rk3588-base
2025-07-21 9:17 [PATCH v9 00/10] New DRM accel driver for Rockchip's RKNN NPU Tomeu Vizoso
` (5 preceding siblings ...)
2025-07-21 9:17 ` [PATCH v9 07/10] arm64: dts: rockchip: add pd_npu label for RK3588 power domains Tomeu Vizoso
@ 2025-07-21 9:17 ` Tomeu Vizoso
2025-07-21 9:17 ` [PATCH v9 09/10] arm64: dts: rockchip: Enable the NPU on quartzpro64 Tomeu Vizoso
` (4 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: Tomeu Vizoso @ 2025-07-21 9:17 UTC (permalink / raw)
To: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Heiko Stuebner,
Oded Gabbay, Jonathan Corbet, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, Sebastian Reichel, Nicolas Frattaroli,
Kever Yang, Robin Murphy, Daniel Stone, Da Xue, Philipp Zabel,
Jeff Hugo
Cc: devicetree, linux-arm-kernel, linux-rockchip, linux-kernel,
dri-devel, linux-doc, linux-media, linaro-mm-sig, Tomeu Vizoso
See Chapter 36 "RKNN" from the RK3588 TRM (Part 1).
The IP is divided in three cores, programmed independently. The first
core though is special, being able to delegate work to the other cores.
The IOMMU of the first core is also special in that it has two subunits
(read/write?) that need to be programmed in sync.
v2:
- Have one device for each NPU core (Sebastian Reichel)
- Have one device for each IOMMU (Sebastian Reichel)
- Correctly sort nodes (Diederik de Haas)
- Add rockchip,iommu compatible to IOMMU nodes (Sebastian Reichel)
v3:
- Adapt to a split of the register block in the DT bindings (Nicolas
Frattaroli)
v4:
- Adapt to changes in bindings
v6:
- pclk and npu clocks are needed by all clocks (Rob Herring)
v8:
- Remove notion of top core (Robin Murphy)
Tested-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
---
arch/arm64/boot/dts/rockchip/rk3588-base.dtsi | 91 +++++++++++++++++++++++++++
1 file changed, 91 insertions(+)
diff --git a/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi b/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi
index 1eddc69fd9c9ed95cdc810ba48d9683e3f82489a..a18aa1e6c3f1cd92fe26d657bf26784dc1f84127 100644
--- a/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi
@@ -1140,6 +1140,97 @@ power-domain@RK3588_PD_SDMMC {
};
};
+ rknn_core_0: npu@fdab0000 {
+ compatible = "rockchip,rk3588-rknn-core";
+ reg = <0x0 0xfdab0000 0x0 0x1000>,
+ <0x0 0xfdab1000 0x0 0x1000>,
+ <0x0 0xfdab3000 0x0 0x1000>;
+ reg-names = "pc", "cna", "core";
+ interrupts = <GIC_SPI 110 IRQ_TYPE_LEVEL_HIGH 0>;
+ clocks = <&cru ACLK_NPU0>, <&cru HCLK_NPU0>,
+ <&scmi_clk SCMI_CLK_NPU>, <&cru PCLK_NPU_ROOT>;
+ clock-names = "aclk", "hclk", "npu", "pclk";
+ assigned-clocks = <&scmi_clk SCMI_CLK_NPU>;
+ assigned-clock-rates = <200000000>;
+ resets = <&cru SRST_A_RKNN0>, <&cru SRST_H_RKNN0>;
+ reset-names = "srst_a", "srst_h";
+ power-domains = <&power RK3588_PD_NPUTOP>;
+ iommus = <&rknn_mmu_0>;
+ status = "disabled";
+ };
+
+ rknn_mmu_0: iommu@fdab9000 {
+ compatible = "rockchip,rk3588-iommu", "rockchip,rk3568-iommu";
+ reg = <0x0 0xfdab9000 0x0 0x100>,
+ <0x0 0xfdaba000 0x0 0x100>;
+ interrupts = <GIC_SPI 110 IRQ_TYPE_LEVEL_HIGH 0>;
+ clocks = <&cru ACLK_NPU0>, <&cru HCLK_NPU0>;
+ clock-names = "aclk", "iface";
+ #iommu-cells = <0>;
+ power-domains = <&power RK3588_PD_NPUTOP>;
+ status = "disabled";
+ };
+
+ rknn_core_1: npu@fdac0000 {
+ compatible = "rockchip,rk3588-rknn-core";
+ reg = <0x0 0xfdac0000 0x0 0x1000>,
+ <0x0 0xfdac1000 0x0 0x1000>,
+ <0x0 0xfdac3000 0x0 0x1000>;
+ reg-names = "pc", "cna", "core";
+ interrupts = <GIC_SPI 111 IRQ_TYPE_LEVEL_HIGH 0>;
+ clocks = <&cru ACLK_NPU1>, <&cru HCLK_NPU1>,
+ <&scmi_clk SCMI_CLK_NPU>, <&cru PCLK_NPU_ROOT>;
+ clock-names = "aclk", "hclk", "npu", "pclk";
+ assigned-clocks = <&scmi_clk SCMI_CLK_NPU>;
+ assigned-clock-rates = <200000000>;
+ resets = <&cru SRST_A_RKNN1>, <&cru SRST_H_RKNN1>;
+ reset-names = "srst_a", "srst_h";
+ power-domains = <&power RK3588_PD_NPU1>;
+ iommus = <&rknn_mmu_1>;
+ status = "disabled";
+ };
+
+ rknn_mmu_1: iommu@fdac9000 {
+ compatible = "rockchip,rk3588-iommu", "rockchip,rk3568-iommu";
+ reg = <0x0 0xfdaca000 0x0 0x100>;
+ interrupts = <GIC_SPI 111 IRQ_TYPE_LEVEL_HIGH 0>;
+ clocks = <&cru ACLK_NPU1>, <&cru HCLK_NPU1>;
+ clock-names = "aclk", "iface";
+ #iommu-cells = <0>;
+ power-domains = <&power RK3588_PD_NPU1>;
+ status = "disabled";
+ };
+
+ rknn_core_2: npu@fdad0000 {
+ compatible = "rockchip,rk3588-rknn-core";
+ reg = <0x0 0xfdad0000 0x0 0x1000>,
+ <0x0 0xfdad1000 0x0 0x1000>,
+ <0x0 0xfdad3000 0x0 0x1000>;
+ reg-names = "pc", "cna", "core";
+ interrupts = <GIC_SPI 112 IRQ_TYPE_LEVEL_HIGH 0>;
+ clocks = <&cru ACLK_NPU2>, <&cru HCLK_NPU2>,
+ <&scmi_clk SCMI_CLK_NPU>, <&cru PCLK_NPU_ROOT>;
+ clock-names = "aclk", "hclk", "npu", "pclk";
+ assigned-clocks = <&scmi_clk SCMI_CLK_NPU>;
+ assigned-clock-rates = <200000000>;
+ resets = <&cru SRST_A_RKNN2>, <&cru SRST_H_RKNN2>;
+ reset-names = "srst_a", "srst_h";
+ power-domains = <&power RK3588_PD_NPU2>;
+ iommus = <&rknn_mmu_2>;
+ status = "disabled";
+ };
+
+ rknn_mmu_2: iommu@fdad9000 {
+ compatible = "rockchip,rk3588-iommu", "rockchip,rk3568-iommu";
+ reg = <0x0 0xfdada000 0x0 0x100>;
+ interrupts = <GIC_SPI 112 IRQ_TYPE_LEVEL_HIGH 0>;
+ clocks = <&cru ACLK_NPU2>, <&cru HCLK_NPU2>;
+ clock-names = "aclk", "iface";
+ #iommu-cells = <0>;
+ power-domains = <&power RK3588_PD_NPU2>;
+ status = "disabled";
+ };
+
vpu121: video-codec@fdb50000 {
compatible = "rockchip,rk3588-vpu121", "rockchip,rk3568-vpu";
reg = <0x0 0xfdb50000 0x0 0x800>;
--
2.50.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v9 09/10] arm64: dts: rockchip: Enable the NPU on quartzpro64
2025-07-21 9:17 [PATCH v9 00/10] New DRM accel driver for Rockchip's RKNN NPU Tomeu Vizoso
` (6 preceding siblings ...)
2025-07-21 9:17 ` [PATCH v9 08/10] arm64: dts: rockchip: Add nodes for NPU and its MMU to rk3588-base Tomeu Vizoso
@ 2025-07-21 9:17 ` Tomeu Vizoso
2025-07-21 9:17 ` [PATCH v9 10/10] arm64: dts: rockchip: enable NPU on ROCK 5B Tomeu Vizoso
` (3 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: Tomeu Vizoso @ 2025-07-21 9:17 UTC (permalink / raw)
To: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Heiko Stuebner,
Oded Gabbay, Jonathan Corbet, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, Sebastian Reichel, Nicolas Frattaroli,
Kever Yang, Robin Murphy, Daniel Stone, Da Xue, Philipp Zabel,
Jeff Hugo
Cc: devicetree, linux-arm-kernel, linux-rockchip, linux-kernel,
dri-devel, linux-doc, linux-media, linaro-mm-sig, Tomeu Vizoso
Enable the nodes added in a previous commit to the rk3588s device tree.
v2:
- Split nodes (Sebastian Reichel)
- Sort nodes (Sebastian Reichel)
- Add board regulators (Sebastian Reichel)
v8:
- Remove notion of top core (Robin Murphy)
Tested-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
---
.../arm64/boot/dts/rockchip/rk3588-quartzpro64.dts | 30 ++++++++++++++++++++++
1 file changed, 30 insertions(+)
diff --git a/arch/arm64/boot/dts/rockchip/rk3588-quartzpro64.dts b/arch/arm64/boot/dts/rockchip/rk3588-quartzpro64.dts
index 78aaa6635b5d20a650aba8d8c2d0d4f498ff0d33..b2336c36da01af3b67fe347d5ff0b7c4ee6b0556 100644
--- a/arch/arm64/boot/dts/rockchip/rk3588-quartzpro64.dts
+++ b/arch/arm64/boot/dts/rockchip/rk3588-quartzpro64.dts
@@ -415,6 +415,36 @@ &pcie3x4 {
status = "okay";
};
+&rknn_core_0 {
+ npu-supply = <&vdd_npu_s0>;
+ sram-supply = <&vdd_npu_mem_s0>;
+ status = "okay";
+};
+
+&rknn_core_1 {
+ npu-supply = <&vdd_npu_s0>;
+ sram-supply = <&vdd_npu_mem_s0>;
+ status = "okay";
+};
+
+&rknn_core_2 {
+ npu-supply = <&vdd_npu_s0>;
+ sram-supply = <&vdd_npu_mem_s0>;
+ status = "okay";
+};
+
+&rknn_mmu_0 {
+ status = "okay";
+};
+
+&rknn_mmu_1 {
+ status = "okay";
+};
+
+&rknn_mmu_2 {
+ status = "okay";
+};
+
&saradc {
vref-supply = <&vcc_1v8_s0>;
status = "okay";
--
2.50.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v9 10/10] arm64: dts: rockchip: enable NPU on ROCK 5B
2025-07-21 9:17 [PATCH v9 00/10] New DRM accel driver for Rockchip's RKNN NPU Tomeu Vizoso
` (7 preceding siblings ...)
2025-07-21 9:17 ` [PATCH v9 09/10] arm64: dts: rockchip: Enable the NPU on quartzpro64 Tomeu Vizoso
@ 2025-07-21 9:17 ` Tomeu Vizoso
2025-07-21 14:55 ` [PATCH v9 00/10] New DRM accel driver for Rockchip's RKNN NPU Jeff Hugo
` (2 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: Tomeu Vizoso @ 2025-07-21 9:17 UTC (permalink / raw)
To: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Heiko Stuebner,
Oded Gabbay, Jonathan Corbet, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, Sebastian Reichel, Nicolas Frattaroli,
Kever Yang, Robin Murphy, Daniel Stone, Da Xue, Philipp Zabel,
Jeff Hugo
Cc: devicetree, linux-arm-kernel, linux-rockchip, linux-kernel,
dri-devel, linux-doc, linux-media, linaro-mm-sig, Tomeu Vizoso
From: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
The NPU on the ROCK5B uses the same regulator for both the sram-supply
and the npu's supply. Add this regulator, and enable all the NPU bits.
Also add the regulator as a domain-supply to the pd_npu power domain.
v8:
- Remove notion of top core (Robin Murphy)
Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
---
arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dtsi | 57 ++++++++++++++++++++++++
1 file changed, 57 insertions(+)
diff --git a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dtsi b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dtsi
index 6052787d2560978d2bae6cfbeea5fc1d419d583a..a1f3571b177fe00b1c169f62b7dd1d27024a663f 100644
--- a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dtsi
@@ -309,6 +309,29 @@ regulator-state-mem {
};
};
+&i2c1 {
+ pinctrl-names = "default";
+ pinctrl-0 = <&i2c1m2_xfer>;
+ status = "okay";
+
+ vdd_npu_s0: regulator@42 {
+ compatible = "rockchip,rk8602";
+ reg = <0x42>;
+ fcs,suspend-voltage-selector = <1>;
+ regulator-name = "vdd_npu_s0";
+ regulator-boot-on;
+ regulator-enable-ramp-delay = <500>;
+ regulator-min-microvolt = <550000>;
+ regulator-max-microvolt = <950000>;
+ regulator-ramp-delay = <2300>;
+ vin-supply = <&vcc5v0_sys>;
+
+ regulator-state-mem {
+ regulator-off-in-suspend;
+ };
+ };
+};
+
&i2c6 {
status = "okay";
@@ -433,6 +456,10 @@ &pd_gpu {
domain-supply = <&vdd_gpu_s0>;
};
+&pd_npu {
+ domain-supply = <&vdd_npu_s0>;
+};
+
&pinctrl {
hdmirx {
hdmirx_hpd: hdmirx-5v-detection {
@@ -487,6 +514,36 @@ &pwm1 {
status = "okay";
};
+&rknn_core_0 {
+ npu-supply = <&vdd_npu_s0>;
+ sram-supply = <&vdd_npu_s0>;
+ status = "okay";
+};
+
+&rknn_core_1 {
+ npu-supply = <&vdd_npu_s0>;
+ sram-supply = <&vdd_npu_s0>;
+ status = "okay";
+};
+
+&rknn_core_2 {
+ npu-supply = <&vdd_npu_s0>;
+ sram-supply = <&vdd_npu_s0>;
+ status = "okay";
+};
+
+&rknn_mmu_0 {
+ status = "okay";
+};
+
+&rknn_mmu_1 {
+ status = "okay";
+};
+
+&rknn_mmu_2 {
+ status = "okay";
+};
+
&saradc {
vref-supply = <&avcc_1v8_s0>;
status = "okay";
--
2.50.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH v9 00/10] New DRM accel driver for Rockchip's RKNN NPU
2025-07-21 9:17 [PATCH v9 00/10] New DRM accel driver for Rockchip's RKNN NPU Tomeu Vizoso
` (8 preceding siblings ...)
2025-07-21 9:17 ` [PATCH v9 10/10] arm64: dts: rockchip: enable NPU on ROCK 5B Tomeu Vizoso
@ 2025-07-21 14:55 ` Jeff Hugo
2025-07-21 15:24 ` Heiko Stübner
2025-07-25 16:14 ` Jeff Hugo
2025-08-11 7:52 ` (subset) " Heiko Stuebner
11 siblings, 1 reply; 16+ messages in thread
From: Jeff Hugo @ 2025-07-21 14:55 UTC (permalink / raw)
To: Tomeu Vizoso, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
Heiko Stuebner, Oded Gabbay, Jonathan Corbet, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Sumit Semwal, Christian König, Sebastian Reichel,
Nicolas Frattaroli, Kever Yang, Robin Murphy, Daniel Stone,
Da Xue, Philipp Zabel
Cc: devicetree, linux-arm-kernel, linux-rockchip, linux-kernel,
dri-devel, linux-doc, linux-media, linaro-mm-sig, Robert Foss,
Krzysztof Kozlowski
On 7/21/2025 3:17 AM, Tomeu Vizoso wrote:
> This series adds a new driver for the NPU that Rockchip includes in its
> newer SoCs, developed by them on the NVDLA base.
>
> In its current form, it supports the specific NPU in the RK3588 SoC.
>
> The userspace driver is part of Mesa and an initial draft can be found at:
>
> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29698
>
> Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
This (and the userspace component) appear ready for merge from what I
can tell. Tomeu is still working on his drm-misc access so I've offered
to merge on his behalf. Planning on waiting until Friday for any final
feedback to come in before doing so.
-Jeff
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v9 06/10] dt-bindings: npu: rockchip,rknn: Add bindings
2025-07-21 9:17 ` [PATCH v9 06/10] dt-bindings: npu: rockchip,rknn: Add bindings Tomeu Vizoso
@ 2025-07-21 15:00 ` Rob Herring (Arm)
0 siblings, 0 replies; 16+ messages in thread
From: Rob Herring (Arm) @ 2025-07-21 15:00 UTC (permalink / raw)
To: Tomeu Vizoso
Cc: Sumit Semwal, Nicolas Frattaroli, Krzysztof Kozlowski,
Jonathan Corbet, linux-rockchip, linaro-mm-sig, Thomas Zimmermann,
Robin Murphy, Conor Dooley, linux-arm-kernel, Simona Vetter,
David Airlie, Maarten Lankhorst, linux-kernel, Sebastian Reichel,
linux-media, Oded Gabbay, Kever Yang, Christian König,
devicetree, Philipp Zabel, linux-doc, Heiko Stuebner,
Daniel Stone, Da Xue, Krzysztof Kozlowski, dri-devel, Jeff Hugo,
Maxime Ripard
On Mon, 21 Jul 2025 11:17:33 +0200, Tomeu Vizoso wrote:
> Add the bindings for the Neural Processing Unit IP from Rockchip.
>
> v2:
> - Adapt to new node structure (one node per core, each with its own
> IOMMU)
> - Several misc. fixes from Sebastian Reichel
>
> v3:
> - Split register block in its constituent subblocks, and only require
> the ones that the kernel would ever use (Nicolas Frattaroli)
> - Group supplies (Rob Herring)
> - Explain the way in which the top core is special (Rob Herring)
>
> v4:
> - Change required node name to npu@ (Rob Herring and Krzysztof Kozlowski)
> - Remove unneeded items: (Krzysztof Kozlowski)
> - Fix use of minItems/maxItems (Krzysztof Kozlowski)
> - Add reg-names to list of required properties (Krzysztof Kozlowski)
> - Fix example (Krzysztof Kozlowski)
>
> v5:
> - Rename file to rockchip,rk3588-rknn-core.yaml (Krzysztof Kozlowski)
> - Streamline compatible property (Krzysztof Kozlowski)
>
> v6:
> - Remove mention to NVDLA, as the hardware is only incidentally related
> (Kever Yang)
> - Mark pclk and npu clocks as required by all clocks (Rob Herring)
>
> v7:
> - Remove allOf section, not needed now that all nodes require 4 clocks
> (Heiko Stübner)
>
> v8:
> - Remove notion of top core (Robin Murphy)
>
> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
> Tested-by: Heiko Stuebner <heiko@sntech.de>
> Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
> ---
> .../bindings/npu/rockchip,rk3588-rknn-core.yaml | 112 +++++++++++++++++++++
> 1 file changed, 112 insertions(+)
>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v9 00/10] New DRM accel driver for Rockchip's RKNN NPU
2025-07-21 14:55 ` [PATCH v9 00/10] New DRM accel driver for Rockchip's RKNN NPU Jeff Hugo
@ 2025-07-21 15:24 ` Heiko Stübner
2025-07-21 15:52 ` Jeff Hugo
0 siblings, 1 reply; 16+ messages in thread
From: Heiko Stübner @ 2025-07-21 15:24 UTC (permalink / raw)
To: Tomeu Vizoso, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
Oded Gabbay, Jonathan Corbet, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, Sebastian Reichel, Nicolas Frattaroli,
Kever Yang, Robin Murphy, Daniel Stone, Da Xue, Philipp Zabel,
Jeff Hugo
Cc: devicetree, linux-arm-kernel, linux-rockchip, linux-kernel,
dri-devel, linux-doc, linux-media, linaro-mm-sig, Robert Foss,
Krzysztof Kozlowski
Hi Jeff,
Am Montag, 21. Juli 2025, 16:55:01 Mitteleuropäische Sommerzeit schrieb Jeff Hugo:
> On 7/21/2025 3:17 AM, Tomeu Vizoso wrote:
> > This series adds a new driver for the NPU that Rockchip includes in its
> > newer SoCs, developed by them on the NVDLA base.
> >
> > In its current form, it supports the specific NPU in the RK3588 SoC.
> >
> > The userspace driver is part of Mesa and an initial draft can be found at:
> >
> > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29698
> >
> > Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
>
> This (and the userspace component) appear ready for merge from what I
> can tell. Tomeu is still working on his drm-misc access so I've offered
> to merge on his behalf. Planning on waiting until Friday for any final
> feedback to come in before doing so.
sounds great.
Just to make sure, you're planning to merge patches 1-6 (driver + binding)
into drm-misc and I'll pick up the "arm64: dts: " patches 7-10 afterwards?
Heiko
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v9 00/10] New DRM accel driver for Rockchip's RKNN NPU
2025-07-21 15:24 ` Heiko Stübner
@ 2025-07-21 15:52 ` Jeff Hugo
0 siblings, 0 replies; 16+ messages in thread
From: Jeff Hugo @ 2025-07-21 15:52 UTC (permalink / raw)
To: Heiko Stübner, Tomeu Vizoso, Rob Herring,
Krzysztof Kozlowski, Conor Dooley, Oded Gabbay, Jonathan Corbet,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, Sumit Semwal, Christian König,
Sebastian Reichel, Nicolas Frattaroli, Kever Yang, Robin Murphy,
Daniel Stone, Da Xue, Philipp Zabel
Cc: devicetree, linux-arm-kernel, linux-rockchip, linux-kernel,
dri-devel, linux-doc, linux-media, linaro-mm-sig, Robert Foss,
Krzysztof Kozlowski
On 7/21/2025 9:24 AM, Heiko Stübner wrote:
> Hi Jeff,
>
> Am Montag, 21. Juli 2025, 16:55:01 Mitteleuropäische Sommerzeit schrieb Jeff Hugo:
>> On 7/21/2025 3:17 AM, Tomeu Vizoso wrote:
>>> This series adds a new driver for the NPU that Rockchip includes in its
>>> newer SoCs, developed by them on the NVDLA base.
>>>
>>> In its current form, it supports the specific NPU in the RK3588 SoC.
>>>
>>> The userspace driver is part of Mesa and an initial draft can be found at:
>>>
>>> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29698
>>>
>>> Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
>>
>> This (and the userspace component) appear ready for merge from what I
>> can tell. Tomeu is still working on his drm-misc access so I've offered
>> to merge on his behalf. Planning on waiting until Friday for any final
>> feedback to come in before doing so.
>
> sounds great.
>
> Just to make sure, you're planning to merge patches 1-6 (driver + binding)
> into drm-misc and I'll pick up the "arm64: dts: " patches 7-10 afterwards?
That works for me. I'll plan on merging 1-6 and leaving 7-10 for you.
-Jeff
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v9 00/10] New DRM accel driver for Rockchip's RKNN NPU
2025-07-21 9:17 [PATCH v9 00/10] New DRM accel driver for Rockchip's RKNN NPU Tomeu Vizoso
` (9 preceding siblings ...)
2025-07-21 14:55 ` [PATCH v9 00/10] New DRM accel driver for Rockchip's RKNN NPU Jeff Hugo
@ 2025-07-25 16:14 ` Jeff Hugo
2025-08-11 7:52 ` (subset) " Heiko Stuebner
11 siblings, 0 replies; 16+ messages in thread
From: Jeff Hugo @ 2025-07-25 16:14 UTC (permalink / raw)
To: Tomeu Vizoso, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
Heiko Stuebner, Oded Gabbay, Jonathan Corbet, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Sumit Semwal, Christian König, Sebastian Reichel,
Nicolas Frattaroli, Kever Yang, Robin Murphy, Daniel Stone,
Da Xue, Philipp Zabel
Cc: devicetree, linux-arm-kernel, linux-rockchip, linux-kernel,
dri-devel, linux-doc, linux-media, linaro-mm-sig, Robert Foss,
Krzysztof Kozlowski
On 7/21/2025 3:17 AM, Tomeu Vizoso wrote:
> This series adds a new driver for the NPU that Rockchip includes in its
> newer SoCs, developed by them on the NVDLA base.
>
> In its current form, it supports the specific NPU in the RK3588 SoC.
>
> The userspace driver is part of Mesa and an initial draft can be found at:
>
> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29698
>
> Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Patches 1-6 pushed to drm-misc-next.
-Jeff
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: (subset) [PATCH v9 00/10] New DRM accel driver for Rockchip's RKNN NPU
2025-07-21 9:17 [PATCH v9 00/10] New DRM accel driver for Rockchip's RKNN NPU Tomeu Vizoso
` (10 preceding siblings ...)
2025-07-25 16:14 ` Jeff Hugo
@ 2025-08-11 7:52 ` Heiko Stuebner
11 siblings, 0 replies; 16+ messages in thread
From: Heiko Stuebner @ 2025-08-11 7:52 UTC (permalink / raw)
To: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Oded Gabbay,
Jonathan Corbet, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, Sebastian Reichel, Nicolas Frattaroli,
Kever Yang, Robin Murphy, Daniel Stone, Da Xue, Philipp Zabel,
Jeff Hugo, Tomeu Vizoso
Cc: Heiko Stuebner, devicetree, linux-arm-kernel, linux-rockchip,
linux-kernel, dri-devel, linux-doc, linux-media, linaro-mm-sig,
Robert Foss, Krzysztof Kozlowski
On Mon, 21 Jul 2025 11:17:27 +0200, Tomeu Vizoso wrote:
> This series adds a new driver for the NPU that Rockchip includes in its
> newer SoCs, developed by them on the NVDLA base.
>
> In its current form, it supports the specific NPU in the RK3588 SoC.
>
> The userspace driver is part of Mesa and an initial draft can be found at:
>
> [...]
Applied, thanks!
[07/10] arm64: dts: rockchip: add pd_npu label for RK3588 power domains
commit: 6d64bceb97a1c93b3cc2131f7e023ef2f9cf33f2
[08/10] arm64: dts: rockchip: Add nodes for NPU and its MMU to rk3588-base
commit: a31dfc060a747f08705ace36d8de006bc13318fa
[09/10] arm64: dts: rockchip: Enable the NPU on quartzpro64
commit: 640366d644b1e282771a09c72be37162b6eda438
[10/10] arm64: dts: rockchip: enable NPU on ROCK 5B
commit: 3af6a83fc85033e44ce5cd0e1de54dc20b7e15af
Best regards,
--
Heiko Stuebner <heiko@sntech.de>
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2025-08-11 8:21 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-21 9:17 [PATCH v9 00/10] New DRM accel driver for Rockchip's RKNN NPU Tomeu Vizoso
2025-07-21 9:17 ` [PATCH v9 02/10] accel/rocket: Add a new driver for Rockchip's NPU Tomeu Vizoso
2025-07-21 9:17 ` [PATCH v9 03/10] accel/rocket: Add IOCTL for BO creation Tomeu Vizoso
2025-07-21 9:17 ` [PATCH v9 04/10] accel/rocket: Add job submission IOCTL Tomeu Vizoso
2025-07-21 9:17 ` [PATCH v9 05/10] accel/rocket: Add IOCTLs for synchronizing memory accesses Tomeu Vizoso
2025-07-21 9:17 ` [PATCH v9 06/10] dt-bindings: npu: rockchip,rknn: Add bindings Tomeu Vizoso
2025-07-21 15:00 ` Rob Herring (Arm)
2025-07-21 9:17 ` [PATCH v9 07/10] arm64: dts: rockchip: add pd_npu label for RK3588 power domains Tomeu Vizoso
2025-07-21 9:17 ` [PATCH v9 08/10] arm64: dts: rockchip: Add nodes for NPU and its MMU to rk3588-base Tomeu Vizoso
2025-07-21 9:17 ` [PATCH v9 09/10] arm64: dts: rockchip: Enable the NPU on quartzpro64 Tomeu Vizoso
2025-07-21 9:17 ` [PATCH v9 10/10] arm64: dts: rockchip: enable NPU on ROCK 5B Tomeu Vizoso
2025-07-21 14:55 ` [PATCH v9 00/10] New DRM accel driver for Rockchip's RKNN NPU Jeff Hugo
2025-07-21 15:24 ` Heiko Stübner
2025-07-21 15:52 ` Jeff Hugo
2025-07-25 16:14 ` Jeff Hugo
2025-08-11 7:52 ` (subset) " Heiko Stuebner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).