Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v3 0/9] accel: rocket: Add RK3568 NPU support
@ 2026-06-04 13:52 Midgy BALON
  2026-06-04 13:52 ` [RFC PATCH v3 1/9] accel: rocket: Introduce per-SoC rocket_soc_data Midgy BALON
                   ` (9 more replies)
  0 siblings, 10 replies; 26+ messages in thread
From: Midgy BALON @ 2026-06-04 13:52 UTC (permalink / raw)
  To: tomeu, ogabbay, heiko, robh, krzk+dt, conor+dt, joro, will
  Cc: robin.murphy, dri-devel, linux-rockchip, devicetree,
	linux-arm-kernel, iommu, linux-kernel

RFC, not for merge. End-to-end inference does not produce correct output
yet (see Status), so per the v2 discussion this is a request for design
feedback. It now probes, attaches, and submits cleanly on a stock
v7.1-rc6 tree; what remains is one hardware-internal issue.

The RK3568 has a single NVDLA-derived NPU core, the same IP family as the
RK3588 NPU the driver already supports; the register layout matches. The
RK3568 differences are a 32-bit NPU AXI/IOMMU (vs 40-bit) and explicit
PVTPLL/PMU bring-up to power and de-idle the NPU before it is reachable.

Patches:
  1-2  rocket: per-SoC data struct, then derive DMA width and core count
       from match data (refactors, no functional change).
  3    rocket: RK3568 SoC data + PVTPLL/PMU/NOC bring-up.
  4    rocket: reset the NPU before detaching the IOMMU on a job timeout
       (the detach otherwise stalls a wedged AXI master and WARNs).
  5    rocket: keep the IOMMU domain attached across jobs instead of
       re-attaching per job (the per-job rk_iommu handshake on the idle
       NPU MMU is slow and noisy).
  6    iommu/rockchip: clear AUTO_GATING bit 1 on the RK356x v1 IOMMU so
       the page-walker keeps its clock (else a TLB-miss walk never
       completes).
  7    dt-bindings: add the RK3568 NPU compatible.
  8-9  arm64 dts: add the NPU and its IOMMU, and enable them on ROCK 3B.

Dependency. The NPU MMU is rockchip-iommu v1 (32-bit) while the rest of
the RK3568 uses v2 (40-bit). They cannot coexist until the driver carries
per-device ops; this series is developed on top of Simon Xue's
"iommu/rockchip: Drop global rk_ops in favor of per-device ops" [1].
Without it the NPU IOMMU fails to probe on a full RK3568 boot.

Power bring-up. The NPU is brought up through the power-domain layer (no
driver hack): the NPU power-domain keeps its clocks but drops the pm_qos
phandle (qos_npu sits behind the gated NPU NoC, so genpd's power-off QoS
save faults reading it), and vdd_npu is marked always-on so the rail is
up before genpd de-idles the NoC at power-on. The PMU de-idle then ACKs
without PVTPLL running; PVTPLL is only needed for compute.

Status. On v7.1-rc6 the driver probes, creates /dev/accel/accel0,
attaches an IOMMU domain, and submits jobs; the program controller
fetches and broadcasts the command list. Inference output is still wrong,
and the cause is split across three layers:
  - kernel (this series): the RK3568 differences appear handled;
  - mesa/Teflon userspace: still emits RK3588-tuned config, wrong for
    RK3568 (to be filed separately on mesa-dev);
  - hardware: with corrected config the NPU's DMA reads the full input
    and weight tensors (confirmed via its DMA bandwidth counters), but
    the MAC/output stage never completes, the job times out, and the
    output stays at the buffer's zero-point. I have not found the missing
    step; it is not in the command list (replaying the vendor's
    byte-exact command list behaves the same). Pointers welcome,
    especially from anyone with RK3568 NPU experience.

Known residual. On the first IOMMU attach the NPU MMU is idle with paging
already enabled; the rk_iommu stall/reset handshake does not complete in
that state and logs one burst of timeouts before the (kept) domain
settles. It is harmless here because the job times out regardless, but it
points at an idle-MMU reconfiguration corner the rk_iommu code does not
handle on this block.

[1] https://lore.kernel.org/linux-rockchip/20260310105303.128859-1-xxm@rock-chips.com/

Changes since v2:
  - Tagged RFC; now tested on a stock v7.1-rc6 tree.
  - Bring-up moved into the power-domain/DT layer (no initcall hack).
  - Added the IOMMU detach-on-timeout and attach-once driver fixes.
  - Split the driver patch (Heiko): soc_data / match-data / RK3568.
  - Derive DMA width and core count from match data; drop the DT rescans.
  - Binding describes the hardware; added the missing $ref on rockchip,pmu.
  - Disclosed the per-device-ops IOMMU dependency.

Midgy BALON (9):
  accel: rocket: Introduce per-SoC rocket_soc_data
  accel: rocket: Derive DMA width and core count from match data
  accel: rocket: Add RK3568 SoC support
  accel: rocket: Reset the NPU before detaching the IOMMU on timeout
  accel: rocket: Keep the IOMMU domain attached across jobs
  iommu/rockchip: Clear AUTO_GATING bit 1 on the RK356x v1 IOMMU
  dt-bindings: npu: rockchip,rk3588-rknn-core: Add RK3568
  arm64: dts: rockchip: rk356x: Add the NPU and its IOMMU
  arm64: dts: rockchip: rk3568-rock-3b: Enable the NPU

 .../npu/rockchip,rk3588-rknn-core.yaml        | 18 ++++-
 .../boot/dts/rockchip/rk3568-rock-3b.dts      | 14 +++-
 arch/arm64/boot/dts/rockchip/rk356x-base.dtsi | 38 +++++++++++
 drivers/accel/rocket/rocket_core.c            | 22 ++++++-
 drivers/accel/rocket/rocket_core.h            | 19 ++++++
 drivers/accel/rocket/rocket_device.c          | 15 ++---
 drivers/accel/rocket/rocket_device.h          |  3 +-
 drivers/accel/rocket/rocket_drv.c             | 66 ++++++++++++++++++-
 drivers/accel/rocket/rocket_job.c             | 35 ++++++++--
 drivers/iommu/rockchip-iommu.c                | 12 ++++
 10 files changed, 219 insertions(+), 23 deletions(-)


base-commit: 52c800fdcf11888ebeb50c3d707f782cc15b66eb
-- 
2.39.5



^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC PATCH v3 1/9] accel: rocket: Introduce per-SoC rocket_soc_data
  2026-06-04 13:52 [RFC PATCH v3 0/9] accel: rocket: Add RK3568 NPU support Midgy BALON
@ 2026-06-04 13:52 ` Midgy BALON
  2026-06-04 13:52 ` [RFC PATCH v3 2/9] accel: rocket: Derive DMA width and core count from match data Midgy BALON
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: Midgy BALON @ 2026-06-04 13:52 UTC (permalink / raw)
  To: tomeu, ogabbay, heiko, robh, krzk+dt, conor+dt, joro, will
  Cc: robin.murphy, dri-devel, linux-rockchip, devicetree,
	linux-arm-kernel, iommu, linux-kernel

Add a per-SoC data structure carried in the OF match table, currently
holding only the NPU AXI address width, and use it for the per-core DMA
mask instead of a hardcoded 40-bit value.  No functional change: the
RK3588 AXI master is 40-bit.  This prepares for SoCs with a narrower
address width.

Signed-off-by: Midgy BALON <midgy971@gmail.com>
---
 drivers/accel/rocket/rocket_core.c |  7 ++++++-
 drivers/accel/rocket/rocket_core.h | 11 +++++++++++
 drivers/accel/rocket/rocket_drv.c  |  6 +++++-
 3 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/drivers/accel/rocket/rocket_core.c b/drivers/accel/rocket/rocket_core.c
index b3b2fa9ba645a..09c445af7de73 100644
--- a/drivers/accel/rocket/rocket_core.c
+++ b/drivers/accel/rocket/rocket_core.c
@@ -7,6 +7,7 @@
 #include <linux/dma-mapping.h>
 #include <linux/err.h>
 #include <linux/iommu.h>
+#include <linux/of.h>
 #include <linux/platform_device.h>
 #include <linux/pm_runtime.h>
 #include <linux/reset.h>
@@ -21,6 +22,10 @@ int rocket_core_init(struct rocket_core *core)
 	u32 version;
 	int err = 0;
 
+	core->soc_data = of_device_get_match_data(dev);
+	if (!core->soc_data)
+		return dev_err_probe(dev, -EINVAL, "missing SoC match data\n");
+
 	core->resets[0].id = "srst_a";
 	core->resets[1].id = "srst_h";
 	err = devm_reset_control_bulk_get_exclusive(&pdev->dev, ARRAY_SIZE(core->resets),
@@ -52,7 +57,7 @@ int rocket_core_init(struct rocket_core *core)
 
 	dma_set_max_seg_size(dev, UINT_MAX);
 
-	err = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(40));
+	err = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(core->soc_data->dma_bits));
 	if (err)
 		return err;
 
diff --git a/drivers/accel/rocket/rocket_core.h b/drivers/accel/rocket/rocket_core.h
index f6d7382854ca9..8ee105a0be40e 100644
--- a/drivers/accel/rocket/rocket_core.h
+++ b/drivers/accel/rocket/rocket_core.h
@@ -12,6 +12,16 @@
 
 #include "rocket_registers.h"
 
+struct rocket_core;
+
+/**
+ * struct rocket_soc_data - per-SoC configuration data
+ * @dma_bits: Physical address width reachable by the NPU's AXI master.
+ */
+struct rocket_soc_data {
+	unsigned int dma_bits;
+};
+
 #define rocket_pc_readl(core, reg) \
 	readl((core)->pc_iomem + (REG_PC_##reg))
 #define rocket_pc_writel(core, reg, value) \
@@ -31,6 +41,7 @@ struct rocket_core {
 	struct device *dev;
 	struct rocket_device *rdev;
 	unsigned int index;
+	const struct rocket_soc_data *soc_data;
 
 	int irq;
 	void __iomem *pc_iomem;
diff --git a/drivers/accel/rocket/rocket_drv.c b/drivers/accel/rocket/rocket_drv.c
index 8bbbce594883e..384c38e13acce 100644
--- a/drivers/accel/rocket/rocket_drv.c
+++ b/drivers/accel/rocket/rocket_drv.c
@@ -213,8 +213,12 @@ static void rocket_remove(struct platform_device *pdev)
 	}
 }
 
+static const struct rocket_soc_data rk3588_soc_data = {
+	.dma_bits = 40,
+};
+
 static const struct of_device_id dt_match[] = {
-	{ .compatible = "rockchip,rk3588-rknn-core" },
+	{ .compatible = "rockchip,rk3588-rknn-core", .data = &rk3588_soc_data },
 	{}
 };
 MODULE_DEVICE_TABLE(of, dt_match);
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC PATCH v3 2/9] accel: rocket: Derive DMA width and core count from match data
  2026-06-04 13:52 [RFC PATCH v3 0/9] accel: rocket: Add RK3568 NPU support Midgy BALON
  2026-06-04 13:52 ` [RFC PATCH v3 1/9] accel: rocket: Introduce per-SoC rocket_soc_data Midgy BALON
@ 2026-06-04 13:52 ` Midgy BALON
  2026-06-04 13:52 ` [RFC PATCH v3 3/9] accel: rocket: Add RK3568 SoC support Midgy BALON
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: Midgy BALON @ 2026-06-04 13:52 UTC (permalink / raw)
  To: tomeu, ogabbay, heiko, robh, krzk+dt, conor+dt, joro, will
  Cc: robin.murphy, dri-devel, linux-rockchip, devicetree,
	linux-arm-kernel, iommu, linux-kernel

The probe already has the per-SoC match data, which now records the core
count and DMA width.  Use it for the cores array allocation and the
device DMA mask instead of re-scanning the device tree for available core
nodes.

Signed-off-by: Midgy BALON <midgy971@gmail.com>
---
 drivers/accel/rocket/rocket_core.h   |  2 ++
 drivers/accel/rocket/rocket_device.c | 15 +++++----------
 drivers/accel/rocket/rocket_device.h |  3 ++-
 drivers/accel/rocket/rocket_drv.c    |  7 ++++++-
 4 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/drivers/accel/rocket/rocket_core.h b/drivers/accel/rocket/rocket_core.h
index 8ee105a0be40e..d6421251670dc 100644
--- a/drivers/accel/rocket/rocket_core.h
+++ b/drivers/accel/rocket/rocket_core.h
@@ -16,9 +16,11 @@ struct rocket_core;
 
 /**
  * struct rocket_soc_data - per-SoC configuration data
+ * @num_cores: Number of NPU cores in this SoC.
  * @dma_bits: Physical address width reachable by the NPU's AXI master.
  */
 struct rocket_soc_data {
+	unsigned int num_cores;
 	unsigned int dma_bits;
 };
 
diff --git a/drivers/accel/rocket/rocket_device.c b/drivers/accel/rocket/rocket_device.c
index 46e6ee1e72c5f..6186f4faa3a2a 100644
--- a/drivers/accel/rocket/rocket_device.c
+++ b/drivers/accel/rocket/rocket_device.c
@@ -6,18 +6,16 @@
 #include <linux/clk.h>
 #include <linux/dma-mapping.h>
 #include <linux/platform_device.h>
-#include <linux/of.h>
 
 #include "rocket_device.h"
 
 struct rocket_device *rocket_device_init(struct platform_device *pdev,
-					 const struct drm_driver *rocket_drm_driver)
+					 const struct drm_driver *rocket_drm_driver,
+					 const struct rocket_soc_data *soc_data)
 {
 	struct device *dev = &pdev->dev;
-	struct device_node *core_node;
 	struct rocket_device *rdev;
 	struct drm_device *ddev;
-	unsigned int num_cores = 0;
 	int err;
 
 	rdev = devm_drm_dev_alloc(dev, rocket_drm_driver, struct rocket_device, ddev);
@@ -27,17 +25,14 @@ struct rocket_device *rocket_device_init(struct platform_device *pdev,
 	ddev = &rdev->ddev;
 	dev_set_drvdata(dev, rdev);
 
-	for_each_compatible_node(core_node, NULL, "rockchip,rk3588-rknn-core")
-		if (of_device_is_available(core_node))
-			num_cores++;
-
-	rdev->cores = devm_kcalloc(dev, num_cores, sizeof(*rdev->cores), GFP_KERNEL);
+	rdev->cores = devm_kcalloc(dev, soc_data->num_cores, sizeof(*rdev->cores),
+				   GFP_KERNEL);
 	if (!rdev->cores)
 		return ERR_PTR(-ENOMEM);
 
 	dma_set_max_seg_size(dev, UINT_MAX);
 
-	err = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(40));
+	err = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(soc_data->dma_bits));
 	if (err)
 		return ERR_PTR(err);
 
diff --git a/drivers/accel/rocket/rocket_device.h b/drivers/accel/rocket/rocket_device.h
index ce662abc01d3d..2f74e078974e3 100644
--- a/drivers/accel/rocket/rocket_device.h
+++ b/drivers/accel/rocket/rocket_device.h
@@ -22,7 +22,8 @@ struct rocket_device {
 };
 
 struct rocket_device *rocket_device_init(struct platform_device *pdev,
-					 const struct drm_driver *rocket_drm_driver);
+					 const struct drm_driver *rocket_drm_driver,
+					 const struct rocket_soc_data *soc_data);
 void rocket_device_fini(struct rocket_device *rdev);
 #define to_rocket_device(drm_dev) \
 	((struct rocket_device *)(container_of((drm_dev), struct rocket_device, ddev)))
diff --git a/drivers/accel/rocket/rocket_drv.c b/drivers/accel/rocket/rocket_drv.c
index 384c38e13acce..c18840e5aff76 100644
--- a/drivers/accel/rocket/rocket_drv.c
+++ b/drivers/accel/rocket/rocket_drv.c
@@ -159,11 +159,15 @@ static const struct drm_driver rocket_drm_driver = {
 
 static int rocket_probe(struct platform_device *pdev)
 {
+	const struct rocket_soc_data *soc_data = of_device_get_match_data(&pdev->dev);
 	int ret;
 
+	if (!soc_data)
+		return -EINVAL;
+
 	if (rdev == NULL) {
 		/* First core probing, initialize DRM device. */
-		rdev = rocket_device_init(drm_dev, &rocket_drm_driver);
+		rdev = rocket_device_init(drm_dev, &rocket_drm_driver, soc_data);
 		if (IS_ERR(rdev)) {
 			dev_err(&pdev->dev, "failed to initialize rocket device\n");
 			return PTR_ERR(rdev);
@@ -214,6 +218,7 @@ static void rocket_remove(struct platform_device *pdev)
 }
 
 static const struct rocket_soc_data rk3588_soc_data = {
+	.num_cores = 3,
 	.dma_bits = 40,
 };
 
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC PATCH v3 3/9] accel: rocket: Add RK3568 SoC support
  2026-06-04 13:52 [RFC PATCH v3 0/9] accel: rocket: Add RK3568 NPU support Midgy BALON
  2026-06-04 13:52 ` [RFC PATCH v3 1/9] accel: rocket: Introduce per-SoC rocket_soc_data Midgy BALON
  2026-06-04 13:52 ` [RFC PATCH v3 2/9] accel: rocket: Derive DMA width and core count from match data Midgy BALON
@ 2026-06-04 13:52 ` Midgy BALON
  2026-06-04 13:52 ` [RFC PATCH v3 4/9] accel: rocket: Reset the NPU before detaching the IOMMU on timeout Midgy BALON
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: Midgy BALON @ 2026-06-04 13:52 UTC (permalink / raw)
  To: tomeu, ogabbay, heiko, robh, krzk+dt, conor+dt, joro, will
  Cc: robin.murphy, dri-devel, linux-rockchip, devicetree,
	linux-arm-kernel, iommu, linux-kernel

The RK3568 has a single core of the same NVDLA-derived NPU IP as the
RK3588, with a 32-bit AXI master.  Unlike the RK3588 it must be powered
on and de-idled through the PMU, and its PVTPLL clock started via SCMI,
before the NPU is reachable.  Add rk3568_soc_data with an noc_init
callback performing this bring-up.

Signed-off-by: Midgy BALON <midgy971@gmail.com>
---
 drivers/accel/rocket/rocket_core.c |  9 +++++
 drivers/accel/rocket/rocket_core.h |  3 ++
 drivers/accel/rocket/rocket_drv.c  | 53 ++++++++++++++++++++++++++++++
 3 files changed, 65 insertions(+)

diff --git a/drivers/accel/rocket/rocket_core.c b/drivers/accel/rocket/rocket_core.c
index 09c445af7de73..a8de876365873 100644
--- a/drivers/accel/rocket/rocket_core.c
+++ b/drivers/accel/rocket/rocket_core.c
@@ -88,6 +88,15 @@ int rocket_core_init(struct rocket_core *core)
 		return err;
 	}
 
+	if (core->soc_data->noc_init) {
+		err = core->soc_data->noc_init(core);
+		if (err) {
+			pm_runtime_put_sync(dev);
+			rocket_job_fini(core);
+			return err;
+		}
+	}
+
 	version = rocket_pc_readl(core, VERSION);
 	version += rocket_pc_readl(core, VERSION_NUM) & 0xffff;
 
diff --git a/drivers/accel/rocket/rocket_core.h b/drivers/accel/rocket/rocket_core.h
index d6421251670dc..66d138a8ed773 100644
--- a/drivers/accel/rocket/rocket_core.h
+++ b/drivers/accel/rocket/rocket_core.h
@@ -18,10 +18,13 @@ struct rocket_core;
  * struct rocket_soc_data - per-SoC configuration data
  * @num_cores: Number of NPU cores in this SoC.
  * @dma_bits: Physical address width reachable by the NPU's AXI master.
+ * @noc_init: Optional callback to power on and de-idle the NPU NOC bus.
+ *            Required on RK3568, where this is done through the PMU.
  */
 struct rocket_soc_data {
 	unsigned int num_cores;
 	unsigned int dma_bits;
+	int (*noc_init)(struct rocket_core *core);
 };
 
 #define rocket_pc_readl(core, reg) \
diff --git a/drivers/accel/rocket/rocket_drv.c b/drivers/accel/rocket/rocket_drv.c
index c18840e5aff76..5a72d0b5f4dff 100644
--- a/drivers/accel/rocket/rocket_drv.c
+++ b/drivers/accel/rocket/rocket_drv.c
@@ -9,9 +9,11 @@
 #include <linux/clk.h>
 #include <linux/err.h>
 #include <linux/iommu.h>
+#include <linux/mfd/syscon.h>
 #include <linux/of.h>
 #include <linux/platform_device.h>
 #include <linux/pm_runtime.h>
+#include <linux/regmap.h>
 
 #include "rocket_device.h"
 #include "rocket_drv.h"
@@ -217,12 +219,63 @@ static void rocket_remove(struct platform_device *pdev)
 	}
 }
 
+/*
+ * On RK3568 the NPU NOC bus is gated and idle out of reset and must be
+ * powered on and de-idled through the PMU before the NPU is reachable.  PMU
+ * registers use a write-mask protocol: the upper 16 bits enable writes to the
+ * matching lower 16 bits.
+ *
+ * The NPU's high-speed clock is a PVTPLL managed by TF-A via SCMI and must be
+ * running before the NOC acknowledges the de-idle request.  Force a real SCMI
+ * rate change (an intermediate rate defeats the clock framework's
+ * unchanged-rate shortcut) now that the power domain is on and clocks enabled.
+ */
+#define ROCKET_RK3568_SCMI_CLK	2
+
+static int rk3568_noc_init(struct rocket_core *core)
+{
+	struct regmap *pmu;
+	unsigned int val;
+	int ret;
+
+	clk_set_rate(core->clks[ROCKET_RK3568_SCMI_CLK].clk, 600000000UL);
+	clk_set_rate(core->clks[ROCKET_RK3568_SCMI_CLK].clk, 1000000000UL);
+
+	pmu = syscon_regmap_lookup_by_phandle(core->dev->of_node, "rockchip,pmu");
+	if (IS_ERR(pmu))
+		return dev_err_probe(core->dev, PTR_ERR(pmu),
+				     "failed to get PMU regmap\n");
+
+	/* Power on the NPU power domain (PWR_GATE_SFTCON bit 1 = 0). */
+	regmap_write(pmu, 0xa0, BIT(1 + 16));
+
+	/* Disable NPU NOC auto-idle (NOC_AUTO_CON0 bit 2). */
+	regmap_write(pmu, 0x70, BIT(2 + 16));
+
+	/* Request NPU bus de-idle (BUS_IDLE_SFTCON0 bit 2 = 0). */
+	regmap_write(pmu, 0x50, BIT(2 + 16));
+
+	/* Wait for the bus to report active (BUS_IDLE_ST bit 2 = 0). */
+	ret = regmap_read_poll_timeout(pmu, 0x68, val, !(val & BIT(2)), 10, 1000);
+	if (ret)
+		dev_err(core->dev, "timed out waiting for NPU bus de-idle\n");
+
+	return ret;
+}
+
+static const struct rocket_soc_data rk3568_soc_data = {
+	.num_cores = 1,
+	.dma_bits = 32,
+	.noc_init = rk3568_noc_init,
+};
+
 static const struct rocket_soc_data rk3588_soc_data = {
 	.num_cores = 3,
 	.dma_bits = 40,
 };
 
 static const struct of_device_id dt_match[] = {
+	{ .compatible = "rockchip,rk3568-rknn-core", .data = &rk3568_soc_data },
 	{ .compatible = "rockchip,rk3588-rknn-core", .data = &rk3588_soc_data },
 	{}
 };
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC PATCH v3 4/9] accel: rocket: Reset the NPU before detaching the IOMMU on timeout
  2026-06-04 13:52 [RFC PATCH v3 0/9] accel: rocket: Add RK3568 NPU support Midgy BALON
                   ` (2 preceding siblings ...)
  2026-06-04 13:52 ` [RFC PATCH v3 3/9] accel: rocket: Add RK3568 SoC support Midgy BALON
@ 2026-06-04 13:52 ` Midgy BALON
  2026-06-04 13:52 ` [RFC PATCH v3 5/9] accel: rocket: Keep the IOMMU domain attached across jobs Midgy BALON
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: Midgy BALON @ 2026-06-04 13:52 UTC (permalink / raw)
  To: tomeu, ogabbay, heiko, robh, krzk+dt, conor+dt, joro, will
  Cc: robin.murphy, dri-devel, linux-rockchip, devicetree,
	linux-arm-kernel, iommu, linux-kernel

On a job timeout the NPU AXI master can be left wedged with
outstanding transactions. rocket_reset() detached the IOMMU group
before resetting the hardware, so iommu_detach_group() ->
__iommu_group_set_core_domain() asked the rk_iommu to stall and wait
for the in-flight transactions to drain. They never did, the stall
request timed out (-ETIMEDOUT) and the IOMMU core WARNed:

  WARNING: drivers/iommu/iommu.c:157 __iommu_group_set_core_domain
    iommu_detach_group
    rocket_reset
    rocket_job_timedout

Assert the core reset first: it quiesces the AXI master so the
following IOMMU detach completes cleanly. Move the detach after
rocket_core_reset() and out of the job_lock (it does not touch
in_flight_job).

Signed-off-by: Midgy BALON <midgy971@gmail.com>
---
 drivers/accel/rocket/rocket_job.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/accel/rocket/rocket_job.c b/drivers/accel/rocket/rocket_job.c
index ac51bff39833f..e25234261536b 100644
--- a/drivers/accel/rocket/rocket_job.c
+++ b/drivers/accel/rocket/rocket_job.c
@@ -364,14 +364,20 @@ rocket_reset(struct rocket_core *core, struct drm_sched_job *bad)
 		if (core->in_flight_job)
 			pm_runtime_put_noidle(core->dev);
 
-		iommu_detach_group(NULL, core->iommu_group);
-
 		core->in_flight_job = NULL;
 	}
 
-	/* Proceed with reset now. */
+	/*
+	 * Reset the NPU hardware before detaching the IOMMU. A timed-out job
+	 * leaves the NPU AXI master wedged; detaching the IOMMU then issues a
+	 * stall request that never drains and times out (warning in the IOMMU
+	 * core). Asserting the core reset first quiesces the master so the
+	 * detach completes cleanly.
+	 */
 	rocket_core_reset(core);
 
+	iommu_detach_group(NULL, core->iommu_group);
+
 	/* NPU has been reset, we can clear the reset pending bit. */
 	atomic_set(&core->reset.pending, 0);
 
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC PATCH v3 5/9] accel: rocket: Keep the IOMMU domain attached across jobs
  2026-06-04 13:52 [RFC PATCH v3 0/9] accel: rocket: Add RK3568 NPU support Midgy BALON
                   ` (3 preceding siblings ...)
  2026-06-04 13:52 ` [RFC PATCH v3 4/9] accel: rocket: Reset the NPU before detaching the IOMMU on timeout Midgy BALON
@ 2026-06-04 13:52 ` Midgy BALON
  2026-06-04 13:52 ` [RFC PATCH v3 6/9] iommu/rockchip: Clear AUTO_GATING bit 1 on the RK356x v1 IOMMU Midgy BALON
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: Midgy BALON @ 2026-06-04 13:52 UTC (permalink / raw)
  To: tomeu, ogabbay, heiko, robh, krzk+dt, conor+dt, joro, will
  Cc: robin.murphy, dri-devel, linux-rockchip, devicetree,
	linux-arm-kernel, iommu, linux-kernel

rocket attached the job's IOMMU domain in rocket_job_run() and
detached it again on every completion and reset. Each attach/detach
toggles the rk_iommu stall/force-reset/paging handshake, and on
RK3568 the NPU MMU is idle between jobs, so that handshake times out
and logs a burst of "stall/paging request timed out" errors for
every job.

Attach the per-context domain once and keep it: track the attached
domain in the core, swap it only when a job from a different context
runs, and detach it at core teardown. A reference on the attached
domain is held so it outlives the job that first attached it and is
released on swap/teardown.

Signed-off-by: Midgy BALON <midgy971@gmail.com>
---
 drivers/accel/rocket/rocket_core.c |  6 ++++++
 drivers/accel/rocket/rocket_core.h |  3 +++
 drivers/accel/rocket/rocket_job.c  | 27 +++++++++++++++++++++------
 3 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/drivers/accel/rocket/rocket_core.c b/drivers/accel/rocket/rocket_core.c
index a8de876365873..634f78dfe2887 100644
--- a/drivers/accel/rocket/rocket_core.c
+++ b/drivers/accel/rocket/rocket_core.c
@@ -13,6 +13,7 @@
 #include <linux/reset.h>
 
 #include "rocket_core.h"
+#include "rocket_drv.h"
 #include "rocket_job.h"
 
 int rocket_core_init(struct rocket_core *core)
@@ -112,6 +113,11 @@ void rocket_core_fini(struct rocket_core *core)
 {
 	pm_runtime_dont_use_autosuspend(core->dev);
 	pm_runtime_disable(core->dev);
+	if (core->attached_domain) {
+		iommu_detach_group(NULL, core->iommu_group);
+		rocket_iommu_domain_put(core->attached_domain);
+		core->attached_domain = NULL;
+	}
 	iommu_group_put(core->iommu_group);
 	core->iommu_group = NULL;
 	rocket_job_fini(core);
diff --git a/drivers/accel/rocket/rocket_core.h b/drivers/accel/rocket/rocket_core.h
index 66d138a8ed773..05a197a9c0113 100644
--- a/drivers/accel/rocket/rocket_core.h
+++ b/drivers/accel/rocket/rocket_core.h
@@ -42,6 +42,8 @@ struct rocket_soc_data {
 #define rocket_core_writel(core, reg, value) \
 	writel(value, (core)->core_iomem + (REG_CORE_##reg) - REG_CORE_S_STATUS)
 
+struct rocket_iommu_domain;
+
 struct rocket_core {
 	struct device *dev;
 	struct rocket_device *rdev;
@@ -56,6 +58,7 @@ struct rocket_core {
 	struct reset_control_bulk_data resets[2];
 
 	struct iommu_group *iommu_group;
+	struct rocket_iommu_domain *attached_domain;
 
 	struct mutex job_lock;
 	struct rocket_job *in_flight_job;
diff --git a/drivers/accel/rocket/rocket_job.c b/drivers/accel/rocket/rocket_job.c
index e25234261536b..b248371be8a1e 100644
--- a/drivers/accel/rocket/rocket_job.c
+++ b/drivers/accel/rocket/rocket_job.c
@@ -9,6 +9,7 @@
 #include <drm/rocket_accel.h>
 #include <linux/interrupt.h>
 #include <linux/iommu.h>
+#include <linux/kref.h>
 #include <linux/platform_device.h>
 #include <linux/pm_runtime.h>
 
@@ -314,9 +315,26 @@ static struct dma_fence *rocket_job_run(struct drm_sched_job *sched_job)
 	if (ret < 0)
 		return fence;
 
-	ret = iommu_attach_group(job->domain->domain, core->iommu_group);
-	if (ret < 0)
-		return fence;
+	/*
+	 * Attach the job's IOMMU domain only when it differs from the one
+	 * already attached. Re-attaching per job toggles the rk_iommu
+	 * stall/reset handshake on an idle NPU MMU, which is slow and
+	 * noisy; keep the domain attached across jobs instead.
+	 */
+	if (core->attached_domain != job->domain) {
+		if (core->attached_domain) {
+			iommu_detach_group(NULL, core->iommu_group);
+			rocket_iommu_domain_put(core->attached_domain);
+			core->attached_domain = NULL;
+		}
+
+		ret = iommu_attach_group(job->domain->domain, core->iommu_group);
+		if (ret < 0)
+			return fence;
+
+		kref_get(&job->domain->kref);
+		core->attached_domain = job->domain;
+	}
 
 	scoped_guard(mutex, &core->job_lock) {
 		core->in_flight_job = job;
@@ -340,7 +358,6 @@ static void rocket_job_handle_irq(struct rocket_core *core)
 				return;
 			}
 
-			iommu_detach_group(NULL, iommu_group_get(core->dev));
 			dma_fence_signal(core->in_flight_job->done_fence);
 			pm_runtime_put_autosuspend(core->dev);
 			core->in_flight_job = NULL;
@@ -376,8 +393,6 @@ rocket_reset(struct rocket_core *core, struct drm_sched_job *bad)
 	 */
 	rocket_core_reset(core);
 
-	iommu_detach_group(NULL, core->iommu_group);
-
 	/* NPU has been reset, we can clear the reset pending bit. */
 	atomic_set(&core->reset.pending, 0);
 
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC PATCH v3 6/9] iommu/rockchip: Clear AUTO_GATING bit 1 on the RK356x v1 IOMMU
  2026-06-04 13:52 [RFC PATCH v3 0/9] accel: rocket: Add RK3568 NPU support Midgy BALON
                   ` (4 preceding siblings ...)
  2026-06-04 13:52 ` [RFC PATCH v3 5/9] accel: rocket: Keep the IOMMU domain attached across jobs Midgy BALON
@ 2026-06-04 13:52 ` Midgy BALON
  2026-06-04 14:20   ` Tomeu Vizoso
  2026-06-05  1:59   ` Chaoyi Chen
  2026-06-04 13:52 ` [RFC PATCH v3 7/9] dt-bindings: npu: rockchip,rk3588-rknn-core: Add RK3568 Midgy BALON
                   ` (3 subsequent siblings)
  9 siblings, 2 replies; 26+ messages in thread
From: Midgy BALON @ 2026-06-04 13:52 UTC (permalink / raw)
  To: tomeu, ogabbay, heiko, robh, krzk+dt, conor+dt, joro, will
  Cc: robin.murphy, dri-devel, linux-rockchip, devicetree,
	linux-arm-kernel, iommu, linux-kernel

On the RK356x v1 IOMMU, RK_MMU_AUTO_GATING resets to 0x3. Bit 1 enables
auto clock-gating of the page-table walker, so the walker's AXI master
loses its clock between transactions; a TLB-miss page walk then never
completes and the IOMMU is left stuck (PAGING_ENABLED, never IDLE).

Clear bit 1 (keeping bit 0, the slave-port gate) once paging is enabled
so the walker keeps its clock. This is required for the RK3568 NPU MMU.

Signed-off-by: Midgy BALON <midgy971@gmail.com>
---
 drivers/iommu/rockchip-iommu.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
index 4da80136933c4..e3d8b6e9ca12b 100644
--- a/drivers/iommu/rockchip-iommu.c
+++ b/drivers/iommu/rockchip-iommu.c
@@ -953,6 +953,18 @@ static int rk_iommu_enable(struct rk_iommu *iommu)
 
 	ret = rk_iommu_enable_paging(iommu);
 
+	if (!ret) {
+		/*
+		 * RK356x v1 IOMMU: RK_MMU_AUTO_GATING bit 1 enables page-walker
+		 * auto clock-gating; the walker's AXI master then loses its clock
+		 * between transactions and a TLB-miss page walk never completes,
+		 * leaving the IOMMU stuck (PAGING_ENABLED, never IDLE).  Clear
+		 * bit 1 (keep bit 0, the slave-port gate) once paging is enabled.
+		 */
+		for (i = 0; i < iommu->num_mmu; i++)
+			rk_iommu_write(iommu->bases[i], RK_MMU_AUTO_GATING, 0x2);
+	}
+
 out_disable_stall:
 	rk_iommu_disable_stall(iommu);
 out_disable_clocks:
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC PATCH v3 7/9] dt-bindings: npu: rockchip,rk3588-rknn-core: Add RK3568
  2026-06-04 13:52 [RFC PATCH v3 0/9] accel: rocket: Add RK3568 NPU support Midgy BALON
                   ` (5 preceding siblings ...)
  2026-06-04 13:52 ` [RFC PATCH v3 6/9] iommu/rockchip: Clear AUTO_GATING bit 1 on the RK356x v1 IOMMU Midgy BALON
@ 2026-06-04 13:52 ` Midgy BALON
  2026-06-04 13:52 ` [RFC PATCH v3 8/9] arm64: dts: rockchip: rk356x: Add the NPU and its IOMMU Midgy BALON
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: Midgy BALON @ 2026-06-04 13:52 UTC (permalink / raw)
  To: tomeu, ogabbay, heiko, robh, krzk+dt, conor+dt, joro, will
  Cc: robin.murphy, dri-devel, linux-rockchip, devicetree,
	linux-arm-kernel, iommu, linux-kernel

The RK3568 carries a single core of the same NVDLA-derived NPU IP as the
RK3588.  Add its compatible.

On RK3568 the NPU NOC bus-idle and power gating are controlled through the
system PMU rather than a dedicated register block, so add a rockchip,pmu
phandle to that syscon.  The RK3568 NPU has no dedicated SRAM rail, so
sram-supply is required only on RK3588.

Signed-off-by: Midgy BALON <midgy971@gmail.com>
---
 .../npu/rockchip,rk3588-rknn-core.yaml         | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/npu/rockchip,rk3588-rknn-core.yaml b/Documentation/devicetree/bindings/npu/rockchip,rk3588-rknn-core.yaml
index caca2a4903cd1..af9936b32e9fe 100644
--- a/Documentation/devicetree/bindings/npu/rockchip,rk3588-rknn-core.yaml
+++ b/Documentation/devicetree/bindings/npu/rockchip,rk3588-rknn-core.yaml
@@ -21,6 +21,7 @@ properties:
 
   compatible:
     enum:
+      - rockchip,rk3568-rknn-core
       - rockchip,rk3588-rknn-core
 
   reg:
@@ -50,6 +51,13 @@ properties:
 
   npu-supply: true
 
+  rockchip,pmu:
+    $ref: /schemas/types.yaml#/definitions/phandle
+    description:
+      Phandle to the PMU syscon.  On RK3568 the NPU's NOC bus-idle and
+      power gating are controlled through the PMU; this points to that
+      syscon so those registers can be reached.
+
   power-domains:
     maxItems: 1
 
@@ -75,7 +83,15 @@ required:
   - resets
   - reset-names
   - npu-supply
-  - sram-supply
+
+if:
+  properties:
+    compatible:
+      contains:
+        const: rockchip,rk3588-rknn-core
+then:
+  required:
+    - sram-supply
 
 additionalProperties: false
 
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC PATCH v3 8/9] arm64: dts: rockchip: rk356x: Add the NPU and its IOMMU
  2026-06-04 13:52 [RFC PATCH v3 0/9] accel: rocket: Add RK3568 NPU support Midgy BALON
                   ` (6 preceding siblings ...)
  2026-06-04 13:52 ` [RFC PATCH v3 7/9] dt-bindings: npu: rockchip,rk3588-rknn-core: Add RK3568 Midgy BALON
@ 2026-06-04 13:52 ` Midgy BALON
  2026-06-04 13:52 ` [RFC PATCH v3 9/9] arm64: dts: rockchip: rk3568-rock-3b: Enable the NPU Midgy BALON
  2026-06-05  1:36 ` [RFC PATCH v3 0/9] accel: rocket: Add RK3568 NPU support Chaoyi Chen
  9 siblings, 0 replies; 26+ messages in thread
From: Midgy BALON @ 2026-06-04 13:52 UTC (permalink / raw)
  To: tomeu, ogabbay, heiko, robh, krzk+dt, conor+dt, joro, will
  Cc: robin.murphy, dri-devel, linux-rockchip, devicetree,
	linux-arm-kernel, iommu, linux-kernel

The RK3568 has an NVDLA-derived NPU at fde40000 with its own IOMMU at
fde4b000. Add both nodes (disabled by default) and the NPU power-domain
child under the PMU power-controller, and point rockchip,pmu at the PMU
syscon that controls the NPU NoC bus-idle.

The power-domain deliberately carries no pm_qos: qos_npu sits behind the
NPU NoC, which is gated until the NPU is brought up, so a genpd power-off
QoS save would fault reading it.

Signed-off-by: Midgy BALON <midgy971@gmail.com>
---
 arch/arm64/boot/dts/rockchip/rk356x-base.dtsi | 38 +++++++++++++++++++
 1 file changed, 38 insertions(+)

diff --git a/arch/arm64/boot/dts/rockchip/rk356x-base.dtsi b/arch/arm64/boot/dts/rockchip/rk356x-base.dtsi
index 64bdd8b7754b5..50ce5a5e4fc24 100644
--- a/arch/arm64/boot/dts/rockchip/rk356x-base.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk356x-base.dtsi
@@ -512,6 +512,13 @@ power-domain@RK3568_PD_GPU {
 				#power-domain-cells = <0>;
 			};
 
+			pd_npu: power-domain@RK3568_PD_NPU {
+				reg = <RK3568_PD_NPU>;
+				clocks = <&cru ACLK_NPU_PRE>,
+					 <&cru HCLK_NPU_PRE>;
+				#power-domain-cells = <0>;
+			};
+
 			/* These power domains are grouped by VD_LOGIC */
 			power-domain@RK3568_PD_VI {
 				reg = <RK3568_PD_VI>;
@@ -948,6 +955,37 @@ qos_rga_wr: qos@fe158300 {
 		reg = <0x0 0xfe158300 0x0 0x20>;
 	};
 
+	rknn_core_0: npu@fde40000 {
+		compatible = "rockchip,rk3568-rknn-core";
+		reg = <0x0 0xfde40000 0x0 0x1000>,
+		      <0x0 0xfde41000 0x0 0x1000>,
+		      <0x0 0xfde43000 0x0 0x1000>;
+		reg-names = "pc", "cna", "core";
+		interrupts = <GIC_SPI 151 IRQ_TYPE_LEVEL_HIGH>;
+		clocks = <&cru ACLK_NPU>, <&cru HCLK_NPU>,
+			 <&scmi_clk SCMI_CLK_NPU>, <&cru PCLK_NPU_PRE>;
+		clock-names = "aclk", "hclk", "npu", "pclk";
+		assigned-clocks = <&scmi_clk SCMI_CLK_NPU>;
+		assigned-clock-rates = <200000000>;
+		resets = <&cru SRST_A_NPU>, <&cru SRST_H_NPU>;
+		reset-names = "srst_a", "srst_h";
+		power-domains = <&power RK3568_PD_NPU>;
+		rockchip,pmu = <&pmu>;
+		iommus = <&rknn_mmu_0>;
+		status = "disabled";
+	};
+
+	rknn_mmu_0: iommu@fde4b000 {
+		compatible = "rockchip,iommu";
+		reg = <0x0 0xfde4b000 0x0 0x40>;
+		interrupts = <GIC_SPI 151 IRQ_TYPE_LEVEL_HIGH>;
+		clock-names = "aclk", "iface";
+		clocks = <&cru ACLK_NPU>, <&cru HCLK_NPU>;
+		power-domains = <&power RK3568_PD_NPU>;
+		#iommu-cells = <0>;
+		status = "disabled";
+	};
+
 	qos_npu: qos@fe180000 {
 		compatible = "rockchip,rk3568-qos", "syscon";
 		reg = <0x0 0xfe180000 0x0 0x20>;
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC PATCH v3 9/9] arm64: dts: rockchip: rk3568-rock-3b: Enable the NPU
  2026-06-04 13:52 [RFC PATCH v3 0/9] accel: rocket: Add RK3568 NPU support Midgy BALON
                   ` (7 preceding siblings ...)
  2026-06-04 13:52 ` [RFC PATCH v3 8/9] arm64: dts: rockchip: rk356x: Add the NPU and its IOMMU Midgy BALON
@ 2026-06-04 13:52 ` Midgy BALON
  2026-06-05  1:36 ` [RFC PATCH v3 0/9] accel: rocket: Add RK3568 NPU support Chaoyi Chen
  9 siblings, 0 replies; 26+ messages in thread
From: Midgy BALON @ 2026-06-04 13:52 UTC (permalink / raw)
  To: tomeu, ogabbay, heiko, robh, krzk+dt, conor+dt, joro, will
  Cc: robin.murphy, dri-devel, linux-rockchip, devicetree,
	linux-arm-kernel, iommu, linux-kernel

Enable the NPU and its IOMMU on ROCK 3B.

vdd_npu is marked always-on so the rail is up before genpd de-idles the
NPU NoC at power-on: the PMU de-idle handshake needs the rail powered.
The PVTPLL compute clock is brought up later by the driver.

Signed-off-by: Midgy BALON <midgy971@gmail.com>
---
 arch/arm64/boot/dts/rockchip/rk3568-rock-3b.dts | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/rockchip/rk3568-rock-3b.dts b/arch/arm64/boot/dts/rockchip/rk3568-rock-3b.dts
index 69001e453732e..7ac780ed313d5 100644
--- a/arch/arm64/boot/dts/rockchip/rk3568-rock-3b.dts
+++ b/arch/arm64/boot/dts/rockchip/rk3568-rock-3b.dts
@@ -330,8 +330,10 @@ regulator-state-mem {
 
 			vdd_npu: DCDC_REG4 {
 				regulator-name = "vdd_npu";
+				regulator-always-on;
+				regulator-boot-on;
 				regulator-initial-mode = <0x2>;
-				regulator-min-microvolt = <500000>;
+				regulator-min-microvolt = <825000>;
 				regulator-max-microvolt = <1350000>;
 				regulator-ramp-delay = <6001>;
 
@@ -787,3 +789,13 @@ vp0_out_hdmi: endpoint@ROCKCHIP_VOP2_EP_HDMI0 {
 		remote-endpoint = <&hdmi_in_vp0>;
 	};
 };
+
+&rknn_core_0 {
+	npu-supply = <&vdd_npu>;
+	status = "okay";
+};
+
+&rknn_mmu_0 {
+	status = "okay";
+};
+
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v3 6/9] iommu/rockchip: Clear AUTO_GATING bit 1 on the RK356x v1 IOMMU
  2026-06-04 13:52 ` [RFC PATCH v3 6/9] iommu/rockchip: Clear AUTO_GATING bit 1 on the RK356x v1 IOMMU Midgy BALON
@ 2026-06-04 14:20   ` Tomeu Vizoso
  2026-06-05  1:59   ` Chaoyi Chen
  1 sibling, 0 replies; 26+ messages in thread
From: Tomeu Vizoso @ 2026-06-04 14:20 UTC (permalink / raw)
  To: Midgy BALON
  Cc: ogabbay, heiko, robh, krzk+dt, conor+dt, joro, will, robin.murphy,
	dri-devel, linux-rockchip, devicetree, linux-arm-kernel, iommu,
	linux-kernel

On Thu, Jun 4, 2026 at 3:53 PM Midgy BALON <midgy971@gmail.com> wrote:
>
> On the RK356x v1 IOMMU, RK_MMU_AUTO_GATING resets to 0x3. Bit 1 enables
> auto clock-gating of the page-table walker, so the walker's AXI master
> loses its clock between transactions; a TLB-miss page walk then never
> completes and the IOMMU is left stuck (PAGING_ENABLED, never IDLE).
>
> Clear bit 1 (keeping bit 0, the slave-port gate) once paging is enabled
> so the walker keeps its clock. This is required for the RK3568 NPU MMU.

Hi,

I'm not able to review this patch myself, but maybe it can be
submitted separately while we work on the NPU bits?

Regards,

Tomeu

> Signed-off-by: Midgy BALON <midgy971@gmail.com>
> ---
>  drivers/iommu/rockchip-iommu.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
>
> diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
> index 4da80136933c4..e3d8b6e9ca12b 100644
> --- a/drivers/iommu/rockchip-iommu.c
> +++ b/drivers/iommu/rockchip-iommu.c
> @@ -953,6 +953,18 @@ static int rk_iommu_enable(struct rk_iommu *iommu)
>
>         ret = rk_iommu_enable_paging(iommu);
>
> +       if (!ret) {
> +               /*
> +                * RK356x v1 IOMMU: RK_MMU_AUTO_GATING bit 1 enables page-walker
> +                * auto clock-gating; the walker's AXI master then loses its clock
> +                * between transactions and a TLB-miss page walk never completes,
> +                * leaving the IOMMU stuck (PAGING_ENABLED, never IDLE).  Clear
> +                * bit 1 (keep bit 0, the slave-port gate) once paging is enabled.
> +                */
> +               for (i = 0; i < iommu->num_mmu; i++)
> +                       rk_iommu_write(iommu->bases[i], RK_MMU_AUTO_GATING, 0x2);
> +       }
> +
>  out_disable_stall:
>         rk_iommu_disable_stall(iommu);
>  out_disable_clocks:
> --
> 2.39.5
>


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v3 0/9] accel: rocket: Add RK3568 NPU support
  2026-06-04 13:52 [RFC PATCH v3 0/9] accel: rocket: Add RK3568 NPU support Midgy BALON
                   ` (8 preceding siblings ...)
  2026-06-04 13:52 ` [RFC PATCH v3 9/9] arm64: dts: rockchip: rk3568-rock-3b: Enable the NPU Midgy BALON
@ 2026-06-05  1:36 ` Chaoyi Chen
  2026-06-07 21:03   ` Midgy Balon
  9 siblings, 1 reply; 26+ messages in thread
From: Chaoyi Chen @ 2026-06-05  1:36 UTC (permalink / raw)
  To: Midgy BALON
  Cc: tomeu, ogabbay, heiko, robh, krzk+dt, conor+dt, joro, will,
	robin.murphy, dri-devel, linux-rockchip, devicetree,
	linux-arm-kernel, iommu, linux-kernel

Hello Midgy,

On 6/4/2026 9:52 PM, Midgy BALON wrote:
> RFC, not for merge. End-to-end inference does not produce correct output
> yet (see Status), so per the v2 discussion this is a request for design
> feedback. It now probes, attaches, and submits cleanly on a stock
> v7.1-rc6 tree; what remains is one hardware-internal issue.
> 
> The RK3568 has a single NVDLA-derived NPU core, the same IP family as the
> RK3588 NPU the driver already supports; the register layout matches. The
> RK3568 differences are a 32-bit NPU AXI/IOMMU (vs 40-bit) and explicit
> PVTPLL/PMU bring-up to power and de-idle the NPU before it is reachable.
> 
> Patches:
>   1-2  rocket: per-SoC data struct, then derive DMA width and core count
>        from match data (refactors, no functional change).
>   3    rocket: RK3568 SoC data + PVTPLL/PMU/NOC bring-up.
>   4    rocket: reset the NPU before detaching the IOMMU on a job timeout
>        (the detach otherwise stalls a wedged AXI master and WARNs).
>   5    rocket: keep the IOMMU domain attached across jobs instead of
>        re-attaching per job (the per-job rk_iommu handshake on the idle
>        NPU MMU is slow and noisy).
>   6    iommu/rockchip: clear AUTO_GATING bit 1 on the RK356x v1 IOMMU so
>        the page-walker keeps its clock (else a TLB-miss walk never
>        completes).
>   7    dt-bindings: add the RK3568 NPU compatible.
>   8-9  arm64 dts: add the NPU and its IOMMU, and enable them on ROCK 3B.
> 
> Dependency. The NPU MMU is rockchip-iommu v1 (32-bit) while the rest of
> the RK3568 uses v2 (40-bit). They cannot coexist until the driver carries
> per-device ops; this series is developed on top of Simon Xue's
> "iommu/rockchip: Drop global rk_ops in favor of per-device ops" [1].
> Without it the NPU IOMMU fails to probe on a full RK3568 boot.
>

Hmmm. If I understand correctly, the NPU IOMMU should be v2 rather than
v1, implying it should support 40-bit PAs. Nevertheless, please note that
the upper limit for DTE is 32 bits.

> Power bring-up. The NPU is brought up through the power-domain layer (no
> driver hack): the NPU power-domain keeps its clocks but drops the pm_qos
> phandle (qos_npu sits behind the gated NPU NoC, so genpd's power-off QoS
> save faults reading it), and vdd_npu is marked always-on so the rail is
> up before genpd de-idles the NoC at power-on. The PMU de-idle then ACKs
> without PVTPLL running; PVTPLL is only needed for compute.
>

Can these operations not be completed via the pmdomain driver?
If some operations are controlled by TF-A, are you using open
source TF-A? Thank you.

> Status. On v7.1-rc6 the driver probes, creates /dev/accel/accel0,
> attaches an IOMMU domain, and submits jobs; the program controller
> fetches and broadcasts the command list. Inference output is still wrong,
> and the cause is split across three layers:
>   - kernel (this series): the RK3568 differences appear handled;
>   - mesa/Teflon userspace: still emits RK3588-tuned config, wrong for
>     RK3568 (to be filed separately on mesa-dev);
>   - hardware: with corrected config the NPU's DMA reads the full input
>     and weight tensors (confirmed via its DMA bandwidth counters), but
>     the MAC/output stage never completes, the job times out, and the
>     output stays at the buffer's zero-point. I have not found the missing
>     step; it is not in the command list (replaying the vendor's
>     byte-exact command list behaves the same). Pointers welcome,
>     especially from anyone with RK3568 NPU experience.
> 
> Known residual. On the first IOMMU attach the NPU MMU is idle with paging
> already enabled; the rk_iommu stall/reset handshake does not complete in
> that state and logs one burst of timeouts before the (kept) domain
> settles. It is harmless here because the job times out regardless, but it
> points at an idle-MMU reconfiguration corner the rk_iommu code does not
> handle on this block.
> 
> [1] https://lore.kernel.org/linux-rockchip/20260310105303.128859-1-xxm@rock-chips.com/
> 
> Changes since v2:
>   - Tagged RFC; now tested on a stock v7.1-rc6 tree.
>   - Bring-up moved into the power-domain/DT layer (no initcall hack).
>   - Added the IOMMU detach-on-timeout and attach-once driver fixes.
>   - Split the driver patch (Heiko): soc_data / match-data / RK3568.
>   - Derive DMA width and core count from match data; drop the DT rescans.
>   - Binding describes the hardware; added the missing $ref on rockchip,pmu.
>   - Disclosed the per-device-ops IOMMU dependency.
> 
> Midgy BALON (9):
>   accel: rocket: Introduce per-SoC rocket_soc_data
>   accel: rocket: Derive DMA width and core count from match data
>   accel: rocket: Add RK3568 SoC support
>   accel: rocket: Reset the NPU before detaching the IOMMU on timeout
>   accel: rocket: Keep the IOMMU domain attached across jobs
>   iommu/rockchip: Clear AUTO_GATING bit 1 on the RK356x v1 IOMMU
>   dt-bindings: npu: rockchip,rk3588-rknn-core: Add RK3568
>   arm64: dts: rockchip: rk356x: Add the NPU and its IOMMU
>   arm64: dts: rockchip: rk3568-rock-3b: Enable the NPU
> 
>  .../npu/rockchip,rk3588-rknn-core.yaml        | 18 ++++-
>  .../boot/dts/rockchip/rk3568-rock-3b.dts      | 14 +++-
>  arch/arm64/boot/dts/rockchip/rk356x-base.dtsi | 38 +++++++++++
>  drivers/accel/rocket/rocket_core.c            | 22 ++++++-
>  drivers/accel/rocket/rocket_core.h            | 19 ++++++
>  drivers/accel/rocket/rocket_device.c          | 15 ++---
>  drivers/accel/rocket/rocket_device.h          |  3 +-
>  drivers/accel/rocket/rocket_drv.c             | 66 ++++++++++++++++++-
>  drivers/accel/rocket/rocket_job.c             | 35 ++++++++--
>  drivers/iommu/rockchip-iommu.c                | 12 ++++
>  10 files changed, 219 insertions(+), 23 deletions(-)
> 
> 
> base-commit: 52c800fdcf11888ebeb50c3d707f782cc15b66eb

-- 
Best, 
Chaoyi


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v3 6/9] iommu/rockchip: Clear AUTO_GATING bit 1 on the RK356x v1 IOMMU
  2026-06-04 13:52 ` [RFC PATCH v3 6/9] iommu/rockchip: Clear AUTO_GATING bit 1 on the RK356x v1 IOMMU Midgy BALON
  2026-06-04 14:20   ` Tomeu Vizoso
@ 2026-06-05  1:59   ` Chaoyi Chen
  2026-06-07 21:05     ` Midgy Balon
  1 sibling, 1 reply; 26+ messages in thread
From: Chaoyi Chen @ 2026-06-05  1:59 UTC (permalink / raw)
  To: Midgy BALON
  Cc: Simon Xue, tomeu, ogabbay, heiko, robh, krzk+dt, conor+dt, joro,
	will, robin.murphy, dri-devel, linux-rockchip, devicetree,
	linux-arm-kernel, iommu, linux-kernel

Hello Midgy,

On 6/4/2026 9:52 PM, Midgy BALON wrote:
> On the RK356x v1 IOMMU, RK_MMU_AUTO_GATING resets to 0x3. Bit 1 enables
> auto clock-gating of the page-table walker, so the walker's AXI master
> loses its clock between transactions; a TLB-miss page walk then never
> completes and the IOMMU is left stuck (PAGING_ENABLED, never IDLE).
> 
> Clear bit 1 (keeping bit 0, the slave-port gate) once paging is enabled
> so the walker keeps its clock. This is required for the RK3568 NPU MMU.
> 
> Signed-off-by: Midgy BALON <midgy971@gmail.com>
> ---
>  drivers/iommu/rockchip-iommu.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
> index 4da80136933c4..e3d8b6e9ca12b 100644
> --- a/drivers/iommu/rockchip-iommu.c
> +++ b/drivers/iommu/rockchip-iommu.c
> @@ -953,6 +953,18 @@ static int rk_iommu_enable(struct rk_iommu *iommu)
>  
>  	ret = rk_iommu_enable_paging(iommu);
>  
> +	if (!ret) {
> +		/*
> +		 * RK356x v1 IOMMU: RK_MMU_AUTO_GATING bit 1 enables page-walker
> +		 * auto clock-gating; the walker's AXI master then loses its clock
> +		 * between transactions and a TLB-miss page walk never completes,
> +		 * leaving the IOMMU stuck (PAGING_ENABLED, never IDLE).  Clear
> +		 * bit 1 (keep bit 0, the slave-port gate) once paging is enabled.
> +		 */
> +		for (i = 0; i < iommu->num_mmu; i++)
> +			rk_iommu_write(iommu->bases[i], RK_MMU_AUTO_GATING, 0x2);
> +	}
> +
>  out_disable_stall:
>  	rk_iommu_disable_stall(iommu);
>  out_disable_clocks:

As I said, it is v2. Could you please try using the code below
instead and see if it works? Thank you.

diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
index 0013cf196c57..89e3a83a0251 100644
--- a/drivers/iommu/rockchip-iommu.c
+++ b/drivers/iommu/rockchip-iommu.c
@@ -930,6 +930,7 @@ static int rk_iommu_enable(struct rk_iommu *iommu)
        struct iommu_domain *domain = iommu->domain;
        struct rk_iommu_domain *rk_domain = to_rk_domain(domain);
        int ret, i;
+       u32 auto_gate;
 
        ret = clk_bulk_enable(iommu->num_clocks, iommu->clocks);
        if (ret)
@@ -948,6 +949,10 @@ static int rk_iommu_enable(struct rk_iommu *iommu)
                               rk_ops->mk_dtentries(rk_domain->dt_dma));
                rk_iommu_base_command(iommu->bases[i], RK_MMU_CMD_ZAP_CACHE);
                rk_iommu_write(iommu->bases[i], RK_MMU_INT_MASK, RK_MMU_IRQ_MASK);
+
+               auto_gate = rk_iommu_read(iommu->bases[i], RK_MMU_AUTO_GATING);
+               auto_gate |= BIT(31);
+               rk_iommu_write(iommu->bases[i], RK_MMU_AUTO_GATING, auto_gate);
        }
 
        ret = rk_iommu_enable_paging(iommu);

-- 
Best, 
Chaoyi


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v3 0/9] accel: rocket: Add RK3568 NPU support
  2026-06-05  1:36 ` [RFC PATCH v3 0/9] accel: rocket: Add RK3568 NPU support Chaoyi Chen
@ 2026-06-07 21:03   ` Midgy Balon
  2026-06-08  1:40     ` Chaoyi Chen
  0 siblings, 1 reply; 26+ messages in thread
From: Midgy Balon @ 2026-06-07 21:03 UTC (permalink / raw)
  To: Chaoyi Chen
  Cc: tomeu, ogabbay, heiko, robh, krzk+dt, conor+dt, joro, will,
	robin.murphy, dri-devel, linux-rockchip, devicetree,
	linux-arm-kernel, iommu, linux-kernel

Hi Chaoyi,

Thanks a lot for looking at this -- input from Rockchip is exactly what this
series needs.

> Hmmm. If I understand correctly, the NPU IOMMU should be v2 rather than v1,
> implying it should support 40-bit PAs. Nevertheless, please note that the
> upper limit for DTE is 32 bits.

Understood, and that 32-bit-DTE note is the crux of the trouble I had, so let
me lay out what I see and ask how you'd prefer to solve it.

The mainline node is already v2 (rockchip,rk3568-iommu in rk356x-base.dtsi).
The problem on this 8 GiB board: with the v2 ops the page-table allocations
(gfp_flags == 0) can land above 4 GiB, so the DTE ends up > 32 bits and the
NPU's first translation faults with DMA_READ_ERROR. To work around that I had
switched the NPU MMU to the v1 compatible (rockchip,iommu), whose ops set
GFP_DMA32 and keep the DTE sub-4 GiB. That works in isolation, but because the
driver keeps a single global rk_ops, a v1 NPU MMU then trips
WARN_ON(rk_ops != ops) against the SoC's v2 instances (VOP/VDEC), which is why
I based the series on Simon's per-device-ops work.

So my question: with per-device ops in place, what's the intended way to keep
the NPU MMU on v2 *and* cap its DTE at 32 bits on boards with >4 GiB of RAM?
A v2 ops variant carrying GFP_DMA32 for this device, or is there a register/
config bit that constrains the DTE address? I'd rather follow the Rockchip
intent here than carry the v1 workaround. (Simon, cc'd -- this is right next to
your per-device-ops series.)

> Can these operations not be completed via the pmdomain driver?
> If some operations are controlled by TF-A, are you using open source TF-A?

Most of it is in pmdomain already. Power-on and NoC de-idle are done by the
RK3568 NPU power domain (genpd) at power-on -- the driver no longer pokes the
PMU directly. Two things remain outside it:

 - vdd_npu: I mark it regulator-always-on in DT rather than wiring it as the
   domain's domain-supply, because as a domain-supply it created a device-link
   to the I2C PMIC (rk809) and genpd's power-off QoS-save path then hung
   reading the NPU QoS registers behind the (gated) NoC. If there's a clean way
   to let genpd own vdd_npu without that I2C ordering deadlock I'd much prefer
   that -- pointers welcome.

 - the NPU compute clock (PVTPLL): set from the driver via SCMI, and only
   needed for actual compute, not for bring-up.

One more pmdomain observation from testing, possibly relevant to how the NPU
domain should be modelled: the domain's power-off/on cycle doesn't reliably
re-de-idle the NoC. If the NPU is probed after genpd has already powered the
(unused) domain off, the power-on de-idle fails ("failed to set idle on domain
'npu'") and the NPU IOMMU then takes an external abort on its first MMIO access.
Probing the NPU before the unused-domain power-off, or marking the domain
always-on, both avoid it. Is the NoC de-idle expected to work on a genpd
re-power here, or should this domain effectively stay on?

On TF-A: yes -- bl31 is built from upstream arm-trusted-firmware
(github.com/ARM-software/arm-trusted-firmware, RK3568 platform), providing PSCI
and the SCMI clock service. The only closed blob in the boot chain is Rockchip's
DDR init (rkbin), which is the standard situation for mainline RK356x.

Kind regards,
Midgy

Le ven. 5 juin 2026 à 03:36, Chaoyi Chen <chaoyi.chen@rock-chips.com> a écrit :
>
> Hello Midgy,
>
> On 6/4/2026 9:52 PM, Midgy BALON wrote:
> > RFC, not for merge. End-to-end inference does not produce correct output
> > yet (see Status), so per the v2 discussion this is a request for design
> > feedback. It now probes, attaches, and submits cleanly on a stock
> > v7.1-rc6 tree; what remains is one hardware-internal issue.
> >
> > The RK3568 has a single NVDLA-derived NPU core, the same IP family as the
> > RK3588 NPU the driver already supports; the register layout matches. The
> > RK3568 differences are a 32-bit NPU AXI/IOMMU (vs 40-bit) and explicit
> > PVTPLL/PMU bring-up to power and de-idle the NPU before it is reachable.
> >
> > Patches:
> >   1-2  rocket: per-SoC data struct, then derive DMA width and core count
> >        from match data (refactors, no functional change).
> >   3    rocket: RK3568 SoC data + PVTPLL/PMU/NOC bring-up.
> >   4    rocket: reset the NPU before detaching the IOMMU on a job timeout
> >        (the detach otherwise stalls a wedged AXI master and WARNs).
> >   5    rocket: keep the IOMMU domain attached across jobs instead of
> >        re-attaching per job (the per-job rk_iommu handshake on the idle
> >        NPU MMU is slow and noisy).
> >   6    iommu/rockchip: clear AUTO_GATING bit 1 on the RK356x v1 IOMMU so
> >        the page-walker keeps its clock (else a TLB-miss walk never
> >        completes).
> >   7    dt-bindings: add the RK3568 NPU compatible.
> >   8-9  arm64 dts: add the NPU and its IOMMU, and enable them on ROCK 3B.
> >
> > Dependency. The NPU MMU is rockchip-iommu v1 (32-bit) while the rest of
> > the RK3568 uses v2 (40-bit). They cannot coexist until the driver carries
> > per-device ops; this series is developed on top of Simon Xue's
> > "iommu/rockchip: Drop global rk_ops in favor of per-device ops" [1].
> > Without it the NPU IOMMU fails to probe on a full RK3568 boot.
> >
>
> Hmmm. If I understand correctly, the NPU IOMMU should be v2 rather than
> v1, implying it should support 40-bit PAs. Nevertheless, please note that
> the upper limit for DTE is 32 bits.
>
> > Power bring-up. The NPU is brought up through the power-domain layer (no
> > driver hack): the NPU power-domain keeps its clocks but drops the pm_qos
> > phandle (qos_npu sits behind the gated NPU NoC, so genpd's power-off QoS
> > save faults reading it), and vdd_npu is marked always-on so the rail is
> > up before genpd de-idles the NoC at power-on. The PMU de-idle then ACKs
> > without PVTPLL running; PVTPLL is only needed for compute.
> >
>
> Can these operations not be completed via the pmdomain driver?
> If some operations are controlled by TF-A, are you using open
> source TF-A? Thank you.
>
> > Status. On v7.1-rc6 the driver probes, creates /dev/accel/accel0,
> > attaches an IOMMU domain, and submits jobs; the program controller
> > fetches and broadcasts the command list. Inference output is still wrong,
> > and the cause is split across three layers:
> >   - kernel (this series): the RK3568 differences appear handled;
> >   - mesa/Teflon userspace: still emits RK3588-tuned config, wrong for
> >     RK3568 (to be filed separately on mesa-dev);
> >   - hardware: with corrected config the NPU's DMA reads the full input
> >     and weight tensors (confirmed via its DMA bandwidth counters), but
> >     the MAC/output stage never completes, the job times out, and the
> >     output stays at the buffer's zero-point. I have not found the missing
> >     step; it is not in the command list (replaying the vendor's
> >     byte-exact command list behaves the same). Pointers welcome,
> >     especially from anyone with RK3568 NPU experience.
> >
> > Known residual. On the first IOMMU attach the NPU MMU is idle with paging
> > already enabled; the rk_iommu stall/reset handshake does not complete in
> > that state and logs one burst of timeouts before the (kept) domain
> > settles. It is harmless here because the job times out regardless, but it
> > points at an idle-MMU reconfiguration corner the rk_iommu code does not
> > handle on this block.
> >
> > [1] https://lore.kernel.org/linux-rockchip/20260310105303.128859-1-xxm@rock-chips.com/
> >
> > Changes since v2:
> >   - Tagged RFC; now tested on a stock v7.1-rc6 tree.
> >   - Bring-up moved into the power-domain/DT layer (no initcall hack).
> >   - Added the IOMMU detach-on-timeout and attach-once driver fixes.
> >   - Split the driver patch (Heiko): soc_data / match-data / RK3568.
> >   - Derive DMA width and core count from match data; drop the DT rescans.
> >   - Binding describes the hardware; added the missing $ref on rockchip,pmu.
> >   - Disclosed the per-device-ops IOMMU dependency.
> >
> > Midgy BALON (9):
> >   accel: rocket: Introduce per-SoC rocket_soc_data
> >   accel: rocket: Derive DMA width and core count from match data
> >   accel: rocket: Add RK3568 SoC support
> >   accel: rocket: Reset the NPU before detaching the IOMMU on timeout
> >   accel: rocket: Keep the IOMMU domain attached across jobs
> >   iommu/rockchip: Clear AUTO_GATING bit 1 on the RK356x v1 IOMMU
> >   dt-bindings: npu: rockchip,rk3588-rknn-core: Add RK3568
> >   arm64: dts: rockchip: rk356x: Add the NPU and its IOMMU
> >   arm64: dts: rockchip: rk3568-rock-3b: Enable the NPU
> >
> >  .../npu/rockchip,rk3588-rknn-core.yaml        | 18 ++++-
> >  .../boot/dts/rockchip/rk3568-rock-3b.dts      | 14 +++-
> >  arch/arm64/boot/dts/rockchip/rk356x-base.dtsi | 38 +++++++++++
> >  drivers/accel/rocket/rocket_core.c            | 22 ++++++-
> >  drivers/accel/rocket/rocket_core.h            | 19 ++++++
> >  drivers/accel/rocket/rocket_device.c          | 15 ++---
> >  drivers/accel/rocket/rocket_device.h          |  3 +-
> >  drivers/accel/rocket/rocket_drv.c             | 66 ++++++++++++++++++-
> >  drivers/accel/rocket/rocket_job.c             | 35 ++++++++--
> >  drivers/iommu/rockchip-iommu.c                | 12 ++++
> >  10 files changed, 219 insertions(+), 23 deletions(-)
> >
> >
> > base-commit: 52c800fdcf11888ebeb50c3d707f782cc15b66eb
>
> --
> Best,
> Chaoyi


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v3 6/9] iommu/rockchip: Clear AUTO_GATING bit 1 on the RK356x v1 IOMMU
  2026-06-05  1:59   ` Chaoyi Chen
@ 2026-06-07 21:05     ` Midgy Balon
  2026-06-08  1:45       ` Chaoyi Chen
  0 siblings, 1 reply; 26+ messages in thread
From: Midgy Balon @ 2026-06-07 21:05 UTC (permalink / raw)
  To: Chaoyi Chen
  Cc: Simon Xue, tomeu, ogabbay, heiko, robh, krzk+dt, conor+dt, joro,
	will, robin.murphy, dri-devel, linux-rockchip, devicetree,
	linux-arm-kernel, iommu, linux-kernel

Hi Chaoyi,

> As I said, it is v2. Could you please try using the code below instead and
> see if it works?
> [ auto_gate = read(RK_MMU_AUTO_GATING); auto_gate |= BIT(31); write(...) ]

Thanks -- that's clearly the right shape (read-modify-write, before paging is
enabled, keeping the reset value instead of my clobbering 0x2).

I rebuilt v7.1-rc6 (with the rocket RK3568 series + your per-device-ops work)
using your bit-31 version and tested it on a ROCK 3B: the NPU IOMMU comes up and
services the NPU's DMA cleanly -- the NPU probes, attaches its domain, and runs
repeated conv submissions with no DMA_READ_ERROR and no page-walk stall. No
regression from the write.

To be precise about what I can and can't show: I tested both ways on v7.1-rc6 --
with your bit-31 write, and on the reset value (0x3) -- and the NPU
IOMMU services
the NPU's reads with zero faults in both cases (no DMA_READ_ERROR, no page-walk
stall). So I don't have a failing baseline here that bit-31 visibly
fixes. Is the
AUTO_GATING write needed on current mainline, or only under conditions I'm not
reproducing (a particular traffic pattern / silicon rev)? I'll keep the patch in
your form unless you'd prefer to drop it.

One question so I document it correctly: what does bit 31 of RK_MMU_AUTO_GATING
control on the v2 block -- is it a master "disable internal auto clock-gating"
for the page-table walker (i.e. so a TLB-miss walk's AXI master keeps its clock
to completion)? The RK3568 TRM I have doesn't cover the IOMMU registers, so a
one-line description would let me write an accurate comment.



Kind regards,
Midgy

Le ven. 5 juin 2026 à 03:59, Chaoyi Chen <chaoyi.chen@rock-chips.com> a écrit :
>
> Hello Midgy,
>
> On 6/4/2026 9:52 PM, Midgy BALON wrote:
> > On the RK356x v1 IOMMU, RK_MMU_AUTO_GATING resets to 0x3. Bit 1 enables
> > auto clock-gating of the page-table walker, so the walker's AXI master
> > loses its clock between transactions; a TLB-miss page walk then never
> > completes and the IOMMU is left stuck (PAGING_ENABLED, never IDLE).
> >
> > Clear bit 1 (keeping bit 0, the slave-port gate) once paging is enabled
> > so the walker keeps its clock. This is required for the RK3568 NPU MMU.
> >
> > Signed-off-by: Midgy BALON <midgy971@gmail.com>
> > ---
> >  drivers/iommu/rockchip-iommu.c | 12 ++++++++++++
> >  1 file changed, 12 insertions(+)
> >
> > diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
> > index 4da80136933c4..e3d8b6e9ca12b 100644
> > --- a/drivers/iommu/rockchip-iommu.c
> > +++ b/drivers/iommu/rockchip-iommu.c
> > @@ -953,6 +953,18 @@ static int rk_iommu_enable(struct rk_iommu *iommu)
> >
> >       ret = rk_iommu_enable_paging(iommu);
> >
> > +     if (!ret) {
> > +             /*
> > +              * RK356x v1 IOMMU: RK_MMU_AUTO_GATING bit 1 enables page-walker
> > +              * auto clock-gating; the walker's AXI master then loses its clock
> > +              * between transactions and a TLB-miss page walk never completes,
> > +              * leaving the IOMMU stuck (PAGING_ENABLED, never IDLE).  Clear
> > +              * bit 1 (keep bit 0, the slave-port gate) once paging is enabled.
> > +              */
> > +             for (i = 0; i < iommu->num_mmu; i++)
> > +                     rk_iommu_write(iommu->bases[i], RK_MMU_AUTO_GATING, 0x2);
> > +     }
> > +
> >  out_disable_stall:
> >       rk_iommu_disable_stall(iommu);
> >  out_disable_clocks:
>
> As I said, it is v2. Could you please try using the code below
> instead and see if it works? Thank you.
>
> diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
> index 0013cf196c57..89e3a83a0251 100644
> --- a/drivers/iommu/rockchip-iommu.c
> +++ b/drivers/iommu/rockchip-iommu.c
> @@ -930,6 +930,7 @@ static int rk_iommu_enable(struct rk_iommu *iommu)
>         struct iommu_domain *domain = iommu->domain;
>         struct rk_iommu_domain *rk_domain = to_rk_domain(domain);
>         int ret, i;
> +       u32 auto_gate;
>
>         ret = clk_bulk_enable(iommu->num_clocks, iommu->clocks);
>         if (ret)
> @@ -948,6 +949,10 @@ static int rk_iommu_enable(struct rk_iommu *iommu)
>                                rk_ops->mk_dtentries(rk_domain->dt_dma));
>                 rk_iommu_base_command(iommu->bases[i], RK_MMU_CMD_ZAP_CACHE);
>                 rk_iommu_write(iommu->bases[i], RK_MMU_INT_MASK, RK_MMU_IRQ_MASK);
> +
> +               auto_gate = rk_iommu_read(iommu->bases[i], RK_MMU_AUTO_GATING);
> +               auto_gate |= BIT(31);
> +               rk_iommu_write(iommu->bases[i], RK_MMU_AUTO_GATING, auto_gate);
>         }
>
>         ret = rk_iommu_enable_paging(iommu);
>
> --
> Best,
> Chaoyi


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v3 0/9] accel: rocket: Add RK3568 NPU support
  2026-06-07 21:03   ` Midgy Balon
@ 2026-06-08  1:40     ` Chaoyi Chen
  2026-06-08  8:05       ` Midgy Balon
  0 siblings, 1 reply; 26+ messages in thread
From: Chaoyi Chen @ 2026-06-08  1:40 UTC (permalink / raw)
  To: Midgy Balon
  Cc: tomeu, ogabbay, heiko, robh, krzk+dt, conor+dt, joro, will,
	robin.murphy, dri-devel, linux-rockchip, devicetree,
	linux-arm-kernel, iommu, linux-kernel, Simon Xue, Finley Xiao

Hi Midgy,

On 6/8/2026 5:03 AM, Midgy Balon wrote:
> Hi Chaoyi,
> 
> Thanks a lot for looking at this -- input from Rockchip is exactly what this
> series needs.
> 
>> Hmmm. If I understand correctly, the NPU IOMMU should be v2 rather than v1,
>> implying it should support 40-bit PAs. Nevertheless, please note that the
>> upper limit for DTE is 32 bits.
> 
> Understood, and that 32-bit-DTE note is the crux of the trouble I had, so let
> me lay out what I see and ask how you'd prefer to solve it.
> 
> The mainline node is already v2 (rockchip,rk3568-iommu in rk356x-base.dtsi).
> The problem on this 8 GiB board: with the v2 ops the page-table allocations
> (gfp_flags == 0) can land above 4 GiB, so the DTE ends up > 32 bits and the
> NPU's first translation faults with DMA_READ_ERROR. To work around that I had
> switched the NPU MMU to the v1 compatible (rockchip,iommu), whose ops set
> GFP_DMA32 and keep the DTE sub-4 GiB. That works in isolation, but because the
> driver keeps a single global rk_ops, a v1 NPU MMU then trips
> WARN_ON(rk_ops != ops) against the SoC's v2 instances (VOP/VDEC), which is why
> I based the series on Simon's per-device-ops work.
> 
> So my question: with per-device ops in place, what's the intended way to keep
> the NPU MMU on v2 *and* cap its DTE at 32 bits on boards with >4 GiB of RAM?
> A v2 ops variant carrying GFP_DMA32 for this device, or is there a register/
> config bit that constrains the DTE address? I'd rather follow the Rockchip
> intent here than carry the v1 workaround. (Simon, cc'd -- this is right next to
> your per-device-ops series.)
>

If Simon's method works, please use it :)

>> Can these operations not be completed via the pmdomain driver?
>> If some operations are controlled by TF-A, are you using open source TF-A?
> 
> Most of it is in pmdomain already. Power-on and NoC de-idle are done by the
> RK3568 NPU power domain (genpd) at power-on -- the driver no longer pokes the
> PMU directly. Two things remain outside it:
> 
>  - vdd_npu: I mark it regulator-always-on in DT rather than wiring it as the
>    domain's domain-supply, because as a domain-supply it created a device-link
>    to the I2C PMIC (rk809) and genpd's power-off QoS-save path then hung
>    reading the NPU QoS registers behind the (gated) NoC. If there's a clean way
>    to let genpd own vdd_npu without that I2C ordering deadlock I'd much prefer
>    that -- pointers welcome.
>

Please refer to the patch below regarding the RK3588 NPU pmdomain.
In short, you need to set a "need_regulator" for the RK3568 NPU pmdomain.

https://lore.kernel.org/all/20251216055247.13150-1-rmxpzlb@gmail.com/

>  - the NPU compute clock (PVTPLL): set from the driver via SCMI, and only
>    needed for actual compute, not for bring-up.
> 
> One more pmdomain observation from testing, possibly relevant to how the NPU
> domain should be modelled: the domain's power-off/on cycle doesn't reliably
> re-de-idle the NoC. If the NPU is probed after genpd has already powered the
> (unused) domain off, the power-on de-idle fails ("failed to set idle on domain
> 'npu'") and the NPU IOMMU then takes an external abort on its first MMIO access.
> Probing the NPU before the unused-domain power-off, or marking the domain
> always-on, both avoid it. Is the NoC de-idle expected to work on a genpd
> re-power here, or should this domain effectively stay on?
>

Not quite sure what's going on with PVTPLL and NOC.
Maybe @Finley knows about this?

> On TF-A: yes -- bl31 is built from upstream arm-trusted-firmware
> (github.com/ARM-software/arm-trusted-firmware, RK3568 platform), providing PSCI
> and the SCMI clock service. The only closed blob in the boot chain is Rockchip's
> DDR init (rkbin), which is the standard situation for mainline RK356x.

-- 
Best, 
Chaoyi


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v3 6/9] iommu/rockchip: Clear AUTO_GATING bit 1 on the RK356x v1 IOMMU
  2026-06-07 21:05     ` Midgy Balon
@ 2026-06-08  1:45       ` Chaoyi Chen
  2026-06-08  3:40         ` Chaoyi Chen
  0 siblings, 1 reply; 26+ messages in thread
From: Chaoyi Chen @ 2026-06-08  1:45 UTC (permalink / raw)
  To: Midgy Balon
  Cc: Simon Xue, tomeu, ogabbay, heiko, robh, krzk+dt, conor+dt, joro,
	will, robin.murphy, dri-devel, linux-rockchip, devicetree,
	linux-arm-kernel, iommu, linux-kernel

Hi Midgy,

On 6/8/2026 5:05 AM, Midgy Balon wrote:
> Hi Chaoyi,
> 
>> As I said, it is v2. Could you please try using the code below instead and
>> see if it works?
>> [ auto_gate = read(RK_MMU_AUTO_GATING); auto_gate |= BIT(31); write(...) ]
> 
> Thanks -- that's clearly the right shape (read-modify-write, before paging is
> enabled, keeping the reset value instead of my clobbering 0x2).
> 
> I rebuilt v7.1-rc6 (with the rocket RK3568 series + your per-device-ops work)
> using your bit-31 version and tested it on a ROCK 3B: the NPU IOMMU comes up and
> services the NPU's DMA cleanly -- the NPU probes, attaches its domain, and runs
> repeated conv submissions with no DMA_READ_ERROR and no page-walk stall. No
> regression from the write.
> 
> To be precise about what I can and can't show: I tested both ways on v7.1-rc6 --
> with your bit-31 write, and on the reset value (0x3) -- and the NPU
> IOMMU services
> the NPU's reads with zero faults in both cases (no DMA_READ_ERROR, no page-walk
> stall). So I don't have a failing baseline here that bit-31 visibly
> fixes. Is the
> AUTO_GATING write needed on current mainline, or only under conditions I'm not
> reproducing (a particular traffic pattern / silicon rev)? I'll keep the patch in
> your form unless you'd prefer to drop it.
> 
> One question so I document it correctly: what does bit 31 of RK_MMU_AUTO_GATING
> control on the v2 block -- is it a master "disable internal auto clock-gating"
> for the page-table walker (i.e. so a TLB-miss walk's AXI master keeps its clock
> to completion)? The RK3568 TRM I have doesn't cover the IOMMU registers, so a
> one-line description would let me write an accurate comment.
> 

Glad to hear this works. Please refer to the commit below.

[0]: https://github.com/rockchip-linux/kernel/commit/7f8158fb41b5cc8e738aaeebc3637c50ebd74cae
[1]: https://github.com/rockchip-linux/kernel/commit/6a355e5f9a2069a2309e240791bc3aad63b7324e

-- 
Best, 
Chaoyi


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v3 6/9] iommu/rockchip: Clear AUTO_GATING bit 1 on the RK356x v1 IOMMU
  2026-06-08  1:45       ` Chaoyi Chen
@ 2026-06-08  3:40         ` Chaoyi Chen
  0 siblings, 0 replies; 26+ messages in thread
From: Chaoyi Chen @ 2026-06-08  3:40 UTC (permalink / raw)
  To: Midgy Balon
  Cc: Simon Xue, tomeu, ogabbay, heiko, robh, krzk+dt, conor+dt, joro,
	will, robin.murphy, dri-devel, linux-rockchip, devicetree,
	linux-arm-kernel, iommu, linux-kernel

On 6/8/2026 9:45 AM, Chaoyi Chen wrote:
> Hi Midgy,
> 
> On 6/8/2026 5:05 AM, Midgy Balon wrote:
>> Hi Chaoyi,
>>
>>> As I said, it is v2. Could you please try using the code below instead and
>>> see if it works?
>>> [ auto_gate = read(RK_MMU_AUTO_GATING); auto_gate |= BIT(31); write(...) ]
>>
>> Thanks -- that's clearly the right shape (read-modify-write, before paging is
>> enabled, keeping the reset value instead of my clobbering 0x2).
>>
>> I rebuilt v7.1-rc6 (with the rocket RK3568 series + your per-device-ops work)
>> using your bit-31 version and tested it on a ROCK 3B: the NPU IOMMU comes up and
>> services the NPU's DMA cleanly -- the NPU probes, attaches its domain, and runs
>> repeated conv submissions with no DMA_READ_ERROR and no page-walk stall. No
>> regression from the write.
>>
>> To be precise about what I can and can't show: I tested both ways on v7.1-rc6 --
>> with your bit-31 write, and on the reset value (0x3) -- and the NPU
>> IOMMU services
>> the NPU's reads with zero faults in both cases (no DMA_READ_ERROR, no page-walk
>> stall). So I don't have a failing baseline here that bit-31 visibly
>> fixes. Is the
>> AUTO_GATING write needed on current mainline, or only under conditions I'm not
>> reproducing (a particular traffic pattern / silicon rev)? I'll keep the patch in
>> your form unless you'd prefer to drop it.
>>
>> One question so I document it correctly: what does bit 31 of RK_MMU_AUTO_GATING
>> control on the v2 block -- is it a master "disable internal auto clock-gating"
>> for the page-table walker (i.e. so a TLB-miss walk's AXI master keeps its clock
>> to completion)? The RK3568 TRM I have doesn't cover the IOMMU registers, so a
>> one-line description would let me write an accurate comment.
>>
> 
> Glad to hear this works. Please refer to the commit below.
> 
> [0]: https://github.com/rockchip-linux/kernel/commit/7f8158fb41b5cc8e738aaeebc3637c50ebd74cae
> [1]: https://github.com/rockchip-linux/kernel/commit/6a355e5f9a2069a2309e240791bc3aad63b7324e
> 

It looks like RGA needs this patch too, and it has already been merged :).

https://lore.kernel.org/all/20260428-spu-iommudtefix-v2-1-f592f579e508@pengutronix.de/

-- 
Best, 
Chaoyi


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v3 0/9] accel: rocket: Add RK3568 NPU support
  2026-06-08  1:40     ` Chaoyi Chen
@ 2026-06-08  8:05       ` Midgy Balon
  2026-06-08  9:14         ` Midgy Balon
  0 siblings, 1 reply; 26+ messages in thread
From: Midgy Balon @ 2026-06-08  8:05 UTC (permalink / raw)
  To: Chaoyi Chen
  Cc: tomeu, ogabbay, heiko, robh, krzk+dt, conor+dt, joro, will,
	robin.murphy, dri-devel, linux-rockchip, devicetree,
	linux-arm-kernel, iommu, linux-kernel, Simon Xue, Finley Xiao

Hello Chaoyi,

Thanks -- this is exactly what I needed.

- v2/DTE: will do. I'll keep building on Simon's per-device-ops series -- with
  that in place the NPU MMU can use the 32-bit-DTE ops (the per-ops GFP_DMA32
  that's already in mainline) without the global rk_ops conflict. I'll
keep it as
  a stated dependency of the v4 cover letter.

- vdd_npu:  I'll switch the RK3568 NPU
  power domain to need_regulator + domain-supply = <&vdd_npu> and drop the
  regulator-always-on workaround. I suspect that's also the right fix for the
  power-off/on de-idle issue I described -- the always-on was really
just papering
  over the domain not being modelled with a regulator. I'll confirm on
the board.

- AUTO_GATING: thanks for the commit references -- I'll keep the bit-31
  read-modify-write form with your Suggested-by and write the comment
from those.
  For the record: on v7.1-rc6 the NPU MMU also completes translations
on the reset
  value (I couldn't reproduce a page-walk stall without the write), so I'll note
  in the commit that it matches the vendor clock-gating handling rather than
  fixing a failure I can reproduce here -- happy to drop it if the iommu
  maintainers would prefer.

- PVTPLL/NoC: I'll follow up with Finley. First I'll check whether the
  need_regulator change resolves the NoC re-power de-idle on its own;
if it still
  I'll bring him the details (the genpd power-on de-idle ack and the
  BUS_IDLE_ST state).

I'll send a v4 with these. Thanks again for the quick, detailed answers.

Kind regards,
Midgy

Le lun. 8 juin 2026 à 03:40, Chaoyi Chen <chaoyi.chen@rock-chips.com> a écrit :
>
> Hi Midgy,
>
> On 6/8/2026 5:03 AM, Midgy Balon wrote:
> > Hi Chaoyi,
> >
> > Thanks a lot for looking at this -- input from Rockchip is exactly what this
> > series needs.
> >
> >> Hmmm. If I understand correctly, the NPU IOMMU should be v2 rather than v1,
> >> implying it should support 40-bit PAs. Nevertheless, please note that the
> >> upper limit for DTE is 32 bits.
> >
> > Understood, and that 32-bit-DTE note is the crux of the trouble I had, so let
> > me lay out what I see and ask how you'd prefer to solve it.
> >
> > The mainline node is already v2 (rockchip,rk3568-iommu in rk356x-base.dtsi).
> > The problem on this 8 GiB board: with the v2 ops the page-table allocations
> > (gfp_flags == 0) can land above 4 GiB, so the DTE ends up > 32 bits and the
> > NPU's first translation faults with DMA_READ_ERROR. To work around that I had
> > switched the NPU MMU to the v1 compatible (rockchip,iommu), whose ops set
> > GFP_DMA32 and keep the DTE sub-4 GiB. That works in isolation, but because the
> > driver keeps a single global rk_ops, a v1 NPU MMU then trips
> > WARN_ON(rk_ops != ops) against the SoC's v2 instances (VOP/VDEC), which is why
> > I based the series on Simon's per-device-ops work.
> >
> > So my question: with per-device ops in place, what's the intended way to keep
> > the NPU MMU on v2 *and* cap its DTE at 32 bits on boards with >4 GiB of RAM?
> > A v2 ops variant carrying GFP_DMA32 for this device, or is there a register/
> > config bit that constrains the DTE address? I'd rather follow the Rockchip
> > intent here than carry the v1 workaround. (Simon, cc'd -- this is right next to
> > your per-device-ops series.)
> >
>
> If Simon's method works, please use it :)
>
> >> Can these operations not be completed via the pmdomain driver?
> >> If some operations are controlled by TF-A, are you using open source TF-A?
> >
> > Most of it is in pmdomain already. Power-on and NoC de-idle are done by the
> > RK3568 NPU power domain (genpd) at power-on -- the driver no longer pokes the
> > PMU directly. Two things remain outside it:
> >
> >  - vdd_npu: I mark it regulator-always-on in DT rather than wiring it as the
> >    domain's domain-supply, because as a domain-supply it created a device-link
> >    to the I2C PMIC (rk809) and genpd's power-off QoS-save path then hung
> >    reading the NPU QoS registers behind the (gated) NoC. If there's a clean way
> >    to let genpd own vdd_npu without that I2C ordering deadlock I'd much prefer
> >    that -- pointers welcome.
> >
>
> Please refer to the patch below regarding the RK3588 NPU pmdomain.
> In short, you need to set a "need_regulator" for the RK3568 NPU pmdomain.
>
> https://lore.kernel.org/all/20251216055247.13150-1-rmxpzlb@gmail.com/
>
> >  - the NPU compute clock (PVTPLL): set from the driver via SCMI, and only
> >    needed for actual compute, not for bring-up.
> >
> > One more pmdomain observation from testing, possibly relevant to how the NPU
> > domain should be modelled: the domain's power-off/on cycle doesn't reliably
> > re-de-idle the NoC. If the NPU is probed after genpd has already powered the
> > (unused) domain off, the power-on de-idle fails ("failed to set idle on domain
> > 'npu'") and the NPU IOMMU then takes an external abort on its first MMIO access.
> > Probing the NPU before the unused-domain power-off, or marking the domain
> > always-on, both avoid it. Is the NoC de-idle expected to work on a genpd
> > re-power here, or should this domain effectively stay on?
> >
>
> Not quite sure what's going on with PVTPLL and NOC.
> Maybe @Finley knows about this?
>
> > On TF-A: yes -- bl31 is built from upstream arm-trusted-firmware
> > (github.com/ARM-software/arm-trusted-firmware, RK3568 platform), providing PSCI
> > and the SCMI clock service. The only closed blob in the boot chain is Rockchip's
> > DDR init (rkbin), which is the standard situation for mainline RK356x.
>
> --
> Best,
> Chaoyi


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v3 0/9] accel: rocket: Add RK3568 NPU support
  2026-06-08  8:05       ` Midgy Balon
@ 2026-06-08  9:14         ` Midgy Balon
  2026-06-08  9:38           ` Chaoyi Chen
  0 siblings, 1 reply; 26+ messages in thread
From: Midgy Balon @ 2026-06-08  9:14 UTC (permalink / raw)
  To: Chaoyi Chen
  Cc: tomeu, ogabbay, heiko, robh, krzk+dt, conor+dt, joro, will,
	robin.murphy, dri-devel, linux-rockchip, devicetree,
	linux-arm-kernel, iommu, linux-kernel, Simon Xue, Finley Xiao

Hello Chaoyi,

Following up on the need_regulator suggestion -- I implemented and
tested it on the
board, and unfortunately it doesn't avoid the deadlock on RK3568; it
moves it from
boot to the NPU job submit.

What I did: gave the RK3568 NPU power domain a regulator (a DOMAIN_M_R
variant with
need_regulator = true), wired domain-supply = <&vdd_npu>, and dropped the
regulator-always-on workaround.

Boot is now clean and the NPU probes, but there is a warning during boot:

  rockchip-pm-domain ...: Failed to create device link (0x180) with supplier
  0-0020 for .../power-domain@6

(0-0020 is the rk809 PMIC that supplies vdd_npu.) Then on the first NPU job
submit the board hard-hangs with an RCU stall:

  rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
  rcu:     3-...!: (1 GPs behind) ...
  rcu: rcu_preempt kthread starved for 5115 jiffies! ... RCU_GP_WAIT_FQS(5)
  rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected

My reading: vdd_npu is on the rk809 *I2C* PMIC, so when genpd
enables/disables the
regulator during the NPU's runtime-PM power transition, the I2C
transfer runs in a
context that starves RCU and the box freezes. (I suspect
need_regulator is fine on
the RK3588 NPU because its supply isn't behind an I2C PMIC.) The always-on
workaround avoids this precisely because genpd never touches the I2C
regulator in
that path.

So: for an NPU domain whose supply is an I2C PMIC, is there a
supported way to let
genpd own the regulator without performing the I2C op in the
power-transition path
(a deferred/async regulator enable, or a flag), or should RK3568 keep vdd_npu as
regulator-always-on? For v4 I'll keep always-on unless there's a cleaner path.


Thanks,
Midgy

Le lun. 8 juin 2026 à 10:05, Midgy Balon <midgy971@gmail.com> a écrit :
>
> Hello Chaoyi,
>
> Thanks -- this is exactly what I needed.
>
> - v2/DTE: will do. I'll keep building on Simon's per-device-ops series -- with
>   that in place the NPU MMU can use the 32-bit-DTE ops (the per-ops GFP_DMA32
>   that's already in mainline) without the global rk_ops conflict. I'll
> keep it as
>   a stated dependency of the v4 cover letter.
>
> - vdd_npu:  I'll switch the RK3568 NPU
>   power domain to need_regulator + domain-supply = <&vdd_npu> and drop the
>   regulator-always-on workaround. I suspect that's also the right fix for the
>   power-off/on de-idle issue I described -- the always-on was really
> just papering
>   over the domain not being modelled with a regulator. I'll confirm on
> the board.
>
> - AUTO_GATING: thanks for the commit references -- I'll keep the bit-31
>   read-modify-write form with your Suggested-by and write the comment
> from those.
>   For the record: on v7.1-rc6 the NPU MMU also completes translations
> on the reset
>   value (I couldn't reproduce a page-walk stall without the write), so I'll note
>   in the commit that it matches the vendor clock-gating handling rather than
>   fixing a failure I can reproduce here -- happy to drop it if the iommu
>   maintainers would prefer.
>
> - PVTPLL/NoC: I'll follow up with Finley. First I'll check whether the
>   need_regulator change resolves the NoC re-power de-idle on its own;
> if it still
>   I'll bring him the details (the genpd power-on de-idle ack and the
>   BUS_IDLE_ST state).
>
> I'll send a v4 with these. Thanks again for the quick, detailed answers.
>
> Kind regards,
> Midgy
>
> Le lun. 8 juin 2026 à 03:40, Chaoyi Chen <chaoyi.chen@rock-chips.com> a écrit :
> >
> > Hi Midgy,
> >
> > On 6/8/2026 5:03 AM, Midgy Balon wrote:
> > > Hi Chaoyi,
> > >
> > > Thanks a lot for looking at this -- input from Rockchip is exactly what this
> > > series needs.
> > >
> > >> Hmmm. If I understand correctly, the NPU IOMMU should be v2 rather than v1,
> > >> implying it should support 40-bit PAs. Nevertheless, please note that the
> > >> upper limit for DTE is 32 bits.
> > >
> > > Understood, and that 32-bit-DTE note is the crux of the trouble I had, so let
> > > me lay out what I see and ask how you'd prefer to solve it.
> > >
> > > The mainline node is already v2 (rockchip,rk3568-iommu in rk356x-base.dtsi).
> > > The problem on this 8 GiB board: with the v2 ops the page-table allocations
> > > (gfp_flags == 0) can land above 4 GiB, so the DTE ends up > 32 bits and the
> > > NPU's first translation faults with DMA_READ_ERROR. To work around that I had
> > > switched the NPU MMU to the v1 compatible (rockchip,iommu), whose ops set
> > > GFP_DMA32 and keep the DTE sub-4 GiB. That works in isolation, but because the
> > > driver keeps a single global rk_ops, a v1 NPU MMU then trips
> > > WARN_ON(rk_ops != ops) against the SoC's v2 instances (VOP/VDEC), which is why
> > > I based the series on Simon's per-device-ops work.
> > >
> > > So my question: with per-device ops in place, what's the intended way to keep
> > > the NPU MMU on v2 *and* cap its DTE at 32 bits on boards with >4 GiB of RAM?
> > > A v2 ops variant carrying GFP_DMA32 for this device, or is there a register/
> > > config bit that constrains the DTE address? I'd rather follow the Rockchip
> > > intent here than carry the v1 workaround. (Simon, cc'd -- this is right next to
> > > your per-device-ops series.)
> > >
> >
> > If Simon's method works, please use it :)
> >
> > >> Can these operations not be completed via the pmdomain driver?
> > >> If some operations are controlled by TF-A, are you using open source TF-A?
> > >
> > > Most of it is in pmdomain already. Power-on and NoC de-idle are done by the
> > > RK3568 NPU power domain (genpd) at power-on -- the driver no longer pokes the
> > > PMU directly. Two things remain outside it:
> > >
> > >  - vdd_npu: I mark it regulator-always-on in DT rather than wiring it as the
> > >    domain's domain-supply, because as a domain-supply it created a device-link
> > >    to the I2C PMIC (rk809) and genpd's power-off QoS-save path then hung
> > >    reading the NPU QoS registers behind the (gated) NoC. If there's a clean way
> > >    to let genpd own vdd_npu without that I2C ordering deadlock I'd much prefer
> > >    that -- pointers welcome.
> > >
> >
> > Please refer to the patch below regarding the RK3588 NPU pmdomain.
> > In short, you need to set a "need_regulator" for the RK3568 NPU pmdomain.
> >
> > https://lore.kernel.org/all/20251216055247.13150-1-rmxpzlb@gmail.com/
> >
> > >  - the NPU compute clock (PVTPLL): set from the driver via SCMI, and only
> > >    needed for actual compute, not for bring-up.
> > >
> > > One more pmdomain observation from testing, possibly relevant to how the NPU
> > > domain should be modelled: the domain's power-off/on cycle doesn't reliably
> > > re-de-idle the NoC. If the NPU is probed after genpd has already powered the
> > > (unused) domain off, the power-on de-idle fails ("failed to set idle on domain
> > > 'npu'") and the NPU IOMMU then takes an external abort on its first MMIO access.
> > > Probing the NPU before the unused-domain power-off, or marking the domain
> > > always-on, both avoid it. Is the NoC de-idle expected to work on a genpd
> > > re-power here, or should this domain effectively stay on?
> > >
> >
> > Not quite sure what's going on with PVTPLL and NOC.
> > Maybe @Finley knows about this?
> >
> > > On TF-A: yes -- bl31 is built from upstream arm-trusted-firmware
> > > (github.com/ARM-software/arm-trusted-firmware, RK3568 platform), providing PSCI
> > > and the SCMI clock service. The only closed blob in the boot chain is Rockchip's
> > > DDR init (rkbin), which is the standard situation for mainline RK356x.
> >
> > --
> > Best,
> > Chaoyi


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v3 0/9] accel: rocket: Add RK3568 NPU support
  2026-06-08  9:14         ` Midgy Balon
@ 2026-06-08  9:38           ` Chaoyi Chen
  2026-06-09 11:11             ` Midgy Balon
  0 siblings, 1 reply; 26+ messages in thread
From: Chaoyi Chen @ 2026-06-08  9:38 UTC (permalink / raw)
  To: Midgy Balon
  Cc: tomeu, ogabbay, heiko, robh, krzk+dt, conor+dt, joro, will,
	robin.murphy, dri-devel, linux-rockchip, devicetree,
	linux-arm-kernel, iommu, linux-kernel, Simon Xue, Finley Xiao

Hi Midgy,

On 6/8/2026 5:14 PM, Midgy Balon wrote:
> Hello Chaoyi,
> 
> Following up on the need_regulator suggestion -- I implemented and
> tested it on the
> board, and unfortunately it doesn't avoid the deadlock on RK3568; it
> moves it from
> boot to the NPU job submit.
> 
> What I did: gave the RK3568 NPU power domain a regulator (a DOMAIN_M_R
> variant with
> need_regulator = true), wired domain-supply = <&vdd_npu>, and dropped the
> regulator-always-on workaround.
> 
> Boot is now clean and the NPU probes, but there is a warning during boot:
> 
>   rockchip-pm-domain ...: Failed to create device link (0x180) with supplier
>   0-0020 for .../power-domain@6
> 
> (0-0020 is the rk809 PMIC that supplies vdd_npu.) Then on the first NPU job
> submit the board hard-hangs with an RCU stall:
> 
>   rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
>   rcu:     3-...!: (1 GPs behind) ...
>   rcu: rcu_preempt kthread starved for 5115 jiffies! ... RCU_GP_WAIT_FQS(5)
>   rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected
> 
> My reading: vdd_npu is on the rk809 *I2C* PMIC, so when genpd
> enables/disables the
> regulator during the NPU's runtime-PM power transition, the I2C
> transfer runs in a
> context that starves RCU and the box freezes. (I suspect
> need_regulator is fine on
> the RK3588 NPU because its supply isn't behind an I2C PMIC.) The always-on
> workaround avoids this precisely because genpd never touches the I2C
> regulator in
> that path.
>

No, they are all controlled by RK809.

And This looks werid. Is your rocket driver compiled as a module? 
Please try compiling it as a module. When is the above error printed? 
Please provide the complete boot log.

> So: for an NPU domain whose supply is an I2C PMIC, is there a
> supported way to let
> genpd own the regulator without performing the I2C op in the
> power-transition path
> (a deferred/async regulator enable, or a flag), or should RK3568 keep vdd_npu as
> regulator-always-on? For v4 I'll keep always-on unless there's a cleaner path.
> 

-- 
Best, 
Chaoyi


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v3 0/9] accel: rocket: Add RK3568 NPU support
  2026-06-08  9:38           ` Chaoyi Chen
@ 2026-06-09 11:11             ` Midgy Balon
  2026-06-10  1:14               ` Chaoyi Chen
  0 siblings, 1 reply; 26+ messages in thread
From: Midgy Balon @ 2026-06-09 11:11 UTC (permalink / raw)
  To: Chaoyi Chen
  Cc: tomeu, ogabbay, heiko, robh, krzk+dt, conor+dt, joro, will,
	robin.murphy, dri-devel, linux-rockchip, devicetree,
	linux-arm-kernel, iommu, linux-kernel, Simon Xue, Finley Xiao

[-- Attachment #1: Type: text/plain, Size: 4983 bytes --]

Hello Chaoyi,

You were right - building rocket as a module fixes it. Thanks for the pointer.

I rebuilt with CONFIG_DRM_ACCEL_ROCKET=m (everything else the same:
need_regulator on
the RK3568 NPU power domain via a DOMAIN_M_R variant, domain-supply =
<&vdd_npu>, and the
regulator-always-on workaround dropped). The board now boots cleanly
and, more importantly,
an NPU job submit no longer hangs: I ran the test workload five times
with no RCU stall and
no freeze.

So with rocket=m the need_regulator approach works on RK3568, and I'll
keep it for v4
(domain-supply + need_regulator, instead of marking vdd_npu
always-on). rocket=m is the
normal configuration anyway; my earlier hang came from building it =y
in a self-contained
image, so it probed in the initcalls (around 2 s) and the genpd ->
I2C-PMIC regulator
transition ran before the system was ready. As a module it loads from
udev much later
(~6.8 s here), after the I2C controller and regulator core are fully up.

On your question of when the device-link error is printed - it is at
power-domain
controller probe, not at the rocket probe:

  [    2.700618] vdd_npu: Bringing 500000uV into 825000-825000uV
  [    2.749637] rockchip-pm-domain fdd90000.power-management:power-controller:
                 Failed to create device link (0x180) with supplier 0-0020 for
                 /power-management@fdd90000/power-controller/power-domain@6
  [    2.945955] platform fde40000.npu: Adding to iommu group 3
  ...
  [    6.840374] rocket: loading out-of-tree module taints kernel.
  [    6.877647] [drm] Initialized rocket 0.0.0 for rknn on minor 0
  [    6.879950] rocket fde40000.npu: Rockchip NPU core 0 version: 0

So the device-link to the rk809 PMIC (0-0020) fails to form at ~2.75
s, well before rocket
loads at ~6.8 s. It is non-fatal here - the vdd_npu rail is brought up
by the regulator core
and all jobs run - and there is no "failed to get ack on domain npu"
NoC warning this boot
(the always-on kernel had one). The complete boot log is attached.

Two notes / one question:
- This boot used fw_devlink=permissive on the command line. Is the
"Failed to create device
  link ... supplier 0-0020" at pmdomain probe expected/benign, or is
there a clean way to make
  it order correctly (so it also works without permissive, and a =y
build wouldn't deadlock in
  the initcalls)?
- (The convolution output is still uniform zero-point / the job times
out - that is the
  separate NPU compute-completion issue, unrelated to the power-domain
work. Finley, that is
  the one I flagged earlier re PVTPLL/NoC.)

Kind regards,
Midgy

Le lun. 8 juin 2026 à 11:38, Chaoyi Chen <chaoyi.chen@rock-chips.com> a écrit :
>
> Hi Midgy,
>
> On 6/8/2026 5:14 PM, Midgy Balon wrote:
> > Hello Chaoyi,
> >
> > Following up on the need_regulator suggestion -- I implemented and
> > tested it on the
> > board, and unfortunately it doesn't avoid the deadlock on RK3568; it
> > moves it from
> > boot to the NPU job submit.
> >
> > What I did: gave the RK3568 NPU power domain a regulator (a DOMAIN_M_R
> > variant with
> > need_regulator = true), wired domain-supply = <&vdd_npu>, and dropped the
> > regulator-always-on workaround.
> >
> > Boot is now clean and the NPU probes, but there is a warning during boot:
> >
> >   rockchip-pm-domain ...: Failed to create device link (0x180) with supplier
> >   0-0020 for .../power-domain@6
> >
> > (0-0020 is the rk809 PMIC that supplies vdd_npu.) Then on the first NPU job
> > submit the board hard-hangs with an RCU stall:
> >
> >   rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
> >   rcu:     3-...!: (1 GPs behind) ...
> >   rcu: rcu_preempt kthread starved for 5115 jiffies! ... RCU_GP_WAIT_FQS(5)
> >   rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected
> >
> > My reading: vdd_npu is on the rk809 *I2C* PMIC, so when genpd
> > enables/disables the
> > regulator during the NPU's runtime-PM power transition, the I2C
> > transfer runs in a
> > context that starves RCU and the box freezes. (I suspect
> > need_regulator is fine on
> > the RK3588 NPU because its supply isn't behind an I2C PMIC.) The always-on
> > workaround avoids this precisely because genpd never touches the I2C
> > regulator in
> > that path.
> >
>
> No, they are all controlled by RK809.
>
> And This looks werid. Is your rocket driver compiled as a module?
> Please try compiling it as a module. When is the above error printed?
> Please provide the complete boot log.
>
> > So: for an NPU domain whose supply is an I2C PMIC, is there a
> > supported way to let
> > genpd own the regulator without performing the I2C op in the
> > power-transition path
> > (a deferred/async regulator enable, or a flag), or should RK3568 keep vdd_npu as
> > regulator-always-on? For v4 I'll keep always-on unless there's a cleaner path.
> >
>
> --
> Best,
> Chaoyi

[-- Attachment #2: 2026-06-09_rocket-m-needreg.log --]
[-- Type: text/x-log, Size: 70113 bytes --]

sudo reboot
Password: 

Login incorrect
rock-3b login: radxa
Password: 
Linux rock-3b 6.19.0-rc5-00003-gb16c04e5e619-dirty #65 SMP PREEMPT Thu May 21 11:16:54 CEST 2026 aarch64

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Mon Jun  8 14:14:44 CEST 2026 from 10.160.1.15 on pts/1
^[[?2004hradxa@rock-3b:~$ [ 1800.807070] hdmi-audio-codec hdmi-audio-codec.11.auto: HDMI: Unknown ELD version 0
[ 1800.811885] hdmi-audio-codec hdmi-audio-codec.11.auto: HDMI: Unknown ELD version 0
[ 1800.813239] hdmi-audio-codec hdmi-audio-codec.11.auto: HDMI: Unknown ELD version 0
[ 1800.814408] hdmi-audio-codec hdmi-audio-codec.11.auto: HDMI: Unknown ELD version 0
[ 1800.815613] hdmi-audio-codec hdmi-audio-codec.11.auto: HDMI: Unknown ELD version 0
[ 1800.817175] hdmi-audio-codec hdmi-audio-codec.11.auto: HDMI: Unknown ELD version 0
[ 1800.821487] hdmi-audio-codec hdmi-audio-codec.11.auto: HDMI: Unknown ELD version 0
d\b^[[Ksudo reboot
^[[?2004l\r         Stopping ^[[0;1;39mSession 42 of user radxa^[[0m.
[^[[0;32m  OK  ^[[0m] Removed slice ^[[0;1;39msystem-modprobe.slice^[[0m.
[^[[0;32m  OK  ^[[0m] Removed slice ^[[0;1;39msystem-ssh.slice^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped target ^[[0;1;39mGraphical Interface^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped target ^[[0;1;39mMulti-User System^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped target ^[[0;1;39mLogin Prompts^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped target ^[[0;1;39mRemote Encrypted Volumes^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped target ^[[0;1;39mSound Card^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped target ^[[0;1;39mTimers^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mDaily apt upgrade and clean activities^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mDaily apt download activities^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mPeriodic ext4 Onli…ata Check for All Filesystems^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mDiscard unused blocks once a week^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mDaily man-db regeneration^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mDaily Cleanup of Temporary Directories^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped target ^[[0;1;39mSystem Time Synchronized^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped target ^[[0;1;39mSystem Time Set^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped target ^[[0;1;39mHardware activated USB gadget^[[0m.
[^[[0;32m  OK  ^[[0m] Closed ^[[0;1;39mLoad/Save RF Kill Switch Status /dev/rfkill Watch^[[0m.
         Unmounting ^[[0;1;39m/config^[[0m...
         Stopping ^[[0;1;39mSave/Restore Sound Card State^[[0m...
         Stopping ^[[0;1;39mAvahi mDNS/DNS-SD Stack^[[0m...
         Stopping ^[[0;1;39mGetty on tty1^[[0m...
         Stopping ^[[0;1;39mNetdata, X-Ray Vi…on for your infrastructure!^[[0m...
         Stopping ^[[0;1;39mEnable adbd on supported Radxa products^[[0m...
         Stopping ^[[0;1;39mEnable USB Ethern…on supported Radxa products^[[0m...
         Stopping ^[[0;1;39mSerial Getty on ttyS2^[[0m...
         Stopping ^[[0;1;39mLSB: Set sysfs variables from /etc/sysfs.conf^[[0m...
         Stopping ^[[0;1;39mJournal Service for Namespace netdata^[[0m...
         Stopping ^[[0;1;39mLoad/Save Random Seed^[[0m...
         Stopping ^[[0;1;39mTailscale node agent^[[0m...
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mAvahi mDNS/DNS-SD Stack^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mGetty on tty1^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mSerial Getty on ttyS2^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mJournal Service for Namespace netdata^[[0m.
[^[[0;32m  OK  ^[[0m] Unmounted ^[[0;1;39m/config^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mSave/Restore Sound Card State^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mLoad/Save Random Seed^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mSession 42 of user radxa^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mLSB: Set sysfs variables from /etc/sysfs.conf^[[0m.
[^[[0;32m  OK  ^[[0m] Removed slice ^[[0;1;39msystem-getty.slice^[[0m.
[^[[0;32m  OK  ^[[0m] Removed slice ^[[0;1;39msystem-serial\x2dgetty.slice^[[0m.
         Stopping ^[[0;1;39mUser Login Management^[[0m...
         Stopping ^[[0;1;39mUser Manager for UID 1000^[[0m...
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mUser Manager for UID 1000^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mUser Login Management^[[0m.
         Stopping ^[[0;1;39mUser Runtime Directory /run/user/1000^[[0m...
[^[[0;32m  OK  ^[[0m] Unmounted ^[[0;1;39m/run/user/1000^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mUser Runtime Directory /run/user/1000^[[0m.
[^[[0;32m  OK  ^[[0m] Removed slice ^[[0;1;39mUser Slice of UID 1000^[[0m.
         Stopping ^[[0;1;39mPermit User Sessions^[[0m...
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mPermit User Sessions^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped target ^[[0;1;39mRemote File Systems^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mTailscale node agent^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mNetdata, X-Ray Vision for your infrastructure!^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped target ^[[0;1;39mNetwork is Online^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped target ^[[0;1;39mHost and Network Name Lookups^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mNetwork Manager Wait Online^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mWait for Network to be Configured^[[0m.
[ 1809.095948] dwc3 fcc00000.usb: request 00000000d5d2d033 was not queued to ep0out
[ 1809.143743] unloading
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mEnable USB Ethernet on supported Radxa products^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mEnable adbd on supported Radxa products^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped target ^[[0;1;39mNetwork^[[0m.
         Stopping ^[[0;1;39mNetwork Manager^[[0m...
         Stopping ^[[0;1;39mRaise network interfaces^[[0m...
         Stopping ^[[0;1;39mNetwork Name Resolution^[[0m...
         Stopping ^[[0;1;39mWPA supplicant^[[0m...
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mRaise network interfaces^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mNetwork Name Resolution^[[0m.
         Stopping ^[[0;1;39mNetwork Service^[[0m...
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mNetwork Service^[[0m.
[ 1809.276260] wlp1s0: deauthenticating from 38:ff:36:bb:04:ac by local choice (Reason: 3=DEAUTH_LEAVING)
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mWPA supplicant^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mNetwork Manager^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped target ^[[0;1;39mNetwork (Pre)^[[0m.
         Stopping ^[[0;1;39mD-Bus System Message Bus^[[0m...
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mD-Bus System Message Bus^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped target ^[[0;1;39mBasic System^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped target ^[[0;1;39mPaths^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped target ^[[0;1;39mSlices^[[0m.
[^[[0;32m  OK  ^[[0m] Removed slice ^[[0;1;39mUser and Session Slice^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped target ^[[0;1;39mSockets^[[0m.
[^[[0;32m  OK  ^[[0m] Closed ^[[0;1;39mAvahi mDNS/DNS-SD Stack Activation Socket^[[0m.
[^[[0;32m  OK  ^[[0m] Closed ^[[0;1;39mD-Bus System Message Bus Socket^[[0m.
[^[[0;32m  OK  ^[[0m] Closed ^[[0;1;39mOpenBSD Secure Shell server socket^[[0m.
[^[[0;32m  OK  ^[[0m] Closed ^[[0;1;39mJournal Varlink Socket for Namespace netdata^[[0m.
[^[[0;32m  OK  ^[[0m] Removed slice ^[[0;1;39msystem-syste…\x2djournald\x2dvarlink.slice^[[0m.
[^[[0;32m  OK  ^[[0m] Closed ^[[0;1;39mJournal Socket for Namespace netdata^[[0m.
[^[[0;32m  OK  ^[[0m] Removed slice ^[[0;1;39msystem-systemd\x2djournald.slice^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped target ^[[0;1;39mSystem Initialization^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped target ^[[0;1;39mLocal Encrypted Volumes^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mDispatch Password …ts to Console Directory Watch^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mForward Password R…uests to Wall Directory Watch^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped target ^[[0;1;39mSwap^[[0m.
         Deactivating swap ^[[0;1;39m/mnt/nvme/swapfile^[[0m...
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mApply Kernel Variables^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mLoad Kernel Modules^[[0m.
         Stopping ^[[0;1;39mNetwork Time Synchronization^[[0m...
         Stopping ^[[0;1;39mUpdate UTMP about System Boot/Shutdown^[[0m...
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mNetwork Time Synchronization^[[0m.
[^[[0;32m  OK  ^[[0m] Deactivated swap ^[[0;1;39m/mnt/nvme/swapfile^[[0m.
         Unmounting ^[[0;1;39m/mnt/nvme^[[0m...
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mUpdate UTMP about System Boot/Shutdown^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mCreate Volatile Files and Directories^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped target ^[[0;1;39mLocal File Systems^[[0m.
[^[[0;32m  OK  ^[[0m] Unset automount ^[[0;1;39mboot-efi.automount^[[0m.
[^[[0;32m  OK  ^[[0m] Unset automount ^[[0;1;39mconfig.automount^[[0m.
[ 1811.430689] EXT4-fs (nvme0n1): unmounting filesystem 0d9000fd-1edf-455d-9058-56e0855a1edb.
[^[[0;32m  OK  ^[[0m] Unmounted ^[[0;1;39m/mnt/nvme^[[0m.
[^[[0;32m  OK  ^[[0m] Reached target ^[[0;1;39mUnmount All Filesystems^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mFile System Check …d-1edf-455d-9058-56e0855a1edb^[[0m.
[^[[0;32m  OK  ^[[0m] Removed slice ^[[0;1;39msystem-systemd\x2dfsck.slice^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped target ^[[0;1;39mLocal File Systems (Pre)^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mCreate Static Device Nodes in /dev^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mCreate System Users^[[0m.
[^[[0;32m  OK  ^[[0m] Stopped ^[[0;1;39mRemount Root and Kernel File Systems^[[0m.
[^[[0;32m  OK  ^[[0m] Reached target ^[[0;1;39mShutdown^[[0m.
[^[[0;32m  OK  ^[[0m] Reached target ^[[0;1;39mFinal Step^[[0m.
[^[[0;32m  OK  ^[[0m] Finished ^[[0;1;39mReboot^[[0m.
[^[[0;32m  OK  ^[[0m] Reached target ^[[0;1;39mReboot^[[0m.
[ 1811.697989] watchdog: watchdog0: watchdog did not stop!
[ 1811.815097] systemd-shutdown[1]: Syncing filesystems and block devices.
[ 1811.829482] systemd-shutdown[1]: Sending SIGTERM to remaining processes...
[ 1811.853568] systemd-journald[204]: Received SIGTERM from PID 1 (systemd-shutdow).
[ 1811.899622] systemd-shutdown[1]: Sending SIGKILL to remaining processes...
[ 1811.913025] systemd-shutdown[1]: Using hardware watchdog 'Synopsys DesignWare Watchdog', version 0, device /dev/watchdog
[ 1811.917532] systemd-shutdown[1]: Unmounting file systems.
[ 1811.920424] [11189]: Remounting '/' read-only in with options '(null)'.
[ 1811.983886] EXT4-fs (mmcblk0p3): re-mounted d210e617-6a4c-4771-b955-ddd835a32d2b ro.
[ 1812.006902] systemd-shutdown[1]: All filesystems unmounted.
[ 1812.007480] systemd-shutdown[1]: Deactivating swaps.
[ 1812.008275] systemd-shutdown[1]: All swaps deactivated.
[ 1812.008802] systemd-shutdown[1]: Detaching loop devices.
[ 1812.015813] systemd-shutdown[1]: All loop devices detached.
[ 1812.016332] systemd-shutdown[1]: Stopping MD devices.
[ 1812.017106] systemd-shutdown[1]: All MD devices stopped.
[ 1812.017591] systemd-shutdown[1]: Detaching DM devices.
[ 1812.018320] systemd-shutdown[1]: All DM devices detached.
[ 1812.018817] systemd-shutdown[1]: All filesystems, swaps, loop devices, MD devices and DM devices detached.
[ 1812.027209] systemd-shutdown[1]: Syncing filesystems and block devices.
[ 1812.031042] systemd-shutdown[1]: Rebooting.
[ 1812.186088] kvm: exiting hardware virtualization
[ 1812.186550] reboot: Restarting system
\0\0DDR 03ea844c5d typ 24/09/03-10:42:57,fwver: v1.23
In
wdqs_if: 0x1010100
LP4/4x derate en, other dram:1x trefi
ddrconfig:7
MID:0xff
LPDDR4X, 324MHz
BW=32 Col=10 Bk=8 CS0 Row=17 CS1 Row=17 CS=2 Die BW=16 Size=8192MB
tdqss_lf: cs0 dqs0: -24ps, dqs1: -96ps, dqs2: -48ps, dqs3: -168ps, 
tdqss_lf: cs1 dqs0: 24ps, dqs1: -72ps, dqs2: -48ps, dqs3: -144ps, 
tdqss_hf: cs0 dqs0: -24ps, dqs1: -96ps, dqs2: -48ps, dqs3: -168ps, 
tdqss_hf: cs1 dqs0: 24ps, dqs1: -72ps, dqs2: -48ps, dqs3: -144ps, 

change to: 324MHz
PHY drv:clk:36,ca:36,DQ:29,odt:240
vrefinner:16%, vrefout:41%
dram drv:40,odt:0
clk skew:0x62

change to: 528MHz
PHY drv:clk:36,ca:36,DQ:29,odt:240
vrefinner:16%, vrefout:41%
dram drv:40,odt:0
clk skew:0x58

change to: 780MHz
PHY drv:clk:36,ca:36,DQ:29,odt:60
vrefinner:16%, vrefout:41%
dram drv:40,odt:0
clk skew:0x58
rx vref: 14.6%
tx vref: 34.0%

change to: 1560MHz(final freq)
PHY drv:clk:36,ca:36,DQ:29,odt:60
vrefinner:16%, vrefout:22%
dram drv:40,odt:80
vref_ca:00000071
clk skew:0x26
rx vref: 15.6%
tx vref: 22.8%
cs 0:
rdtrn RS:
DQS0:0x30, DQS1:0x31, DQS2:0x33, DQS3:0x2c, 
min  : 0xd  0xe 0x10  0xe  0x1  0x2  0x8  0x5 , 0xa  0x7  0x2  0x3  0xc  0xb  0xd  0xa ,
      0x11  0xf  0xa  0xa  0x2  0x2  0x2  0x6 , 0xc  0x7  0x6  0x3 0x10 0x11  0xd 0x11 ,
mid  :0x26 0x26 0x29 0x27 0x1b 0x1c 0x22 0x1f ,0x24 0x21 0x1b 0x1c 0x26 0x25 0x26 0x25 ,
      0x2c 0x2a 0x24 0x24 0x1c 0x1b 0x1b 0x20 ,0x24 0x20 0x1e 0x1a 0x27 0x29 0x25 0x29 ,
max  :0x3f 0x3f 0x43 0x40 0x35 0x36 0x3c 0x39 ,0x3e 0x3c 0x35 0x36 0x41 0x3f 0x3f 0x40 ,
      0x47 0x45 0x3e 0x3e 0x36 0x35 0x34 0x3b ,0x3d 0x39 0x37 0x32 0x3f 0x41 0x3e 0x42 ,
range:0x32 0x31 0x33 0x32 0x34 0x34 0x34 0x34 ,0x34 0x35 0x33 0x33 0x35 0x34 0x32 0x36 ,
      0x36 0x36 0x34 0x34 0x34 0x33 0x32 0x35 ,0x31 0x32 0x31 0x2f 0x2f 0x30 0x31 0x31 ,
wrtrn RS:
DQS0:0x22, DQS1:0x13, DQS2:0x1d, DQS3:0x5, 
min  :0x76 0x79 0x7c 0x78 0x6c 0x6f 0x73 0x72 0x72 ,0x63 0x5f 0x59 0x59 0x65 0x63 0x64 0x62 0x5e ,
      0x6a 0x6a 0x66 0x64 0x5d 0x5c 0x5c 0x60 0x63 ,0x55 0x51 0x51 0x4c 0x58 0x5a 0x55 0x5b 0x53 ,
mid  :0x91 0x93 0x95 0x92 0x84 0x88 0x8c 0x8b 0x8a ,0x7c 0x79 0x73 0x73 0x7d 0x7b 0x7c 0x7b 0x77 ,
      0x85 0x84 0x7f 0x7e 0x76 0x75 0x75 0x7a 0x7b ,0x70 0x6b 0x6b 0x66 0x73 0x73 0x6f 0x75 0x6c ,
max  :0xac 0xae 0xaf 0xad 0x9c 0xa1 0xa5 0xa5 0xa2 ,0x96 0x94 0x8e 0x8d 0x96 0x94 0x95 0x94 0x90 ,
      0xa1 0x9f 0x98 0x98 0x90 0x8e 0x8f 0x94 0x94 ,0x8b 0x85 0x86 0x80 0x8f 0x8d 0x89 0x8f 0x86 ,
range:0x36 0x35 0x33 0x35 0x30 0x32 0x32 0x33 0x30 ,0x33 0x35 0x35 0x34 0x31 0x31 0x31 0x32 0x32 ,
      0x37 0x35 0x32 0x34 0x33 0x32 0x33 0x34 0x31 ,0x36 0x34 0x35 0x34 0x37 0x33 0x34 0x34 0x33 ,
cs 1:
rdtrn RS:
DQS0:0x30, DQS1:0x31, DQS2:0x33, DQS3:0x2c, 
min  : 0xd  0xe 0x10  0xe  0x1  0x2  0x8  0x5 , 0xa  0x7  0x2  0x3  0xc  0xb  0xd  0xa ,
      0x11  0xf  0xa  0xa  0x2  0x2  0x2  0x6 , 0xc  0x7  0x6  0x3 0x10 0x11  0xd 0x11 ,
mid  :0x26 0x26 0x29 0x27 0x1b 0x1c 0x22 0x1f ,0x24 0x21 0x1b 0x1c 0x26 0x25 0x26 0x25 ,
      0x2c 0x2a 0x24 0x24 0x1c 0x1b 0x1b 0x20 ,0x24 0x20 0x1e 0x1a 0x27 0x29 0x25 0x29 ,
max  :0x3f 0x3f 0x43 0x40 0x35 0x36 0x3c 0x39 ,0x3e 0x3c 0x35 0x36 0x41 0x3f 0x3f 0x40 ,
      0x47 0x45 0x3e 0x3e 0x36 0x35 0x34 0x3b ,0x3d 0x39 0x37 0x32 0x3f 0x41 0x3e 0x42 ,
range:0x32 0x31 0x33 0x32 0x34 0x34 0x34 0x34 ,0x34 0x35 0x33 0x33 0x35 0x34 0x32 0x36 ,
      0x36 0x36 0x34 0x34 0x34 0x33 0x32 0x35 ,0x31 0x32 0x31 0x2f 0x2f 0x30 0x31 0x31 ,
wrtrn RS:
DQS0:0x22, DQS1:0x13, DQS2:0x1d, DQS3:0x5, 
min  :0x76 0x79 0x7c 0x78 0x6c 0x6f 0x73 0x72 0x72 ,0x63 0x5f 0x59 0x59 0x65 0x63 0x64 0x62 0x5e ,
      0x6a 0x6a 0x66 0x64 0x5d 0x5c 0x5c 0x60 0x63 ,0x55 0x51 0x51 0x4c 0x58 0x5a 0x55 0x5b 0x53 ,
mid  :0x91 0x93 0x95 0x92 0x84 0x88 0x8c 0x8b 0x8a ,0x7c 0x79 0x73 0x73 0x7d 0x7b 0x7c 0x7b 0x77 ,
      0x85 0x84 0x7f 0x7e 0x76 0x75 0x75 0x7a 0x7b ,0x70 0x6b 0x6b 0x66 0x73 0x73 0x6f 0x75 0x6c ,
max  :0xac 0xae 0xaf 0xad 0x9c 0xa1 0xa5 0xa5 0xa2 ,0x96 0x94 0x8e 0x8d 0x96 0x94 0x95 0x94 0x90 ,
      0xa1 0x9f 0x98 0x98 0x90 0x8e 0x8f 0x94 0x94 ,0x8b 0x85 0x86 0x80 0x8f 0x8d 0x89 0x8f 0x86 ,
range:0x36 0x35 0x33 0x35 0x30 0x32 0x32 0x33 0x30 ,0x33 0x35 0x35 0x34 0x31 0x31 0x31 0x32 0x32 ,
      0x37 0x35 0x32 0x34 0x33 0x32 0x33 0x34 0x31 ,0x36 0x34 0x35 0x34 0x37 0x33 0x34 0x34 0x33 ,
CBT RS:
cs:0 min  :0x43 0x39 0x38 0x2d 0x36 0x27 0x3c ,0x44 0x36 0x37 0x2a 0x35 0x2a 0x3d ,
cs:0 mid  :0x7d 0x7d 0x71 0x71 0x70 0x6c 0x69 ,0x7d 0x7a 0x70 0x6d 0x6e 0x6e 0x69 ,
cs:0 max  :0xb7 0xc1 0xab 0xb5 0xab 0xb1 0x97 ,0xb7 0xbf 0xa9 0xb0 0xa7 0xb2 0x96 ,
cs:0 range:0x74 0x88 0x73 0x88 0x75 0x8a 0x5b ,0x73 0x89 0x72 0x86 0x72 0x88 0x59 ,
cs:1 min  :0x42 0x3e 0x39 0x33 0x38 0x2f 0x40 ,0x43 0x3c 0x35 0x2f 0x37 0x30 0x3f ,
cs:1 mid  :0x7f 0x7f 0x75 0x72 0x74 0x6e 0x6e ,0x7f 0x7c 0x72 0x6e 0x73 0x6f 0x6e ,
cs:1 max  :0xbd 0xc0 0xb2 0xb2 0xb0 0xad 0x9c ,0xbc 0xbc 0xb0 0xae 0xaf 0xae 0x9d ,
cs:1 range:0x7b 0x82 0x79 0x7f 0x78 0x7e 0x5c ,0x79 0x80 0x7b 0x7f 0x78 0x7e 0x5e ,
out

<debug_uart>
dmc
pinctrl
serial@fe660000

U-Boot SPL 2025.07-rc4-dirty (Mar 25 2026 - 16:29:49 +0100)
mmc@fe310000
clock-controller@fdd20000
clock-controller@fdd00000
mmc@fe2b0000
Trying to boot from MMC1
## Checking hash(es) for config config-1 ... OK
## Checking hash(es) for Image atf-1 ... sha256+ OK
## Checking hash(es) for Image u-boot ... sha256+ OK
## Checking hash(es) for Image fdt-1 ... sha256+ OK
## Checking hash(es) for Image atf-2 ... sha256+ OK
## Checking hash(es) for Image atf-3 ... sha256+ OK
NOTICE:  BL31: v2.14.0(release):8dae086
NOTICE:  BL31: Built : 14:26:06, Mar 25 2026
NOTICE:  BL31: Rockchip release version: v1.0

<debug_uart>
A
B
C
D
E
F
G
H
pinctrl
serial@fe660000


U-Boot 2025.07-rc4-dirty (Mar 25 2026 - 16:29:49 +0100)

clock-controller@fdd20000
clock-controller@fdd00000
Model: Radxa ROCK 3B
nvmem@fe38c000
SoC:   RK3568
I
DRAM:  dmc
J
8 GiB (total 7.7 GiB)
io-domains
clock-controller@fdd00000
pinctrl
i2c@fdd40000
clock-controller@fdd20000
pmic@20
PMIC:  RK809 (on=0x02, off=0x00)
LDO_REG6
LDO_REG4
LDO_REG5
DCDC_REG5
SWITCH_REG1
DCDC_REG1
DCDC_REG2
DCDC_REG3
LDO_REG2
LDO_REG3
LDO_REG7
LDO_REG8
SWITCH_REG2
led-0
gpio@fdd60000
serial@fe660000
Core:  340 devices, 33 uclasses, devicetree: separate
MMC:   mmc@fe310000
mmc@fe2b0000
mmc@fe2b0000: 1, mmc@fe310000: 0
Loading Environment from nowhere... OK
In:    serial@fe660000
Out:   serial@fe660000
Err:   serial@fe660000
Model: Radxa ROCK 3B
nvmem@fe38c000
SoC:   RK3568
saradc@fe720000
reset
Net:   ethernet@fe010000
gpio@fe760000
ethernet@fe2a0000
eth1: ethernet@fe010000, eth0: ethernet@fe2a0000
Hit any key to stop autoboot:  2 \b\b\b 1 \b\b\b 0 
bootstd
Scanning for bootflows in all bootdevs
Seq  Method       State   Uclass    Part  Name                      Filename
---  -----------  ------  --------  ----  ------------------------  ----------------
vbe_simple
vbe_simple
Scanning global bootmeth 'efi_mgr':
^[7^[[r^[[999;999H^[[6n^[8mmc@fe2b0000.blk
Card did not respond to voltage select! : -110
mmc@fe310000.blk
rng@fe388000
psci
  0  efi_mgr      ready   (none)       0  <NULL>                    
** Booting bootflow '<NULL>' with efi_mgr
Loading Boot0000 'mmc 0' failed
EFI boot manager: Cannot load any image
Boot failed (err=-14)
Scanning bootdev 'mmc@fe2b0000.bootdev':
mmc@fe2b0000.blk
Card did not respond to voltage select! : -110
Scanning bootdev 'mmc@fe310000.bootdev':
  1  extlinux     ready   mmc          3  mmc@fe310000.bootdev.part /boot/extlinux/extlinux.conf
** Booting bootflow 'mmc@fe310000.bootdev.part_3' with extlinux
U-Boot menu
1:	Mainline 6.19 NPU IOMMU (default)
2:	Mainline 6.19 NPU non-IOMMU (fallback)
3:	Mainline 6.19 Rocket accel driver (test)
4:	Mainline 7.1-rc6 Rocket NPU (test)
5:	Mainline 7.1-rc6 Rocket NPU + Chaoyi AUTO_GATING BIT(31) (test)
6:	Mainline 7.1-rc6 NPU need_regulator + rocket=m (test)
Enter choice: 6
6:	Mainline 7.1-rc6 NPU need_regulator + rocket=m (test)
Retrieving file: /boot/Image-7.1-needreg-m
Retrieving file: /boot/initrd.img-7.1.0-rc6-00007-g043be7a551c4
append: root=UUID=d210e617-6a4c-4771-b955-ddd835a32d2b rw rootwait earlycon console=ttyFIQ0,1500000n8 console=ttyS2,1500000n8 clk_ignore_unused cma=128M kernel.panic=5 fw_devlink=permissive
Retrieving file: /boot/rk3568-rock-3b-7.1-needreg-m.dtb
## Flattened Device Tree blob at 12000000
   Booting using the fdt blob at 0x12000000
Working FDT set to 12000000
   Loading Ramdisk to eb2c0000, end ecead483 ... OK
   Loading Device Tree to 00000000eb2ac000, end 00000000eb2bf1e3 ... OK
Working FDT set to eb2ac000

Starting kernel ...

[    0.000000] Booting Linux on physical CPU 0x0000000000 [0x412fd050]
[    0.000000] Linux version 7.1.0-rc6-chaoyi-00011-ga31e2e6fae27 (radxa@rock-3b) (gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #5 SMP PREEMPT Tue Jun  9 12:40:22 CEST 2026
[    0.000000] KASLR enabled
[    0.000000] Machine model: Radxa ROCK 3B
[    0.000000] efi: UEFI not found.
[    0.000000] earlycon: uart0 at MMIO32 0x00000000fe660000 (options '1500000n8')
[    0.000000] printk: legacy bootconsole [uart0] enabled
[    0.000000] OF: reserved mem: 0x000000000010f000..0x000000000010f0ff (0 KiB) nomap non-reusable shmem@10f000
[    0.000000] NUMA: Faking a node at [mem 0x0000000000200000-0x00000001ffffffff]
[    0.000000] NODE_DATA(0) allocated [mem 0x1ff01df80-0x1ff02067f]
[    0.000000] cma: Reserved 128 MiB at 0x00000000e3200000
[    0.000000] psci: probing for conduit method from DT.
[    0.000000] psci: PSCIv1.1 detected in firmware.
[    0.000000] psci: Using standard PSCI v0.2 function IDs
[    0.000000] psci: MIGRATE_INFO_TYPE not supported.
[    0.000000] psci: SMC Calling Convention v1.5
[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x0000000000200000-0x00000000ffffffff]
[    0.000000]   DMA32    empty
[    0.000000]   Normal   [mem 0x0000000100000000-0x00000001ffffffff]
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000000200000-0x00000000efffffff]
[    0.000000]   node   0: [mem 0x0000000100000000-0x00000001ffffffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000000200000-0x00000001ffffffff]
[    0.000000] On node 0, zone DMA: 512 pages in unavailable ranges
[    0.000000] percpu: Embedded 26 pages/cpu s67288 r8192 d31016 u106496
[    0.000000] Detected VIPT I-cache on CPU0
[    0.000000] CPU features: detected: GICv3 CPU interface
[    0.000000] CPU features: detected: Virtualization Host Extensions
[    0.000000] CPU features: kernel page table isolation forced ON by KASLR
[    0.000000] CPU features: detected: Kernel page table isolation (KPTI)
[    0.000000] CPU features: detected: ARM errata 1165522, 1319367, or 1530923
[    0.000000] alternatives: applying boot alternatives
[    0.000000] Kernel command line: root=UUID=d210e617-6a4c-4771-b955-ddd835a32d2b rw rootwait earlycon console=ttyFIQ0,1500000n8 console=ttyS2,1500000n8 clk_ignore_unused cma=128M kernel.panic=5 fw_devlink=permissive
[    0.000000] printk: log buffer data + meta data: 131072 + 458752 = 589824 bytes
[    0.000000] Dentry cache hash table entries: 1048576 (order: 11, 8388608 bytes, linear)
[    0.000000] Inode-cache hash table entries: 524288 (order: 10, 4194304 bytes, linear)
[    0.000000] software IO TLB: area num 4.
[    0.000000] software IO TLB: mapped [mem 0x00000000df200000-0x00000000e3200000] (64MB)
[    0.000000] Fallback order for Node 0: 0 
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 2031104
[    0.000000] Policy zone: Normal
[    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
[    0.000000] rcu: Preemptible hierarchical RCU implementation.
[    0.000000] rcu: 	RCU event tracing is enabled.
[    0.000000] rcu: 	RCU restricting CPUs from NR_CPUS=512 to nr_cpu_ids=4.
[    0.000000] 	Trampoline variant of Tasks RCU enabled.
[    0.000000] 	Tracing variant of Tasks RCU enabled.
[    0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies.
[    0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=4
[    0.000000] RCU Tasks: Setting shift to 2 and lim to 1 rcu_task_cb_adjust=1 rcu_task_cpu_ids=4.
[    0.000000] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
[    0.000000] GIC: enabling workaround for GICv3: non-coherent attribute
[    0.000000] GICv3: GIC: Using split EOI/Deactivate mode
[    0.000000] GICv3: 320 SPIs implemented
[    0.000000] GICv3: 0 Extended SPIs implemented
[    0.000000] GICv3: MBI range [296:319]
[    0.000000] GICv3: Using MBI frame 0x00000000fd410000
[    0.000000] Root IRQ handler: gic_handle_irq
[    0.000000] GICv3: GICv3 features: 16 PPIs
[    0.000000] GICv3: GICD_CTLR.DS=0, SCR_EL3.FIQ=0
[    0.000000] GICv3: CPU0: found redistributor 0 region 0:0x00000000fd460000
[    0.000000] ITS [mem 0xfd440000-0xfd45ffff]
[    0.000000] GIC: enabling workaround for ITS: Rockchip erratum RK3568002
[    0.000000] GIC: enabling workaround for ITS: non-coherent attribute
[    0.000000] ITS@0x00000000fd440000: allocated 8192 Devices @410000 (indirect, esz 8, psz 64K, shr 0)
[    0.000000] ITS@0x00000000fd440000: allocated 32768 Interrupt Collections @420000 (flat, esz 2, psz 64K, shr 0)
[    0.000000] ITS: using cache flushing for cmd queue
[    0.000000] GICv3: using LPI property table @0x0000000000430000
[    0.000000] GIC: using cache flushing for LPI property table
[    0.000000] GICv3: CPU0: using allocated LPI pending table @0x0000000000440000
[    0.000000] rcu: srcu_init: Setting srcu_struct sizes based on contention.
[    0.000000] arch_timer: cp15 timer running at 24.00MHz (phys).
[    0.000000] clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0x588fe9dc0, max_idle_ns: 440795202592 ns
[    0.000001] sched_clock: 56 bits at 24MHz, resolution 41ns, wraps every 4398046511097ns
[    0.004345] Console: colour dummy device 80x25
[    0.004930] Calibrating delay loop (skipped), value calculated using timer frequency.. 48.00 BogoMIPS (lpj=96000)
[    0.005946] pid_max: default: 32768 minimum: 301
[    0.006823] Mount-cache hash table entries: 16384 (order: 5, 131072 bytes, linear)
[    0.007612] Mountpoint-cache hash table entries: 16384 (order: 5, 131072 bytes, linear)
[    0.008917] VFS: Finished mounting rootfs on nullfs
[    0.012680] rcu: Hierarchical SRCU implementation.
[    0.013171] rcu: 	Max phase no-delay instances is 1000.
[    0.014123] Timer migration: 1 hierarchy levels; 8 children per group; 1 crossnode level
[    0.015214] fsl-mc MSI: msi-controller@fd440000 domain created
[    0.020987] EFI services will not be available.
[    0.021934] smp: Bringing up secondary CPUs ...
[    0.023391] Detected VIPT I-cache on CPU1
[    0.023553] GICv3: CPU1: found redistributor 100 region 0:0x00000000fd480000
[    0.023577] GICv3: CPU1: using allocated LPI pending table @0x0000000000450000
[    0.023640] CPU1: Booted secondary processor 0x0000000100 [0x412fd050]
[    0.024763] Detected VIPT I-cache on CPU2
[    0.024907] GICv3: CPU2: found redistributor 200 region 0:0x00000000fd4a0000
[    0.024929] GICv3: CPU2: using allocated LPI pending table @0x0000000000460000
[    0.024978] CPU2: Booted secondary processor 0x0000000200 [0x412fd050]
[    0.026138] Detected VIPT I-cache on CPU3
[    0.026279] GICv3: CPU3: found redistributor 300 region 0:0x00000000fd4c0000
[    0.026303] GICv3: CPU3: using allocated LPI pending table @0x0000000000470000
[    0.026352] CPU3: Booted secondary processor 0x0000000300 [0x412fd050]
[    0.026541] smp: Brought up 1 node, 4 CPUs
[    0.034241] SMP: Total of 4 processors activated.
[    0.034710] CPU: All CPU(s) started at EL2
[    0.035116] CPU features: detected: 32-bit EL0 Support
[    0.035620] CPU features: detected: 32-bit EL1 Support
[    0.036127] CPU features: detected: Data cache clean to the PoU not required for I/D coherence
[    0.036969] CPU features: detected: Common not Private translations
[    0.037582] CPU features: detected: CRC32 instructions
[    0.038125] CPU features: detected: RCpc load-acquire (LDAPR)
[    0.038694] CPU features: detected: LSE atomic instructions
[    0.039244] CPU features: detected: Privileged Access Never
[    0.039791] CPU features: detected: PMUv3
[    0.040186] CPU features: detected: RAS Extension Support
[    0.040719] CPU features: detected: XNX
[    0.041102] CPU features: detected: Speculative Store Bypassing Safe (SSBS)
[    0.041855] alternatives: applying system-wide alternatives
[    0.047418] CPU features: detected: ICV_DIR_EL1 trapping
[    0.048557] Memory: 7683552K/8124416K available (20736K kernel code, 5018K rwdata, 14920K rodata, 12480K init, 724K bss, 305152K reserved, 131072K cma-reserved)
[    0.053188] devtmpfs: initialized
[    0.076194] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[    0.077193] posixtimers hash table entries: 2048 (order: 3, 32768 bytes, linear)
[    0.077991] futex hash table entries: 1024 (65536 bytes on 1 NUMA nodes, total 64 KiB, linear).
[    0.080642] 2G module region forced by RANDOMIZE_MODULE_REGION_FULL
[    0.081302] 0 pages in range for non-PLT usage
[    0.081310] 510752 pages in range for PLT usage
[    0.087203] DMI: not present or invalid.
[    0.092012] NET: Registered PF_NETLINK/PF_ROUTE protocol family
[    0.094182] DMA: preallocated 1024 KiB GFP_KERNEL pool for atomic allocations
[    0.095295] DMA: preallocated 1024 KiB GFP_KERNEL|GFP_DMA pool for atomic allocations
[    0.096163] audit: initializing netlink subsys (disabled)
[    0.096954] audit: type=2000 audit(0.092:1): state=initialized audit_enabled=0 res=1
[    0.102019] thermal_sys: Registered thermal governor 'step_wise'
[    0.102040] thermal_sys: Registered thermal governor 'power_allocator'
[    0.102800] cpuidle: using governor menu
[    0.104380] hw-breakpoint: found 6 breakpoint and 4 watchpoint registers.
[    0.105274] ASID allocator initialised with 32768 entries
[    0.111040] Serial: AMBA PL011 UART driver
[    0.140432] /vop@fe040000: Fixed dependency cycle(s) with /hdmi@fe0a0000
[    0.141206] /hdmi@fe0a0000: Fixed dependency cycle(s) with /vop@fe040000
[    0.157642] /pcie@fe260000: Fixed dependency cycle(s) with /pcie@fe260000/legacy-interrupt-controller
[    0.178928] rockchip-gpio fdd60000.gpio: probed /pinctrl/gpio@fdd60000
[    0.180729] rockchip-gpio fe740000.gpio: probed /pinctrl/gpio@fe740000
[    0.182380] rockchip-gpio fe750000.gpio: probed /pinctrl/gpio@fe750000
[    0.184237] rockchip-gpio fe760000.gpio: probed /pinctrl/gpio@fe760000
[    0.185880] rockchip-gpio fe770000.gpio: probed /pinctrl/gpio@fe770000
[    0.192768] /pcie@fe280000: Fixed dependency cycle(s) with /pcie@fe280000/legacy-interrupt-controller
[    0.196426] /hdmi@fe0a0000: Fixed dependency cycle(s) with /hdmi-con
[    0.197174] /hdmi-con: Fixed dependency cycle(s) with /hdmi@fe0a0000
[    0.207741] HugeTLB: registered 1.00 GiB page size, pre-allocated 0 pages
[    0.208425] HugeTLB: 0 KiB vmemmap can be freed for a 1.00 GiB page
[    0.209043] HugeTLB: registered 32.0 MiB page size, pre-allocated 0 pages
[    0.209708] HugeTLB: 0 KiB vmemmap can be freed for a 32.0 MiB page
[    0.210324] HugeTLB: registered 2.00 MiB page size, pre-allocated 0 pages
[    0.210986] HugeTLB: 0 KiB vmemmap can be freed for a 2.00 MiB page
[    0.211638] HugeTLB: registered 64.0 KiB page size, pre-allocated 0 pages
[    0.212304] HugeTLB: 0 KiB vmemmap can be freed for a 64.0 KiB page
[    0.216832] ACPI: Interpreter disabled.
[    0.226115] iommu: Default domain type: Translated
[    0.226612] iommu: DMA domain TLB invalidation policy: strict mode
[    0.229120] SCSI subsystem initialized
[    0.230219] usbcore: registered new interface driver usbfs
[    0.230824] usbcore: registered new interface driver hub
[    0.231419] usbcore: registered new device driver usb
[    0.234769] pps_core: LinuxPPS API ver. 1 registered
[    0.235266] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
[    0.236205] PTP clock support registered
[    0.236952] EDAC MC: Ver: 3.0.0
[    0.238218] scmi_core: SCMI protocol bus registered
[    0.241592] FPGA manager framework
[    0.244246] vgaarb: loaded
[    0.245545] clocksource: Switched to clocksource arch_sys_counter
[    0.246553] VFS: Disk quotas dquot_6.6.0
[    0.246977] VFS: Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
[    0.248448] pnp: PnP ACPI: disabled
[    0.262673] NET: Registered PF_INET protocol family
[    0.263673] IP idents hash table entries: 131072 (order: 8, 1048576 bytes, linear)
[    0.271846] tcp_listen_portaddr_hash hash table entries: 4096 (order: 4, 65536 bytes, linear)
[    0.272763] Table-perturb hash table entries: 65536 (order: 6, 262144 bytes, linear)
[    0.273581] TCP established hash table entries: 65536 (order: 7, 524288 bytes, linear)
[    0.274889] TCP bind hash table entries: 65536 (order: 9, 2097152 bytes, linear)
[    0.277519] TCP: Hash tables configured (established 65536 bind 65536)
[    0.278369] UDP hash table entries: 4096 (order: 6, 262144 bytes, linear)
[    0.279544] NET: Registered PF_UNIX/PF_LOCAL protocol family
[    0.280868] RPC: Registered named UNIX socket transport module.
[    0.281459] RPC: Registered udp transport module.
[    0.281967] RPC: Registered tcp transport module.
[    0.282433] RPC: Registered tcp-with-tls transport module.
[    0.282973] RPC: Registered tcp NFSv4.1 backchannel transport module.
[    0.285066] PCI: CLS 0 bytes, default 64
[    0.285751] Unpacking initramfs...
[    0.294394] kvm [1]: nv: 570 coarse grained trap handlers
[    0.295328] kvm [1]: nv: 710 fine grained trap handlers
[    0.296530] kvm [1]: IPA Size Limit: 40 bits
[    0.296999] kvm [1]: GICv3: no GICV resource entry
[    0.297475] kvm [1]: disabling GICv2 emulation
[    0.297989] kvm [1]: GIC system register CPU interface enabled
[    0.298599] kvm [1]: vgic interrupt IRQ9
[    0.299039] kvm [1]: VHE mode initialized successfully
[    0.302206] Initialise system trusted keyrings
[    0.303030] workingset: timestamp_bits=42 (anon: 38) max_order=21 bucket_order=0 (anon: 0)
[    0.304530] squashfs: version 4.0 (2009/01/31) Phillip Lougher
[    0.305699] NFS: Registering the id_resolver key type
[    0.306250] Key type id_resolver registered
[    0.306668] Key type id_legacy registered
[    0.307105] nfs4filelayout_init: NFSv4 File Layout Driver Registering...
[    0.307767] nfs4flexfilelayout_init: NFSv4 Flexfile Layout Driver Registering...
[    0.308845] 9p: Installing v9fs 9p2000 file system support
[    0.378356] Key type asymmetric registered
[    0.378791] Asymmetric key parser 'x509' registered
[    0.379447] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 244)
[    0.380179] io scheduler mq-deadline registered
[    0.380628] io scheduler kyber registered
[    0.381094] io scheduler bfq registered
[    2.017986] Freeing initrd memory: 28596K
[    2.025771] ledtrig-cpu: registered to indicate activity on CPUs
[    2.036309] phy phy-fe8c0000.phy.3: lane number 0, val 1
[    2.036934] rockchip-dw-pcie 3c0800000.pcie: host bridge /pcie@fe280000 ranges:
[    2.037737] rockchip-dw-pcie 3c0800000.pcie:       IO 0x00f0100000..0x00f01fffff -> 0x00f0100000
[    2.038623] rockchip-dw-pcie 3c0800000.pcie:      MEM 0x00f0200000..0x00f1ffffff -> 0x00f0200000
[    2.039516] rockchip-dw-pcie 3c0800000.pcie:      MEM 0x0380000000..0x03bfffffff -> 0x0380000000
[    2.048377] rockchip-dw-pcie 3c0800000.pcie: iATU: unroll T, 8 ob, 8 ib, align 64K, limit 8G
[    2.349576] rockchip-dw-pcie 3c0800000.pcie: PCIe Gen.3 x2 link up
[    2.350587] rockchip-dw-pcie 3c0800000.pcie: PCI host bridge to bus 0002:20
[    2.351289] pci_bus 0002:20: root bus resource [bus 20-2f]
[    2.351840] pci_bus 0002:20: root bus resource [io  0x0000-0xfffff] (bus address [0xf0100000-0xf01fffff])
[    2.352778] pci_bus 0002:20: root bus resource [mem 0xf0200000-0xf1ffffff]
[    2.353456] pci_bus 0002:20: root bus resource [mem 0x380000000-0x3bfffffff]
[    2.354237] pci 0002:20:00.0: [1d87:3566] type 01 class 0x060400 PCIe Root Port
[    2.354983] pci 0002:20:00.0: ROM [mem 0x00000000-0x0000ffff pref]
[    2.355597] pci 0002:20:00.0: PCI bridge to [bus 01-ff]
[    2.356120] pci 0002:20:00.0:   bridge window [io  0x0000-0x0fff]
[    2.356722] pci 0002:20:00.0:   bridge window [mem 0x00000000-0x000fffff]
[    2.357397] pci 0002:20:00.0:   bridge window [mem 0x00000000-0x000fffff 64bit pref]
[    2.358247] pci 0002:20:00.0: supports D1 D2
[    2.358677] pci 0002:20:00.0: PME# supported from D0 D1 D3hot
[    2.364773] pci 0002:20:00.0: Primary bus is hard wired to 0
[    2.365345] pci 0002:20:00.0: bridge configuration invalid ([bus 01-ff]), reconfiguring
[    2.366452] pci 0002:21:00.0: [10ec:5765] type 00 class 0x010802 PCIe Endpoint
[    2.367380] pci 0002:21:00.0: BAR 0 [mem 0x00000000-0x00003fff 64bit]
[    2.368045] pci 0002:21:00.0: BAR 5 [mem 0x00000000-0x00001fff]
[    2.369416] pci 0002:21:00.0: 15.752 Gb/s available PCIe bandwidth, limited by 8.0 GT/s PCIe x2 link at 0002:20:00.0 (capable of 31.504 Gb/s with 8.0 GT/s PCIe x4 link)
[    2.377603] pci 0002:21:00.0: ASPM: default states L1
[    2.378208] pci_bus 0002:21: busn_res: [bus 21-2f] end is updated to 21
[    2.378898] pci 0002:20:00.0: bridge window [mem 0xf0200000-0xf02fffff]: assigned
[    2.379643] pci 0002:20:00.0: ROM [mem 0xf0300000-0xf030ffff pref]: assigned
[    2.380348] pci 0002:21:00.0: BAR 0 [mem 0xf0200000-0xf0203fff 64bit]: assigned
[    2.381111] pci 0002:21:00.0: BAR 5 [mem 0xf0204000-0xf0205fff]: assigned
[    2.381824] pci 0002:20:00.0: PCI bridge to [bus 21]
[    2.382325] pci 0002:20:00.0:   bridge window [mem 0xf0200000-0xf02fffff]
[    2.383002] pci_bus 0002:20: resource 4 [io  0x0000-0xfffff]
[    2.383562] pci_bus 0002:20: resource 5 [mem 0xf0200000-0xf1ffffff]
[    2.384181] pci_bus 0002:20: resource 6 [mem 0x380000000-0x3bfffffff]
[    2.384817] pci_bus 0002:21: resource 1 [mem 0xf0200000-0xf02fffff]
[    2.390163] pcieport 0002:20:00.0: PME: Signaling with IRQ 31
[    2.391304] pcieport 0002:20:00.0: AER: enabled with IRQ 32
[    2.461045] dma-pl330 fe530000.dma-controller: Loaded driver for PL330 DMAC-241330
[    2.461857] dma-pl330 fe530000.dma-controller: 	DBUFF-128x8bytes Num_Chans-8 Num_Peri-32 Num_Events-16
[    2.465804] dma-pl330 fe550000.dma-controller: Loaded driver for PL330 DMAC-241330
[    2.466560] dma-pl330 fe550000.dma-controller: 	DBUFF-128x8bytes Num_Chans-8 Num_Peri-32 Num_Events-16
[    2.506402] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
[    2.512775] printk: legacy console [ttyS2] disabled
[    2.513873] fe660000.serial: ttyS2 at MMIO 0xfe660000 (irq = 37, base_baud = 1500000) is a 16550A
[    2.514882] printk: legacy console [ttyS2] enabled
[    2.514882] printk: legacy console [ttyS2] enabled
[    2.515784] printk: legacy bootconsole [uart0] disabled
[    2.515784] printk: legacy bootconsole [uart0] disabled
[    2.523685] msm_serial: driver initialized
[    2.525677] SuperH (H)SCI(F) driver initialized
[    2.526792] STM32 USART driver initialized
[    2.538263] random: crng init done
[    2.538643] platform fdea0000.video-codec: Adding to iommu group 0
[    2.541147] platform fdee0000.video-codec: Adding to iommu group 1
[    2.543525] platform fe040000.vop: Adding to iommu group 2
[    2.545348] Error: Driver 'efi-framebuffer' is already registered, aborting...
[    2.556173] loop: module loaded
[    2.560241] megasas: 07.734.00.00-rc1
[    2.563304] nvme nvme0: pci function 0002:21:00.0
[    2.563774] nvme 0002:21:00.0: enabling device (0000 -> 0002)
[    2.581994] tun: Universal TUN/TAP device driver, 1.6
[    2.585959] thunder_xcv, ver 1.0
[    2.586353] thunder_bgx, ver 1.0
[    2.586717] nicpf, ver 1.0
[    2.591410] e1000: Intel(R) PRO/1000 Network Driver
[    2.591864] e1000: Copyright (c) 1999-2006 Intel Corporation.
[    2.592455] e1000e: Intel(R) PRO/1000 Network Driver
[    2.592904] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
[    2.593517] igb: Intel(R) Gigabit Ethernet Network Driver
[    2.594043] igb: Copyright (c) 2007-2014 Intel Corporation.
[    2.594615] igbvf: Intel(R) Gigabit Virtual Function Network Driver
[    2.595175] igbvf: Copyright (c) 2009 - 2012 Intel Corporation.
[    2.596815] sky2: driver version 1.30
[    2.601784] rk_gmac-dwmac fe010000.ethernet: IRQ sfty not found
[    2.604039] rk_gmac-dwmac fe2a0000.ethernet: IRQ sfty not found
[    2.606627] usbcore: registered new device driver r8152-cfgselector
[    2.607267] usbcore: registered new interface driver r8152
[    2.607825] usbcore: registered new interface driver asix
[    2.608408] usbcore: registered new interface driver ax88179_178a
[    2.609582] VFIO - User Level meta-driver version: 0.3
[    2.624316] ehci-platform fd800000.usb: EHCI Host Controller
[    2.624342] ohci-platform fd840000.usb: Generic Platform OHCI controller
[    2.624869] ehci-platform fd800000.usb: new USB bus registered, assigned bus number 1
[    2.625452] ohci-platform fd840000.usb: new USB bus registered, assigned bus number 2
[    2.625481] usbcore: registered new interface driver usb-storage
[    2.626353] ehci-platform fd800000.usb: irq 46, io mem 0xfd800000
[    2.627002] ohci-platform fd840000.usb: irq 47, io mem 0xfd840000
[    2.636410] i2c_dev: i2c /dev entries driver
[    2.637620] ehci-platform fd800000.usb: USB 2.0 started, EHCI 1.00
[    2.639265] hub 1-0:1.0: USB hub found
[    2.639666] hub 1-0:1.0: 1 port detected
[    2.643281] fan53555-regulator 0-001c: FAN53555 Option[12] Rev[15] Detected!
[    2.686641] hub 2-0:1.0: USB hub found
[    2.687081] hub 2-0:1.0: 1 port detected
[    2.700618] vdd_npu: Bringing 500000uV into 825000-825000uV
[    2.701202] nvme nvme0: passthrough uses implicit buffer lengths
[    2.709720] vdda0v9_image: Bringing 600000uV into 900000-900000uV
[    2.714524] nvme nvme0: allocated 64 MiB host memory buffer (16 segments).
[    2.737292] vcca1v8_image: Bringing 600000uV into 1800000-1800000uV
[    2.749637] rockchip-pm-domain fdd90000.power-management:power-controller: Failed to create device link (0x180) with supplier 0-0020 for /power-management@fdd90000/power-controller/power-domain@6
[    2.750502] nvme nvme0: 4/0/0 default/read/poll queues
[    2.773746] nvme nvme0: Ignoring bogus Namespace Identifiers
[    2.788100] dwmmc_rockchip fe2b0000.mmc: IDMAC supports 32-bit address mode.
[    2.788831] dwmmc_rockchip fe2b0000.mmc: Using internal DMA controller.
[    2.789432] dwmmc_rockchip fe2b0000.mmc: Version ID is 270a
[    2.790037] dwmmc_rockchip fe2b0000.mmc: DW MMC controller at irq 80,32 bit host data width,256 deep fifo
[    2.802482] arm-scmi arm-scmi.6.auto: Using scmi_smc_transport
[    2.803032] arm-scmi arm-scmi.6.auto: SCMI max-rx-timeout: 30ms / max-msg-size: 104bytes / max-msg: 20
[    2.804109] scmi_protocol scmi_dev.1: Enabled polling mode TX channel - prot_id:16
[    2.805142] arm-scmi arm-scmi.6.auto: SCMI Notifications - Core Enabled.
[    2.805740] mmc_host mmc1: Bus speed = 375000Hz (req 400000Hz, actual 375000HZ div = 0)
[    2.806552] arm-scmi arm-scmi.6.auto: SCMI Protocol v2.0 'rockchip:' Firmware version 0x0
[    2.807391] arm-scmi arm-scmi.6.auto: Enabling SCMI Quirk [quirk_clock_rates_triplet_out_of_spec]
[    2.810214] SMCCC: SOC_ID: ARCH_SOC_ID not implemented, skipping ....
[    2.815909] usbcore: registered new interface driver usbhid
[    2.816421] usbhid: USB HID core driver
[    2.825563] mmc0: SDHCI controller on fe310000.mmc [fe310000.mmc] using ADMA
[    2.829265] hw perfevents: enabled with armv8_cortex_a55 PMU driver, 7 (0,8000003f) counters available
[    2.849585] NET: Registered PF_INET6 protocol family
[    2.851218] Segment Routing with IPv6
[    2.851605] In-situ OAM (IOAM) with IPv6
[    2.852047] NET: Registered PF_PACKET protocol family
[    2.852619] 9pnet: Installing 9P2000 support
[    2.853103] Key type dns_resolver registered
[    2.878282] registered taskstats version 1
[    2.878959] Loading compiled-in X.509 certificates
[    2.885591] usb 1-1: new high-speed USB device number 2 using ehci-platform
[    2.889469] sdhci-dwcmshc fe310000.mmc: Can't reduce the clock below 52MHz in HS200/HS400 mode
[    2.890328] sdhci-dwcmshc fe310000.mmc: Can't reduce the clock below 52MHz in HS200/HS400 mode
[    2.891109] sdhci-dwcmshc fe310000.mmc: Can't reduce the clock below 52MHz in HS200/HS400 mode
[    2.892871] mmc0: new HS200 MMC card at address 0001
[    2.894399] mmcblk0: mmc0:0001 BJTD4R 29.1 GiB
[    2.894880] Demotion targets for Node 0: null
[    2.899582]  mmcblk0: p1 p2 p3
[    2.901085] mmcblk0boot0: mmc0:0001 BJTD4R 4.00 MiB
[    2.904012] mmcblk0boot1: mmc0:0001 BJTD4R 4.00 MiB
[    2.907113] mmcblk0rpmb: mmc0:0001 BJTD4R 4.00 MiB, chardev (509:0)
[    2.945955] platform fde40000.npu: Adding to iommu group 3
[    2.950953] rk_gmac-dwmac fe010000.ethernet: IRQ sfty not found
[    2.952440] rk_gmac-dwmac fe010000.ethernet: clock input or output? (input).
[    2.953079] rk_gmac-dwmac fe010000.ethernet: Can not read property: tx_delay.
[    2.953754] rk_gmac-dwmac fe010000.ethernet: set tx_delay to 0x30
[    2.954304] rk_gmac-dwmac fe010000.ethernet: Can not read property: rx_delay.
[    2.954936] rk_gmac-dwmac fe010000.ethernet: set rx_delay to 0x10
[    2.955485] rk_gmac-dwmac fe010000.ethernet: integrated PHY? (no).
[    2.956079] rk_gmac-dwmac fe010000.ethernet: clock input from PHY
[    2.961643] rk_gmac-dwmac fe010000.ethernet: init for RGMII_ID
[    2.962534] rk_gmac-dwmac fe010000.ethernet: User ID: 0x30, Synopsys ID: 0x51
[    2.963181] rk_gmac-dwmac fe010000.ethernet: 	DWMAC4/5
[    2.963645] rk_gmac-dwmac fe010000.ethernet: DMA HW capability register supported
[    2.964309] rk_gmac-dwmac fe010000.ethernet: Active PHY interface: RGMII (1)
[    2.964933] rk_gmac-dwmac fe010000.ethernet: RX Checksum Offload Engine supported
[    2.965638] rk_gmac-dwmac fe010000.ethernet: TX Checksum insertion supported
[    2.966267] rk_gmac-dwmac fe010000.ethernet: Wake-Up On Lan supported
[    2.966926] rk_gmac-dwmac fe010000.ethernet: Enable RX Mitigation via HW Watchdog Timer
[    2.967641] rk_gmac-dwmac fe010000.ethernet: Enabled RFS Flow TC (entries=10)
[    2.968277] rk_gmac-dwmac fe010000.ethernet: TSO supported
[    2.968766] rk_gmac-dwmac fe010000.ethernet: TSO feature enabled
[    2.969301] rk_gmac-dwmac fe010000.ethernet: Using 32/32 bits DMA host/device width
[    3.035000] hub 1-1:1.0: USB hub found
[    3.035550] hub 1-1:1.0: 4 ports detected
[    3.073444] rk_gmac-dwmac fe2a0000.ethernet: IRQ sfty not found
[    3.075209] rk_gmac-dwmac fe2a0000.ethernet: clock input or output? (input).
[    3.075852] rk_gmac-dwmac fe2a0000.ethernet: Can not read property: tx_delay.
[    3.076485] rk_gmac-dwmac fe2a0000.ethernet: set tx_delay to 0x30
[    3.077027] rk_gmac-dwmac fe2a0000.ethernet: Can not read property: rx_delay.
[    3.077695] rk_gmac-dwmac fe2a0000.ethernet: set rx_delay to 0x10
[    3.078252] rk_gmac-dwmac fe2a0000.ethernet: integrated PHY? (no).
[    3.078851] rk_gmac-dwmac fe2a0000.ethernet: clock input from PHY
[    3.084415] rk_gmac-dwmac fe2a0000.ethernet: init for RGMII_ID
[    3.085289] rk_gmac-dwmac fe2a0000.ethernet: User ID: 0x30, Synopsys ID: 0x51
[    3.085968] rk_gmac-dwmac fe2a0000.ethernet: 	DWMAC4/5
[    3.086436] rk_gmac-dwmac fe2a0000.ethernet: DMA HW capability register supported
[    3.087099] rk_gmac-dwmac fe2a0000.ethernet: Active PHY interface: RGMII (1)
[    3.087724] rk_gmac-dwmac fe2a0000.ethernet: RX Checksum Offload Engine supported
[    3.088385] rk_gmac-dwmac fe2a0000.ethernet: TX Checksum insertion supported
[    3.089008] rk_gmac-dwmac fe2a0000.ethernet: Wake-Up On Lan supported
[    3.089683] rk_gmac-dwmac fe2a0000.ethernet: Enable RX Mitigation via HW Watchdog Timer
[    3.090398] rk_gmac-dwmac fe2a0000.ethernet: Enabled RFS Flow TC (entries=10)
[    3.091034] rk_gmac-dwmac fe2a0000.ethernet: TSO supported
[    3.091521] rk_gmac-dwmac fe2a0000.ethernet: TSO feature enabled
[    3.092056] rk_gmac-dwmac fe2a0000.ethernet: Using 32/32 bits DMA host/device width
[    3.228427] cfg80211: Loading compiled-in X.509 certificates for regulatory database
[    3.244306] Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
[    3.245444] Loaded X.509 cert 'wens: 61c038651aabdcf94bd0ac7ff06c7248db18c600'
[    3.246270] faux_driver regulatory: Direct firmware load for regulatory.db failed with error -2
[    3.246879] clk: Not disabling unused clocks
[    3.247042] cfg80211: failed to load regulatory.db
[    3.247425] PM: genpd: Disabling unused power domains
[    3.248577] dw-apb-uart fe660000.serial: forbid DMA for kernel console
[    3.254849] Freeing unused kernel memory: 12480K
[    3.255399] Run /init as init process
Loading, please wait...
Starting version 247.3-7+deb11u4
Begin: Loading essential drivers ... done.
Begin: Running /scripts/init-premount ... done.
Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done.
Begin: Running /scripts/local-premount ... done.
Begin: Will now check root file system ... fsck from util-linux 2.36.1
[/sbin/fsck.ext4 (1) -- /dev/mmcblk0p3] fsck.ext4 -a -C0 /dev/mmcblk0p3 
rootfs: clean, 297753/1855392 files, 4946612/7548923 blocks
done.
[    4.081211] EXT4-fs (mmcblk0p3): mounted filesystem d210e617-6a4c-4771-b955-ddd835a32d2b r/w with ordered data mode. Quota mode: none.
done.
Begin: Running /scripts/local-bottom ... done.
Begin: Running /scripts/init-bottom ... done.
[    4.488149] systemd[1]: System time before build time, advancing clock.
[    4.545777] systemd[1]: systemd 247.3-7+deb11u4 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +ZSTD +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=unified)
[    4.549900] systemd[1]: Detected architecture arm64.

Welcome to ^[[1mDebian GNU/Linux 11 (bullseye)^[[0m!

[    4.563093] systemd[1]: Set hostname to <rock-3b>.
[    4.712335] block mmcblk0: the capability attribute has been deprecated.
[    5.090660] systemd[1]: Queued start job for default target Graphical Interface.
[    5.131927] systemd[1]: Created slice system-getty.slice.
[^[[0;32m  OK  ^[[0m] Created slice ^[[0;1;39msystem-getty.slice^[[0m.
[    5.134792] systemd[1]: Created slice system-modprobe.slice.
[^[[0;32m  OK  ^[[0m] Created slice ^[[0;1;39msystem-modprobe.slice^[[0m.
[    5.143177] systemd[1]: Created slice system-serial\x2dgetty.slice.
[^[[0;32m  OK  ^[[0m] Created slice ^[[0;1;39msystem-serial\x2dgetty.slice^[[0m.
[    5.151765] systemd[1]: Created slice system-systemd\x2dfsck.slice.
[^[[0;32m  OK  ^[[0m] Created slice ^[[0;1;39msystem-systemd\x2dfsck.slice^[[0m.
[    5.163903] systemd[1]: Created slice system-systemd\x2djournald.slice.
[^[[0;32m  OK  ^[[0m] Created slice ^[[0;1;39msystem-systemd\x2djournald.slice^[[0m.
[    5.175785] systemd[1]: Created slice system-systemd\x2djournald\x2dvarlink.slice.
[^[[0;32m  OK  ^[[0m] Created slice ^[[0;1;39msystem-syste…\x2djournald\x2dvarlink.slice^[[0m.
[    5.183362] systemd[1]: Created slice User and Session Slice.
[^[[0;32m  OK  ^[[0m] Created slice ^[[0;1;39mUser and Session Slice^[[0m.
[    5.190135] systemd[1]: Started Dispatch Password Requests to Console Directory Watch.
[^[[0;32m  OK  ^[[0m] Started ^[[0;1;39mDispatch Password …ts to Console Directory Watch^[[0m.
[    5.202212] systemd[1]: Started Forward Password Requests to Wall Directory Watch.
[^[[0;32m  OK  ^[[0m] Started ^[[0;1;39mForward Password R…uests to Wall Directory Watch^[[0m.
[    5.210105] systemd[1]: Condition check resulted in Arbitrary Executable File Formats File System Automount Point being skipped.
[    5.211439] systemd[1]: Reached target Local Encrypted Volumes.
[^[[0;32m  OK  ^[[0m] Reached target ^[[0;1;39mLocal Encrypted Volumes^[[0m.
[    5.222078] systemd[1]: Reached target Network (Pre).
[^[[0;32m  OK  ^[[0m] Reached target ^[[0;1;39mNetwork (Pre)^[[0m.
[    5.229939] systemd[1]: Reached target Paths.
[^[[0;32m  OK  ^[[0m] Reached target ^[[0;1;39mPaths^[[0m.
[    5.238057] systemd[1]: Reached target Remote Encrypted Volumes.
[^[[0;32m  OK  ^[[0m] Reached target ^[[0;1;39mRemote Encrypted Volumes^[[0m.
[    5.245910] systemd[1]: Reached target Remote File Systems.
[^[[0;32m  OK  ^[[0m] Reached target ^[[0;1;39mRemote File Systems^[[0m.
[    5.247592] systemd[1]: Reached target Slices.
[^[[0;32m  OK  ^[[0m] Reached target ^[[0;1;39mSlices^[[0m.
[    5.250927] systemd[1]: Listening on fsck to fsckd communication Socket.
[^[[0;32m  OK  ^[[0m] Listening on ^[[0;1;39mfsck to fsckd communication Socket^[[0m.
[    5.258649] systemd[1]: Listening on initctl Compatibility Named Pipe.
[^[[0;32m  OK  ^[[0m] Listening on ^[[0;1;39minitctl Compatibility Named Pipe^[[0m.
[    5.267782] systemd[1]: Listening on Journal Audit Socket.
[^[[0;32m  OK  ^[[0m] Listening on ^[[0;1;39mJournal Audit Socket^[[0m.
[    5.270864] systemd[1]: systemd-journald-dev-log.socket: SO_PASSSEC failed: Operation not supported
[    5.271966] systemd[1]: Listening on Journal Socket (/dev/log).
[^[[0;32m  OK  ^[[0m] Listening on ^[[0;1;39mJournal Socket (/dev/log)^[[0m.
[    5.274918] systemd[1]: systemd-journald.socket: SO_PASSSEC failed: Operation not supported
[    5.276098] systemd[1]: systemd-journald.socket: SO_PASSSEC failed: Operation not supported
[    5.277020] systemd[1]: Listening on Journal Socket.
[^[[0;32m  OK  ^[[0m] Listening on ^[[0;1;39mJournal Socket^[[0m.
[    5.280707] systemd[1]: Listening on Network Service Netlink Socket.
[^[[0;32m  OK  ^[[0m] Listening on ^[[0;1;39mNetwork Service Netlink Socket^[[0m.
[    5.291657] systemd[1]: Listening on udev Control Socket.
[^[[0;32m  OK  ^[[0m] Listening on ^[[0;1;39mudev Control Socket^[[0m.
[    5.294591] systemd[1]: Listening on udev Kernel Socket.
[^[[0;32m  OK  ^[[0m] Listening on ^[[0;1;39mudev Kernel Socket^[[0m.
[    5.321932] systemd[1]: Mounting Huge Pages File System...
         Mounting ^[[0;1;39mHuge Pages File System^[[0m...
[    5.334172] systemd[1]: Mounting POSIX Message Queue File System...
         Mounting ^[[0;1;39mPOSIX Message Queue File System^[[0m...
[    5.339943] systemd[1]: Mounting Kernel Debug File System...
         Mounting ^[[0;1;39mKernel Debug File System^[[0m...
[    5.346080] systemd[1]: Condition check resulted in Kernel Trace File System being skipped.
[    5.351151] systemd[1]: Starting Wait for network to be configured by ifupdown...
         Starting ^[[0;1;39mWait for network to be configured by ifupdown^[[0m...
[    5.357911] systemd[1]: Condition check resulted in Create list of static device nodes for the current kernel being skipped.
[    5.363081] systemd[1]: Starting Load Kernel Module configfs...
         Starting ^[[0;1;39mLoad Kernel Module configfs^[[0m...
[    5.368299] systemd[1]: Starting Load Kernel Module drm...
         Starting ^[[0;1;39mLoad Kernel Module drm^[[0m...
[    5.376882] systemd[1]: Starting Load Kernel Module fuse...
         Starting ^[[0;1;39mLoad Kernel Module fuse^[[0m...
[    5.386967] systemd[1]: Condition check resulted in Set Up Additional Binary Formats being skipped.
[    5.387937] systemd[1]: Condition check resulted in File System Check on Root Device being skipped.
[    5.398865] systemd[1]: Starting Journal Service...
         Starting ^[[0;1;39mJournal Service^[[0m...
[    5.407719] systemd[1]: Starting Load Kernel Modules...
         Starting ^[[0;1;39mLoad Kernel Modules^[[0m...
[    5.412414] systemd[1]: Starting Remount Root and Kernel File Systems...
         Starting ^[[0;1;39mRemount Root and Kernel File Systems^[[0m...
[    5.429850] systemd[1]: Starting Coldplug All udev Devices...
         Starting ^[[0;1;39mColdplug All udev Devices^[[0m...
[    5.447220] systemd[1]: Mounted Huge Pages File System.
[^[[0;32m  OK  ^[[0m] Mounted ^[[0;1;39mHuge Pages File System^[[0m.
[    5.449061] systemd[1]: Mounted POSIX Message Queue File System.
[^[[0;32m  OK  ^[[0m] Mounted ^[[0;1;39mPOSIX Message Queue File System^[[0m.
[    5.453688] EXT4-fs (mmcblk0p3): re-mounted d210e617-6a4c-4771-b955-ddd835a32d2b.
[    5.459654] systemd[1]: Mounted Kernel Debug File System.
[^[[0;32m  OK  ^[[0m] Mounted ^[[0;1;39mKernel Debug File System^[[0m.
[    5.467229] systemd[1]: Finished Wait for network to be configured by ifupdown.
[^[[0;32m  OK  ^[[0m] Finished ^[[0;1;39mWait for network to be configured by ifupdown^[[0m.
[    5.474892] systemd[1]: modprobe@configfs.service: Succeeded.
[    5.476453] systemd[1]: Finished Load Kernel Module configfs.
[^[[0;32m  OK  ^[[0m] Finished ^[[0;1;39mLoad Kernel Module configfs^[[0m.
[    5.486629] systemd[1]: modprobe@drm.service: Succeeded.
[    5.488053] systemd[1]: Finished Load Kernel Module drm.
[^[[0;32m  OK  ^[[0m] Finished ^[[0;1;39mLoad Kernel Module drm^[[0m.
[    5.498681] systemd[1]: modprobe@fuse.service: Succeeded.
[    5.500190] systemd[1]: Finished Load Kernel Module fuse.
[^[[0;32m  OK  ^[[0m] Finished ^[[0;1;39mLoad Kernel Module fuse^[[0m.
[    5.511197] systemd[1]: Finished Load Kernel Modules.
[^[[0;32m  OK  ^[[0m] Finished ^[[0;1;39mLoad Kernel Modules^[[0m.
[    5.523376] systemd[1]: Finished Remount Root and Kernel File Systems.
[^[[0;32m  OK  ^[[0m] Finished ^[[0;1;39mRemount Root and Kernel File Systems^[[0m.
[    5.534599] systemd[1]: Condition check resulted in FUSE Control File System being skipped.
[    5.550113] systemd[1]: Mounting Kernel Configuration File System...
         Mounting ^[[0;1;39mKernel Configuration File System^[[0m...
[    5.556704] systemd[1]: Condition check resulted in Rebuild Hardware Database being skipped.
[    5.557696] systemd[1]: Condition check resulted in Platform Persistent Storage Archival being skipped.
[    5.561425] systemd[1]: Starting Load/Save Random Seed...
         Starting ^[[0;1;39mLoad/Save Random Seed^[[0m...
[    5.572806] systemd[1]: Starting Apply Kernel Variables...
         Starting ^[[0;1;39mApply Kernel Variables^[[0m...
[    5.584810] systemd[1]: Starting Create System Users...
         Starting ^[[0;1;39mCreate System Users^[[0m...
[    5.596109] systemd[1]: Started Journal Service.
[^[[0;32m  OK  ^[[0m] Started ^[[0;1;39mJournal Service^[[0m.
[^[[0;32m  OK  ^[[0m] Mounted ^[[0;1;39mKernel Configuration File System^[[0m.
[^[[0;32m  OK  ^[[0m] Finished ^[[0;1;39mLoad/Save Random Seed^[[0m.
[^[[0;32m  OK  ^[[0m] Finished ^[[0;1;39mApply Kernel Variables^[[0m.
         Starting ^[[0;1;39mFlush Journal to Persistent Storage^[[0m...
[^[[0;32m  OK  ^[[0m] Finished ^[[0;1;39mCreate System Users^[[0m.
[    5.637155] systemd-journald[230]: Received client request to flush runtime journal.
[    5.640079] systemd-journald[230]: File /var/log/journal/f26be486655e4e559a1282889eb20124/system.journal corrupted or uncleanly shut down, renaming and replacing.
         Starting ^[[0;1;39mCreate Static Device Nodes in /dev^[[0m...
[^[[0;32m  OK  ^[[0m] Finished ^[[0;1;39mCreate Static Device Nodes in /dev^[[0m.
[^[[0;32m  OK  ^[[0m] Reached target ^[[0;1;39mLocal File Systems (Pre)^[[0m.
[^[[0;32m  OK  ^[[0m] Set up automount ^[[0;1;39mboot-efi.automount^[[0m.
[^[[0;32m  OK  ^[[0m] Set up automount ^[[0;1;39mconfig.automount^[[0m.
[^[[0;32m  OK  ^[[0m] Reached target ^[[0;1;39mLocal File Systems^[[0m.
         Starting ^[[0;1;39mRule-based Manage…for Device Events and Files^[[0m...
[^[[0;32m  OK  ^[[0m] Finished ^[[0;1;39mFlush Journal to Persistent Storage^[[0m.
         Starting ^[[0;1;39mCreate Volatile Files and Directories^[[0m...
[^[[0;32m  OK  ^[[0m] Started ^[[0;1;39mRule-based Manager for Device Events and Files^[[0m.
         Starting ^[[0;1;39mNetwork Service^[[0m...
[^[[0;32m  OK  ^[[0m] Finished ^[[0;1;39mCreate Volatile Files and Directories^[[0m.
[^[[0;32m  OK  ^[[0m] Started ^[[0;1;39mEntropy Daemon based on the HAVEGE algorithm^[[0m.
         Starting ^[[0;1;39mNetwork Time Synchronization^[[0m...
         Starting ^[[0;1;39mUpdate UTMP about System Boot/Shutdown^[[0m...
[^[[0;32m  OK  ^[[0m] Finished ^[[0;1;39mColdplug All udev Devices^[[0m.
         Starting ^[[0;1;39mHelper to synchronize boot up for ifupdown^[[0m...
[^[[0;32m  OK  ^[[0m] Finished ^[[0;1;39mHelper to synchronize boot up for ifupdown^[[0m.
[^[[0;32m  OK  ^[[0m] Finished ^[[0;1;39mUpdate UTMP about System Boot/Shutdown^[[0m.
         Starting ^[[0;1;39mRaise network interfaces^[[0m...
[^[[0;32m  OK  ^[[0m] Started ^[[0;1;39mNetwork Service^[[0m.
         Starting ^[[0;1;39mWait for Network to be Configured^[[0m...
         Starting ^[[0;1;39mNetwork Name Resolution^[[0m...
[^[[0;32m  OK  ^[[0m] Started ^[[0;1;39mNetwork Time Synchronization^[[0m.
[^[[0;32m  OK  ^[[0m] Reached target ^[[0;1;39mSystem Time Set^[[0m.
[^[[0;32m  OK  ^[[0m] Reached target ^[[0;1;39mSystem Time Synchronized^[[0m.
[^[[0;32m  OK  ^[[0m] Finished ^[[0;1;39mRaise network interfaces^[[0m.
[^[[0;32m  OK  ^[[0m] Started ^[[0;1;39mNetwork Name Resolution^[[0m.
[    6.840374] rocket: loading out-of-tree module taints kernel.
[    6.877647] [drm] Initialized rocket 0.0.0 for rknn on minor 0
[    6.879950] rocket fde40000.npu: Rockchip NPU core 0 version: 0
[^[[0;32m  OK  ^[[0m] Found device ^[[0;1;39mEDILOCA EN605 512GB^[[0m.
         Starting ^[[0;1;39mFile System Check…1edf-455d-9058-56e0855a1edb^[[0m...
[^[[0;32m  OK  ^[[0m] Started ^[[0;1;39mFile System Check Daemon to report status^[[0m.
[^[[0;32m  OK  ^[[0m] Listening on ^[[0;1;39mLoad/Save RF …itch Status /dev/rfkill Watch^[[0m.
[^[[0;32m  OK  ^[[0m] Finished ^[[0;1;39mFile System Check…d-1edf-455d-9058-56e0855a1edb^[[0m.
[^[[0;32m  OK  ^[[0m] Found device ^[[0;1;39m/dev/ttyS2^[[0m.
         Mounting ^[[0;1;39m/mnt/nvme^[[0m...
[    7.209346] EXT4-fs (nvme0n1): mounted filesystem 0d9000fd-1edf-455d-9058-56e0855a1edb r/w with ordered data mode. Quota mode: none.
[^[[0;32m  OK  ^[[0m] Mounted ^[[0;1;39m/mnt/nvme^[[0m.
         Activating swap ^[[0;1;39m/mnt/nvme/swapfile^[[0m...
[    7.236341] Adding 10485756k swap on /mnt/nvme/swapfile.  Priority:-1 extents:10 across:19374076k SS
[^[[0;32m  OK  ^[[0m] Activated swap ^[[0;1;39m/mnt/nvme/swapfile^[[0m.
[^[[0;32m  OK  ^[[0m] Reached target ^[[0;1;39mSwap^[[0m.
[^[[0;32m  OK  ^[[0m] Reached target ^[[0;1;39mSystem Initialization^[[0m.
[^[[0;32m  OK  ^[[0m] Started ^[[0;1;39mDaily apt download activities^[[0m.
[^[[0;32m  OK  ^[[0m] Started ^[[0;1;39mDaily apt upgrade and clean activities^[[0m.
[^[[0;32m  OK  ^[[0m] Started ^[[0;1;39mPeriodic ext4 Onli…ata Check for All Filesystems^[[0m.
[^[[0;32m  OK  ^[[0m] Started ^[[0;1;39mDiscard unused blocks once a week^[[0m.
[^[[0;32m  OK  ^[[0m] Started ^[[0;1;39mDaily man-db regeneration^[[0m.
[^[[0;32m  OK  ^[[0m] Started ^[[0;1;39mDaily Cleanup of Temporary Directories^[[0m.
[^[[0;32m  OK  ^[[0m] Reached target ^[[0;1;39mTimers^[[0m.
[^[[0;32m  OK  ^[[0m] Listening on ^[[0;1;39mAvahi mDNS/DNS-SD Stack Activation Socket^[[0m.
[^[[0;32m  OK  ^[[0m] Listening on ^[[0;1;39mD-Bus System Message Bus Socket^[[0m.
[^[[0;32m  OK  ^[[0m] Listening on ^[[0;1;39mOpenBSD Secure Shell server socket^[[0m.
[^[[0;32m  OK  ^[[0m] Listening on ^[[0;1;39mJournal Varli… Socket for Namespace netdata^[[0m.
[^[[0;32m  OK  ^[[0m] Listening on ^[[0;1;39mJournal Socket for Namespace netdata^[[0m.
[^[[0;32m  OK  ^[[0m] Reached target ^[[0;1;39mSockets^[[0m.
[^[[0;32m  OK  ^[[0m] Reached target ^[[0;1;39mBasic System^[[0m.
         Starting ^[[0;1;39mAvahi mDNS/DNS-SD Stack^[[0m...
[^[[0;32m  OK  ^[[0m] Started ^[[0;1;39mD-Bus System Message Bus^[[0m.
         Starting ^[[0;1;39mNetwork Manager^[[0m...
         Starting ^[[0;1;39mRemove Stale Onli…t4 Metadata Check Snapshots^[[0m...
[^[[0;32m  OK  ^[[0m] Started ^[[0;1;39mrsetup configuration service^[[0m.
         Starting ^[[0;1;39mLSB: Set sysfs variables from /etc/sysfs.conf^[[0m...
         Starting ^[[0;1;39mUser Login Management^[[0m...
         Starting ^[[0;1;39mWPA supplicant^[[0m...
         Starting ^[[0;1;39mLinux zramswap setup^[[0m...
[^[[0;32m  OK  ^[[0m] Started ^[[0;1;39mAvahi mDNS/DNS-SD Stack^[[0m.
         Mounting ^[[0;1;39m/config^[[0m...
[    7.640448] FAT-fs (mmcblk0p1): Volume was not properly unmounted. Some data may be corrupt. Please run fsck.
[^[[0;1;31mFAILED^[[0m] Failed to start ^[[0;1;39mLinux zramswap setup^[[0m.
See 'systemctl status zramswap.service' for details.
[^[[0;32m  OK  ^[[0m] Mounted ^[[0;1;39m/config^[[0m.
[^[[0;32m  OK  ^[[0m] Started ^[[0;1;39mWPA supplicant^[[0m.
[^[[0;32m  OK  ^[[0m] Started ^[[0;1;39mLSB: Set sysfs variables from /etc/sysfs.conf^[[0m.
[^[[0;32m  OK  ^[[0m] Started ^[[0;1;39mUser Login Management^[[0m.
[^[[0;32m  OK  ^[[0m] Finished ^[[0;1;39mRemove Stale Onli…ext4 Metadata Check Snapshots^[[0m.
[^[[0;32m  OK  ^[[0m] Started ^[[0;1;39mNetwork Manager^[[0m.
[^[[0;32m  OK  ^[[0m] Reached target ^[[0;1;39mNetwork^[[0m.
         Starting ^[[0;1;39mNetwork Manager Wait Online^[[0m...
         Starting ^[[0;1;39mdnsmasq - A light…DHCP and caching DNS server^[[0m...
[^[[0;32m  OK  ^[[0m] Started ^[[0;1;39mEnable adbd on supported Radxa products^[[0m.
[^[[0;32m  OK  ^[[0m] Started ^[[0;1;39mEnable USB Ethernet on supported Radxa products^[[0m.
         Starting ^[[0;1;39mPermit User Sessions^[[0m...
         Starting ^[[0;1;39mTailscale node agent^[[0m...
[^[[0;32m  OK  ^[[0m] Finished ^[[0;1;39mPermit User Sessions^[[0m.
[^[[0;32m  OK  ^[[0m] Started ^[[0;1;39mGetty on tty1^[[0m.
[^[[0;32m  OK  ^[[0m] Started ^[[0;1;39mSerial Getty on ttyS2^[[0m.
[^[[0;32m  OK  ^[[0m] Reached target ^[[0;1;39mLogin Prompts^[[0m.
         Starting ^[[0;1;39mHostname Service^[[0m...
[^[[0;1;31mFAILED^[[0m] Failed to start ^[[0;1;39mdnsmasq - …t DHCP and caching DNS server^[[0m.
See 'systemctl status dnsmasq.service' for details.
[^[[0;32m  OK  ^[[0m] Reached target ^[[0;1;39mHost and Network Name Lookups^[[0m.
[^[[0;32m  OK  ^[[0m] Started ^[[0;1;39mHostname Service^[[0m.
         Starting ^[[0;1;39mNetwork Manager Script Dispatcher Service^[[0m...
[    8.192053] rk_gmac-dwmac fe010000.ethernet eth0: Register MEM_TYPE_PAGE_POOL RxQ-0
[^[[0;32m  OK  ^[[0m] Started ^[[0;1;39mNetwork Manager Script Dispatcher Service^[[0m.
[    8.221724] rk_gmac-dwmac fe010000.ethernet eth0: PHY [stmmac-1:01] driver [RTL8211F Gigabit Ethernet] (irq=POLL)
[    8.224669] dwmac4: Master AXI performs any burst length
[    8.225174] rk_gmac-dwmac fe010000.ethernet eth0: No Safety Features support found
[    8.227524] rk_gmac-dwmac fe010000.ethernet eth0: IEEE 1588-2008 Advanced Timestamp supported
[    8.229192] rk_gmac-dwmac fe010000.ethernet eth0: registered PTP clock
[    8.229923] rk_gmac-dwmac fe010000.ethernet eth0: configuring for phy/rgmii-id link mode
[    8.284289] rk_gmac-dwmac fe2a0000.ethernet eth1: Register MEM_TYPE_PAGE_POOL RxQ-0
[^[[0;32m  OK  ^[[0m] Started ^[[0;1;39mTailscale node agent^[[0m.
[    8.313571] rk_gmac-dwmac fe2a0000.ethernet eth1: PHY [stmmac-0:01] driver [RTL8211F Gigabit Ethernet] (irq=POLL)
[    8.325571] dwmac4: Master AXI performs any burst length
[    8.326067] rk_gmac-dwmac fe2a0000.ethernet eth1: No Safety Features support found
[    8.327175] rk_gmac-dwmac fe2a0000.ethernet eth1: IEEE 1588-2008 Advanced Timestamp supported
[    8.328451] rk_gmac-dwmac fe2a0000.ethernet eth1: registered PTP clock
[    8.329047] rk_gmac-dwmac fe2a0000.ethernet eth1: configuring for phy/rgmii-id link mode
[^[[0;32m  OK  ^[[0m] Finished ^[[0;1;39mWait for Network to be Configured^[[0m.
[   12.359432] rk_gmac-dwmac fe010000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx

Debian GNU/Linux 11 rock-3b ttyS2

rock-3b login: [   17.158640] platform sdio-pwrseq: deferred probe pending: pwrseq_simple: reset control not ready
[   17.159481] platform 3c0000000.pcie: deferred probe pending: rockchip-dw-pcie: failed to initialize the phy
[   17.160361] platform fcc00000.usb: deferred probe pending: dwc3: failed to initialize core
[   17.161108] platform fd000000.usb: deferred probe pending: dwc3: failed to initialize core
[   17.161993] rockchip-pm-domain fdd90000.power-management:power-controller: sync_state() pending due to fd000000.usb
[   17.162945] rockchip-pm-domain fdd90000.power-management:power-controller: sync_state() pending due to fcc00000.usb
[   17.163884] rockchip-pm-domain fdd90000.power-management:power-controller: sync_state() pending due to fde60000.gpu
[   17.164821] rockchip-pm-domain fdd90000.power-management:power-controller: sync_state() pending due to fdea0000.video-codec
[   17.165877] rockchip-pm-domain fdd90000.power-management:power-controller: sync_state() pending due to fdeb0000.rga
[   17.166824] rockchip-pm-domain fdd90000.power-management:power-controller: sync_state() pending due to fdee0000.video-codec
[   17.167821] rockchip-pm-domain fdd90000.power-management:power-controller: sync_state() pending due to fe040000.vop
[   17.168757] rockchip-pm-domain fdd90000.power-management:power-controller: sync_state() pending due to fe0a0000.hdmi
[   17.169795] rockchip-pm-domain fdd90000.power-management:power-controller: sync_state() pending due to 3c0000000.pcie

rock-3b login: radxa
Password: 
Linux rock-3b 7.1.0-rc6-chaoyi-00011-ga31e2e6fae27 #5 SMP PREEMPT Tue Jun  9 12:40:22 CEST 2026 aarch64

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Tue Jun  9 12:55:51 CEST 2026 on ttyS2
^[[?2004hradxa@rock-3b:~$ ^[[7mlsmod | grep rocket^[[27m\r^[[C^[[C^[[C^[[C^[[C^[[C^[[C^[[C^[[C^[[C^[[C^[[C^[[C^[[C^[[C^[[C^[[Clsmod | grep rocket
^[[?2004l\r^[[01;31m^[[Krocket^[[m^[[K                 24576  0
^[[?2004hradxa@rock-3b:~$ ^[[7msudo dmesg -C; python3 ~/npu-debug/teflon_test.py^[[27m\r^[[C^[[C^[[C^[[C^[[C^[[C^[[C^[[C^[[C^[[C^[[C^[[C^[[C^[[C^[[C^[[C^[[Csudo dmesg -C; python3 ~/npu-debug/teflon_test.py
^[[?2004l\rInput: [ 1 80 80 16] <class 'numpy.uint8'>
Output: [  1  40  40 128] <class 'numpy.uint8'>
[  136.899509] rocket fde40000.npu: NPU job timed out
Run 0: elapsed=529.6ms out_sum=26214400
[  137.443518] rocket fde40000.npu: NPU job timed out
Run 1: elapsed=538.0ms out_sum=26214400
[  137.987451] rocket fde40000.npu: NPU job timed out
Run 2: elapsed=543.3ms out_sum=26214400
[  138.531447] rocket fde40000.npu: NPU job timed out
Run 3: elapsed=540.3ms out_sum=26214400
[  139.075418] rocket fde40000.npu: NPU job timed out
Run 4: elapsed=541.1ms out_sum=26214400
Done
^[[?2004hradxa@rock-3b:~$ 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v3 0/9] accel: rocket: Add RK3568 NPU support
  2026-06-09 11:11             ` Midgy Balon
@ 2026-06-10  1:14               ` Chaoyi Chen
  2026-06-10 10:05                 ` Diederik de Haas
  0 siblings, 1 reply; 26+ messages in thread
From: Chaoyi Chen @ 2026-06-10  1:14 UTC (permalink / raw)
  To: Midgy Balon
  Cc: tomeu, ogabbay, heiko, robh, krzk+dt, conor+dt, joro, will,
	robin.murphy, dri-devel, linux-rockchip, devicetree,
	linux-arm-kernel, iommu, linux-kernel, Simon Xue, Finley Xiao

Hi Midgy,

On 6/9/2026 7:11 PM, Midgy Balon wrote:
> Hello Chaoyi,
> 
> You were right - building rocket as a module fixes it. Thanks for the pointer.
> 
> I rebuilt with CONFIG_DRM_ACCEL_ROCKET=m (everything else the same:
> need_regulator on
> the RK3568 NPU power domain via a DOMAIN_M_R variant, domain-supply =
> <&vdd_npu>, and the
> regulator-always-on workaround dropped). The board now boots cleanly
> and, more importantly,
> an NPU job submit no longer hangs: I ran the test workload five times
> with no RCU stall and
> no freeze.
> 
> So with rocket=m the need_regulator approach works on RK3568, and I'll
> keep it for v4
> (domain-supply + need_regulator, instead of marking vdd_npu
> always-on). rocket=m is the
> normal configuration anyway; my earlier hang came from building it =y
> in a self-contained
> image, so it probed in the initcalls (around 2 s) and the genpd ->
> I2C-PMIC regulator
> transition ran before the system was ready. As a module it loads from
> udev much later
> (~6.8 s here), after the I2C controller and regulator core are fully up.
> 
> On your question of when the device-link error is printed - it is at
> power-domain
> controller probe, not at the rocket probe:
> 
>   [    2.700618] vdd_npu: Bringing 500000uV into 825000-825000uV
>   [    2.749637] rockchip-pm-domain fdd90000.power-management:power-controller:
>                  Failed to create device link (0x180) with supplier 0-0020 for
>                  /power-management@fdd90000/power-controller/power-domain@6
>   [    2.945955] platform fde40000.npu: Adding to iommu group 3
>   ...
>   [    6.840374] rocket: loading out-of-tree module taints kernel.
>   [    6.877647] [drm] Initialized rocket 0.0.0 for rknn on minor 0
>   [    6.879950] rocket fde40000.npu: Rockchip NPU core 0 version: 0
> 
> So the device-link to the rk809 PMIC (0-0020) fails to form at ~2.75
> s, well before rocket
> loads at ~6.8 s. It is non-fatal here - the vdd_npu rail is brought up
> by the regulator core
> and all jobs run - and there is no "failed to get ack on domain npu"
> NoC warning this boot
> (the always-on kernel had one). The complete boot log is attached.
> 
> Two notes / one question:
> - This boot used fw_devlink=permissive on the command line. Is the
> "Failed to create device
>   link ... supplier 0-0020" at pmdomain probe expected/benign, or is
> there a clean way to make
>   it order correctly (so it also works without permissive, and a =y
> build wouldn't deadlock in
>   the initcalls)?

We encountered the same issue on the RK3588 NPU before. And it was
resolved with the following patch at that time.

https://lore.kernel.org/all/20251216055247.13150-1-rmxpzlb@gmail.com/

Please compare the differences in NPU pmdomain and DTS configuration
between the RK3568 and RK3588.

> - (The convolution output is still uniform zero-point / the job times
> out - that is the
>   separate NPU compute-completion issue, unrelated to the power-domain
> work. Finley, that is
>   the one I flagged earlier re PVTPLL/NoC.)
> 
> Kind regards,
> Midgy
> 

-- 
Best, 
Chaoyi


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v3 0/9] accel: rocket: Add RK3568 NPU support
  2026-06-10  1:14               ` Chaoyi Chen
@ 2026-06-10 10:05                 ` Diederik de Haas
  2026-06-10 13:38                   ` Midgy Balon
  0 siblings, 1 reply; 26+ messages in thread
From: Diederik de Haas @ 2026-06-10 10:05 UTC (permalink / raw)
  To: Chaoyi Chen, Midgy Balon
  Cc: tomeu, ogabbay, heiko, robh, krzk+dt, conor+dt, joro, will,
	robin.murphy, dri-devel, linux-rockchip, devicetree,
	linux-arm-kernel, iommu, linux-kernel, Simon Xue, Finley Xiao,
	Jonas Karlman

Hi,

On Wed Jun 10, 2026 at 3:14 AM CEST, Chaoyi Chen wrote:
> Hi Midgy,
>
> On 6/9/2026 7:11 PM, Midgy Balon wrote:
>> Hello Chaoyi,
>> 
>> You were right - building rocket as a module fixes it. Thanks for the pointer.
>> 
>> I rebuilt with CONFIG_DRM_ACCEL_ROCKET=m (everything else the same:
>> need_regulator on
>> the RK3568 NPU power domain via a DOMAIN_M_R variant, domain-supply =
>> <&vdd_npu>, and the
>> regulator-always-on workaround dropped). The board now boots cleanly
>> and, more importantly,
>> an NPU job submit no longer hangs: I ran the test workload five times
>> with no RCU stall and
>> no freeze.
>> 
>> So with rocket=m the need_regulator approach works on RK3568, and I'll
>> keep it for v4
>> (domain-supply + need_regulator, instead of marking vdd_npu
>> always-on). rocket=m is the
>> normal configuration anyway; my earlier hang came from building it =y
>> in a self-contained
>> image, so it probed in the initcalls (around 2 s) and the genpd ->
>> I2C-PMIC regulator
>> transition ran before the system was ready. As a module it loads from
>> udev much later
>> (~6.8 s here), after the I2C controller and regulator core are fully up.
>> 
>> On your question of when the device-link error is printed - it is at
>> power-domain
>> controller probe, not at the rocket probe:
>> 
>>   [    2.700618] vdd_npu: Bringing 500000uV into 825000-825000uV
>>   [    2.749637] rockchip-pm-domain fdd90000.power-management:power-controller:
>>                  Failed to create device link (0x180) with supplier 0-0020 for
>>                  /power-management@fdd90000/power-controller/power-domain@6
>>   [    2.945955] platform fde40000.npu: Adding to iommu group 3
>>   ...
>>   [    6.840374] rocket: loading out-of-tree module taints kernel.
>>   [    6.877647] [drm] Initialized rocket 0.0.0 for rknn on minor 0
>>   [    6.879950] rocket fde40000.npu: Rockchip NPU core 0 version: 0
>> 
>> So the device-link to the rk809 PMIC (0-0020) fails to form at ~2.75
>> s, well before rocket
>> loads at ~6.8 s. It is non-fatal here - the vdd_npu rail is brought up
>> by the regulator core
>> and all jobs run - and there is no "failed to get ack on domain npu"
>> NoC warning this boot
>> (the always-on kernel had one). The complete boot log is attached.
>> 
>> Two notes / one question:
>> - This boot used fw_devlink=permissive on the command line. Is the
>> "Failed to create device
>>   link ... supplier 0-0020" at pmdomain probe expected/benign, or is
>> there a clean way to make
>>   it order correctly (so it also works without permissive, and a =y
>> build wouldn't deadlock in
>>   the initcalls)?
>
> We encountered the same issue on the RK3588 NPU before. And it was
> resolved with the following patch at that time.
>
> https://lore.kernel.org/all/20251216055247.13150-1-rmxpzlb@gmail.com/
>
> Please compare the differences in NPU pmdomain and DTS configuration
> between the RK3568 and RK3588.

About a month ago on #linux-rockchip we were discussing PM 'stuff':
https://libera.catirclogs.org/linux-rockchip/2026-05-15#39939137;
which references this paste
https://paste.sr.ht/~diederik/89d9f84e22474e837b55286d213b67f03859ce2e
I've since removed the DCDC_REG2 for PineTab2 and the 'fix' should likely
be extended to cover all RK3566/RK3568 devices though.

It's what I made at the time hoping to fix a suspend/resume issue when
trying upstream TF-A. It didn't fix the issue at the time, but may still
be useful/needed and I think it's what Chaoyi hinted at.

Just yesterday, Jonas posted this patch which may be useful/needed too:
https://lore.kernel.org/linux-rockchip/20260609154124.445182-1-jonas@kwiboo.se/

HTH,
  Diederik

>> - (The convolution output is still uniform zero-point / the job times
>> out - that is the
>>   separate NPU compute-completion issue, unrelated to the power-domain
>> work. Finley, that is
>>   the one I flagged earlier re PVTPLL/NoC.)
>> 
>> Kind regards,
>> Midgy
>> 



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v3 0/9] accel: rocket: Add RK3568 NPU support
  2026-06-10 10:05                 ` Diederik de Haas
@ 2026-06-10 13:38                   ` Midgy Balon
  2026-06-10 14:28                     ` Diederik de Haas
  0 siblings, 1 reply; 26+ messages in thread
From: Midgy Balon @ 2026-06-10 13:38 UTC (permalink / raw)
  To: Diederik de Haas
  Cc: Chaoyi Chen, tomeu, ogabbay, heiko, robh, krzk+dt, conor+dt, joro,
	will, robin.murphy, dri-devel, linux-rockchip, devicetree,
	linux-arm-kernel, iommu, linux-kernel, Simon Xue, Finley Xiao,
	Jonas Karlman

Hello Chaoyi & Diederik,

I compared the RK3568 and RK3588 NPU power-domain + DTS as you
suggested, and it lines up
exactly with what you described.

The difference is the `need_regulator` capability. RK3588's NPU domain is
`DOMAIN_RK3588("npu", …, false, true)` — the trailing `true` is
`regulator`/`need_regulator`.
The mainline RK3568 macro `DOMAIN_RK3568(name, pwr, req, wakeup)` has
no regulator parameter at
all, so `RK3568_PD_NPU` can't be marked need_regulator. My v4 adds
that: a regulator-capable
RK3568 NPU domain (need_regulator = true) plus `domain-supply =
<&vdd_npu>` on the NPU node —
i.e. the same shape as RK3588.

And the fix you referenced (Frank Zhang's "pmdomain: rockchip: Fix init genpd as
GENPD_STATE_ON before regulator ready", plus "quiet regulator error on
-EPROBE_DEFER") is
already in my base (v7.1-rc6), so the `if (need_regulator)
rockchip_pd_power(pd, false)`
default-off path is in effect. That's what resolves the actual problem
for me: with rocket
built as a module (the normal config), need_regulator on the NPU
domain, and those pmdomain
patches in place, the board boots cleanly and NPU jobs run with no RCU
stall / no deadlock. My
earlier hang was an artifact of a self-contained rocket=y image
probing in the initcalls before
the I2C regulator core was up — as a module it loads ~6.8 s in, well
after, so it's gone.

I also went back and checked the `fw_devlink=permissive` question
myself — and good news, it
turns out it is NOT needed. I rebooted the exact same kernel with
permissive removed from the
cmdline (strict fw_devlink, the default), and the board boots cleanly,
the NPU probes
(`rocket fde40000.npu: Rockchip NPU core 0 version: 0`), and NPU jobs
submit and run five times
in a row with no deadlock and no RCU stall. So strict fw_devlink
resolves the NPU/PMIC ordering
fine via deferred probe.

The one remaining thing is cosmetic: at power-domain-controller probe
(~2.94 s) I still get,
in BOTH modes (with or without permissive):

  rockchip-pm-domain …: Failed to create device link (0x180) with
supplier 0-0020 …power-domain@6

i.e. genpd can't form the link to the rk809 (the I2C PMIC supplying
vdd_npu) because the PMIC
isn't registered yet at that point. It's non-fatal — the domain
defaults off (Frank's patch),
the rail comes up via the regulator core, the NPU probes a few seconds
later, and all jobs run.

One question: on RK3588 with need_regulator, do you also see that
"Failed to create device
link … supplier <pmic>" line at pmdomain probe, or does it order
cleanly? If RK3588 is clean,
is there a DTS detail (e.g. the regulator's bus/probe order) I should
mirror on RK3568 to make
the link form in time — or is this line just expected/harmless and
best left as-is?

@Diederik — thanks; the DCDC_REG2 change and Jonas's USB-suspend
series look like generally
useful RK356x robustness fixes, though for this specific NPU
device-link the need_regulator +
Frank's pmdomain patches seem to be the relevant piece. I'll keep them
in mind for suspend.

The convolution-output / compute-completion issue is still separate
and open (@Finley — that's
the PVTPLL/NoC one); the power-domain side is in good shape for v4.

Thanks y'all for your help :)

Kind regards,
Midgy

Le mer. 10 juin 2026 à 12:05, Diederik de Haas
<diederik@cknow-tech.com> a écrit :
>
> Hi,
>
> On Wed Jun 10, 2026 at 3:14 AM CEST, Chaoyi Chen wrote:
> > Hi Midgy,
> >
> > On 6/9/2026 7:11 PM, Midgy Balon wrote:
> >> Hello Chaoyi,
> >>
> >> You were right - building rocket as a module fixes it. Thanks for the pointer.
> >>
> >> I rebuilt with CONFIG_DRM_ACCEL_ROCKET=m (everything else the same:
> >> need_regulator on
> >> the RK3568 NPU power domain via a DOMAIN_M_R variant, domain-supply =
> >> <&vdd_npu>, and the
> >> regulator-always-on workaround dropped). The board now boots cleanly
> >> and, more importantly,
> >> an NPU job submit no longer hangs: I ran the test workload five times
> >> with no RCU stall and
> >> no freeze.
> >>
> >> So with rocket=m the need_regulator approach works on RK3568, and I'll
> >> keep it for v4
> >> (domain-supply + need_regulator, instead of marking vdd_npu
> >> always-on). rocket=m is the
> >> normal configuration anyway; my earlier hang came from building it =y
> >> in a self-contained
> >> image, so it probed in the initcalls (around 2 s) and the genpd ->
> >> I2C-PMIC regulator
> >> transition ran before the system was ready. As a module it loads from
> >> udev much later
> >> (~6.8 s here), after the I2C controller and regulator core are fully up.
> >>
> >> On your question of when the device-link error is printed - it is at
> >> power-domain
> >> controller probe, not at the rocket probe:
> >>
> >>   [    2.700618] vdd_npu: Bringing 500000uV into 825000-825000uV
> >>   [    2.749637] rockchip-pm-domain fdd90000.power-management:power-controller:
> >>                  Failed to create device link (0x180) with supplier 0-0020 for
> >>                  /power-management@fdd90000/power-controller/power-domain@6
> >>   [    2.945955] platform fde40000.npu: Adding to iommu group 3
> >>   ...
> >>   [    6.840374] rocket: loading out-of-tree module taints kernel.
> >>   [    6.877647] [drm] Initialized rocket 0.0.0 for rknn on minor 0
> >>   [    6.879950] rocket fde40000.npu: Rockchip NPU core 0 version: 0
> >>
> >> So the device-link to the rk809 PMIC (0-0020) fails to form at ~2.75
> >> s, well before rocket
> >> loads at ~6.8 s. It is non-fatal here - the vdd_npu rail is brought up
> >> by the regulator core
> >> and all jobs run - and there is no "failed to get ack on domain npu"
> >> NoC warning this boot
> >> (the always-on kernel had one). The complete boot log is attached.
> >>
> >> Two notes / one question:
> >> - This boot used fw_devlink=permissive on the command line. Is the
> >> "Failed to create device
> >>   link ... supplier 0-0020" at pmdomain probe expected/benign, or is
> >> there a clean way to make
> >>   it order correctly (so it also works without permissive, and a =y
> >> build wouldn't deadlock in
> >>   the initcalls)?
> >
> > We encountered the same issue on the RK3588 NPU before. And it was
> > resolved with the following patch at that time.
> >
> > https://lore.kernel.org/all/20251216055247.13150-1-rmxpzlb@gmail.com/
> >
> > Please compare the differences in NPU pmdomain and DTS configuration
> > between the RK3568 and RK3588.
>
> About a month ago on #linux-rockchip we were discussing PM 'stuff':
> https://libera.catirclogs.org/linux-rockchip/2026-05-15#39939137;
> which references this paste
> https://paste.sr.ht/~diederik/89d9f84e22474e837b55286d213b67f03859ce2e
> I've since removed the DCDC_REG2 for PineTab2 and the 'fix' should likely
> be extended to cover all RK3566/RK3568 devices though.
>
> It's what I made at the time hoping to fix a suspend/resume issue when
> trying upstream TF-A. It didn't fix the issue at the time, but may still
> be useful/needed and I think it's what Chaoyi hinted at.
>
> Just yesterday, Jonas posted this patch which may be useful/needed too:
> https://lore.kernel.org/linux-rockchip/20260609154124.445182-1-jonas@kwiboo.se/
>
> HTH,
>   Diederik
>
> >> - (The convolution output is still uniform zero-point / the job times
> >> out - that is the
> >>   separate NPU compute-completion issue, unrelated to the power-domain
> >> work. Finley, that is
> >>   the one I flagged earlier re PVTPLL/NoC.)
> >>
> >> Kind regards,
> >> Midgy
> >>
>


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v3 0/9] accel: rocket: Add RK3568 NPU support
  2026-06-10 13:38                   ` Midgy Balon
@ 2026-06-10 14:28                     ` Diederik de Haas
  0 siblings, 0 replies; 26+ messages in thread
From: Diederik de Haas @ 2026-06-10 14:28 UTC (permalink / raw)
  To: Midgy Balon, Diederik de Haas
  Cc: Chaoyi Chen, tomeu, ogabbay, heiko, robh, krzk+dt, conor+dt, joro,
	will, robin.murphy, dri-devel, linux-rockchip, devicetree,
	linux-arm-kernel, iommu, linux-kernel, Simon Xue, Finley Xiao,
	Jonas Karlman

On Wed Jun 10, 2026 at 3:36 PM CEST, Midgy Balon wrote:
> Hello Chaoyi & Diederik,
>
> I compared the RK3568 and RK3588 NPU power-domain + DTS as you
> suggested, and it lines up
> exactly with what you described.
>
> The difference is the `need_regulator` capability. RK3588's NPU domain is
> `DOMAIN_RK3588("npu", …, false, true)` — the trailing `true` is
> `regulator`/`need_regulator`.
> The mainline RK3568 macro `DOMAIN_RK3568(name, pwr, req, wakeup)` has
> no regulator parameter at
> all, so `RK3568_PD_NPU` can't be marked need_regulator. My v4 adds
> that: a regulator-capable
> RK3568 NPU domain (need_regulator = true) plus `domain-supply =
> <&vdd_npu>` on the NPU node —
> i.e. the same shape as RK3588.
>
> And the fix you referenced (Frank Zhang's "pmdomain: rockchip: Fix init genpd as
> GENPD_STATE_ON before regulator ready", plus "quiet regulator error on
> -EPROBE_DEFER") is
> already in my base (v7.1-rc6), so the `if (need_regulator)
> rockchip_pd_power(pd, false)`
> default-off path is in effect. That's what resolves the actual problem
> for me: with rocket
> built as a module (the normal config), need_regulator on the NPU
> domain, and those pmdomain
> patches in place, the board boots cleanly and NPU jobs run with no RCU
> stall / no deadlock. My
> earlier hang was an artifact of a self-contained rocket=y image
> probing in the initcalls before
> the I2C regulator core was up — as a module it loads ~6.8 s in, well
> after, so it's gone.
>
> I also went back and checked the `fw_devlink=permissive` question
> myself — and good news, it
> turns out it is NOT needed. I rebooted the exact same kernel with
> permissive removed from the
> cmdline (strict fw_devlink, the default), and the board boots cleanly,
> the NPU probes
> (`rocket fde40000.npu: Rockchip NPU core 0 version: 0`), and NPU jobs
> submit and run five times
> in a row with no deadlock and no RCU stall. So strict fw_devlink
> resolves the NPU/PMIC ordering
> fine via deferred probe.
>
> The one remaining thing is cosmetic: at power-domain-controller probe
> (~2.94 s) I still get,
> in BOTH modes (with or without permissive):
>
>   rockchip-pm-domain …: Failed to create device link (0x180) with
> supplier 0-0020 …power-domain@6
>
> i.e. genpd can't form the link to the rk809 (the I2C PMIC supplying
> vdd_npu) because the PMIC
> isn't registered yet at that point. It's non-fatal — the domain
> defaults off (Frank's patch),
> the rail comes up via the regulator core, the NPU probes a few seconds
> later, and all jobs run.
>
> One question: on RK3588 with need_regulator, do you also see that
> "Failed to create device
> link … supplier <pmic>" line at pmdomain probe, or does it order
> cleanly? If RK3588 is clean,
> is there a DTS detail (e.g. the regulator's bus/probe order) I should
> mirror on RK3568 to make
> the link form in time — or is this line just expected/harmless and
> best left as-is?

[    2.110935] rockchip-pm-domain fd8d8000.power-management:power-controller: Failed to create device link (0x180) with supplier 2-0042 for /power-management@fd8d8000/power-controller/power-domain@8
[    2.557459] sdhci-dwcmshc fe2e0000.mmc: Can't reduce the clock below 52MHz in HS200/HS400 mode
[    2.647174] rockchip-pm-domain fd8d8000.power-management:power-controller: Failed to create device link (0x180) with supplier 2-0042 for /power-management@fd8d8000/power-controller/power-domain@8
[    2.945089] rockchip-pm-domain fd8d8000.power-management:power-controller: Failed to create device link (0x180) with supplier spi2.0 for /power-management@fd8d8000/power-controller/power-domain@12

8 = NPU; 12 = GPU

on both nanopc-t6-lts and nanopc-t6-plus (both RK3588).
And on a 6.18 dmesg output I have for Rock 5B, I see the ~ same, but then
it's 1-0042 instead of 2-0042. 

I don't know if it's bad or harmless, but it is consistent.

HTH,
  Diederik

> @Diederik — thanks; the DCDC_REG2 change and Jonas's USB-suspend
> series look like generally
> useful RK356x robustness fixes, though for this specific NPU
> device-link the need_regulator +
> Frank's pmdomain patches seem to be the relevant piece. I'll keep them
> in mind for suspend.
>
> The convolution-output / compute-completion issue is still separate
> and open (@Finley — that's
> the PVTPLL/NoC one); the power-domain side is in good shape for v4.
>
> Thanks y'all for your help :)
>
> Kind regards,
> Midgy
>
> Le mer. 10 juin 2026 à 12:05, Diederik de Haas
> <diederik@cknow-tech.com> a écrit :
>>
>> Hi,
>>
>> On Wed Jun 10, 2026 at 3:14 AM CEST, Chaoyi Chen wrote:
>> > Hi Midgy,
>> >
>> > On 6/9/2026 7:11 PM, Midgy Balon wrote:
>> >> Hello Chaoyi,
>> >>
>> >> You were right - building rocket as a module fixes it. Thanks for the pointer.
>> >>
>> >> I rebuilt with CONFIG_DRM_ACCEL_ROCKET=m (everything else the same:
>> >> need_regulator on
>> >> the RK3568 NPU power domain via a DOMAIN_M_R variant, domain-supply =
>> >> <&vdd_npu>, and the
>> >> regulator-always-on workaround dropped). The board now boots cleanly
>> >> and, more importantly,
>> >> an NPU job submit no longer hangs: I ran the test workload five times
>> >> with no RCU stall and
>> >> no freeze.
>> >>
>> >> So with rocket=m the need_regulator approach works on RK3568, and I'll
>> >> keep it for v4
>> >> (domain-supply + need_regulator, instead of marking vdd_npu
>> >> always-on). rocket=m is the
>> >> normal configuration anyway; my earlier hang came from building it =y
>> >> in a self-contained
>> >> image, so it probed in the initcalls (around 2 s) and the genpd ->
>> >> I2C-PMIC regulator
>> >> transition ran before the system was ready. As a module it loads from
>> >> udev much later
>> >> (~6.8 s here), after the I2C controller and regulator core are fully up.
>> >>
>> >> On your question of when the device-link error is printed - it is at
>> >> power-domain
>> >> controller probe, not at the rocket probe:
>> >>
>> >>   [    2.700618] vdd_npu: Bringing 500000uV into 825000-825000uV
>> >>   [    2.749637] rockchip-pm-domain fdd90000.power-management:power-controller:
>> >>                  Failed to create device link (0x180) with supplier 0-0020 for
>> >>                  /power-management@fdd90000/power-controller/power-domain@6
>> >>   [    2.945955] platform fde40000.npu: Adding to iommu group 3
>> >>   ...
>> >>   [    6.840374] rocket: loading out-of-tree module taints kernel.
>> >>   [    6.877647] [drm] Initialized rocket 0.0.0 for rknn on minor 0
>> >>   [    6.879950] rocket fde40000.npu: Rockchip NPU core 0 version: 0
>> >>
>> >> So the device-link to the rk809 PMIC (0-0020) fails to form at ~2.75
>> >> s, well before rocket
>> >> loads at ~6.8 s. It is non-fatal here - the vdd_npu rail is brought up
>> >> by the regulator core
>> >> and all jobs run - and there is no "failed to get ack on domain npu"
>> >> NoC warning this boot
>> >> (the always-on kernel had one). The complete boot log is attached.
>> >>
>> >> Two notes / one question:
>> >> - This boot used fw_devlink=permissive on the command line. Is the
>> >> "Failed to create device
>> >>   link ... supplier 0-0020" at pmdomain probe expected/benign, or is
>> >> there a clean way to make
>> >>   it order correctly (so it also works without permissive, and a =y
>> >> build wouldn't deadlock in
>> >>   the initcalls)?
>> >
>> > We encountered the same issue on the RK3588 NPU before. And it was
>> > resolved with the following patch at that time.
>> >
>> > https://lore.kernel.org/all/20251216055247.13150-1-rmxpzlb@gmail.com/
>> >
>> > Please compare the differences in NPU pmdomain and DTS configuration
>> > between the RK3568 and RK3588.
>>
>> About a month ago on #linux-rockchip we were discussing PM 'stuff':
>> https://libera.catirclogs.org/linux-rockchip/2026-05-15#39939137;
>> which references this paste
>> https://paste.sr.ht/~diederik/89d9f84e22474e837b55286d213b67f03859ce2e
>> I've since removed the DCDC_REG2 for PineTab2 and the 'fix' should likely
>> be extended to cover all RK3566/RK3568 devices though.
>>
>> It's what I made at the time hoping to fix a suspend/resume issue when
>> trying upstream TF-A. It didn't fix the issue at the time, but may still
>> be useful/needed and I think it's what Chaoyi hinted at.
>>
>> Just yesterday, Jonas posted this patch which may be useful/needed too:
>> https://lore.kernel.org/linux-rockchip/20260609154124.445182-1-jonas@kwiboo.se/
>>
>> HTH,
>>   Diederik
>>
>> >> - (The convolution output is still uniform zero-point / the job times
>> >> out - that is the
>> >>   separate NPU compute-completion issue, unrelated to the power-domain
>> >> work. Finley, that is
>> >>   the one I flagged earlier re PVTPLL/NoC.)
>> >>
>> >> Kind regards,
>> >> Midgy
>> >>
>>
>
> _______________________________________________
> Linux-rockchip mailing list
> Linux-rockchip@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-rockchip



^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2026-06-10 14:28 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-04 13:52 [RFC PATCH v3 0/9] accel: rocket: Add RK3568 NPU support Midgy BALON
2026-06-04 13:52 ` [RFC PATCH v3 1/9] accel: rocket: Introduce per-SoC rocket_soc_data Midgy BALON
2026-06-04 13:52 ` [RFC PATCH v3 2/9] accel: rocket: Derive DMA width and core count from match data Midgy BALON
2026-06-04 13:52 ` [RFC PATCH v3 3/9] accel: rocket: Add RK3568 SoC support Midgy BALON
2026-06-04 13:52 ` [RFC PATCH v3 4/9] accel: rocket: Reset the NPU before detaching the IOMMU on timeout Midgy BALON
2026-06-04 13:52 ` [RFC PATCH v3 5/9] accel: rocket: Keep the IOMMU domain attached across jobs Midgy BALON
2026-06-04 13:52 ` [RFC PATCH v3 6/9] iommu/rockchip: Clear AUTO_GATING bit 1 on the RK356x v1 IOMMU Midgy BALON
2026-06-04 14:20   ` Tomeu Vizoso
2026-06-05  1:59   ` Chaoyi Chen
2026-06-07 21:05     ` Midgy Balon
2026-06-08  1:45       ` Chaoyi Chen
2026-06-08  3:40         ` Chaoyi Chen
2026-06-04 13:52 ` [RFC PATCH v3 7/9] dt-bindings: npu: rockchip,rk3588-rknn-core: Add RK3568 Midgy BALON
2026-06-04 13:52 ` [RFC PATCH v3 8/9] arm64: dts: rockchip: rk356x: Add the NPU and its IOMMU Midgy BALON
2026-06-04 13:52 ` [RFC PATCH v3 9/9] arm64: dts: rockchip: rk3568-rock-3b: Enable the NPU Midgy BALON
2026-06-05  1:36 ` [RFC PATCH v3 0/9] accel: rocket: Add RK3568 NPU support Chaoyi Chen
2026-06-07 21:03   ` Midgy Balon
2026-06-08  1:40     ` Chaoyi Chen
2026-06-08  8:05       ` Midgy Balon
2026-06-08  9:14         ` Midgy Balon
2026-06-08  9:38           ` Chaoyi Chen
2026-06-09 11:11             ` Midgy Balon
2026-06-10  1:14               ` Chaoyi Chen
2026-06-10 10:05                 ` Diederik de Haas
2026-06-10 13:38                   ` Midgy Balon
2026-06-10 14:28                     ` Diederik de Haas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox