Devicetree
 help / color / mirror / Atom feed
From: "Diederik de Haas" <diederik@cknow-tech.com>
To: "Midgy Balon" <midgy971@gmail.com>,
	"Diederik de Haas" <diederik@cknow-tech.com>
Cc: "Chaoyi Chen" <chaoyi.chen@rock-chips.com>,
	<tomeu@tomeuvizoso.net>, <ogabbay@kernel.org>, <heiko@sntech.de>,
	<robh@kernel.org>, <krzk+dt@kernel.org>, <conor+dt@kernel.org>,
	<joro@8bytes.org>, <will@kernel.org>, <robin.murphy@arm.com>,
	<dri-devel@lists.freedesktop.org>,
	<linux-rockchip@lists.infradead.org>,
	<devicetree@vger.kernel.org>,
	<linux-arm-kernel@lists.infradead.org>, <iommu@lists.linux.dev>,
	<linux-kernel@vger.kernel.org>, "Simon Xue" <xxm@rock-chips.com>,
	"Finley Xiao" <finley.xiao@rock-chips.com>,
	"Jonas Karlman" <jonas@kwiboo.se>
Subject: Re: [RFC PATCH v3 0/9] accel: rocket: Add RK3568 NPU support
Date: Wed, 10 Jun 2026 16:28:17 +0200	[thread overview]
Message-ID: <DJ5FUW50YM2N.6ZTY4WK27ZP5@cknow-tech.com> (raw)
In-Reply-To: <CA+GS1Y1xAq-9eMyMmoVE6NG9KLG7XRxgPoSr5RkW=6fT5D820g@mail.gmail.com>

On Wed Jun 10, 2026 at 3:36 PM CEST, Midgy Balon wrote:
> Hello Chaoyi & Diederik,
>
> I compared the RK3568 and RK3588 NPU power-domain + DTS as you
> suggested, and it lines up
> exactly with what you described.
>
> The difference is the `need_regulator` capability. RK3588's NPU domain is
> `DOMAIN_RK3588("npu", …, false, true)` — the trailing `true` is
> `regulator`/`need_regulator`.
> The mainline RK3568 macro `DOMAIN_RK3568(name, pwr, req, wakeup)` has
> no regulator parameter at
> all, so `RK3568_PD_NPU` can't be marked need_regulator. My v4 adds
> that: a regulator-capable
> RK3568 NPU domain (need_regulator = true) plus `domain-supply =
> <&vdd_npu>` on the NPU node —
> i.e. the same shape as RK3588.
>
> And the fix you referenced (Frank Zhang's "pmdomain: rockchip: Fix init genpd as
> GENPD_STATE_ON before regulator ready", plus "quiet regulator error on
> -EPROBE_DEFER") is
> already in my base (v7.1-rc6), so the `if (need_regulator)
> rockchip_pd_power(pd, false)`
> default-off path is in effect. That's what resolves the actual problem
> for me: with rocket
> built as a module (the normal config), need_regulator on the NPU
> domain, and those pmdomain
> patches in place, the board boots cleanly and NPU jobs run with no RCU
> stall / no deadlock. My
> earlier hang was an artifact of a self-contained rocket=y image
> probing in the initcalls before
> the I2C regulator core was up — as a module it loads ~6.8 s in, well
> after, so it's gone.
>
> I also went back and checked the `fw_devlink=permissive` question
> myself — and good news, it
> turns out it is NOT needed. I rebooted the exact same kernel with
> permissive removed from the
> cmdline (strict fw_devlink, the default), and the board boots cleanly,
> the NPU probes
> (`rocket fde40000.npu: Rockchip NPU core 0 version: 0`), and NPU jobs
> submit and run five times
> in a row with no deadlock and no RCU stall. So strict fw_devlink
> resolves the NPU/PMIC ordering
> fine via deferred probe.
>
> The one remaining thing is cosmetic: at power-domain-controller probe
> (~2.94 s) I still get,
> in BOTH modes (with or without permissive):
>
>   rockchip-pm-domain …: Failed to create device link (0x180) with
> supplier 0-0020 …power-domain@6
>
> i.e. genpd can't form the link to the rk809 (the I2C PMIC supplying
> vdd_npu) because the PMIC
> isn't registered yet at that point. It's non-fatal — the domain
> defaults off (Frank's patch),
> the rail comes up via the regulator core, the NPU probes a few seconds
> later, and all jobs run.
>
> One question: on RK3588 with need_regulator, do you also see that
> "Failed to create device
> link … supplier <pmic>" line at pmdomain probe, or does it order
> cleanly? If RK3588 is clean,
> is there a DTS detail (e.g. the regulator's bus/probe order) I should
> mirror on RK3568 to make
> the link form in time — or is this line just expected/harmless and
> best left as-is?

[    2.110935] rockchip-pm-domain fd8d8000.power-management:power-controller: Failed to create device link (0x180) with supplier 2-0042 for /power-management@fd8d8000/power-controller/power-domain@8
[    2.557459] sdhci-dwcmshc fe2e0000.mmc: Can't reduce the clock below 52MHz in HS200/HS400 mode
[    2.647174] rockchip-pm-domain fd8d8000.power-management:power-controller: Failed to create device link (0x180) with supplier 2-0042 for /power-management@fd8d8000/power-controller/power-domain@8
[    2.945089] rockchip-pm-domain fd8d8000.power-management:power-controller: Failed to create device link (0x180) with supplier spi2.0 for /power-management@fd8d8000/power-controller/power-domain@12

8 = NPU; 12 = GPU

on both nanopc-t6-lts and nanopc-t6-plus (both RK3588).
And on a 6.18 dmesg output I have for Rock 5B, I see the ~ same, but then
it's 1-0042 instead of 2-0042. 

I don't know if it's bad or harmless, but it is consistent.

HTH,
  Diederik

> @Diederik — thanks; the DCDC_REG2 change and Jonas's USB-suspend
> series look like generally
> useful RK356x robustness fixes, though for this specific NPU
> device-link the need_regulator +
> Frank's pmdomain patches seem to be the relevant piece. I'll keep them
> in mind for suspend.
>
> The convolution-output / compute-completion issue is still separate
> and open (@Finley — that's
> the PVTPLL/NoC one); the power-domain side is in good shape for v4.
>
> Thanks y'all for your help :)
>
> Kind regards,
> Midgy
>
> Le mer. 10 juin 2026 à 12:05, Diederik de Haas
> <diederik@cknow-tech.com> a écrit :
>>
>> Hi,
>>
>> On Wed Jun 10, 2026 at 3:14 AM CEST, Chaoyi Chen wrote:
>> > Hi Midgy,
>> >
>> > On 6/9/2026 7:11 PM, Midgy Balon wrote:
>> >> Hello Chaoyi,
>> >>
>> >> You were right - building rocket as a module fixes it. Thanks for the pointer.
>> >>
>> >> I rebuilt with CONFIG_DRM_ACCEL_ROCKET=m (everything else the same:
>> >> need_regulator on
>> >> the RK3568 NPU power domain via a DOMAIN_M_R variant, domain-supply =
>> >> <&vdd_npu>, and the
>> >> regulator-always-on workaround dropped). The board now boots cleanly
>> >> and, more importantly,
>> >> an NPU job submit no longer hangs: I ran the test workload five times
>> >> with no RCU stall and
>> >> no freeze.
>> >>
>> >> So with rocket=m the need_regulator approach works on RK3568, and I'll
>> >> keep it for v4
>> >> (domain-supply + need_regulator, instead of marking vdd_npu
>> >> always-on). rocket=m is the
>> >> normal configuration anyway; my earlier hang came from building it =y
>> >> in a self-contained
>> >> image, so it probed in the initcalls (around 2 s) and the genpd ->
>> >> I2C-PMIC regulator
>> >> transition ran before the system was ready. As a module it loads from
>> >> udev much later
>> >> (~6.8 s here), after the I2C controller and regulator core are fully up.
>> >>
>> >> On your question of when the device-link error is printed - it is at
>> >> power-domain
>> >> controller probe, not at the rocket probe:
>> >>
>> >>   [    2.700618] vdd_npu: Bringing 500000uV into 825000-825000uV
>> >>   [    2.749637] rockchip-pm-domain fdd90000.power-management:power-controller:
>> >>                  Failed to create device link (0x180) with supplier 0-0020 for
>> >>                  /power-management@fdd90000/power-controller/power-domain@6
>> >>   [    2.945955] platform fde40000.npu: Adding to iommu group 3
>> >>   ...
>> >>   [    6.840374] rocket: loading out-of-tree module taints kernel.
>> >>   [    6.877647] [drm] Initialized rocket 0.0.0 for rknn on minor 0
>> >>   [    6.879950] rocket fde40000.npu: Rockchip NPU core 0 version: 0
>> >>
>> >> So the device-link to the rk809 PMIC (0-0020) fails to form at ~2.75
>> >> s, well before rocket
>> >> loads at ~6.8 s. It is non-fatal here - the vdd_npu rail is brought up
>> >> by the regulator core
>> >> and all jobs run - and there is no "failed to get ack on domain npu"
>> >> NoC warning this boot
>> >> (the always-on kernel had one). The complete boot log is attached.
>> >>
>> >> Two notes / one question:
>> >> - This boot used fw_devlink=permissive on the command line. Is the
>> >> "Failed to create device
>> >>   link ... supplier 0-0020" at pmdomain probe expected/benign, or is
>> >> there a clean way to make
>> >>   it order correctly (so it also works without permissive, and a =y
>> >> build wouldn't deadlock in
>> >>   the initcalls)?
>> >
>> > We encountered the same issue on the RK3588 NPU before. And it was
>> > resolved with the following patch at that time.
>> >
>> > https://lore.kernel.org/all/20251216055247.13150-1-rmxpzlb@gmail.com/
>> >
>> > Please compare the differences in NPU pmdomain and DTS configuration
>> > between the RK3568 and RK3588.
>>
>> About a month ago on #linux-rockchip we were discussing PM 'stuff':
>> https://libera.catirclogs.org/linux-rockchip/2026-05-15#39939137;
>> which references this paste
>> https://paste.sr.ht/~diederik/89d9f84e22474e837b55286d213b67f03859ce2e
>> I've since removed the DCDC_REG2 for PineTab2 and the 'fix' should likely
>> be extended to cover all RK3566/RK3568 devices though.
>>
>> It's what I made at the time hoping to fix a suspend/resume issue when
>> trying upstream TF-A. It didn't fix the issue at the time, but may still
>> be useful/needed and I think it's what Chaoyi hinted at.
>>
>> Just yesterday, Jonas posted this patch which may be useful/needed too:
>> https://lore.kernel.org/linux-rockchip/20260609154124.445182-1-jonas@kwiboo.se/
>>
>> HTH,
>>   Diederik
>>
>> >> - (The convolution output is still uniform zero-point / the job times
>> >> out - that is the
>> >>   separate NPU compute-completion issue, unrelated to the power-domain
>> >> work. Finley, that is
>> >>   the one I flagged earlier re PVTPLL/NoC.)
>> >>
>> >> Kind regards,
>> >> Midgy
>> >>
>>
>
> _______________________________________________
> Linux-rockchip mailing list
> Linux-rockchip@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-rockchip


      reply	other threads:[~2026-06-10 14:28 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-04 13:52 [RFC PATCH v3 0/9] accel: rocket: Add RK3568 NPU support Midgy BALON
2026-06-04 13:52 ` [RFC PATCH v3 1/9] accel: rocket: Introduce per-SoC rocket_soc_data Midgy BALON
2026-06-04 14:08   ` sashiko-bot
2026-06-04 13:52 ` [RFC PATCH v3 2/9] accel: rocket: Derive DMA width and core count from match data Midgy BALON
2026-06-04 14:05   ` sashiko-bot
2026-06-04 13:52 ` [RFC PATCH v3 3/9] accel: rocket: Add RK3568 SoC support Midgy BALON
2026-06-04 14:05   ` sashiko-bot
2026-06-04 13:52 ` [RFC PATCH v3 4/9] accel: rocket: Reset the NPU before detaching the IOMMU on timeout Midgy BALON
2026-06-04 14:10   ` sashiko-bot
2026-06-04 13:52 ` [RFC PATCH v3 5/9] accel: rocket: Keep the IOMMU domain attached across jobs Midgy BALON
2026-06-04 14:08   ` sashiko-bot
2026-06-04 13:52 ` [RFC PATCH v3 6/9] iommu/rockchip: Clear AUTO_GATING bit 1 on the RK356x v1 IOMMU Midgy BALON
2026-06-04 14:04   ` sashiko-bot
2026-06-04 14:20   ` Tomeu Vizoso
2026-06-05  1:59   ` Chaoyi Chen
2026-06-07 21:05     ` Midgy Balon
2026-06-08  1:45       ` Chaoyi Chen
2026-06-08  3:40         ` Chaoyi Chen
2026-06-04 13:52 ` [RFC PATCH v3 7/9] dt-bindings: npu: rockchip,rk3588-rknn-core: Add RK3568 Midgy BALON
2026-06-04 14:08   ` sashiko-bot
2026-06-04 16:55     ` Conor Dooley
2026-06-04 13:52 ` [RFC PATCH v3 8/9] arm64: dts: rockchip: rk356x: Add the NPU and its IOMMU Midgy BALON
2026-06-04 14:11   ` sashiko-bot
2026-06-04 13:52 ` [RFC PATCH v3 9/9] arm64: dts: rockchip: rk3568-rock-3b: Enable the NPU Midgy BALON
2026-06-05  1:36 ` [RFC PATCH v3 0/9] accel: rocket: Add RK3568 NPU support Chaoyi Chen
2026-06-07 21:03   ` Midgy Balon
2026-06-08  1:40     ` Chaoyi Chen
2026-06-08  8:05       ` Midgy Balon
2026-06-08  9:14         ` Midgy Balon
2026-06-08  9:38           ` Chaoyi Chen
2026-06-09 11:11             ` Midgy Balon
2026-06-10  1:14               ` Chaoyi Chen
2026-06-10 10:05                 ` Diederik de Haas
2026-06-10 13:38                   ` Midgy Balon
2026-06-10 14:28                     ` Diederik de Haas [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DJ5FUW50YM2N.6ZTY4WK27ZP5@cknow-tech.com \
    --to=diederik@cknow-tech.com \
    --cc=chaoyi.chen@rock-chips.com \
    --cc=conor+dt@kernel.org \
    --cc=devicetree@vger.kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=finley.xiao@rock-chips.com \
    --cc=heiko@sntech.de \
    --cc=iommu@lists.linux.dev \
    --cc=jonas@kwiboo.se \
    --cc=joro@8bytes.org \
    --cc=krzk+dt@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rockchip@lists.infradead.org \
    --cc=midgy971@gmail.com \
    --cc=ogabbay@kernel.org \
    --cc=robh@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=tomeu@tomeuvizoso.net \
    --cc=will@kernel.org \
    --cc=xxm@rock-chips.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox