* [PATCH RFC 01/10] dt-bindings: gpu: mali-valhall-csf: add mediatek,mt8196-mali variant
2025-09-05 10:22 [PATCH RFC 00/10] MT8196 GPU Frequency/Power Control Support Nicolas Frattaroli
@ 2025-09-05 10:22 ` Nicolas Frattaroli
2025-09-05 23:26 ` Rob Herring
2025-09-08 19:51 ` Liviu Dudau
2025-09-05 10:22 ` [PATCH RFC 02/10] dt-bindings: devfreq: add mt8196-gpufreq binding Nicolas Frattaroli
` (8 subsequent siblings)
9 siblings, 2 replies; 30+ messages in thread
From: Nicolas Frattaroli @ 2025-09-05 10:22 UTC (permalink / raw)
To: AngeloGioacchino Del Regno, Boris Brezillon, Steven Price,
Liviu Dudau, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Matthias Brugger, MyungJoo Ham, Kyungmin Park,
Chanwoo Choi, Jassi Brar, Kees Cook, Gustavo A. R. Silva
Cc: Chia-I Wu, Chen-Yu Tsai, kernel, dri-devel, devicetree,
linux-kernel, linux-arm-kernel, linux-mediatek, linux-pm,
linux-hardening, Nicolas Frattaroli
The Mali-based GPU on the MediaTek MT8196 SoC is shackled to its concept
of "MFlexGraphics", which in this iteration includes an embedded MCU
that needs to be poked to power on the GPU, and is in charge of
controlling all the clocks and regulators.
In return, it lets us omit the OPP tables from the device tree, as those
can now be enumerated at runtime from the MCU.
Add the mediatek,mt8196-mali compatible, and a performance-controller
property which points to a node representing such setups. It's required
on mt8196 devices.
Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
---
.../bindings/gpu/arm,mali-valhall-csf.yaml | 36 +++++++++++++++++++++-
1 file changed, 35 insertions(+), 1 deletion(-)
diff --git a/Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml b/Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml
index a5b4e00217587c5d1f889094e2fff7b76e6148eb..6df802e900b744d226395c29f8d87fb6d3282d26 100644
--- a/Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml
+++ b/Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml
@@ -19,6 +19,7 @@ properties:
- items:
- enum:
- rockchip,rk3588-mali
+ - mediatek,mt8196-mali
- const: arm,mali-valhall-csf # Mali Valhall GPU model/revision is fully discoverable
reg:
@@ -53,6 +54,13 @@ properties:
opp-table:
type: object
+ performance-controller:
+ $ref: /schemas/types.yaml#/definitions/phandle
+ description:
+ A phandle of a device that controls this GPU's power and frequency,
+ if any. If present, this is usually in the form of some specialised
+ embedded MCU.
+
power-domains:
minItems: 1
maxItems: 5
@@ -91,7 +99,6 @@ required:
- interrupts
- interrupt-names
- clocks
- - mali-supply
additionalProperties: false
@@ -105,9 +112,24 @@ allOf:
properties:
clocks:
minItems: 3
+ performance-controller: false
power-domains:
maxItems: 1
power-domain-names: false
+ required:
+ - mali-supply
+ - if:
+ properties:
+ compatible:
+ contains:
+ const: rockchip,mt8196-mali
+ then:
+ properties:
+ mali-supply: false
+ sram-supply: false
+ operating-points-v2: false
+ required:
+ - performance-controller
examples:
- |
@@ -143,5 +165,17 @@ examples:
};
};
};
+ - |
+ gpu2: gpu@48000000 {
+ compatible = "mediatek,mt8196-mali", "arm,mali-valhall-csf";
+ reg = <0x48000000 0x480000>;
+ clocks = <&mfgpll 0>;
+ clock-names = "core";
+ interrupts = <GIC_SPI 606 IRQ_TYPE_LEVEL_HIGH 0>,
+ <GIC_SPI 605 IRQ_TYPE_LEVEL_HIGH 0>,
+ <GIC_SPI 604 IRQ_TYPE_LEVEL_HIGH 0>;
+ interrupt-names = "job", "mmu", "gpu";
+ performance-controller = <&gpufreq>;
+ };
...
--
2.51.0
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH RFC 01/10] dt-bindings: gpu: mali-valhall-csf: add mediatek,mt8196-mali variant
2025-09-05 10:22 ` [PATCH RFC 01/10] dt-bindings: gpu: mali-valhall-csf: add mediatek,mt8196-mali variant Nicolas Frattaroli
@ 2025-09-05 23:26 ` Rob Herring
2025-09-06 17:54 ` Nicolas Frattaroli
2025-09-08 19:51 ` Liviu Dudau
1 sibling, 1 reply; 30+ messages in thread
From: Rob Herring @ 2025-09-05 23:26 UTC (permalink / raw)
To: Nicolas Frattaroli
Cc: AngeloGioacchino Del Regno, Boris Brezillon, Steven Price,
Liviu Dudau, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Krzysztof Kozlowski, Conor Dooley,
Matthias Brugger, MyungJoo Ham, Kyungmin Park, Chanwoo Choi,
Jassi Brar, Kees Cook, Gustavo A. R. Silva, Chia-I Wu,
Chen-Yu Tsai, kernel, dri-devel, devicetree, linux-kernel,
linux-arm-kernel, linux-mediatek, linux-pm, linux-hardening
On Fri, Sep 05, 2025 at 12:22:57PM +0200, Nicolas Frattaroli wrote:
> The Mali-based GPU on the MediaTek MT8196 SoC is shackled to its concept
> of "MFlexGraphics", which in this iteration includes an embedded MCU
> that needs to be poked to power on the GPU, and is in charge of
> controlling all the clocks and regulators.
>
> In return, it lets us omit the OPP tables from the device tree, as those
> can now be enumerated at runtime from the MCU.
>
> Add the mediatek,mt8196-mali compatible, and a performance-controller
> property which points to a node representing such setups. It's required
> on mt8196 devices.
>
> Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> ---
> .../bindings/gpu/arm,mali-valhall-csf.yaml | 36 +++++++++++++++++++++-
> 1 file changed, 35 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml b/Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml
> index a5b4e00217587c5d1f889094e2fff7b76e6148eb..6df802e900b744d226395c29f8d87fb6d3282d26 100644
> --- a/Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml
> +++ b/Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml
> @@ -19,6 +19,7 @@ properties:
> - items:
> - enum:
> - rockchip,rk3588-mali
> + - mediatek,mt8196-mali
> - const: arm,mali-valhall-csf # Mali Valhall GPU model/revision is fully discoverable
>
> reg:
> @@ -53,6 +54,13 @@ properties:
> opp-table:
> type: object
>
> + performance-controller:
> + $ref: /schemas/types.yaml#/definitions/phandle
> + description:
> + A phandle of a device that controls this GPU's power and frequency,
> + if any. If present, this is usually in the form of some specialised
> + embedded MCU.
We already abuse power-domains binding with both power and performance.
There's a performance-domain binding too, but only used on one platform
for CPUs (Mediatek too IIRC). Or perhaps you could just point to an
empty OPP table. I don't think you have anything new here, so don't
invent something new.
Rob
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH RFC 01/10] dt-bindings: gpu: mali-valhall-csf: add mediatek,mt8196-mali variant
2025-09-05 23:26 ` Rob Herring
@ 2025-09-06 17:54 ` Nicolas Frattaroli
0 siblings, 0 replies; 30+ messages in thread
From: Nicolas Frattaroli @ 2025-09-06 17:54 UTC (permalink / raw)
To: Rob Herring
Cc: AngeloGioacchino Del Regno, Boris Brezillon, Steven Price,
Liviu Dudau, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Krzysztof Kozlowski, Conor Dooley,
Matthias Brugger, MyungJoo Ham, Kyungmin Park, Chanwoo Choi,
Jassi Brar, Kees Cook, Gustavo A. R. Silva, Chia-I Wu,
Chen-Yu Tsai, kernel, dri-devel, devicetree, linux-kernel,
linux-arm-kernel, linux-mediatek, linux-pm, linux-hardening
Hi Rob,
On Saturday, 6 September 2025 01:26:57 Central European Summer Time Rob Herring wrote:
> On Fri, Sep 05, 2025 at 12:22:57PM +0200, Nicolas Frattaroli wrote:
> > The Mali-based GPU on the MediaTek MT8196 SoC is shackled to its concept
> > of "MFlexGraphics", which in this iteration includes an embedded MCU
> > that needs to be poked to power on the GPU, and is in charge of
> > controlling all the clocks and regulators.
> >
> > In return, it lets us omit the OPP tables from the device tree, as those
> > can now be enumerated at runtime from the MCU.
> >
> > Add the mediatek,mt8196-mali compatible, and a performance-controller
> > property which points to a node representing such setups. It's required
> > on mt8196 devices.
> >
> > Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> > ---
> > .../bindings/gpu/arm,mali-valhall-csf.yaml | 36 +++++++++++++++++++++-
> > 1 file changed, 35 insertions(+), 1 deletion(-)
> >
> > diff --git a/Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml b/Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml
> > index a5b4e00217587c5d1f889094e2fff7b76e6148eb..6df802e900b744d226395c29f8d87fb6d3282d26 100644
> > --- a/Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml
> > +++ b/Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml
> > @@ -19,6 +19,7 @@ properties:
> > - items:
> > - enum:
> > - rockchip,rk3588-mali
> > + - mediatek,mt8196-mali
> > - const: arm,mali-valhall-csf # Mali Valhall GPU model/revision is fully discoverable
> >
> > reg:
> > @@ -53,6 +54,13 @@ properties:
> > opp-table:
> > type: object
> >
> > + performance-controller:
> > + $ref: /schemas/types.yaml#/definitions/phandle
> > + description:
> > + A phandle of a device that controls this GPU's power and frequency,
> > + if any. If present, this is usually in the form of some specialised
> > + embedded MCU.
>
> We already abuse power-domains binding with both power and performance.
> There's a performance-domain binding too, but only used on one platform
> for CPUs (Mediatek too IIRC). Or perhaps you could just point to an
> empty OPP table. I don't think you have anything new here, so don't
> invent something new.
Oops, yeah, I forgot about performance-domain already existing. I agree
that it looks like a good fit; iirc I initially disregarded it because
I thought it was an actual heterogenous core cpufreq-y thing I'd be
overloading with new meaning, but I see now that this is not so, and
aside from mediatek, Apple appears to be the only user.
Thanks for the fast review.
Kind regards,
Nicolas Frattaroli
>
> Rob
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH RFC 01/10] dt-bindings: gpu: mali-valhall-csf: add mediatek,mt8196-mali variant
2025-09-05 10:22 ` [PATCH RFC 01/10] dt-bindings: gpu: mali-valhall-csf: add mediatek,mt8196-mali variant Nicolas Frattaroli
2025-09-05 23:26 ` Rob Herring
@ 2025-09-08 19:51 ` Liviu Dudau
1 sibling, 0 replies; 30+ messages in thread
From: Liviu Dudau @ 2025-09-08 19:51 UTC (permalink / raw)
To: Nicolas Frattaroli
Cc: AngeloGioacchino Del Regno, Boris Brezillon, Steven Price,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
Matthias Brugger, MyungJoo Ham, Kyungmin Park, Chanwoo Choi,
Jassi Brar, Kees Cook, Gustavo A. R. Silva, Chia-I Wu,
Chen-Yu Tsai, kernel, dri-devel, devicetree, linux-kernel,
linux-arm-kernel, linux-mediatek, linux-pm, linux-hardening
On Fri, Sep 05, 2025 at 12:22:57PM +0200, Nicolas Frattaroli wrote:
> The Mali-based GPU on the MediaTek MT8196 SoC is shackled to its concept
> of "MFlexGraphics", which in this iteration includes an embedded MCU
> that needs to be poked to power on the GPU, and is in charge of
> controlling all the clocks and regulators.
>
> In return, it lets us omit the OPP tables from the device tree, as those
> can now be enumerated at runtime from the MCU.
>
> Add the mediatek,mt8196-mali compatible, and a performance-controller
> property which points to a node representing such setups. It's required
> on mt8196 devices.
>
> Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> ---
> .../bindings/gpu/arm,mali-valhall-csf.yaml | 36 +++++++++++++++++++++-
> 1 file changed, 35 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml b/Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml
> index a5b4e00217587c5d1f889094e2fff7b76e6148eb..6df802e900b744d226395c29f8d87fb6d3282d26 100644
> --- a/Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml
> +++ b/Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml
> @@ -19,6 +19,7 @@ properties:
> - items:
> - enum:
> - rockchip,rk3588-mali
> + - mediatek,mt8196-mali
> - const: arm,mali-valhall-csf # Mali Valhall GPU model/revision is fully discoverable
>
> reg:
> @@ -53,6 +54,13 @@ properties:
> opp-table:
> type: object
>
> + performance-controller:
> + $ref: /schemas/types.yaml#/definitions/phandle
> + description:
> + A phandle of a device that controls this GPU's power and frequency,
> + if any. If present, this is usually in the form of some specialised
> + embedded MCU.
> +
> power-domains:
> minItems: 1
> maxItems: 5
> @@ -91,7 +99,6 @@ required:
> - interrupts
> - interrupt-names
> - clocks
> - - mali-supply
>
> additionalProperties: false
>
> @@ -105,9 +112,24 @@ allOf:
> properties:
> clocks:
> minItems: 3
> + performance-controller: false
> power-domains:
> maxItems: 1
> power-domain-names: false
> + required:
> + - mali-supply
> + - if:
> + properties:
> + compatible:
> + contains:
> + const: rockchip,mt8196-mali
s/rockchip/mediatek/
Best regards,
Liviu
> + then:
> + properties:
> + mali-supply: false
> + sram-supply: false
> + operating-points-v2: false
> + required:
> + - performance-controller
>
> examples:
> - |
> @@ -143,5 +165,17 @@ examples:
> };
> };
> };
> + - |
> + gpu2: gpu@48000000 {
> + compatible = "mediatek,mt8196-mali", "arm,mali-valhall-csf";
> + reg = <0x48000000 0x480000>;
> + clocks = <&mfgpll 0>;
> + clock-names = "core";
> + interrupts = <GIC_SPI 606 IRQ_TYPE_LEVEL_HIGH 0>,
> + <GIC_SPI 605 IRQ_TYPE_LEVEL_HIGH 0>,
> + <GIC_SPI 604 IRQ_TYPE_LEVEL_HIGH 0>;
> + interrupt-names = "job", "mmu", "gpu";
> + performance-controller = <&gpufreq>;
> + };
>
> ...
>
> --
> 2.51.0
>
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH RFC 02/10] dt-bindings: devfreq: add mt8196-gpufreq binding
2025-09-05 10:22 [PATCH RFC 00/10] MT8196 GPU Frequency/Power Control Support Nicolas Frattaroli
2025-09-05 10:22 ` [PATCH RFC 01/10] dt-bindings: gpu: mali-valhall-csf: add mediatek,mt8196-mali variant Nicolas Frattaroli
@ 2025-09-05 10:22 ` Nicolas Frattaroli
2025-09-08 11:15 ` AngeloGioacchino Del Regno
2025-09-05 10:22 ` [PATCH RFC 03/10] dt-bindings: sram: Add compatible for mediatek,mt8196-gpufreq-sram Nicolas Frattaroli
` (7 subsequent siblings)
9 siblings, 1 reply; 30+ messages in thread
From: Nicolas Frattaroli @ 2025-09-05 10:22 UTC (permalink / raw)
To: AngeloGioacchino Del Regno, Boris Brezillon, Steven Price,
Liviu Dudau, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Matthias Brugger, MyungJoo Ham, Kyungmin Park,
Chanwoo Choi, Jassi Brar, Kees Cook, Gustavo A. R. Silva
Cc: Chia-I Wu, Chen-Yu Tsai, kernel, dri-devel, devicetree,
linux-kernel, linux-arm-kernel, linux-mediatek, linux-pm,
linux-hardening, Nicolas Frattaroli
On the MediaTek MT8196 SoC, the GPU has its power and frequency
dynamically controlled by an embedded special-purpose MCU. This MCU is
in charge of powering up the GPU silicon. It also provides us with a
list of available OPPs at runtime, and is fully in control of all the
regulator and clock fiddling it takes to reach a certain level of
performance. It's also in charge of enforcing limits on power draw or
temperature.
Add a binding for this device in the devfreq subdirectory, where it
seems to fit in best considering its tasks.
The functions of many of the mailbox channels are unknown. This is not
the fault of this binding's author; we've never received adequate
documentation for this hardware, and the downstream code does not make
use of them in a way that'd reveal their purpose. They are kept in the
binding as the binding should be complete.
Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
---
.../bindings/devfreq/mediatek,mt8196-gpufreq.yaml | 116 +++++++++++++++++++++
1 file changed, 116 insertions(+)
diff --git a/Documentation/devicetree/bindings/devfreq/mediatek,mt8196-gpufreq.yaml b/Documentation/devicetree/bindings/devfreq/mediatek,mt8196-gpufreq.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..1fe43c9fc94bb603b1fb77e9a97a27e92fea1ae8
--- /dev/null
+++ b/Documentation/devicetree/bindings/devfreq/mediatek,mt8196-gpufreq.yaml
@@ -0,0 +1,116 @@
+# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/devfreq/mediatek,mt8196-gpufreq.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: MediaTek MFlexGraphics Performance Controller
+
+maintainers:
+ - Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
+
+properties:
+ $nodename:
+ pattern: '^performance-controller@[a-f0-9]+$'
+
+ compatible:
+ enum:
+ - mediatek,mt8196-gpufreq
+
+ reg:
+ items:
+ - description: GPR memory area
+ - description: RPC memory area
+ - description: SoC variant ID register
+
+ reg-names:
+ items:
+ - const: gpr
+ - const: rpc
+ - const: e2_id
+
+ clocks:
+ items:
+ - description: main clock of the embedded controller (EB)
+ - description: core PLL
+ - description: stack 0 PLL
+ - description: stack 1 PLL
+
+ clock-names:
+ items:
+ - const: eb
+ - const: mfgpll
+ - const: mfgpll_sc0
+ - const: mfgpll_sc1
+
+ mboxes:
+ items:
+ - description: FastDVFS events
+ - description: frequency control
+ - description: sleep control
+ - description: timer control
+ - description: frequency hopping control
+ - description: hardware voter control
+ - description: gpumpu (some type of memory control, unknown)
+ - description: FastDVFS control
+ - description: Unknown
+ - description: Unknown
+ - description: Unknown, but likely controls some boosting behaviour
+ - description: Unknown
+
+ mbox-names:
+ items:
+ - const: fast_dvfs_event
+ - const: gpufreq
+ - const: sleep
+ - const: timer
+ - const: fhctl
+ - const: ccf
+ - const: gpumpu
+ - const: fast_dvfs
+ - const: ipir_c_met
+ - const: ipis_c_met
+ - const: brisket
+ - const: ppb
+
+ shmem:
+ $ref: /schemas/types.yaml#/definitions/phandle
+ description: phandle to the shared memory region of the GPUEB MCU
+
+required:
+ - compatible
+ - reg
+ - reg-names
+ - clocks
+ - clock-names
+ - mboxes
+ - mbox-names
+ - shmem
+
+additionalProperties: false
+
+examples:
+ - |
+ #include <dt-bindings/clock/mediatek,mt8196-clock.h>
+
+ gpufreq: performance-controller@4b09fd00 {
+ compatible = "mediatek,mt8196-gpufreq";
+ reg = <0x4b09fd00 0x80>,
+ <0x4b800000 0x1000>,
+ <0x4b860128 0x4>;
+ reg-names = "gpr", "rpc", "e2_id";
+ clocks = <&topckgen CLK_TOP_MFG_EB>,
+ <&mfgpll CLK_MFG_AO_MFGPLL>,
+ <&mfgpll_sc0 CLK_MFGSC0_AO_MFGPLL_SC0>,
+ <&mfgpll_sc1 CLK_MFGSC1_AO_MFGPLL_SC1>;
+ clock-names = "eb", "mfgpll", "mfgpll_sc0",
+ "mfgpll_sc1";
+ mboxes = <&gpueb_mbox 0>, <&gpueb_mbox 1>, <&gpueb_mbox 2>,
+ <&gpueb_mbox 3>, <&gpueb_mbox 4>, <&gpueb_mbox 5>,
+ <&gpueb_mbox 6>, <&gpueb_mbox 7>, <&gpueb_mbox 8>,
+ <&gpueb_mbox 9>, <&gpueb_mbox 10>, <&gpueb_mbox 11>;
+ mbox-names = "fast_dvfs_event", "gpufreq", "sleep", "timer", "fhctl",
+ "ccf", "gpumpu", "fast_dvfs", "ipir_c_met", "ipis_c_met",
+ "brisket", "ppb";
+ shmem = <&gpufreq_shmem>;
+ };
--
2.51.0
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH RFC 02/10] dt-bindings: devfreq: add mt8196-gpufreq binding
2025-09-05 10:22 ` [PATCH RFC 02/10] dt-bindings: devfreq: add mt8196-gpufreq binding Nicolas Frattaroli
@ 2025-09-08 11:15 ` AngeloGioacchino Del Regno
2025-09-08 11:39 ` Nicolas Frattaroli
0 siblings, 1 reply; 30+ messages in thread
From: AngeloGioacchino Del Regno @ 2025-09-08 11:15 UTC (permalink / raw)
To: Nicolas Frattaroli, Boris Brezillon, Steven Price, Liviu Dudau,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
Matthias Brugger, MyungJoo Ham, Kyungmin Park, Chanwoo Choi,
Jassi Brar, Kees Cook, Gustavo A. R. Silva
Cc: Chia-I Wu, Chen-Yu Tsai, kernel, dri-devel, devicetree,
linux-kernel, linux-arm-kernel, linux-mediatek, linux-pm,
linux-hardening
Il 05/09/25 12:22, Nicolas Frattaroli ha scritto:
> On the MediaTek MT8196 SoC, the GPU has its power and frequency
> dynamically controlled by an embedded special-purpose MCU. This MCU is
> in charge of powering up the GPU silicon. It also provides us with a
> list of available OPPs at runtime, and is fully in control of all the
> regulator and clock fiddling it takes to reach a certain level of
> performance. It's also in charge of enforcing limits on power draw or
> temperature.
>
> Add a binding for this device in the devfreq subdirectory, where it
> seems to fit in best considering its tasks.
>
> The functions of many of the mailbox channels are unknown. This is not
> the fault of this binding's author; we've never received adequate
> documentation for this hardware, and the downstream code does not make
> use of them in a way that'd reveal their purpose. They are kept in the
> binding as the binding should be complete.
>
> Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> ---
> .../bindings/devfreq/mediatek,mt8196-gpufreq.yaml | 116 +++++++++++++++++++++
> 1 file changed, 116 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/devfreq/mediatek,mt8196-gpufreq.yaml b/Documentation/devicetree/bindings/devfreq/mediatek,mt8196-gpufreq.yaml
> new file mode 100644
> index 0000000000000000000000000000000000000000..1fe43c9fc94bb603b1fb77e9a97a27e92fea1ae8
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/devfreq/mediatek,mt8196-gpufreq.yaml
> @@ -0,0 +1,116 @@
> +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/devfreq/mediatek,mt8196-gpufreq.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: MediaTek MFlexGraphics Performance Controller
Doesn't MFG stand for MediaTek Flexible Graphics? (or did they update the name?)
Perhaps it's a good idea to also add that reference... I think it's a little more
readable and understandable compared to "MFlexGraphics" :-)
> +
> +maintainers:
> + - Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> +
> +properties:
> + $nodename:
> + pattern: '^performance-controller@[a-f0-9]+$'
> +
> + compatible:
> + enum:
> + - mediatek,mt8196-gpufreq
> +
> + reg:
> + items:
> + - description: GPR memory area
> + - description: RPC memory area
> + - description: SoC variant ID register
> +
> + reg-names:
> + items:
> + - const: gpr
> + - const: rpc
> + - const: e2_id
We should find a better name for that "e2_id".
> +
> + clocks:
> + items:
> + - description: main clock of the embedded controller (EB)
> + - description: core PLL
> + - description: stack 0 PLL
> + - description: stack 1 PLL
> +
> + clock-names:
> + items:
> + - const: eb
> + - const: mfgpll
> + - const: mfgpll_sc0
> + - const: mfgpll_sc1
> +
> + mboxes:
> + items:
> + - description: FastDVFS events
> + - description: frequency control
> + - description: sleep control
> + - description: timer control
> + - description: frequency hopping control
> + - description: hardware voter control
> + - description: gpumpu (some type of memory control, unknown)
> + - description: FastDVFS control
> + - description: Unknown
> + - description: Unknown
> + - description: Unknown, but likely controls some boosting behaviour
> + - description: Unknown
> +
> + mbox-names:
> + items:
> + - const: fast_dvfs_event
Any problem if we avoid underscores in names?
> + - const: gpufreq
> + - const: sleep
> + - const: timer
> + - const: fhctl
> + - const: ccf
> + - const: gpumpu
"some type of memory control" .. it's really a MPU. For memory protection. :-)
Besides, I don't think we have to touch anything in the gpumpu for freq control
via gpueb.
> + - const: fast_dvfs
> + - const: ipir_c_met
> + - const: ipis_c_met
MET is a hardware event tracer / profiler... and I'm fairly sure that we have no
real reason to support it (at least, not like that, and not in a first submission).
Ah btw: ipir ipis .. ipi-receive ipi-send
> + - const: brisket
Brisket is... something. There's one for the GPU, one for CPU, and one for APU.
Not sure what it exactly does, but seems to be or control a FLL (freq locked loop).
> + - const: ppb
PPB = Peak Power Budget
The PPB needs its own "big" driver (the PBM - Power Budget Manager) in order to do
anything - as in - this manages a SoC-global peak power setting based on the
available maximum deliverable instantaneous (and/or sustainable) power from the
board's power source and it is mainly used for smartphone usecase (battery!).
In order to work, the PPB HW (yet another mcu) needs to be initialized with tables
for CPU and GPU (and APU? and something else too?), and with other data explaining
the maximum instantaneous power that can delivered at a certain battery percentage.
Important point is... I doubt that PPB is being initialized by the bootloader, on
all of Genio, Kompanio and Dimensity chips, so this should be disabled by default.
You can keep it, especially now that you have a description for it - and because it
does indeed exist, but I doubt that we're using this anytime soon.
Cheers,
Angelo
> +
> + shmem:
> + $ref: /schemas/types.yaml#/definitions/phandle
> + description: phandle to the shared memory region of the GPUEB MCU
> +
> +required:
> + - compatible
> + - reg
> + - reg-names
> + - clocks
> + - clock-names
> + - mboxes
> + - mbox-names
> + - shmem
> +
> +additionalProperties: false
> +
> +examples:
> + - |
> + #include <dt-bindings/clock/mediatek,mt8196-clock.h>
> +
> + gpufreq: performance-controller@4b09fd00 {
> + compatible = "mediatek,mt8196-gpufreq";
> + reg = <0x4b09fd00 0x80>,
> + <0x4b800000 0x1000>,
> + <0x4b860128 0x4>;
> + reg-names = "gpr", "rpc", "e2_id";
> + clocks = <&topckgen CLK_TOP_MFG_EB>,
> + <&mfgpll CLK_MFG_AO_MFGPLL>,
> + <&mfgpll_sc0 CLK_MFGSC0_AO_MFGPLL_SC0>,
> + <&mfgpll_sc1 CLK_MFGSC1_AO_MFGPLL_SC1>;
> + clock-names = "eb", "mfgpll", "mfgpll_sc0",
> + "mfgpll_sc1";
> + mboxes = <&gpueb_mbox 0>, <&gpueb_mbox 1>, <&gpueb_mbox 2>,
> + <&gpueb_mbox 3>, <&gpueb_mbox 4>, <&gpueb_mbox 5>,
> + <&gpueb_mbox 6>, <&gpueb_mbox 7>, <&gpueb_mbox 8>,
> + <&gpueb_mbox 9>, <&gpueb_mbox 10>, <&gpueb_mbox 11>;
> + mbox-names = "fast_dvfs_event", "gpufreq", "sleep", "timer", "fhctl",
> + "ccf", "gpumpu", "fast_dvfs", "ipir_c_met", "ipis_c_met",
> + "brisket", "ppb";
> + shmem = <&gpufreq_shmem>;
> + };
>
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [PATCH RFC 02/10] dt-bindings: devfreq: add mt8196-gpufreq binding
2025-09-08 11:15 ` AngeloGioacchino Del Regno
@ 2025-09-08 11:39 ` Nicolas Frattaroli
2025-09-08 12:49 ` AngeloGioacchino Del Regno
0 siblings, 1 reply; 30+ messages in thread
From: Nicolas Frattaroli @ 2025-09-08 11:39 UTC (permalink / raw)
To: Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Rob Herring, Krzysztof Kozlowski, Conor Dooley, Matthias Brugger,
MyungJoo Ham, Kyungmin Park, Chanwoo Choi, Jassi Brar, Kees Cook,
Gustavo A. R. Silva, AngeloGioacchino Del Regno
Cc: Chia-I Wu, Chen-Yu Tsai, kernel, dri-devel, devicetree,
linux-kernel, linux-arm-kernel, linux-mediatek, linux-pm,
linux-hardening
On Monday, 8 September 2025 13:15:03 Central European Summer Time AngeloGioacchino Del Regno wrote:
> Il 05/09/25 12:22, Nicolas Frattaroli ha scritto:
> > On the MediaTek MT8196 SoC, the GPU has its power and frequency
> > dynamically controlled by an embedded special-purpose MCU. This MCU is
> > in charge of powering up the GPU silicon. It also provides us with a
> > list of available OPPs at runtime, and is fully in control of all the
> > regulator and clock fiddling it takes to reach a certain level of
> > performance. It's also in charge of enforcing limits on power draw or
> > temperature.
> >
> > Add a binding for this device in the devfreq subdirectory, where it
> > seems to fit in best considering its tasks.
> >
> > The functions of many of the mailbox channels are unknown. This is not
> > the fault of this binding's author; we've never received adequate
> > documentation for this hardware, and the downstream code does not make
> > use of them in a way that'd reveal their purpose. They are kept in the
> > binding as the binding should be complete.
> >
> > Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> > ---
> > .../bindings/devfreq/mediatek,mt8196-gpufreq.yaml | 116 +++++++++++++++++++++
> > 1 file changed, 116 insertions(+)
> >
> > diff --git a/Documentation/devicetree/bindings/devfreq/mediatek,mt8196-gpufreq.yaml b/Documentation/devicetree/bindings/devfreq/mediatek,mt8196-gpufreq.yaml
> > new file mode 100644
> > index 0000000000000000000000000000000000000000..1fe43c9fc94bb603b1fb77e9a97a27e92fea1ae8
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/devfreq/mediatek,mt8196-gpufreq.yaml
> > @@ -0,0 +1,116 @@
> > +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
> > +%YAML 1.2
> > +---
> > +$id: http://devicetree.org/schemas/devfreq/mediatek,mt8196-gpufreq.yaml#
> > +$schema: http://devicetree.org/meta-schemas/core.yaml#
> > +
> > +title: MediaTek MFlexGraphics Performance Controller
>
> Doesn't MFG stand for MediaTek Flexible Graphics? (or did they update the name?)
>
> Perhaps it's a good idea to also add that reference... I think it's a little more
> readable and understandable compared to "MFlexGraphics" :-)
"MFlexGraphics" is what the abbreviation section in the datasheet calls "MFG".
I don't see "Flexible Graphics" at all in the datasheet, but it's an obvious
inference of what the name means.
I think keeping "MFlexGraphics" is better for people grepping for what
the datasheet calls it.
>
> > +
> > +maintainers:
> > + - Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> > +
> > +properties:
> > + $nodename:
> > + pattern: '^performance-controller@[a-f0-9]+$'
> > +
> > + compatible:
> > + enum:
> > + - mediatek,mt8196-gpufreq
> > +
> > + reg:
> > + items:
> > + - description: GPR memory area
> > + - description: RPC memory area
> > + - description: SoC variant ID register
> > +
> > + reg-names:
> > + items:
> > + - const: gpr
> > + - const: rpc
> > + - const: e2_id
>
> We should find a better name for that "e2_id".
Agreed, but we don't have a register map that includes this address
and would give us a different name. I think it's some sort of silicon
revision.
>
> > +
> > + clocks:
> > + items:
> > + - description: main clock of the embedded controller (EB)
> > + - description: core PLL
> > + - description: stack 0 PLL
> > + - description: stack 1 PLL
> > +
> > + clock-names:
> > + items:
> > + - const: eb
> > + - const: mfgpll
> > + - const: mfgpll_sc0
> > + - const: mfgpll_sc1
> > +
> > + mboxes:
> > + items:
> > + - description: FastDVFS events
> > + - description: frequency control
> > + - description: sleep control
> > + - description: timer control
> > + - description: frequency hopping control
> > + - description: hardware voter control
> > + - description: gpumpu (some type of memory control, unknown)
> > + - description: FastDVFS control
> > + - description: Unknown
> > + - description: Unknown
> > + - description: Unknown, but likely controls some boosting behaviour
> > + - description: Unknown
> > +
> > + mbox-names:
> > + items:
> > + - const: fast_dvfs_event
>
> Any problem if we avoid underscores in names?
>
No but I'm not sure what the canonical naming style is for mailbox
channels. "fastdvfsevent" is hard to read.
> > + - const: gpufreq
> > + - const: sleep
> > + - const: timer
> > + - const: fhctl
> > + - const: ccf
> > + - const: gpumpu
>
> "some type of memory control" .. it's really a MPU. For memory protection. :-)
> Besides, I don't think we have to touch anything in the gpumpu for freq control
> via gpueb.
>
Gotcha, so should I leave it out of the GPUFreq binding's used channels?
Would leave a gap, but that's probably fine.
> > + - const: fast_dvfs
> > + - const: ipir_c_met
> > + - const: ipis_c_met
>
> MET is a hardware event tracer / profiler... and I'm fairly sure that we have no
> real reason to support it (at least, not like that, and not in a first submission).
>
> Ah btw: ipir ipis .. ipi-receive ipi-send
>
Gotcha, will remove those as well.
> > + - const: brisket
>
> Brisket is... something. There's one for the GPU, one for CPU, and one for APU.
> Not sure what it exactly does, but seems to be or control a FLL (freq locked loop).
>
> > + - const: ppb
>
> PPB = Peak Power Budget
>
> The PPB needs its own "big" driver (the PBM - Power Budget Manager) in order to do
> anything - as in - this manages a SoC-global peak power setting based on the
> available maximum deliverable instantaneous (and/or sustainable) power from the
> board's power source and it is mainly used for smartphone usecase (battery!).
>
> In order to work, the PPB HW (yet another mcu) needs to be initialized with tables
> for CPU and GPU (and APU? and something else too?), and with other data explaining
> the maximum instantaneous power that can delivered at a certain battery percentage.
>
> Important point is... I doubt that PPB is being initialized by the bootloader, on
> all of Genio, Kompanio and Dimensity chips, so this should be disabled by default.
>
> You can keep it, especially now that you have a description for it - and because it
> does indeed exist, but I doubt that we're using this anytime soon.
If it's going to be used by a separate driver, wouldn't it be better if we don't make
this channel part of the channels the GPUFreq driver uses?
>
> Cheers,
> Angelo
>
Kind regards,
Nicolas Frattaroli
> > +
> > + shmem:
> > + $ref: /schemas/types.yaml#/definitions/phandle
> > + description: phandle to the shared memory region of the GPUEB MCU
> > +
> > +required:
> > + - compatible
> > + - reg
> > + - reg-names
> > + - clocks
> > + - clock-names
> > + - mboxes
> > + - mbox-names
> > + - shmem
> > +
> > +additionalProperties: false
> > +
> > +examples:
> > + - |
> > + #include <dt-bindings/clock/mediatek,mt8196-clock.h>
> > +
> > + gpufreq: performance-controller@4b09fd00 {
> > + compatible = "mediatek,mt8196-gpufreq";
> > + reg = <0x4b09fd00 0x80>,
> > + <0x4b800000 0x1000>,
> > + <0x4b860128 0x4>;
> > + reg-names = "gpr", "rpc", "e2_id";
> > + clocks = <&topckgen CLK_TOP_MFG_EB>,
> > + <&mfgpll CLK_MFG_AO_MFGPLL>,
> > + <&mfgpll_sc0 CLK_MFGSC0_AO_MFGPLL_SC0>,
> > + <&mfgpll_sc1 CLK_MFGSC1_AO_MFGPLL_SC1>;
> > + clock-names = "eb", "mfgpll", "mfgpll_sc0",
> > + "mfgpll_sc1";
> > + mboxes = <&gpueb_mbox 0>, <&gpueb_mbox 1>, <&gpueb_mbox 2>,
> > + <&gpueb_mbox 3>, <&gpueb_mbox 4>, <&gpueb_mbox 5>,
> > + <&gpueb_mbox 6>, <&gpueb_mbox 7>, <&gpueb_mbox 8>,
> > + <&gpueb_mbox 9>, <&gpueb_mbox 10>, <&gpueb_mbox 11>;
> > + mbox-names = "fast_dvfs_event", "gpufreq", "sleep", "timer", "fhctl",
> > + "ccf", "gpumpu", "fast_dvfs", "ipir_c_met", "ipis_c_met",
> > + "brisket", "ppb";
> > + shmem = <&gpufreq_shmem>;
> > + };
> >
>
>
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [PATCH RFC 02/10] dt-bindings: devfreq: add mt8196-gpufreq binding
2025-09-08 11:39 ` Nicolas Frattaroli
@ 2025-09-08 12:49 ` AngeloGioacchino Del Regno
0 siblings, 0 replies; 30+ messages in thread
From: AngeloGioacchino Del Regno @ 2025-09-08 12:49 UTC (permalink / raw)
To: Nicolas Frattaroli, Boris Brezillon, Steven Price, Liviu Dudau,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
Matthias Brugger, MyungJoo Ham, Kyungmin Park, Chanwoo Choi,
Jassi Brar, Kees Cook, Gustavo A. R. Silva
Cc: Chia-I Wu, Chen-Yu Tsai, kernel, dri-devel, devicetree,
linux-kernel, linux-arm-kernel, linux-mediatek, linux-pm,
linux-hardening
Il 08/09/25 13:39, Nicolas Frattaroli ha scritto:
> On Monday, 8 September 2025 13:15:03 Central European Summer Time AngeloGioacchino Del Regno wrote:
>> Il 05/09/25 12:22, Nicolas Frattaroli ha scritto:
>>> On the MediaTek MT8196 SoC, the GPU has its power and frequency
>>> dynamically controlled by an embedded special-purpose MCU. This MCU is
>>> in charge of powering up the GPU silicon. It also provides us with a
>>> list of available OPPs at runtime, and is fully in control of all the
>>> regulator and clock fiddling it takes to reach a certain level of
>>> performance. It's also in charge of enforcing limits on power draw or
>>> temperature.
>>>
>>> Add a binding for this device in the devfreq subdirectory, where it
>>> seems to fit in best considering its tasks.
>>>
>>> The functions of many of the mailbox channels are unknown. This is not
>>> the fault of this binding's author; we've never received adequate
>>> documentation for this hardware, and the downstream code does not make
>>> use of them in a way that'd reveal their purpose. They are kept in the
>>> binding as the binding should be complete.
>>>
>>> Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
>>> ---
>>> .../bindings/devfreq/mediatek,mt8196-gpufreq.yaml | 116 +++++++++++++++++++++
>>> 1 file changed, 116 insertions(+)
>>>
>>> diff --git a/Documentation/devicetree/bindings/devfreq/mediatek,mt8196-gpufreq.yaml b/Documentation/devicetree/bindings/devfreq/mediatek,mt8196-gpufreq.yaml
>>> new file mode 100644
>>> index 0000000000000000000000000000000000000000..1fe43c9fc94bb603b1fb77e9a97a27e92fea1ae8
>>> --- /dev/null
>>> +++ b/Documentation/devicetree/bindings/devfreq/mediatek,mt8196-gpufreq.yaml
>>> @@ -0,0 +1,116 @@
>>> +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
>>> +%YAML 1.2
>>> +---
>>> +$id: http://devicetree.org/schemas/devfreq/mediatek,mt8196-gpufreq.yaml#
>>> +$schema: http://devicetree.org/meta-schemas/core.yaml#
>>> +
>>> +title: MediaTek MFlexGraphics Performance Controller
>>
>> Doesn't MFG stand for MediaTek Flexible Graphics? (or did they update the name?)
>>
>> Perhaps it's a good idea to also add that reference... I think it's a little more
>> readable and understandable compared to "MFlexGraphics" :-)
>
> "MFlexGraphics" is what the abbreviation section in the datasheet calls "MFG".
> I don't see "Flexible Graphics" at all in the datasheet, but it's an obvious
> inference of what the name means.
>
> I think keeping "MFlexGraphics" is better for people grepping for what
> the datasheet calls it.
>
Okay in MT8196 that was updated then.
On any other SoC previous to MT8196, the datasheet name is
"MediaTek Flexible Graphics (MFG)".
If you want to keep "MFlexGraphics" in MT8196-style, it's still a good idea to also
reference somewhere the old "MediaTek Flexible Graphics" name, as that was used for
more than 10 years. Really.
>>
>>> +
>>> +maintainers:
>>> + - Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
>>> +
>>> +properties:
>>> + $nodename:
>>> + pattern: '^performance-controller@[a-f0-9]+$'
>>> +
>>> + compatible:
>>> + enum:
>>> + - mediatek,mt8196-gpufreq
>>> +
>>> + reg:
>>> + items:
>>> + - description: GPR memory area
>>> + - description: RPC memory area
>>> + - description: SoC variant ID register
>>> +
>>> + reg-names:
>>> + items:
>>> + - const: gpr
>>> + - const: rpc
>>> + - const: e2_id
>>
>> We should find a better name for that "e2_id".
>
> Agreed, but we don't have a register map that includes this address
> and would give us a different name.
Yeah but still, it feels like this naming is MT8196-specific, and this driver is
not entirely specific to this SoC (this version is, but with minor modifications
this can work on other chips as well).
> I think it's some sort of silicon revision.
It is. And there's a broad range of names that you can use in place of "e2_id"...
If there's no precise name, always go with something that is generic enough but
that resembles what can be found inside of the mmio that you're specifying.
>>
>>> +
>>> + clocks:
>>> + items:
>>> + - description: main clock of the embedded controller (EB)
>>> + - description: core PLL
>>> + - description: stack 0 PLL
>>> + - description: stack 1 PLL
>>> +
>>> + clock-names:
>>> + items:
>>> + - const: eb
>>> + - const: mfgpll
>>> + - const: mfgpll_sc0
>>> + - const: mfgpll_sc1
>>> +
>>> + mboxes:
>>> + items:
>>> + - description: FastDVFS events
>>> + - description: frequency control
>>> + - description: sleep control
>>> + - description: timer control
>>> + - description: frequency hopping control
>>> + - description: hardware voter control
>>> + - description: gpumpu (some type of memory control, unknown)
>>> + - description: FastDVFS control
>>> + - description: Unknown
>>> + - description: Unknown
>>> + - description: Unknown, but likely controls some boosting behaviour
>>> + - description: Unknown
>>> +
>>> + mbox-names:
>>> + items:
>>> + - const: fast_dvfs_event
>>
>> Any problem if we avoid underscores in names?
>>
>
> No but I'm not sure what the canonical naming style is for mailbox
> channels. "fastdvfsevent" is hard to read.
>
"fast-dvfs-event" would be good, wouldn't it? :-)
>>> + - const: gpufreq
>>> + - const: sleep
>>> + - const: timer
>>> + - const: fhctl
>>> + - const: ccf
>>> + - const: gpumpu
>>
>> "some type of memory control" .. it's really a MPU. For memory protection. :-)
>> Besides, I don't think we have to touch anything in the gpumpu for freq control
>> via gpueb.
>>
>
> Gotcha, so should I leave it out of the GPUFreq binding's used channels?
>
> Would leave a gap, but that's probably fine.
>
I really doubt that this is ever getting used at all for GPUFreq, so yes, leave it
out.
>>> + - const: fast_dvfs
>>> + - const: ipir_c_met
>>> + - const: ipis_c_met
>>
>> MET is a hardware event tracer / profiler... and I'm fairly sure that we have no
>> real reason to support it (at least, not like that, and not in a first submission).
>>
>> Ah btw: ipir ipis .. ipi-receive ipi-send
>>
>
> Gotcha, will remove those as well.
>
P.S.: of course I was implying that if we ever need to support those, we can always
add them later (but I still really doubt that we're ever going to use MET at all,
even though it would be *really* nice to).
>>> + - const: brisket
>>
>> Brisket is... something. There's one for the GPU, one for CPU, and one for APU.
>> Not sure what it exactly does, but seems to be or control a FLL (freq locked loop).
>>
>>> + - const: ppb
>>
>> PPB = Peak Power Budget
>>
>> The PPB needs its own "big" driver (the PBM - Power Budget Manager) in order to do
>> anything - as in - this manages a SoC-global peak power setting based on the
>> available maximum deliverable instantaneous (and/or sustainable) power from the
>> board's power source and it is mainly used for smartphone usecase (battery!).
>>
>> In order to work, the PPB HW (yet another mcu) needs to be initialized with tables
>> for CPU and GPU (and APU? and something else too?), and with other data explaining
>> the maximum instantaneous power that can delivered at a certain battery percentage.
>>
>> Important point is... I doubt that PPB is being initialized by the bootloader, on
>> all of Genio, Kompanio and Dimensity chips, so this should be disabled by default.
>>
>> You can keep it, especially now that you have a description for it - and because it
>> does indeed exist, but I doubt that we're using this anytime soon.
>
> If it's going to be used by a separate driver, wouldn't it be better if we don't make
> this channel part of the channels the GPUFreq driver uses?
>
Not sure if gpufreq needs to poke that channel to tell to the ppb "I'm setting this
frequency here", or if the MCUs can communicate that stuff on their own.
I didn't do any research about that.
Since adding stuff is kinda easier than removing, I guess avoiding to add this for
now is a sensible option. Let's do just that then.
>>
>> Cheers,
>> Angelo
>>
>
> Kind regards,
> Nicolas Frattaroli
>
>>> +
>>> + shmem:
>>> + $ref: /schemas/types.yaml#/definitions/phandle
>>> + description: phandle to the shared memory region of the GPUEB MCU
>>> +
>>> +required:
>>> + - compatible
>>> + - reg
>>> + - reg-names
>>> + - clocks
>>> + - clock-names
>>> + - mboxes
>>> + - mbox-names
>>> + - shmem
>>> +
>>> +additionalProperties: false
>>> +
>>> +examples:
>>> + - |
>>> + #include <dt-bindings/clock/mediatek,mt8196-clock.h>
>>> +
>>> + gpufreq: performance-controller@4b09fd00 {
>>> + compatible = "mediatek,mt8196-gpufreq";
>>> + reg = <0x4b09fd00 0x80>,
>>> + <0x4b800000 0x1000>,
>>> + <0x4b860128 0x4>;
>>> + reg-names = "gpr", "rpc", "e2_id";
>>> + clocks = <&topckgen CLK_TOP_MFG_EB>,
>>> + <&mfgpll CLK_MFG_AO_MFGPLL>,
>>> + <&mfgpll_sc0 CLK_MFGSC0_AO_MFGPLL_SC0>,
>>> + <&mfgpll_sc1 CLK_MFGSC1_AO_MFGPLL_SC1>;
>>> + clock-names = "eb", "mfgpll", "mfgpll_sc0",
>>> + "mfgpll_sc1";
>>> + mboxes = <&gpueb_mbox 0>, <&gpueb_mbox 1>, <&gpueb_mbox 2>,
>>> + <&gpueb_mbox 3>, <&gpueb_mbox 4>, <&gpueb_mbox 5>,
>>> + <&gpueb_mbox 6>, <&gpueb_mbox 7>, <&gpueb_mbox 8>,
>>> + <&gpueb_mbox 9>, <&gpueb_mbox 10>, <&gpueb_mbox 11>;
>>> + mbox-names = "fast_dvfs_event", "gpufreq", "sleep", "timer", "fhctl",
>>> + "ccf", "gpumpu", "fast_dvfs", "ipir_c_met", "ipis_c_met",
>>> + "brisket", "ppb";
>>> + shmem = <&gpufreq_shmem>;
>>> + };
>>>
>>
>>
>
>
>
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH RFC 03/10] dt-bindings: sram: Add compatible for mediatek,mt8196-gpufreq-sram
2025-09-05 10:22 [PATCH RFC 00/10] MT8196 GPU Frequency/Power Control Support Nicolas Frattaroli
2025-09-05 10:22 ` [PATCH RFC 01/10] dt-bindings: gpu: mali-valhall-csf: add mediatek,mt8196-mali variant Nicolas Frattaroli
2025-09-05 10:22 ` [PATCH RFC 02/10] dt-bindings: devfreq: add mt8196-gpufreq binding Nicolas Frattaroli
@ 2025-09-05 10:22 ` Nicolas Frattaroli
2025-09-05 23:28 ` Rob Herring (Arm)
2025-09-05 10:23 ` [PATCH RFC 04/10] dt-bindings: mailbox: Add MT8196 GPUEB Mailbox Nicolas Frattaroli
` (6 subsequent siblings)
9 siblings, 1 reply; 30+ messages in thread
From: Nicolas Frattaroli @ 2025-09-05 10:22 UTC (permalink / raw)
To: AngeloGioacchino Del Regno, Boris Brezillon, Steven Price,
Liviu Dudau, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Matthias Brugger, MyungJoo Ham, Kyungmin Park,
Chanwoo Choi, Jassi Brar, Kees Cook, Gustavo A. R. Silva
Cc: Chia-I Wu, Chen-Yu Tsai, kernel, dri-devel, devicetree,
linux-kernel, linux-arm-kernel, linux-mediatek, linux-pm,
linux-hardening, Nicolas Frattaroli
This compatible is used for an SRAM section that's shared between the
MT8196's application processor cores and the embedded GPUEB MCU that
controls the GPU frequency.
Through this SRAM section, things about the GPU frequency controller
like the OPP table can be read.
Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
---
Documentation/devicetree/bindings/sram/sram.yaml | 1 +
1 file changed, 1 insertion(+)
diff --git a/Documentation/devicetree/bindings/sram/sram.yaml b/Documentation/devicetree/bindings/sram/sram.yaml
index 7c1337e159f2371401ae99313375656fff014ed4..6ba0dd6a66def11f56a1d5276d7397b655bff11e 100644
--- a/Documentation/devicetree/bindings/sram/sram.yaml
+++ b/Documentation/devicetree/bindings/sram/sram.yaml
@@ -89,6 +89,7 @@ patternProperties:
- arm,juno-scp-shmem
- arm,scmi-shmem
- arm,scp-shmem
+ - mediatek,mt8196-gpufreq-sram
- renesas,smp-sram
- rockchip,rk3066-smp-sram
- samsung,exynos4210-sysram
--
2.51.0
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH RFC 03/10] dt-bindings: sram: Add compatible for mediatek,mt8196-gpufreq-sram
2025-09-05 10:22 ` [PATCH RFC 03/10] dt-bindings: sram: Add compatible for mediatek,mt8196-gpufreq-sram Nicolas Frattaroli
@ 2025-09-05 23:28 ` Rob Herring (Arm)
0 siblings, 0 replies; 30+ messages in thread
From: Rob Herring (Arm) @ 2025-09-05 23:28 UTC (permalink / raw)
To: Nicolas Frattaroli
Cc: Liviu Dudau, MyungJoo Ham, Simona Vetter, Kees Cook,
Matthias Brugger, linux-mediatek, Maarten Lankhorst, Conor Dooley,
Maxime Ripard, David Airlie, Chia-I Wu, dri-devel, Jassi Brar,
Boris Brezillon, linux-pm, Kyungmin Park, Chanwoo Choi,
Thomas Zimmermann, linux-hardening, linux-arm-kernel, devicetree,
Krzysztof Kozlowski, linux-kernel, AngeloGioacchino Del Regno,
Steven Price, Gustavo A. R. Silva, kernel, Chen-Yu Tsai
On Fri, 05 Sep 2025 12:22:59 +0200, Nicolas Frattaroli wrote:
> This compatible is used for an SRAM section that's shared between the
> MT8196's application processor cores and the embedded GPUEB MCU that
> controls the GPU frequency.
>
> Through this SRAM section, things about the GPU frequency controller
> like the OPP table can be read.
>
> Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> ---
> Documentation/devicetree/bindings/sram/sram.yaml | 1 +
> 1 file changed, 1 insertion(+)
>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH RFC 04/10] dt-bindings: mailbox: Add MT8196 GPUEB Mailbox
2025-09-05 10:22 [PATCH RFC 00/10] MT8196 GPU Frequency/Power Control Support Nicolas Frattaroli
` (2 preceding siblings ...)
2025-09-05 10:22 ` [PATCH RFC 03/10] dt-bindings: sram: Add compatible for mediatek,mt8196-gpufreq-sram Nicolas Frattaroli
@ 2025-09-05 10:23 ` Nicolas Frattaroli
2025-09-05 23:31 ` Rob Herring
2025-09-05 10:23 ` [PATCH RFC 05/10] mailbox: add MediaTek GPUEB IPI mailbox Nicolas Frattaroli
` (5 subsequent siblings)
9 siblings, 1 reply; 30+ messages in thread
From: Nicolas Frattaroli @ 2025-09-05 10:23 UTC (permalink / raw)
To: AngeloGioacchino Del Regno, Boris Brezillon, Steven Price,
Liviu Dudau, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Matthias Brugger, MyungJoo Ham, Kyungmin Park,
Chanwoo Choi, Jassi Brar, Kees Cook, Gustavo A. R. Silva
Cc: Chia-I Wu, Chen-Yu Tsai, kernel, dri-devel, devicetree,
linux-kernel, linux-arm-kernel, linux-mediatek, linux-pm,
linux-hardening, Nicolas Frattaroli
The MediaTek MT8196 SoC includes an embedded MCU referred to as "GPUEB",
acting as glue logic to control power and frequency of the Mali GPU.
This MCU runs proprietary firmware for this purpose, and the main
application processor communicates with it through a mailbox.
Add a binding that describes this mailbox.
Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
---
.../mailbox/mediatek,mt8196-gpueb-mbox.yaml | 64 ++++++++++++++++++++++
1 file changed, 64 insertions(+)
diff --git a/Documentation/devicetree/bindings/mailbox/mediatek,mt8196-gpueb-mbox.yaml b/Documentation/devicetree/bindings/mailbox/mediatek,mt8196-gpueb-mbox.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..506723d54ae0a429b462914f3666184c24c4fc5a
--- /dev/null
+++ b/Documentation/devicetree/bindings/mailbox/mediatek,mt8196-gpueb-mbox.yaml
@@ -0,0 +1,64 @@
+# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/mailbox/mediatek,mt8196-gpueb-mbox.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: MediaTek MFlexGraphics GPUEB Mailbox Controller
+
+maintainers:
+ - Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
+
+properties:
+ compatible:
+ enum:
+ - mediatek,mt8196-gpueb-mbox
+
+ reg:
+ items:
+ - description: mailbox data registers
+ - description: mailbox control registers
+
+ reg-names:
+ items:
+ - const: mbox
+ - const: mbox_ctl
+
+ clocks:
+ items:
+ - description: main clock of the GPUEB MCU
+
+ interrupts:
+ items:
+ - description: fires when a new message is received
+
+ "#mbox-cells":
+ const: 1
+ description:
+ The number of the mailbox channel.
+
+required:
+ - compatible
+ - reg
+ - reg-names
+ - clocks
+ - interrupts
+ - "#mbox-cells"
+
+additionalProperties: false
+
+examples:
+ - |
+ #include <dt-bindings/clock/mediatek,mt8196-clock.h>
+ #include <dt-bindings/interrupt-controller/arm-gic.h>
+ #include <dt-bindings/interrupt-controller/irq.h>
+
+ gpueb_mbox: mailbox@4b09fd80 {
+ compatible = "mediatek,mt8196-gpueb-mbox";
+ reg = <0x4b09fd80 0x280>,
+ <0x4b170000 0x7c>;
+ reg-names = "mbox", "mbox_ctl";
+ clocks = <&topckgen CLK_TOP_MFG_EB>;
+ interrupts = <GIC_SPI 608 IRQ_TYPE_LEVEL_HIGH 0>;
+ #mbox-cells = <1>;
+ };
--
2.51.0
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH RFC 04/10] dt-bindings: mailbox: Add MT8196 GPUEB Mailbox
2025-09-05 10:23 ` [PATCH RFC 04/10] dt-bindings: mailbox: Add MT8196 GPUEB Mailbox Nicolas Frattaroli
@ 2025-09-05 23:31 ` Rob Herring
0 siblings, 0 replies; 30+ messages in thread
From: Rob Herring @ 2025-09-05 23:31 UTC (permalink / raw)
To: Nicolas Frattaroli
Cc: AngeloGioacchino Del Regno, Boris Brezillon, Steven Price,
Liviu Dudau, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Krzysztof Kozlowski, Conor Dooley,
Matthias Brugger, MyungJoo Ham, Kyungmin Park, Chanwoo Choi,
Jassi Brar, Kees Cook, Gustavo A. R. Silva, Chia-I Wu,
Chen-Yu Tsai, kernel, dri-devel, devicetree, linux-kernel,
linux-arm-kernel, linux-mediatek, linux-pm, linux-hardening
On Fri, Sep 05, 2025 at 12:23:00PM +0200, Nicolas Frattaroli wrote:
> The MediaTek MT8196 SoC includes an embedded MCU referred to as "GPUEB",
> acting as glue logic to control power and frequency of the Mali GPU.
> This MCU runs proprietary firmware for this purpose, and the main
> application processor communicates with it through a mailbox.
>
> Add a binding that describes this mailbox.
>
> Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> ---
> .../mailbox/mediatek,mt8196-gpueb-mbox.yaml | 64 ++++++++++++++++++++++
> 1 file changed, 64 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/mailbox/mediatek,mt8196-gpueb-mbox.yaml b/Documentation/devicetree/bindings/mailbox/mediatek,mt8196-gpueb-mbox.yaml
> new file mode 100644
> index 0000000000000000000000000000000000000000..506723d54ae0a429b462914f3666184c24c4fc5a
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/mailbox/mediatek,mt8196-gpueb-mbox.yaml
> @@ -0,0 +1,64 @@
> +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/mailbox/mediatek,mt8196-gpueb-mbox.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: MediaTek MFlexGraphics GPUEB Mailbox Controller
> +
> +maintainers:
> + - Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> +
> +properties:
> + compatible:
> + enum:
> + - mediatek,mt8196-gpueb-mbox
> +
> + reg:
> + items:
> + - description: mailbox data registers
> + - description: mailbox control registers
> +
> + reg-names:
> + items:
> + - const: mbox
> + - const: mbox_ctl
'mbox' is redundant. So 'data' and 'ctl'?
> +
> + clocks:
> + items:
> + - description: main clock of the GPUEB MCU
> +
> + interrupts:
> + items:
> + - description: fires when a new message is received
> +
> + "#mbox-cells":
> + const: 1
> + description:
> + The number of the mailbox channel.
> +
> +required:
> + - compatible
> + - reg
> + - reg-names
> + - clocks
> + - interrupts
> + - "#mbox-cells"
> +
> +additionalProperties: false
> +
> +examples:
> + - |
> + #include <dt-bindings/clock/mediatek,mt8196-clock.h>
> + #include <dt-bindings/interrupt-controller/arm-gic.h>
> + #include <dt-bindings/interrupt-controller/irq.h>
> +
> + gpueb_mbox: mailbox@4b09fd80 {
Drop unused labels.
> + compatible = "mediatek,mt8196-gpueb-mbox";
> + reg = <0x4b09fd80 0x280>,
> + <0x4b170000 0x7c>;
> + reg-names = "mbox", "mbox_ctl";
> + clocks = <&topckgen CLK_TOP_MFG_EB>;
> + interrupts = <GIC_SPI 608 IRQ_TYPE_LEVEL_HIGH 0>;
> + #mbox-cells = <1>;
> + };
>
> --
> 2.51.0
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH RFC 05/10] mailbox: add MediaTek GPUEB IPI mailbox
2025-09-05 10:22 [PATCH RFC 00/10] MT8196 GPU Frequency/Power Control Support Nicolas Frattaroli
` (3 preceding siblings ...)
2025-09-05 10:23 ` [PATCH RFC 04/10] dt-bindings: mailbox: Add MT8196 GPUEB Mailbox Nicolas Frattaroli
@ 2025-09-05 10:23 ` Nicolas Frattaroli
2025-09-08 10:06 ` AngeloGioacchino Del Regno
2025-09-12 4:48 ` Chia-I Wu
2025-09-05 10:23 ` [PATCH RFC 06/10] drm/panthor: call into devfreq for current frequency Nicolas Frattaroli
` (4 subsequent siblings)
9 siblings, 2 replies; 30+ messages in thread
From: Nicolas Frattaroli @ 2025-09-05 10:23 UTC (permalink / raw)
To: AngeloGioacchino Del Regno, Boris Brezillon, Steven Price,
Liviu Dudau, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Matthias Brugger, MyungJoo Ham, Kyungmin Park,
Chanwoo Choi, Jassi Brar, Kees Cook, Gustavo A. R. Silva
Cc: Chia-I Wu, Chen-Yu Tsai, kernel, dri-devel, devicetree,
linux-kernel, linux-arm-kernel, linux-mediatek, linux-pm,
linux-hardening, Nicolas Frattaroli
The MT8196 SoC uses an embedded MCU to control frequencies and power of
the GPU. This controller is referred to as "GPUEB".
It communicates to the application processor, among other ways, through
a mailbox.
The mailbox exposes one interrupt, which appears to only be fired when a
response is received, rather than a transaction is completed. For us,
this means we unfortunately need to poll for txdone.
The mailbox also requires the EB clock to be on when touching any of the
mailbox registers.
Add a simple driver for it based on the common mailbox framework.
Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
---
drivers/mailbox/Kconfig | 10 ++
drivers/mailbox/Makefile | 2 +
drivers/mailbox/mtk-gpueb-mailbox.c | 330 ++++++++++++++++++++++++++++++++++++
3 files changed, 342 insertions(+)
diff --git a/drivers/mailbox/Kconfig b/drivers/mailbox/Kconfig
index 02432d4a5ccd46a16156a09c7f277fb03e4013f5..2016defda1fabb5c0fcc8078f84a52d4e4e00167 100644
--- a/drivers/mailbox/Kconfig
+++ b/drivers/mailbox/Kconfig
@@ -294,6 +294,16 @@ config MTK_CMDQ_MBOX
critical time limitation, such as updating display configuration
during the vblank.
+config MTK_GPUEB_MBOX
+ tristate "MediaTek GPUEB Mailbox Support"
+ depends on ARCH_MEDIATEK || COMPILE_TEST
+ help
+ The MediaTek GPUEB mailbox is used to communicate with the embedded
+ controller in charge of GPU frequency and power management on some
+ MediaTek SoCs, such as the MT8196.
+ Say Y or m here if you want to support the MT8196 SoC in your kernel
+ build.
+
config ZYNQMP_IPI_MBOX
tristate "Xilinx ZynqMP IPI Mailbox"
depends on ARCH_ZYNQMP && OF
diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
index 98a68f838486eed117d24296138bf59fda3f92e4..564d06e71313e6d1972e4a6036e1e78c8c7ec450 100644
--- a/drivers/mailbox/Makefile
+++ b/drivers/mailbox/Makefile
@@ -63,6 +63,8 @@ obj-$(CONFIG_MTK_ADSP_MBOX) += mtk-adsp-mailbox.o
obj-$(CONFIG_MTK_CMDQ_MBOX) += mtk-cmdq-mailbox.o
+obj-$(CONFIG_MTK_GPUEB_MBOX) += mtk-gpueb-mailbox.o
+
obj-$(CONFIG_ZYNQMP_IPI_MBOX) += zynqmp-ipi-mailbox.o
obj-$(CONFIG_SUN6I_MSGBOX) += sun6i-msgbox.o
diff --git a/drivers/mailbox/mtk-gpueb-mailbox.c b/drivers/mailbox/mtk-gpueb-mailbox.c
new file mode 100644
index 0000000000000000000000000000000000000000..0236fb358136e434a09a21ef293fe949ced94123
--- /dev/null
+++ b/drivers/mailbox/mtk-gpueb-mailbox.c
@@ -0,0 +1,330 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * MediaTek GPUEB mailbox driver for SoCs such as the MT8196
+ *
+ * Copyright (C) 2025, Collabora Ltd.
+ *
+ * Developers harmed in the making of this driver:
+ * - Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
+ */
+
+#include <linux/atomic.h>
+#include <linux/clk.h>
+#include <linux/device.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/iopoll.h>
+#include <linux/mailbox_controller.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+
+#define MBOX_CTL_TX_STS 0x0000
+#define MBOX_CTL_IRQ_SET 0x0004
+#define MBOX_CTL_IRQ_CLR 0x0074
+#define MBOX_CTL_RX_STS 0x0078
+
+#define MBOX_FULL BIT(0) /* i.e. we've received data */
+#define MBOX_CLOGGED BIT(1) /* i.e. the channel is shutdown */
+
+struct mtk_gpueb_mbox {
+ struct device *dev;
+ struct clk *clk;
+ void __iomem *mbox_mmio;
+ void __iomem *mbox_ctl;
+ void **rx_buf;
+ atomic_t *rx_status;
+ struct mbox_controller mbox;
+ unsigned int *chn;
+ int irq;
+ const struct mtk_gpueb_mbox_variant *v;
+};
+
+struct mtk_gpueb_mbox_ch {
+ const char *name;
+ const int num;
+ const unsigned int tx_offset;
+ const unsigned int tx_len;
+ const unsigned int rx_offset;
+ const unsigned int rx_len;
+ const bool no_response;
+};
+
+struct mtk_gpueb_mbox_variant {
+ unsigned int num_channels;
+ const struct mtk_gpueb_mbox_ch channels[] __counted_by(num_channels);
+};
+
+/**
+ * mtk_gpueb_mbox_read_rx - read RX buffer from MMIO into ebm's RX buffer
+ * @ebm: pointer to &struct mtk_gpueb_mbox instance
+ * @channel: number of channel to read
+ */
+static void mtk_gpueb_mbox_read_rx(struct mtk_gpueb_mbox *ebm,
+ unsigned int channel)
+{
+ const struct mtk_gpueb_mbox_ch *ch;
+
+ ch = &ebm->v->channels[channel];
+
+ memcpy_fromio(ebm->rx_buf[channel], ebm->mbox_mmio + ch->rx_offset,
+ ch->rx_len);
+
+}
+
+static irqreturn_t mtk_gpueb_mbox_isr(int irq, void *data)
+{
+ struct mtk_gpueb_mbox *ebm = data;
+ u32 rx_handled = 0;
+ u32 rx_sts;
+ int i;
+
+ rx_sts = readl(ebm->mbox_ctl + MBOX_CTL_RX_STS);
+
+ for (i = 0; i < ebm->v->num_channels; i++) {
+ if (rx_sts & BIT(i)) {
+ if (!atomic_cmpxchg(&ebm->rx_status[i], 0,
+ MBOX_FULL | MBOX_CLOGGED))
+ rx_handled |= BIT(i);
+ }
+ }
+
+ writel(rx_handled, ebm->mbox_ctl + MBOX_CTL_IRQ_CLR);
+
+ if (!(rx_sts ^ rx_handled))
+ return IRQ_WAKE_THREAD;
+
+ dev_warn_ratelimited(ebm->dev, "spurious interrupts on 0x%04X\n",
+ rx_sts ^ rx_handled);
+ return IRQ_NONE;
+}
+
+static irqreturn_t mtk_gpueb_mbox_thread(int irq, void *data)
+{
+ struct mtk_gpueb_mbox *ebm = data;
+ irqreturn_t ret = IRQ_NONE;
+ int status;
+ int i;
+
+ for (i = 0; i < ebm->v->num_channels; i++) {
+ status = atomic_cmpxchg(&ebm->rx_status[i],
+ MBOX_FULL | MBOX_CLOGGED, MBOX_FULL);
+ if (status == (MBOX_FULL | MBOX_CLOGGED)) {
+ mtk_gpueb_mbox_read_rx(ebm, i);
+ mbox_chan_received_data(&ebm->mbox.chans[i],
+ ebm->rx_buf[i]);
+ /* FIXME: When does MBOX_FULL get cleared? Here? */
+ atomic_set(&ebm->rx_status[i], 0);
+ ret = IRQ_HANDLED;
+ }
+ }
+
+ return ret;
+}
+
+static int mtk_gpueb_mbox_send_data(struct mbox_chan *chan, void *data)
+{
+ struct mtk_gpueb_mbox *ebm = dev_get_drvdata(chan->mbox->dev);
+ unsigned int *num = chan->con_priv;
+ int i;
+ u32 *values = data;
+
+ if (*num >= ebm->v->num_channels)
+ return -ECHRNG;
+
+ if (!ebm->v->channels[*num].no_response &&
+ atomic_read(&ebm->rx_status[*num]))
+ return -EBUSY;
+
+ writel(BIT(*num), ebm->mbox_ctl + MBOX_CTL_IRQ_CLR);
+
+ /*
+ * We don't want any fancy nonsense, just write the 32-bit values in
+ * order. memcpy_toio/__iowrite32_copy don't work here, because fancy.
+ */
+ for (i = 0; i < ebm->v->channels[*num].tx_len; i += 4) {
+ writel(values[i / 4],
+ ebm->mbox_mmio + ebm->v->channels[*num].tx_offset + i);
+ }
+
+ writel(BIT(*num), ebm->mbox_ctl + MBOX_CTL_IRQ_SET);
+
+ return 0;
+}
+
+static int mtk_gpueb_mbox_startup(struct mbox_chan *chan)
+{
+ struct mtk_gpueb_mbox *ebm = dev_get_drvdata(chan->mbox->dev);
+ unsigned int *num = chan->con_priv;
+
+ if (*num >= ebm->v->num_channels)
+ return -ECHRNG;
+
+ atomic_set(&ebm->rx_status[*num], 0);
+
+ return 0;
+}
+
+static void mtk_gpueb_mbox_shutdown(struct mbox_chan *chan)
+{
+ struct mtk_gpueb_mbox *ebm = dev_get_drvdata(chan->mbox->dev);
+ unsigned int *num = chan->con_priv;
+
+ atomic_set(&ebm->rx_status[*num], MBOX_CLOGGED);
+}
+
+static bool mtk_gpueb_mbox_last_tx_done(struct mbox_chan *chan)
+{
+ struct mtk_gpueb_mbox *ebm = dev_get_drvdata(chan->mbox->dev);
+ unsigned int *num = chan->con_priv;
+
+ return !(readl(ebm->mbox_ctl + MBOX_CTL_TX_STS) & BIT(*num));
+}
+
+const struct mbox_chan_ops mtk_gpueb_mbox_ops = {
+ .send_data = mtk_gpueb_mbox_send_data,
+ .startup = mtk_gpueb_mbox_startup,
+ .shutdown = mtk_gpueb_mbox_shutdown,
+ .last_tx_done = mtk_gpueb_mbox_last_tx_done,
+};
+
+static struct mbox_chan *
+mtk_gpueb_mbox_of_xlate(struct mbox_controller *mbox,
+ const struct of_phandle_args *sp)
+{
+ struct mtk_gpueb_mbox *ebm = dev_get_drvdata(mbox->dev);
+
+ if (!sp->args_count)
+ return ERR_PTR(-EINVAL);
+
+ if (sp->args[0] >= ebm->v->num_channels)
+ return ERR_PTR(-ECHRNG);
+
+ return &mbox->chans[sp->args[0]];
+}
+
+static int mtk_gpueb_mbox_probe(struct platform_device *pdev)
+{
+ struct mtk_gpueb_mbox *ebm;
+ unsigned int rx_buf_sz;
+ void *buf;
+ unsigned int i;
+ int ret;
+
+ ebm = devm_kzalloc(&pdev->dev, sizeof(*ebm), GFP_KERNEL);
+ if (!ebm)
+ return -ENOMEM;
+
+ ebm->dev = &pdev->dev;
+ ebm->v = of_device_get_match_data(ebm->dev);
+
+ dev_set_drvdata(ebm->dev, ebm);
+
+ ebm->clk = devm_clk_get_enabled(ebm->dev, NULL);
+ if (IS_ERR(ebm->clk))
+ return dev_err_probe(ebm->dev, PTR_ERR(ebm->clk),
+ "Failed to get 'eb' clock\n");
+
+ ebm->mbox_mmio = devm_platform_ioremap_resource_byname(pdev, "mbox");
+ if (IS_ERR(ebm->mbox_mmio))
+ return dev_err_probe(ebm->dev, PTR_ERR(ebm->mbox_mmio),
+ "Couldn't map mailbox registers\n");
+
+ ebm->mbox_ctl = devm_platform_ioremap_resource_byname(pdev, "mbox_ctl");
+ if (IS_ERR(ebm->mbox_ctl))
+ return dev_err_probe(
+ ebm->dev, PTR_ERR(ebm->mbox_ctl),
+ "Couldn't map mailbox control registers\n");
+
+ rx_buf_sz = (ebm->v->channels[ebm->v->num_channels - 1].rx_offset +
+ ebm->v->channels[ebm->v->num_channels - 1].rx_len);
+
+ buf = devm_kzalloc(ebm->dev, rx_buf_sz, GFP_KERNEL);
+ if (!buf)
+ return -ENOMEM;
+
+ ebm->rx_buf = devm_kmalloc_array(ebm->dev, ebm->v->num_channels,
+ sizeof(*ebm->rx_buf), GFP_KERNEL);
+ if (!ebm->rx_buf)
+ return -ENOMEM;
+
+ ebm->mbox.chans = devm_kcalloc(ebm->dev, ebm->v->num_channels,
+ sizeof(struct mbox_chan), GFP_KERNEL);
+ if (!ebm->mbox.chans)
+ return -ENOMEM;
+
+ ebm->rx_status = devm_kcalloc(ebm->dev, ebm->v->num_channels,
+ sizeof(atomic_t), GFP_KERNEL);
+ if (!ebm->rx_status)
+ return -ENOMEM;
+
+ ebm->chn = devm_kcalloc(ebm->dev, ebm->v->num_channels,
+ sizeof(*ebm->chn), GFP_KERNEL);
+
+ for (i = 0; i < ebm->v->num_channels; i++) {
+ ebm->rx_buf[i] = buf + ebm->v->channels[i].rx_offset;
+ spin_lock_init(&ebm->mbox.chans[i].lock);
+ /* the things you do to avoid explicit casting void* */
+ ebm->chn[i] = i;
+ ebm->mbox.chans[i].con_priv = &ebm->chn[i];
+ atomic_set(&ebm->rx_status[i], MBOX_CLOGGED);
+ }
+
+ ebm->mbox.dev = ebm->dev;
+ ebm->mbox.num_chans = ebm->v->num_channels;
+ ebm->mbox.txdone_poll = true;
+ ebm->mbox.txpoll_period = 0; /* minimum hrtimer interval */
+ ebm->mbox.of_xlate = mtk_gpueb_mbox_of_xlate;
+ ebm->mbox.ops = &mtk_gpueb_mbox_ops;
+
+ ebm->irq = platform_get_irq(pdev, 0);
+ if (ebm->irq < 0)
+ return ebm->irq;
+
+ ret = devm_request_threaded_irq(ebm->dev, ebm->irq, mtk_gpueb_mbox_isr,
+ mtk_gpueb_mbox_thread, 0, NULL, ebm);
+ if (ret)
+ return dev_err_probe(ebm->dev, ret, "failed to request IRQ\n");
+
+ ret = devm_mbox_controller_register(ebm->dev, &ebm->mbox);
+
+ return 0;
+}
+
+static const struct mtk_gpueb_mbox_variant mtk_gpueb_mbox_mt8196 = {
+ .num_channels = 12,
+ .channels = {
+ { "fast_dvfs_event", 0, 0x0000, 16, 0x00e0, 16, false },
+ { "gpufreq", 1, 0x0010, 32, 0x00f0, 32, false },
+ { "sleep", 2, 0x0030, 12, 0x0110, 4, true },
+ { "timer", 3, 0x003c, 24, 0x0114, 4, false },
+ { "fhctl", 4, 0x0054, 36, 0x0118, 4, false },
+ { "ccf", 5, 0x0078, 16, 0x011c, 16, false },
+ { "gpumpu", 6, 0x0088, 24, 0x012c, 4, false },
+ { "fast_dvfs", 7, 0x00a0, 24, 0x0130, 24, false },
+ { "ipir_c_met", 8, 0x00b8, 4, 0x0148, 16, false },
+ { "ipis_c_met", 9, 0x00bc, 16, 0x0158, 4, false },
+ { "brisket", 10, 0x00cc, 16, 0x015c, 16, false },
+ { "ppb", 11, 0x00dc, 4, 0x016c, 4, false },
+ },
+};
+
+static const struct of_device_id mtk_gpueb_mbox_of_ids[] = {
+ { .compatible = "mediatek,mt8196-gpueb-mbox",
+ .data = &mtk_gpueb_mbox_mt8196 },
+ { /* Sentinel */ }
+};
+MODULE_DEVICE_TABLE(of, mtk_gpueb_mbox_of_ids);
+
+static struct platform_driver mtk_gpueb_mbox_drv = {
+ .probe = mtk_gpueb_mbox_probe,
+ .driver = {
+ .name = "mtk-gpueb-mbox",
+ .of_match_table = mtk_gpueb_mbox_of_ids,
+ }
+};
+module_platform_driver(mtk_gpueb_mbox_drv);
+
+MODULE_AUTHOR("Nicolas Frattaroli <nicolas.frattaroli@collabora.com>");
+MODULE_DESCRIPTION("MediaTek GPUEB mailbox driver for SoCs such as the MT8196");
+MODULE_LICENSE("GPL");
--
2.51.0
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH RFC 05/10] mailbox: add MediaTek GPUEB IPI mailbox
2025-09-05 10:23 ` [PATCH RFC 05/10] mailbox: add MediaTek GPUEB IPI mailbox Nicolas Frattaroli
@ 2025-09-08 10:06 ` AngeloGioacchino Del Regno
2025-09-08 12:05 ` Nicolas Frattaroli
2025-09-12 4:48 ` Chia-I Wu
1 sibling, 1 reply; 30+ messages in thread
From: AngeloGioacchino Del Regno @ 2025-09-08 10:06 UTC (permalink / raw)
To: Nicolas Frattaroli, Boris Brezillon, Steven Price, Liviu Dudau,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
Matthias Brugger, MyungJoo Ham, Kyungmin Park, Chanwoo Choi,
Jassi Brar, Kees Cook, Gustavo A. R. Silva
Cc: Chia-I Wu, Chen-Yu Tsai, kernel, dri-devel, devicetree,
linux-kernel, linux-arm-kernel, linux-mediatek, linux-pm,
linux-hardening
Il 05/09/25 12:23, Nicolas Frattaroli ha scritto:
> The MT8196 SoC uses an embedded MCU to control frequencies and power of
> the GPU. This controller is referred to as "GPUEB".
>
> It communicates to the application processor, among other ways, through
> a mailbox.
>
> The mailbox exposes one interrupt, which appears to only be fired when a
> response is received, rather than a transaction is completed. For us,
> this means we unfortunately need to poll for txdone.
>
> The mailbox also requires the EB clock to be on when touching any of the
> mailbox registers.
>
> Add a simple driver for it based on the common mailbox framework.
>
> Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
Only a few nits in this, check below.
> ---
> drivers/mailbox/Kconfig | 10 ++
> drivers/mailbox/Makefile | 2 +
> drivers/mailbox/mtk-gpueb-mailbox.c | 330 ++++++++++++++++++++++++++++++++++++
> 3 files changed, 342 insertions(+)
>
> diff --git a/drivers/mailbox/Kconfig b/drivers/mailbox/Kconfig
> index 02432d4a5ccd46a16156a09c7f277fb03e4013f5..2016defda1fabb5c0fcc8078f84a52d4e4e00167 100644
> --- a/drivers/mailbox/Kconfig
> +++ b/drivers/mailbox/Kconfig
> @@ -294,6 +294,16 @@ config MTK_CMDQ_MBOX
> critical time limitation, such as updating display configuration
> during the vblank.
>
> +config MTK_GPUEB_MBOX
> + tristate "MediaTek GPUEB Mailbox Support"
> + depends on ARCH_MEDIATEK || COMPILE_TEST
> + help
> + The MediaTek GPUEB mailbox is used to communicate with the embedded
> + controller in charge of GPU frequency and power management on some
> + MediaTek SoCs, such as the MT8196.
> + Say Y or m here if you want to support the MT8196 SoC in your kernel
> + build.
> +
> config ZYNQMP_IPI_MBOX
> tristate "Xilinx ZynqMP IPI Mailbox"
> depends on ARCH_ZYNQMP && OF
> diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
> index 98a68f838486eed117d24296138bf59fda3f92e4..564d06e71313e6d1972e4a6036e1e78c8c7ec450 100644
> --- a/drivers/mailbox/Makefile
> +++ b/drivers/mailbox/Makefile
> @@ -63,6 +63,8 @@ obj-$(CONFIG_MTK_ADSP_MBOX) += mtk-adsp-mailbox.o
>
> obj-$(CONFIG_MTK_CMDQ_MBOX) += mtk-cmdq-mailbox.o
>
> +obj-$(CONFIG_MTK_GPUEB_MBOX) += mtk-gpueb-mailbox.o
> +
> obj-$(CONFIG_ZYNQMP_IPI_MBOX) += zynqmp-ipi-mailbox.o
>
> obj-$(CONFIG_SUN6I_MSGBOX) += sun6i-msgbox.o
> diff --git a/drivers/mailbox/mtk-gpueb-mailbox.c b/drivers/mailbox/mtk-gpueb-mailbox.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..0236fb358136e434a09a21ef293fe949ced94123
> --- /dev/null
> +++ b/drivers/mailbox/mtk-gpueb-mailbox.c
> @@ -0,0 +1,330 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * MediaTek GPUEB mailbox driver for SoCs such as the MT8196
> + *
> + * Copyright (C) 2025, Collabora Ltd.
> + *
> + * Developers harmed in the making of this driver:
lol
> + * - Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> + */
> +
> +#include <linux/atomic.h>
> +#include <linux/clk.h>
> +#include <linux/device.h>
> +#include <linux/interrupt.h>
> +#include <linux/io.h>
> +#include <linux/iopoll.h>
> +#include <linux/mailbox_controller.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/platform_device.h>
> +
> +#define MBOX_CTL_TX_STS 0x0000
> +#define MBOX_CTL_IRQ_SET 0x0004
> +#define MBOX_CTL_IRQ_CLR 0x0074
> +#define MBOX_CTL_RX_STS 0x0078
> +
> +#define MBOX_FULL BIT(0) /* i.e. we've received data */
> +#define MBOX_CLOGGED BIT(1) /* i.e. the channel is shutdown */
> +
> +struct mtk_gpueb_mbox {
> + struct device *dev;
> + struct clk *clk;
> + void __iomem *mbox_mmio;
> + void __iomem *mbox_ctl;
> + void **rx_buf;
> + atomic_t *rx_status;
> + struct mbox_controller mbox;
> + unsigned int *chn;
> + int irq;
> + const struct mtk_gpueb_mbox_variant *v;
> +};
> +
> +struct mtk_gpueb_mbox_ch {
> + const char *name;
> + const int num;
8 bits are enough for `num`....
> + const unsigned int tx_offset;
You don't need more than 16 bits for the tx/rx offset, do you?
> + const unsigned int tx_len;
...and I'm sure that it's unrealistic to think that we could ever see any future
gpueb mailbox with tx/rx len that is more than 255 bytes, so you don't need more
than 8 bits for the length.
> + const unsigned int rx_offset;
> + const unsigned int rx_len;
> + const bool no_response;
> +};
> +
> +struct mtk_gpueb_mbox_variant {
> + unsigned int num_channels;
> + const struct mtk_gpueb_mbox_ch channels[] __counted_by(num_channels);
> +};
> +
> +/**
> + * mtk_gpueb_mbox_read_rx - read RX buffer from MMIO into ebm's RX buffer
> + * @ebm: pointer to &struct mtk_gpueb_mbox instance
> + * @channel: number of channel to read
> + */
> +static void mtk_gpueb_mbox_read_rx(struct mtk_gpueb_mbox *ebm,
> + unsigned int channel)
> +{
> + const struct mtk_gpueb_mbox_ch *ch;
Just initialize it during declaration, it's shorter.
> +
> + ch = &ebm->v->channels[channel];
> +
> + memcpy_fromio(ebm->rx_buf[channel], ebm->mbox_mmio + ch->rx_offset,
> + ch->rx_len);
Please use a single line, 89 columns is fine :-)
> +
> +}
> +
> +static irqreturn_t mtk_gpueb_mbox_isr(int irq, void *data)
> +{
> + struct mtk_gpueb_mbox *ebm = data;
> + u32 rx_handled = 0;
> + u32 rx_sts;
> + int i;
> +
> + rx_sts = readl(ebm->mbox_ctl + MBOX_CTL_RX_STS);
> +
> + for (i = 0; i < ebm->v->num_channels; i++) {
> + if (rx_sts & BIT(i)) {
> + if (!atomic_cmpxchg(&ebm->rx_status[i], 0,
> + MBOX_FULL | MBOX_CLOGGED))
> + rx_handled |= BIT(i);
> + }
> + }
> +
> + writel(rx_handled, ebm->mbox_ctl + MBOX_CTL_IRQ_CLR);
> +
> + if (!(rx_sts ^ rx_handled))
> + return IRQ_WAKE_THREAD;
> +
> + dev_warn_ratelimited(ebm->dev, "spurious interrupts on 0x%04X\n",
> + rx_sts ^ rx_handled);
> + return IRQ_NONE;
> +}
> +
> +static irqreturn_t mtk_gpueb_mbox_thread(int irq, void *data)
> +{
> + struct mtk_gpueb_mbox *ebm = data;
> + irqreturn_t ret = IRQ_NONE;
> + int status;
> + int i;
> +
> + for (i = 0; i < ebm->v->num_channels; i++) {
> + status = atomic_cmpxchg(&ebm->rx_status[i],
> + MBOX_FULL | MBOX_CLOGGED, MBOX_FULL);
> + if (status == (MBOX_FULL | MBOX_CLOGGED)) {
> + mtk_gpueb_mbox_read_rx(ebm, i);
> + mbox_chan_received_data(&ebm->mbox.chans[i],
> + ebm->rx_buf[i]);
> + /* FIXME: When does MBOX_FULL get cleared? Here? */
> + atomic_set(&ebm->rx_status[i], 0);
I say that you're right and it has to get cleared here.
You got (and handled) the message, so the info in the mbox is not needed anymore.
> + ret = IRQ_HANDLED;
> + }
> + }
> +
> + return ret;
> +}
> +
> +static int mtk_gpueb_mbox_send_data(struct mbox_chan *chan, void *data)
> +{
> + struct mtk_gpueb_mbox *ebm = dev_get_drvdata(chan->mbox->dev);
> + unsigned int *num = chan->con_priv;
> + int i;
int i, j;
> + u32 *values = data;
> +
> + if (*num >= ebm->v->num_channels)
> + return -ECHRNG;
> +
> + if (!ebm->v->channels[*num].no_response &&
> + atomic_read(&ebm->rx_status[*num]))
> + return -EBUSY;
> +
> + writel(BIT(*num), ebm->mbox_ctl + MBOX_CTL_IRQ_CLR);
> +
> + /*
> + * We don't want any fancy nonsense, just write the 32-bit values in
> + * order. memcpy_toio/__iowrite32_copy don't work here, because fancy.
> + */
> + for (i = 0; i < ebm->v->channels[*num].tx_len; i += 4) {
Just use an additional `j` index, so that you can avoid division.
> + writel(values[i / 4],
> + ebm->mbox_mmio + ebm->v->channels[*num].tx_offset + i);
> + }
> +
> + writel(BIT(*num), ebm->mbox_ctl + MBOX_CTL_IRQ_SET);
> +
> + return 0;
> +}
> +
> +static int mtk_gpueb_mbox_startup(struct mbox_chan *chan)
> +{
> + struct mtk_gpueb_mbox *ebm = dev_get_drvdata(chan->mbox->dev);
> + unsigned int *num = chan->con_priv;
> +
> + if (*num >= ebm->v->num_channels)
> + return -ECHRNG;
> +
> + atomic_set(&ebm->rx_status[*num], 0);
> +
> + return 0;
> +}
> +
> +static void mtk_gpueb_mbox_shutdown(struct mbox_chan *chan)
> +{
> + struct mtk_gpueb_mbox *ebm = dev_get_drvdata(chan->mbox->dev);
> + unsigned int *num = chan->con_priv;
> +
> + atomic_set(&ebm->rx_status[*num], MBOX_CLOGGED);
> +}
> +
> +static bool mtk_gpueb_mbox_last_tx_done(struct mbox_chan *chan)
> +{
> + struct mtk_gpueb_mbox *ebm = dev_get_drvdata(chan->mbox->dev);
> + unsigned int *num = chan->con_priv;
> +
> + return !(readl(ebm->mbox_ctl + MBOX_CTL_TX_STS) & BIT(*num));
> +}
> +
> +const struct mbox_chan_ops mtk_gpueb_mbox_ops = {
> + .send_data = mtk_gpueb_mbox_send_data,
> + .startup = mtk_gpueb_mbox_startup,
> + .shutdown = mtk_gpueb_mbox_shutdown,
> + .last_tx_done = mtk_gpueb_mbox_last_tx_done,
> +};
> +
> +static struct mbox_chan *
> +mtk_gpueb_mbox_of_xlate(struct mbox_controller *mbox,
> + const struct of_phandle_args *sp)
> +{
> + struct mtk_gpueb_mbox *ebm = dev_get_drvdata(mbox->dev);
> +
> + if (!sp->args_count)
> + return ERR_PTR(-EINVAL);
> +
> + if (sp->args[0] >= ebm->v->num_channels)
> + return ERR_PTR(-ECHRNG);
> +
> + return &mbox->chans[sp->args[0]];
> +}
> +
> +static int mtk_gpueb_mbox_probe(struct platform_device *pdev)
> +{
> + struct mtk_gpueb_mbox *ebm;
> + unsigned int rx_buf_sz;
> + void *buf;
> + unsigned int i;
> + int ret;
> +
> + ebm = devm_kzalloc(&pdev->dev, sizeof(*ebm), GFP_KERNEL);
> + if (!ebm)
> + return -ENOMEM;
> +
> + ebm->dev = &pdev->dev;
> + ebm->v = of_device_get_match_data(ebm->dev);
> +
> + dev_set_drvdata(ebm->dev, ebm);
My preference is to set drvdata after all of the paths that can fail (or anyway
before anything that uses it), but it's your own choice.
> +
> + ebm->clk = devm_clk_get_enabled(ebm->dev, NULL);
> + if (IS_ERR(ebm->clk))
> + return dev_err_probe(ebm->dev, PTR_ERR(ebm->clk),
> + "Failed to get 'eb' clock\n");
> +
> + ebm->mbox_mmio = devm_platform_ioremap_resource_byname(pdev, "mbox");
I'd say that "chan" and "ctl" are more descriptive as resource names, but then,
do we really need to search by name?
Doing that by index is also an option, as you can write the MMIO names and their
full description in the bindings instead.
> + if (IS_ERR(ebm->mbox_mmio))
> + return dev_err_probe(ebm->dev, PTR_ERR(ebm->mbox_mmio),
> + "Couldn't map mailbox registers\n");
> +
> + ebm->mbox_ctl = devm_platform_ioremap_resource_byname(pdev, "mbox_ctl");
> + if (IS_ERR(ebm->mbox_ctl))
> + return dev_err_probe(
> + ebm->dev, PTR_ERR(ebm->mbox_ctl),
> + "Couldn't map mailbox control registers\n");
> +
> + rx_buf_sz = (ebm->v->channels[ebm->v->num_channels - 1].rx_offset +
> + ebm->v->channels[ebm->v->num_channels - 1].rx_len);
> +
> + buf = devm_kzalloc(ebm->dev, rx_buf_sz, GFP_KERNEL);
> + if (!buf)
> + return -ENOMEM;
> +
> + ebm->rx_buf = devm_kmalloc_array(ebm->dev, ebm->v->num_channels,
> + sizeof(*ebm->rx_buf), GFP_KERNEL);
> + if (!ebm->rx_buf)
> + return -ENOMEM;
> +
> + ebm->mbox.chans = devm_kcalloc(ebm->dev, ebm->v->num_channels,
> + sizeof(struct mbox_chan), GFP_KERNEL);
> + if (!ebm->mbox.chans)
> + return -ENOMEM;
> +
> + ebm->rx_status = devm_kcalloc(ebm->dev, ebm->v->num_channels,
> + sizeof(atomic_t), GFP_KERNEL);
> + if (!ebm->rx_status)
> + return -ENOMEM;
> +
> + ebm->chn = devm_kcalloc(ebm->dev, ebm->v->num_channels,
> + sizeof(*ebm->chn), GFP_KERNEL);
Two things:
1. You missed error checking, oops!
2. chn is fully initialized later, you can devm_kmalloc_array() here instead,
so that you can avoid double initialization (you don't need zeroing).
> +
> + for (i = 0; i < ebm->v->num_channels; i++) {
> + ebm->rx_buf[i] = buf + ebm->v->channels[i].rx_offset;
> + spin_lock_init(&ebm->mbox.chans[i].lock);
> + /* the things you do to avoid explicit casting void* */
> + ebm->chn[i] = i;
> + ebm->mbox.chans[i].con_priv = &ebm->chn[i];
> + atomic_set(&ebm->rx_status[i], MBOX_CLOGGED);
> + }
> +
> + ebm->mbox.dev = ebm->dev;
> + ebm->mbox.num_chans = ebm->v->num_channels;
> + ebm->mbox.txdone_poll = true;
> + ebm->mbox.txpoll_period = 0; /* minimum hrtimer interval */
> + ebm->mbox.of_xlate = mtk_gpueb_mbox_of_xlate;
> + ebm->mbox.ops = &mtk_gpueb_mbox_ops;
> +
> + ebm->irq = platform_get_irq(pdev, 0);
> + if (ebm->irq < 0)
> + return ebm->irq;
Please move the platform_get_irq() up a bit, before all those expensive memory
allocations - if it fails, at least it fails faster in that case.
> +
> + ret = devm_request_threaded_irq(ebm->dev, ebm->irq, mtk_gpueb_mbox_isr,
> + mtk_gpueb_mbox_thread, 0, NULL, ebm);
> + if (ret)
> + return dev_err_probe(ebm->dev, ret, "failed to request IRQ\n");
> +
> + ret = devm_mbox_controller_register(ebm->dev, &ebm->mbox);
error check
> +
> + return 0;
> +}
> +
> +static const struct mtk_gpueb_mbox_variant mtk_gpueb_mbox_mt8196 = {
> + .num_channels = 12,
> + .channels = {
> + { "fast_dvfs_event", 0, 0x0000, 16, 0x00e0, 16, false },
> + { "gpufreq", 1, 0x0010, 32, 0x00f0, 32, false },
> + { "sleep", 2, 0x0030, 12, 0x0110, 4, true },
> + { "timer", 3, 0x003c, 24, 0x0114, 4, false },
> + { "fhctl", 4, 0x0054, 36, 0x0118, 4, false },
> + { "ccf", 5, 0x0078, 16, 0x011c, 16, false },
> + { "gpumpu", 6, 0x0088, 24, 0x012c, 4, false },
> + { "fast_dvfs", 7, 0x00a0, 24, 0x0130, 24, false },
> + { "ipir_c_met", 8, 0x00b8, 4, 0x0148, 16, false },
> + { "ipis_c_met", 9, 0x00bc, 16, 0x0158, 4, false },
> + { "brisket", 10, 0x00cc, 16, 0x015c, 16, false },
> + { "ppb", 11, 0x00dc, 4, 0x016c, 4, false },
> + },
> +};
> +
> +static const struct of_device_id mtk_gpueb_mbox_of_ids[] = {
> + { .compatible = "mediatek,mt8196-gpueb-mbox",
> + .data = &mtk_gpueb_mbox_mt8196 },
single line please
> + { /* Sentinel */ }
> +};
> +MODULE_DEVICE_TABLE(of, mtk_gpueb_mbox_of_ids);
> +
> +static struct platform_driver mtk_gpueb_mbox_drv = {
> + .probe = mtk_gpueb_mbox_probe,
> + .driver = {
> + .name = "mtk-gpueb-mbox",
> + .of_match_table = mtk_gpueb_mbox_of_ids,
> + }
> +};
> +module_platform_driver(mtk_gpueb_mbox_drv);
> +
> +MODULE_AUTHOR("Nicolas Frattaroli <nicolas.frattaroli@collabora.com>");
> +MODULE_DESCRIPTION("MediaTek GPUEB mailbox driver for SoCs such as the MT8196");
Just "MediaTek GPUEB Mailbox driver" is fine - there are other SoCs that can make
use of this driver (whether you're implementing support for those in this moment
is not important), and the others are very-very-very-very similar to how it's done
in 8196.
> +MODULE_LICENSE("GPL");
>
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [PATCH RFC 05/10] mailbox: add MediaTek GPUEB IPI mailbox
2025-09-08 10:06 ` AngeloGioacchino Del Regno
@ 2025-09-08 12:05 ` Nicolas Frattaroli
2025-09-08 12:34 ` AngeloGioacchino Del Regno
0 siblings, 1 reply; 30+ messages in thread
From: Nicolas Frattaroli @ 2025-09-08 12:05 UTC (permalink / raw)
To: Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Rob Herring, Krzysztof Kozlowski, Conor Dooley, Matthias Brugger,
MyungJoo Ham, Kyungmin Park, Chanwoo Choi, Jassi Brar, Kees Cook,
Gustavo A. R. Silva, AngeloGioacchino Del Regno
Cc: Chia-I Wu, Chen-Yu Tsai, kernel, dri-devel, devicetree,
linux-kernel, linux-arm-kernel, linux-mediatek, linux-pm,
linux-hardening
On Monday, 8 September 2025 12:06:01 Central European Summer Time AngeloGioacchino Del Regno wrote:
> Il 05/09/25 12:23, Nicolas Frattaroli ha scritto:
> > The MT8196 SoC uses an embedded MCU to control frequencies and power of
> > the GPU. This controller is referred to as "GPUEB".
> >
> > It communicates to the application processor, among other ways, through
> > a mailbox.
> >
> > The mailbox exposes one interrupt, which appears to only be fired when a
> > response is received, rather than a transaction is completed. For us,
> > this means we unfortunately need to poll for txdone.
> >
> > The mailbox also requires the EB clock to be on when touching any of the
> > mailbox registers.
> >
> > Add a simple driver for it based on the common mailbox framework.
> >
> > Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
>
> Only a few nits in this, check below.
>
> [...]
> > +
> > +static int mtk_gpueb_mbox_send_data(struct mbox_chan *chan, void *data)
> > +{
> > + struct mtk_gpueb_mbox *ebm = dev_get_drvdata(chan->mbox->dev);
> > + unsigned int *num = chan->con_priv;
> > + int i;
>
> int i, j;
>
> > + u32 *values = data;
> > +
> > + if (*num >= ebm->v->num_channels)
> > + return -ECHRNG;
> > +
> > + if (!ebm->v->channels[*num].no_response &&
> > + atomic_read(&ebm->rx_status[*num]))
> > + return -EBUSY;
> > +
> > + writel(BIT(*num), ebm->mbox_ctl + MBOX_CTL_IRQ_CLR);
> > +
> > + /*
> > + * We don't want any fancy nonsense, just write the 32-bit values in
> > + * order. memcpy_toio/__iowrite32_copy don't work here, because fancy.
> > + */
> > + for (i = 0; i < ebm->v->channels[*num].tx_len; i += 4) {
>
> Just use an additional `j` index, so that you can avoid division.
The `/ 4` division here is equivalent to a `>> 2` which comes free with
almost every instruction on arm64, I don't think having two separate
indices makes the code any clearer? Unless I misunderstand how you'd
want me to use j here.
Like this?
j = 0;
for (i = 0; i < ebm->v->channels[*num].tx_len; i += 4) {
writel(values[j++], ebm->mbox_mmio + ebm->v->channels[*num].tx_offset + i);
}
This makes the relationship between the values index and i less clear. (And
in my rendition, assumes the reader knows how postincrement works, but I
think assuming people know C is fine.)
> [...]
>
> > +
> > + ebm->clk = devm_clk_get_enabled(ebm->dev, NULL);
> > + if (IS_ERR(ebm->clk))
> > + return dev_err_probe(ebm->dev, PTR_ERR(ebm->clk),
> > + "Failed to get 'eb' clock\n");
> > +
> > + ebm->mbox_mmio = devm_platform_ioremap_resource_byname(pdev, "mbox");
>
> I'd say that "chan" and "ctl" are more descriptive as resource names, but then,
> do we really need to search by name?
In the binding, it was proposed to change "mbox" to something like "data",
which is fine by me, and to drop the "mbox" prefix of "ctl".
>
> Doing that by index is also an option, as you can write the MMIO names and their
> full description in the bindings instead.
Yeah in the driver I think I'll switch to doing indices until some second
compatible forces us to actually rely on names because it adds a bunch of
other ranges.
> [...]
thanks for the feedback, assume that anything I didn't directly respond
to will be fixed in the next revision.
Kind regards,
Nicolas Frattaroli
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [PATCH RFC 05/10] mailbox: add MediaTek GPUEB IPI mailbox
2025-09-08 12:05 ` Nicolas Frattaroli
@ 2025-09-08 12:34 ` AngeloGioacchino Del Regno
0 siblings, 0 replies; 30+ messages in thread
From: AngeloGioacchino Del Regno @ 2025-09-08 12:34 UTC (permalink / raw)
To: Nicolas Frattaroli, Boris Brezillon, Steven Price, Liviu Dudau,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
Matthias Brugger, MyungJoo Ham, Kyungmin Park, Chanwoo Choi,
Jassi Brar, Kees Cook, Gustavo A. R. Silva
Cc: Chia-I Wu, Chen-Yu Tsai, kernel, dri-devel, devicetree,
linux-kernel, linux-arm-kernel, linux-mediatek, linux-pm,
linux-hardening
Il 08/09/25 14:05, Nicolas Frattaroli ha scritto:
> On Monday, 8 September 2025 12:06:01 Central European Summer Time AngeloGioacchino Del Regno wrote:
>> Il 05/09/25 12:23, Nicolas Frattaroli ha scritto:
>>> The MT8196 SoC uses an embedded MCU to control frequencies and power of
>>> the GPU. This controller is referred to as "GPUEB".
>>>
>>> It communicates to the application processor, among other ways, through
>>> a mailbox.
>>>
>>> The mailbox exposes one interrupt, which appears to only be fired when a
>>> response is received, rather than a transaction is completed. For us,
>>> this means we unfortunately need to poll for txdone.
>>>
>>> The mailbox also requires the EB clock to be on when touching any of the
>>> mailbox registers.
>>>
>>> Add a simple driver for it based on the common mailbox framework.
>>>
>>> Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
>>
>> Only a few nits in this, check below.
>>
>> [...]
>>> +
>>> +static int mtk_gpueb_mbox_send_data(struct mbox_chan *chan, void *data)
>>> +{
>>> + struct mtk_gpueb_mbox *ebm = dev_get_drvdata(chan->mbox->dev);
>>> + unsigned int *num = chan->con_priv;
>>> + int i;
>>
>> int i, j;
>>
>>> + u32 *values = data;
>>> +
>>> + if (*num >= ebm->v->num_channels)
>>> + return -ECHRNG;
>>> +
>>> + if (!ebm->v->channels[*num].no_response &&
>>> + atomic_read(&ebm->rx_status[*num]))
>>> + return -EBUSY;
>>> +
>>> + writel(BIT(*num), ebm->mbox_ctl + MBOX_CTL_IRQ_CLR);
>>> +
>>> + /*
>>> + * We don't want any fancy nonsense, just write the 32-bit values in
>>> + * order. memcpy_toio/__iowrite32_copy don't work here, because fancy.
>>> + */
>>> + for (i = 0; i < ebm->v->channels[*num].tx_len; i += 4) {
>>
>> Just use an additional `j` index, so that you can avoid division.
>
> The `/ 4` division here is equivalent to a `>> 2` which comes free with
> almost every instruction on arm64, I don't think having two separate
> indices makes the code any clearer?
> Unless I misunderstand how you'd
> want me to use j here.
>
> Like this?
>
> j = 0;
> for (i = 0; i < ebm->v->channels[*num].tx_len; i += 4) {
> writel(values[j++], ebm->mbox_mmio + ebm->v->channels[*num].tx_offset + i);
> }
>
> This makes the relationship between the values index and i less clear. (And
> in my rendition, assumes the reader knows how postincrement works, but I
> think assuming people know C is fine.)
I guess assuming people know that compilers will auto-optimize that is okay then.
>
>> [...]
>>
>>> +
>>> + ebm->clk = devm_clk_get_enabled(ebm->dev, NULL);
>>> + if (IS_ERR(ebm->clk))
>>> + return dev_err_probe(ebm->dev, PTR_ERR(ebm->clk),
>>> + "Failed to get 'eb' clock\n");
>>> +
>>> + ebm->mbox_mmio = devm_platform_ioremap_resource_byname(pdev, "mbox");
>>
>> I'd say that "chan" and "ctl" are more descriptive as resource names, but then,
>> do we really need to search by name?
>
> In the binding, it was proposed to change "mbox" to something like "data",
> which is fine by me, and to drop the "mbox" prefix of "ctl".
>
Heh, didn't see that comment. data and ctl are also fine for me, go with those :-)
>>
>> Doing that by index is also an option, as you can write the MMIO names and their
>> full description in the bindings instead.
>
> Yeah in the driver I think I'll switch to doing indices until some second
> compatible forces us to actually rely on names because it adds a bunch of
> other ranges.
>
>> [...]
>
> thanks for the feedback, assume that anything I didn't directly respond
> to will be fixed in the next revision.
Perfect.
Cheers,
Angelo
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH RFC 05/10] mailbox: add MediaTek GPUEB IPI mailbox
2025-09-05 10:23 ` [PATCH RFC 05/10] mailbox: add MediaTek GPUEB IPI mailbox Nicolas Frattaroli
2025-09-08 10:06 ` AngeloGioacchino Del Regno
@ 2025-09-12 4:48 ` Chia-I Wu
2025-09-12 6:11 ` Chen-Yu Tsai
2025-09-12 10:59 ` Nicolas Frattaroli
1 sibling, 2 replies; 30+ messages in thread
From: Chia-I Wu @ 2025-09-12 4:48 UTC (permalink / raw)
To: Nicolas Frattaroli
Cc: AngeloGioacchino Del Regno, Boris Brezillon, Steven Price,
Liviu Dudau, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Matthias Brugger, MyungJoo Ham, Kyungmin Park,
Chanwoo Choi, Jassi Brar, Kees Cook, Gustavo A. R. Silva,
Chen-Yu Tsai, kernel, dri-devel, devicetree, linux-kernel,
linux-arm-kernel, linux-mediatek, linux-pm, linux-hardening
On Fri, Sep 5, 2025 at 3:24 AM Nicolas Frattaroli
<nicolas.frattaroli@collabora.com> wrote:
>
> The MT8196 SoC uses an embedded MCU to control frequencies and power of
> the GPU. This controller is referred to as "GPUEB".
>
> It communicates to the application processor, among other ways, through
> a mailbox.
>
> The mailbox exposes one interrupt, which appears to only be fired when a
> response is received, rather than a transaction is completed. For us,
> this means we unfortunately need to poll for txdone.
>
> The mailbox also requires the EB clock to be on when touching any of the
> mailbox registers.
>
> Add a simple driver for it based on the common mailbox framework.
>
> Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> ---
> drivers/mailbox/Kconfig | 10 ++
> drivers/mailbox/Makefile | 2 +
> drivers/mailbox/mtk-gpueb-mailbox.c | 330 ++++++++++++++++++++++++++++++++++++
> 3 files changed, 342 insertions(+)
>
> diff --git a/drivers/mailbox/Kconfig b/drivers/mailbox/Kconfig
> index 02432d4a5ccd46a16156a09c7f277fb03e4013f5..2016defda1fabb5c0fcc8078f84a52d4e4e00167 100644
> --- a/drivers/mailbox/Kconfig
> +++ b/drivers/mailbox/Kconfig
> @@ -294,6 +294,16 @@ config MTK_CMDQ_MBOX
> critical time limitation, such as updating display configuration
> during the vblank.
>
> +config MTK_GPUEB_MBOX
> + tristate "MediaTek GPUEB Mailbox Support"
> + depends on ARCH_MEDIATEK || COMPILE_TEST
> + help
> + The MediaTek GPUEB mailbox is used to communicate with the embedded
> + controller in charge of GPU frequency and power management on some
> + MediaTek SoCs, such as the MT8196.
> + Say Y or m here if you want to support the MT8196 SoC in your kernel
> + build.
> +
> config ZYNQMP_IPI_MBOX
> tristate "Xilinx ZynqMP IPI Mailbox"
> depends on ARCH_ZYNQMP && OF
> diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
> index 98a68f838486eed117d24296138bf59fda3f92e4..564d06e71313e6d1972e4a6036e1e78c8c7ec450 100644
> --- a/drivers/mailbox/Makefile
> +++ b/drivers/mailbox/Makefile
> @@ -63,6 +63,8 @@ obj-$(CONFIG_MTK_ADSP_MBOX) += mtk-adsp-mailbox.o
>
> obj-$(CONFIG_MTK_CMDQ_MBOX) += mtk-cmdq-mailbox.o
>
> +obj-$(CONFIG_MTK_GPUEB_MBOX) += mtk-gpueb-mailbox.o
> +
> obj-$(CONFIG_ZYNQMP_IPI_MBOX) += zynqmp-ipi-mailbox.o
>
> obj-$(CONFIG_SUN6I_MSGBOX) += sun6i-msgbox.o
> diff --git a/drivers/mailbox/mtk-gpueb-mailbox.c b/drivers/mailbox/mtk-gpueb-mailbox.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..0236fb358136e434a09a21ef293fe949ced94123
> --- /dev/null
> +++ b/drivers/mailbox/mtk-gpueb-mailbox.c
> @@ -0,0 +1,330 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * MediaTek GPUEB mailbox driver for SoCs such as the MT8196
> + *
> + * Copyright (C) 2025, Collabora Ltd.
> + *
> + * Developers harmed in the making of this driver:
> + * - Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> + */
> +
> +#include <linux/atomic.h>
> +#include <linux/clk.h>
> +#include <linux/device.h>
> +#include <linux/interrupt.h>
> +#include <linux/io.h>
> +#include <linux/iopoll.h>
> +#include <linux/mailbox_controller.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/platform_device.h>
> +
> +#define MBOX_CTL_TX_STS 0x0000
> +#define MBOX_CTL_IRQ_SET 0x0004
> +#define MBOX_CTL_IRQ_CLR 0x0074
> +#define MBOX_CTL_RX_STS 0x0078
> +
> +#define MBOX_FULL BIT(0) /* i.e. we've received data */
> +#define MBOX_CLOGGED BIT(1) /* i.e. the channel is shutdown */
> +
> +struct mtk_gpueb_mbox {
> + struct device *dev;
> + struct clk *clk;
> + void __iomem *mbox_mmio;
> + void __iomem *mbox_ctl;
> + void **rx_buf;
> + atomic_t *rx_status;
> + struct mbox_controller mbox;
> + unsigned int *chn;
> + int irq;
> + const struct mtk_gpueb_mbox_variant *v;
> +};
> +
> +struct mtk_gpueb_mbox_ch {
> + const char *name;
> + const int num;
> + const unsigned int tx_offset;
> + const unsigned int tx_len;
> + const unsigned int rx_offset;
> + const unsigned int rx_len;
> + const bool no_response;
> +};
> +
> +struct mtk_gpueb_mbox_variant {
> + unsigned int num_channels;
> + const struct mtk_gpueb_mbox_ch channels[] __counted_by(num_channels);
> +};
> +
> +/**
> + * mtk_gpueb_mbox_read_rx - read RX buffer from MMIO into ebm's RX buffer
> + * @ebm: pointer to &struct mtk_gpueb_mbox instance
> + * @channel: number of channel to read
> + */
> +static void mtk_gpueb_mbox_read_rx(struct mtk_gpueb_mbox *ebm,
> + unsigned int channel)
> +{
> + const struct mtk_gpueb_mbox_ch *ch;
> +
> + ch = &ebm->v->channels[channel];
> +
> + memcpy_fromio(ebm->rx_buf[channel], ebm->mbox_mmio + ch->rx_offset,
> + ch->rx_len);
> +
> +}
> +
> +static irqreturn_t mtk_gpueb_mbox_isr(int irq, void *data)
> +{
> + struct mtk_gpueb_mbox *ebm = data;
> + u32 rx_handled = 0;
> + u32 rx_sts;
> + int i;
> +
> + rx_sts = readl(ebm->mbox_ctl + MBOX_CTL_RX_STS);
> +
> + for (i = 0; i < ebm->v->num_channels; i++) {
> + if (rx_sts & BIT(i)) {
> + if (!atomic_cmpxchg(&ebm->rx_status[i], 0,
> + MBOX_FULL | MBOX_CLOGGED))
> + rx_handled |= BIT(i);
> + }
> + }
We can loop over bits that are set in rx_sts, if we expect that only a
few bits are set most of the time.
> +
> + writel(rx_handled, ebm->mbox_ctl + MBOX_CTL_IRQ_CLR);
> +
> + if (!(rx_sts ^ rx_handled))
"rx_sts == rx_handled" should be more direct.
> + return IRQ_WAKE_THREAD;
> +
> + dev_warn_ratelimited(ebm->dev, "spurious interrupts on 0x%04X\n",
> + rx_sts ^ rx_handled);
> + return IRQ_NONE;
It seems a bit too punishing when there are spurious interrupts. I
wonder if we should warn, but return IRQ_WAKE_THREAD as long as
rx_handled != 0.
Also, if another interrupt can fire before mtk_gpueb_mbox_thread runs,
that's data dropping rather than spurious interrupts.
> +}
> +
> +static irqreturn_t mtk_gpueb_mbox_thread(int irq, void *data)
> +{
> + struct mtk_gpueb_mbox *ebm = data;
> + irqreturn_t ret = IRQ_NONE;
> + int status;
> + int i;
> +
> + for (i = 0; i < ebm->v->num_channels; i++) {
> + status = atomic_cmpxchg(&ebm->rx_status[i],
> + MBOX_FULL | MBOX_CLOGGED, MBOX_FULL);
> + if (status == (MBOX_FULL | MBOX_CLOGGED)) {
We could also save rx_handled from mtk_gpueb_mbox_isr and loop over
bits that are set. If we do that, ebm->rx_status[i] is guaranteed to
be MBOX_FULL | MBOX_CLOGGED.
> + mtk_gpueb_mbox_read_rx(ebm, i);
> + mbox_chan_received_data(&ebm->mbox.chans[i],
> + ebm->rx_buf[i]);
It looks like we read the data and pass it on to the client
immediately. Does each channel need its own rx_buf?
> + /* FIXME: When does MBOX_FULL get cleared? Here? */
> + atomic_set(&ebm->rx_status[i], 0);
> + ret = IRQ_HANDLED;
> + }
> + }
> +
> + return ret;
> +}
> +
> +static int mtk_gpueb_mbox_send_data(struct mbox_chan *chan, void *data)
> +{
> + struct mtk_gpueb_mbox *ebm = dev_get_drvdata(chan->mbox->dev);
> + unsigned int *num = chan->con_priv;
> + int i;
> + u32 *values = data;
> +
> + if (*num >= ebm->v->num_channels)
> + return -ECHRNG;
Can this ever happen? (I am not familiar with the mbox subsystem)
> +
> + if (!ebm->v->channels[*num].no_response &&
> + atomic_read(&ebm->rx_status[*num]))
> + return -EBUSY;
When no_response is true, rx_status is 0. We probably don't need to
check no_response.
> +
> + writel(BIT(*num), ebm->mbox_ctl + MBOX_CTL_IRQ_CLR);
> +
> + /*
> + * We don't want any fancy nonsense, just write the 32-bit values in
> + * order. memcpy_toio/__iowrite32_copy don't work here, because fancy.
> + */
> + for (i = 0; i < ebm->v->channels[*num].tx_len; i += 4) {
> + writel(values[i / 4],
> + ebm->mbox_mmio + ebm->v->channels[*num].tx_offset + i);
> + }
> +
> + writel(BIT(*num), ebm->mbox_ctl + MBOX_CTL_IRQ_SET);
> +
> + return 0;
> +}
> +
> +static int mtk_gpueb_mbox_startup(struct mbox_chan *chan)
> +{
> + struct mtk_gpueb_mbox *ebm = dev_get_drvdata(chan->mbox->dev);
> + unsigned int *num = chan->con_priv;
> +
> + if (*num >= ebm->v->num_channels)
> + return -ECHRNG;
> +
> + atomic_set(&ebm->rx_status[*num], 0);
> +
> + return 0;
> +}
> +
> +static void mtk_gpueb_mbox_shutdown(struct mbox_chan *chan)
> +{
> + struct mtk_gpueb_mbox *ebm = dev_get_drvdata(chan->mbox->dev);
> + unsigned int *num = chan->con_priv;
> +
> + atomic_set(&ebm->rx_status[*num], MBOX_CLOGGED);
> +}
> +
> +static bool mtk_gpueb_mbox_last_tx_done(struct mbox_chan *chan)
> +{
> + struct mtk_gpueb_mbox *ebm = dev_get_drvdata(chan->mbox->dev);
> + unsigned int *num = chan->con_priv;
> +
> + return !(readl(ebm->mbox_ctl + MBOX_CTL_TX_STS) & BIT(*num));
> +}
> +
> +const struct mbox_chan_ops mtk_gpueb_mbox_ops = {
> + .send_data = mtk_gpueb_mbox_send_data,
> + .startup = mtk_gpueb_mbox_startup,
> + .shutdown = mtk_gpueb_mbox_shutdown,
> + .last_tx_done = mtk_gpueb_mbox_last_tx_done,
> +};
> +
> +static struct mbox_chan *
> +mtk_gpueb_mbox_of_xlate(struct mbox_controller *mbox,
> + const struct of_phandle_args *sp)
> +{
> + struct mtk_gpueb_mbox *ebm = dev_get_drvdata(mbox->dev);
> +
> + if (!sp->args_count)
> + return ERR_PTR(-EINVAL);
> +
> + if (sp->args[0] >= ebm->v->num_channels)
> + return ERR_PTR(-ECHRNG);
> +
> + return &mbox->chans[sp->args[0]];
> +}
> +
> +static int mtk_gpueb_mbox_probe(struct platform_device *pdev)
> +{
> + struct mtk_gpueb_mbox *ebm;
> + unsigned int rx_buf_sz;
> + void *buf;
> + unsigned int i;
> + int ret;
> +
> + ebm = devm_kzalloc(&pdev->dev, sizeof(*ebm), GFP_KERNEL);
> + if (!ebm)
> + return -ENOMEM;
> +
> + ebm->dev = &pdev->dev;
> + ebm->v = of_device_get_match_data(ebm->dev);
> +
> + dev_set_drvdata(ebm->dev, ebm);
> +
> + ebm->clk = devm_clk_get_enabled(ebm->dev, NULL);
> + if (IS_ERR(ebm->clk))
> + return dev_err_probe(ebm->dev, PTR_ERR(ebm->clk),
> + "Failed to get 'eb' clock\n");
> +
> + ebm->mbox_mmio = devm_platform_ioremap_resource_byname(pdev, "mbox");
> + if (IS_ERR(ebm->mbox_mmio))
> + return dev_err_probe(ebm->dev, PTR_ERR(ebm->mbox_mmio),
> + "Couldn't map mailbox registers\n");
> +
> + ebm->mbox_ctl = devm_platform_ioremap_resource_byname(pdev, "mbox_ctl");
> + if (IS_ERR(ebm->mbox_ctl))
> + return dev_err_probe(
> + ebm->dev, PTR_ERR(ebm->mbox_ctl),
> + "Couldn't map mailbox control registers\n");
> +
> + rx_buf_sz = (ebm->v->channels[ebm->v->num_channels - 1].rx_offset +
> + ebm->v->channels[ebm->v->num_channels - 1].rx_len);
rx is after tx in mmio. The first half of the space is wasted.
We follow mtk_gpueb_mbox_read_rx by mbox_chan_received_data. It seems
we only need max of rx_len's.
> +
> + buf = devm_kzalloc(ebm->dev, rx_buf_sz, GFP_KERNEL);
> + if (!buf)
> + return -ENOMEM;
> +
> + ebm->rx_buf = devm_kmalloc_array(ebm->dev, ebm->v->num_channels,
> + sizeof(*ebm->rx_buf), GFP_KERNEL);
> + if (!ebm->rx_buf)
> + return -ENOMEM;
> +
> + ebm->mbox.chans = devm_kcalloc(ebm->dev, ebm->v->num_channels,
> + sizeof(struct mbox_chan), GFP_KERNEL);
> + if (!ebm->mbox.chans)
> + return -ENOMEM;
> +
> + ebm->rx_status = devm_kcalloc(ebm->dev, ebm->v->num_channels,
> + sizeof(atomic_t), GFP_KERNEL);
> + if (!ebm->rx_status)
> + return -ENOMEM;
> +
> + ebm->chn = devm_kcalloc(ebm->dev, ebm->v->num_channels,
> + sizeof(*ebm->chn), GFP_KERNEL);
> +
> + for (i = 0; i < ebm->v->num_channels; i++) {
> + ebm->rx_buf[i] = buf + ebm->v->channels[i].rx_offset;
> + spin_lock_init(&ebm->mbox.chans[i].lock);
> + /* the things you do to avoid explicit casting void* */
I actually prefer an inline helper that casts chan->con_priv to the
channel number. Another option is "chan - ebm->mox.chans".
> + ebm->chn[i] = i;
> + ebm->mbox.chans[i].con_priv = &ebm->chn[i];
> + atomic_set(&ebm->rx_status[i], MBOX_CLOGGED);
> + }
> +
> + ebm->mbox.dev = ebm->dev;
> + ebm->mbox.num_chans = ebm->v->num_channels;
> + ebm->mbox.txdone_poll = true;
> + ebm->mbox.txpoll_period = 0; /* minimum hrtimer interval */
> + ebm->mbox.of_xlate = mtk_gpueb_mbox_of_xlate;
> + ebm->mbox.ops = &mtk_gpueb_mbox_ops;
> +
> + ebm->irq = platform_get_irq(pdev, 0);
> + if (ebm->irq < 0)
> + return ebm->irq;
> +
> + ret = devm_request_threaded_irq(ebm->dev, ebm->irq, mtk_gpueb_mbox_isr,
> + mtk_gpueb_mbox_thread, 0, NULL, ebm);
> + if (ret)
> + return dev_err_probe(ebm->dev, ret, "failed to request IRQ\n");
> +
> + ret = devm_mbox_controller_register(ebm->dev, &ebm->mbox);
> +
> + return 0;
> +}
> +
> +static const struct mtk_gpueb_mbox_variant mtk_gpueb_mbox_mt8196 = {
> + .num_channels = 12,
> + .channels = {
> + { "fast_dvfs_event", 0, 0x0000, 16, 0x00e0, 16, false },
> + { "gpufreq", 1, 0x0010, 32, 0x00f0, 32, false },
> + { "sleep", 2, 0x0030, 12, 0x0110, 4, true },
> + { "timer", 3, 0x003c, 24, 0x0114, 4, false },
> + { "fhctl", 4, 0x0054, 36, 0x0118, 4, false },
> + { "ccf", 5, 0x0078, 16, 0x011c, 16, false },
> + { "gpumpu", 6, 0x0088, 24, 0x012c, 4, false },
> + { "fast_dvfs", 7, 0x00a0, 24, 0x0130, 24, false },
> + { "ipir_c_met", 8, 0x00b8, 4, 0x0148, 16, false },
> + { "ipis_c_met", 9, 0x00bc, 16, 0x0158, 4, false },
> + { "brisket", 10, 0x00cc, 16, 0x015c, 16, false },
> + { "ppb", 11, 0x00dc, 4, 0x016c, 4, false },
> + },
> +};
> +
> +static const struct of_device_id mtk_gpueb_mbox_of_ids[] = {
> + { .compatible = "mediatek,mt8196-gpueb-mbox",
> + .data = &mtk_gpueb_mbox_mt8196 },
> + { /* Sentinel */ }
> +};
> +MODULE_DEVICE_TABLE(of, mtk_gpueb_mbox_of_ids);
> +
> +static struct platform_driver mtk_gpueb_mbox_drv = {
> + .probe = mtk_gpueb_mbox_probe,
> + .driver = {
> + .name = "mtk-gpueb-mbox",
> + .of_match_table = mtk_gpueb_mbox_of_ids,
> + }
> +};
> +module_platform_driver(mtk_gpueb_mbox_drv);
> +
> +MODULE_AUTHOR("Nicolas Frattaroli <nicolas.frattaroli@collabora.com>");
> +MODULE_DESCRIPTION("MediaTek GPUEB mailbox driver for SoCs such as the MT8196");
> +MODULE_LICENSE("GPL");
>
> --
> 2.51.0
>
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [PATCH RFC 05/10] mailbox: add MediaTek GPUEB IPI mailbox
2025-09-12 4:48 ` Chia-I Wu
@ 2025-09-12 6:11 ` Chen-Yu Tsai
2025-09-12 10:59 ` Nicolas Frattaroli
1 sibling, 0 replies; 30+ messages in thread
From: Chen-Yu Tsai @ 2025-09-12 6:11 UTC (permalink / raw)
To: Chia-I Wu, Nicolas Frattaroli
Cc: AngeloGioacchino Del Regno, Boris Brezillon, Steven Price,
Liviu Dudau, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Matthias Brugger, MyungJoo Ham, Kyungmin Park,
Chanwoo Choi, Jassi Brar, Kees Cook, Gustavo A. R. Silva, kernel,
dri-devel, devicetree, linux-kernel, linux-arm-kernel,
linux-mediatek, linux-pm, linux-hardening
On Fri, Sep 12, 2025 at 12:48 PM Chia-I Wu <olvaffe@gmail.com> wrote:
>
> On Fri, Sep 5, 2025 at 3:24 AM Nicolas Frattaroli
> <nicolas.frattaroli@collabora.com> wrote:
> >
> > The MT8196 SoC uses an embedded MCU to control frequencies and power of
> > the GPU. This controller is referred to as "GPUEB".
> >
> > It communicates to the application processor, among other ways, through
> > a mailbox.
> >
> > The mailbox exposes one interrupt, which appears to only be fired when a
> > response is received, rather than a transaction is completed. For us,
> > this means we unfortunately need to poll for txdone.
> >
> > The mailbox also requires the EB clock to be on when touching any of the
> > mailbox registers.
> >
> > Add a simple driver for it based on the common mailbox framework.
> >
> > Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> > ---
> > drivers/mailbox/Kconfig | 10 ++
> > drivers/mailbox/Makefile | 2 +
> > drivers/mailbox/mtk-gpueb-mailbox.c | 330 ++++++++++++++++++++++++++++++++++++
> > 3 files changed, 342 insertions(+)
> >
> > diff --git a/drivers/mailbox/Kconfig b/drivers/mailbox/Kconfig
> > index 02432d4a5ccd46a16156a09c7f277fb03e4013f5..2016defda1fabb5c0fcc8078f84a52d4e4e00167 100644
> > --- a/drivers/mailbox/Kconfig
> > +++ b/drivers/mailbox/Kconfig
> > @@ -294,6 +294,16 @@ config MTK_CMDQ_MBOX
> > critical time limitation, such as updating display configuration
> > during the vblank.
> >
> > +config MTK_GPUEB_MBOX
> > + tristate "MediaTek GPUEB Mailbox Support"
> > + depends on ARCH_MEDIATEK || COMPILE_TEST
> > + help
> > + The MediaTek GPUEB mailbox is used to communicate with the embedded
> > + controller in charge of GPU frequency and power management on some
> > + MediaTek SoCs, such as the MT8196.
> > + Say Y or m here if you want to support the MT8196 SoC in your kernel
> > + build.
> > +
> > config ZYNQMP_IPI_MBOX
> > tristate "Xilinx ZynqMP IPI Mailbox"
> > depends on ARCH_ZYNQMP && OF
> > diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
> > index 98a68f838486eed117d24296138bf59fda3f92e4..564d06e71313e6d1972e4a6036e1e78c8c7ec450 100644
> > --- a/drivers/mailbox/Makefile
> > +++ b/drivers/mailbox/Makefile
> > @@ -63,6 +63,8 @@ obj-$(CONFIG_MTK_ADSP_MBOX) += mtk-adsp-mailbox.o
> >
> > obj-$(CONFIG_MTK_CMDQ_MBOX) += mtk-cmdq-mailbox.o
> >
> > +obj-$(CONFIG_MTK_GPUEB_MBOX) += mtk-gpueb-mailbox.o
> > +
> > obj-$(CONFIG_ZYNQMP_IPI_MBOX) += zynqmp-ipi-mailbox.o
> >
> > obj-$(CONFIG_SUN6I_MSGBOX) += sun6i-msgbox.o
> > diff --git a/drivers/mailbox/mtk-gpueb-mailbox.c b/drivers/mailbox/mtk-gpueb-mailbox.c
> > new file mode 100644
> > index 0000000000000000000000000000000000000000..0236fb358136e434a09a21ef293fe949ced94123
> > --- /dev/null
> > +++ b/drivers/mailbox/mtk-gpueb-mailbox.c
> > @@ -0,0 +1,330 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * MediaTek GPUEB mailbox driver for SoCs such as the MT8196
> > + *
> > + * Copyright (C) 2025, Collabora Ltd.
> > + *
> > + * Developers harmed in the making of this driver:
> > + * - Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> > + */
> > +
> > +#include <linux/atomic.h>
> > +#include <linux/clk.h>
> > +#include <linux/device.h>
> > +#include <linux/interrupt.h>
> > +#include <linux/io.h>
> > +#include <linux/iopoll.h>
> > +#include <linux/mailbox_controller.h>
> > +#include <linux/module.h>
> > +#include <linux/of.h>
> > +#include <linux/platform_device.h>
> > +
> > +#define MBOX_CTL_TX_STS 0x0000
> > +#define MBOX_CTL_IRQ_SET 0x0004
> > +#define MBOX_CTL_IRQ_CLR 0x0074
> > +#define MBOX_CTL_RX_STS 0x0078
> > +
> > +#define MBOX_FULL BIT(0) /* i.e. we've received data */
> > +#define MBOX_CLOGGED BIT(1) /* i.e. the channel is shutdown */
> > +
> > +struct mtk_gpueb_mbox {
> > + struct device *dev;
> > + struct clk *clk;
> > + void __iomem *mbox_mmio;
> > + void __iomem *mbox_ctl;
> > + void **rx_buf;
> > + atomic_t *rx_status;
> > + struct mbox_controller mbox;
> > + unsigned int *chn;
> > + int irq;
> > + const struct mtk_gpueb_mbox_variant *v;
> > +};
> > +
> > +struct mtk_gpueb_mbox_ch {
> > + const char *name;
> > + const int num;
> > + const unsigned int tx_offset;
> > + const unsigned int tx_len;
> > + const unsigned int rx_offset;
> > + const unsigned int rx_len;
> > + const bool no_response;
> > +};
> > +
> > +struct mtk_gpueb_mbox_variant {
> > + unsigned int num_channels;
> > + const struct mtk_gpueb_mbox_ch channels[] __counted_by(num_channels);
> > +};
> > +
> > +/**
> > + * mtk_gpueb_mbox_read_rx - read RX buffer from MMIO into ebm's RX buffer
> > + * @ebm: pointer to &struct mtk_gpueb_mbox instance
> > + * @channel: number of channel to read
> > + */
> > +static void mtk_gpueb_mbox_read_rx(struct mtk_gpueb_mbox *ebm,
> > + unsigned int channel)
> > +{
> > + const struct mtk_gpueb_mbox_ch *ch;
> > +
> > + ch = &ebm->v->channels[channel];
> > +
> > + memcpy_fromio(ebm->rx_buf[channel], ebm->mbox_mmio + ch->rx_offset,
> > + ch->rx_len);
> > +
> > +}
> > +
> > +static irqreturn_t mtk_gpueb_mbox_isr(int irq, void *data)
> > +{
> > + struct mtk_gpueb_mbox *ebm = data;
> > + u32 rx_handled = 0;
> > + u32 rx_sts;
> > + int i;
> > +
> > + rx_sts = readl(ebm->mbox_ctl + MBOX_CTL_RX_STS);
> > +
> > + for (i = 0; i < ebm->v->num_channels; i++) {
> > + if (rx_sts & BIT(i)) {
> > + if (!atomic_cmpxchg(&ebm->rx_status[i], 0,
> > + MBOX_FULL | MBOX_CLOGGED))
> > + rx_handled |= BIT(i);
> > + }
> > + }
> We can loop over bits that are set in rx_sts, if we expect that only a
> few bits are set most of the time.
>
> > +
> > + writel(rx_handled, ebm->mbox_ctl + MBOX_CTL_IRQ_CLR);
With the VCP, clearing the interrupt flag doubles as an ack to the
coprocessor. I hope that's not the case here.
> > +
> > + if (!(rx_sts ^ rx_handled))
> "rx_sts == rx_handled" should be more direct.
> > + return IRQ_WAKE_THREAD;
> > +
> > + dev_warn_ratelimited(ebm->dev, "spurious interrupts on 0x%04X\n",
> > + rx_sts ^ rx_handled);
> > + return IRQ_NONE;
> It seems a bit too punishing when there are spurious interrupts. I
> wonder if we should warn, but return IRQ_WAKE_THREAD as long as
> rx_handled != 0.
>
> Also, if another interrupt can fire before mtk_gpueb_mbox_thread runs,
> that's data dropping rather than spurious interrupts.
+1
Either the channel is free and a message can be sent over, or some
message was received but hasn't been processed yet. Acking the
reception without processing the message invites data losses and/or
uncertain latency if you queue messages within the kernel driver.
> > +}
> > +
> > +static irqreturn_t mtk_gpueb_mbox_thread(int irq, void *data)
> > +{
> > + struct mtk_gpueb_mbox *ebm = data;
> > + irqreturn_t ret = IRQ_NONE;
> > + int status;
> > + int i;
> > +
> > + for (i = 0; i < ebm->v->num_channels; i++) {
> > + status = atomic_cmpxchg(&ebm->rx_status[i],
> > + MBOX_FULL | MBOX_CLOGGED, MBOX_FULL);
> > + if (status == (MBOX_FULL | MBOX_CLOGGED)) {
> We could also save rx_handled from mtk_gpueb_mbox_isr and loop over
> bits that are set. If we do that, ebm->rx_status[i] is guaranteed to
> be MBOX_FULL | MBOX_CLOGGED.
>
> > + mtk_gpueb_mbox_read_rx(ebm, i);
This needs to be done _before_ you ack the interrupt. Once you ack the
channel interrupt, presumably it is free for the other side to send
again.
> > + mbox_chan_received_data(&ebm->mbox.chans[i],
> > + ebm->rx_buf[i]);
> It looks like we read the data and pass it on to the client
> immediately. Does each channel need its own rx_buf?
AFAICT rx_buf is already split by channel?
And if the client has to defer processing, it should make a copy.
> > + /* FIXME: When does MBOX_FULL get cleared? Here? */
> > + atomic_set(&ebm->rx_status[i], 0);
> > + ret = IRQ_HANDLED;
> > + }
> > + }
> > +
> > + return ret;
> > +}
> > +
> > +static int mtk_gpueb_mbox_send_data(struct mbox_chan *chan, void *data)
> > +{
> > + struct mtk_gpueb_mbox *ebm = dev_get_drvdata(chan->mbox->dev);
> > + unsigned int *num = chan->con_priv;
> > + int i;
> > + u32 *values = data;
> > +
> > + if (*num >= ebm->v->num_channels)
> > + return -ECHRNG;
> Can this ever happen? (I am not familiar with the mbox subsystem)
AFAIK it can't. The client has to first request the channel. The provider
should check and error out then.
> > +
> > + if (!ebm->v->channels[*num].no_response &&
> > + atomic_read(&ebm->rx_status[*num]))
> > + return -EBUSY;
> When no_response is true, rx_status is 0. We probably don't need to
> check no_response.
>
> > +
> > + writel(BIT(*num), ebm->mbox_ctl + MBOX_CTL_IRQ_CLR);
> > +
> > + /*
> > + * We don't want any fancy nonsense, just write the 32-bit values in
> > + * order. memcpy_toio/__iowrite32_copy don't work here, because fancy.
> > + */
> > + for (i = 0; i < ebm->v->channels[*num].tx_len; i += 4) {
> > + writel(values[i / 4],
> > + ebm->mbox_mmio + ebm->v->channels[*num].tx_offset + i);
> > + }
> > +
> > + writel(BIT(*num), ebm->mbox_ctl + MBOX_CTL_IRQ_SET);
> > +
> > + return 0;
> > +}
> > +
> > +static int mtk_gpueb_mbox_startup(struct mbox_chan *chan)
> > +{
> > + struct mtk_gpueb_mbox *ebm = dev_get_drvdata(chan->mbox->dev);
> > + unsigned int *num = chan->con_priv;
> > +
> > + if (*num >= ebm->v->num_channels)
> > + return -ECHRNG;
> > +
> > + atomic_set(&ebm->rx_status[*num], 0);
> > +
> > + return 0;
> > +}
> > +
> > +static void mtk_gpueb_mbox_shutdown(struct mbox_chan *chan)
> > +{
> > + struct mtk_gpueb_mbox *ebm = dev_get_drvdata(chan->mbox->dev);
> > + unsigned int *num = chan->con_priv;
> > +
> > + atomic_set(&ebm->rx_status[*num], MBOX_CLOGGED);
> > +}
> > +
> > +static bool mtk_gpueb_mbox_last_tx_done(struct mbox_chan *chan)
> > +{
> > + struct mtk_gpueb_mbox *ebm = dev_get_drvdata(chan->mbox->dev);
> > + unsigned int *num = chan->con_priv;
> > +
> > + return !(readl(ebm->mbox_ctl + MBOX_CTL_TX_STS) & BIT(*num));
> > +}
> > +
> > +const struct mbox_chan_ops mtk_gpueb_mbox_ops = {
> > + .send_data = mtk_gpueb_mbox_send_data,
> > + .startup = mtk_gpueb_mbox_startup,
> > + .shutdown = mtk_gpueb_mbox_shutdown,
> > + .last_tx_done = mtk_gpueb_mbox_last_tx_done,
> > +};
> > +
> > +static struct mbox_chan *
> > +mtk_gpueb_mbox_of_xlate(struct mbox_controller *mbox,
> > + const struct of_phandle_args *sp)
> > +{
> > + struct mtk_gpueb_mbox *ebm = dev_get_drvdata(mbox->dev);
> > +
> > + if (!sp->args_count)
> > + return ERR_PTR(-EINVAL);
> > +
> > + if (sp->args[0] >= ebm->v->num_channels)
> > + return ERR_PTR(-ECHRNG);
> > +
> > + return &mbox->chans[sp->args[0]];
> > +}
> > +
> > +static int mtk_gpueb_mbox_probe(struct platform_device *pdev)
> > +{
> > + struct mtk_gpueb_mbox *ebm;
> > + unsigned int rx_buf_sz;
> > + void *buf;
> > + unsigned int i;
> > + int ret;
> > +
> > + ebm = devm_kzalloc(&pdev->dev, sizeof(*ebm), GFP_KERNEL);
> > + if (!ebm)
> > + return -ENOMEM;
> > +
> > + ebm->dev = &pdev->dev;
> > + ebm->v = of_device_get_match_data(ebm->dev);
> > +
> > + dev_set_drvdata(ebm->dev, ebm);
> > +
> > + ebm->clk = devm_clk_get_enabled(ebm->dev, NULL);
> > + if (IS_ERR(ebm->clk))
> > + return dev_err_probe(ebm->dev, PTR_ERR(ebm->clk),
> > + "Failed to get 'eb' clock\n");
> > +
> > + ebm->mbox_mmio = devm_platform_ioremap_resource_byname(pdev, "mbox");
> > + if (IS_ERR(ebm->mbox_mmio))
> > + return dev_err_probe(ebm->dev, PTR_ERR(ebm->mbox_mmio),
> > + "Couldn't map mailbox registers\n");
> > +
> > + ebm->mbox_ctl = devm_platform_ioremap_resource_byname(pdev, "mbox_ctl");
> > + if (IS_ERR(ebm->mbox_ctl))
> > + return dev_err_probe(
> > + ebm->dev, PTR_ERR(ebm->mbox_ctl),
> > + "Couldn't map mailbox control registers\n");
> > +
> > + rx_buf_sz = (ebm->v->channels[ebm->v->num_channels - 1].rx_offset +
> > + ebm->v->channels[ebm->v->num_channels - 1].rx_len);
> rx is after tx in mmio. The first half of the space is wasted.
>
> We follow mtk_gpueb_mbox_read_rx by mbox_chan_received_data. It seems
> we only need max of rx_len's.
>
> > +
> > + buf = devm_kzalloc(ebm->dev, rx_buf_sz, GFP_KERNEL);
> > + if (!buf)
> > + return -ENOMEM;
> > +
> > + ebm->rx_buf = devm_kmalloc_array(ebm->dev, ebm->v->num_channels,
> > + sizeof(*ebm->rx_buf), GFP_KERNEL);
> > + if (!ebm->rx_buf)
> > + return -ENOMEM;
> > +
> > + ebm->mbox.chans = devm_kcalloc(ebm->dev, ebm->v->num_channels,
> > + sizeof(struct mbox_chan), GFP_KERNEL);
> > + if (!ebm->mbox.chans)
> > + return -ENOMEM;
> > +
> > + ebm->rx_status = devm_kcalloc(ebm->dev, ebm->v->num_channels,
> > + sizeof(atomic_t), GFP_KERNEL);
> > + if (!ebm->rx_status)
> > + return -ENOMEM;
> > +
> > + ebm->chn = devm_kcalloc(ebm->dev, ebm->v->num_channels,
> > + sizeof(*ebm->chn), GFP_KERNEL);
> > +
> > + for (i = 0; i < ebm->v->num_channels; i++) {
> > + ebm->rx_buf[i] = buf + ebm->v->channels[i].rx_offset;
> > + spin_lock_init(&ebm->mbox.chans[i].lock);
> > + /* the things you do to avoid explicit casting void* */
> I actually prefer an inline helper that casts chan->con_priv to the
> channel number. Another option is "chan - ebm->mox.chans".
> > + ebm->chn[i] = i;
> > + ebm->mbox.chans[i].con_priv = &ebm->chn[i];
> > + atomic_set(&ebm->rx_status[i], MBOX_CLOGGED);
> > + }
> > +
> > + ebm->mbox.dev = ebm->dev;
> > + ebm->mbox.num_chans = ebm->v->num_channels;
> > + ebm->mbox.txdone_poll = true;
> > + ebm->mbox.txpoll_period = 0; /* minimum hrtimer interval */
> > + ebm->mbox.of_xlate = mtk_gpueb_mbox_of_xlate;
> > + ebm->mbox.ops = &mtk_gpueb_mbox_ops;
> > +
> > + ebm->irq = platform_get_irq(pdev, 0);
> > + if (ebm->irq < 0)
> > + return ebm->irq;
> > +
> > + ret = devm_request_threaded_irq(ebm->dev, ebm->irq, mtk_gpueb_mbox_isr,
> > + mtk_gpueb_mbox_thread, 0, NULL, ebm);
> > + if (ret)
> > + return dev_err_probe(ebm->dev, ret, "failed to request IRQ\n");
> > +
> > + ret = devm_mbox_controller_register(ebm->dev, &ebm->mbox);
> > +
> > + return 0;
> > +}
> > +
> > +static const struct mtk_gpueb_mbox_variant mtk_gpueb_mbox_mt8196 = {
> > + .num_channels = 12,
> > + .channels = {
> > + { "fast_dvfs_event", 0, 0x0000, 16, 0x00e0, 16, false },
> > + { "gpufreq", 1, 0x0010, 32, 0x00f0, 32, false },
> > + { "sleep", 2, 0x0030, 12, 0x0110, 4, true },
> > + { "timer", 3, 0x003c, 24, 0x0114, 4, false },
> > + { "fhctl", 4, 0x0054, 36, 0x0118, 4, false },
> > + { "ccf", 5, 0x0078, 16, 0x011c, 16, false },
> > + { "gpumpu", 6, 0x0088, 24, 0x012c, 4, false },
> > + { "fast_dvfs", 7, 0x00a0, 24, 0x0130, 24, false },
> > + { "ipir_c_met", 8, 0x00b8, 4, 0x0148, 16, false },
> > + { "ipis_c_met", 9, 0x00bc, 16, 0x0158, 4, false },
> > + { "brisket", 10, 0x00cc, 16, 0x015c, 16, false },
> > + { "ppb", 11, 0x00dc, 4, 0x016c, 4, false },
> > + },
> > +};
> > +
> > +static const struct of_device_id mtk_gpueb_mbox_of_ids[] = {
> > + { .compatible = "mediatek,mt8196-gpueb-mbox",
> > + .data = &mtk_gpueb_mbox_mt8196 },
> > + { /* Sentinel */ }
> > +};
> > +MODULE_DEVICE_TABLE(of, mtk_gpueb_mbox_of_ids);
> > +
> > +static struct platform_driver mtk_gpueb_mbox_drv = {
> > + .probe = mtk_gpueb_mbox_probe,
> > + .driver = {
> > + .name = "mtk-gpueb-mbox",
> > + .of_match_table = mtk_gpueb_mbox_of_ids,
> > + }
> > +};
> > +module_platform_driver(mtk_gpueb_mbox_drv);
> > +
> > +MODULE_AUTHOR("Nicolas Frattaroli <nicolas.frattaroli@collabora.com>");
> > +MODULE_DESCRIPTION("MediaTek GPUEB mailbox driver for SoCs such as the MT8196");
> > +MODULE_LICENSE("GPL");
> >
> > --
> > 2.51.0
> >
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [PATCH RFC 05/10] mailbox: add MediaTek GPUEB IPI mailbox
2025-09-12 4:48 ` Chia-I Wu
2025-09-12 6:11 ` Chen-Yu Tsai
@ 2025-09-12 10:59 ` Nicolas Frattaroli
2025-09-12 17:30 ` Chia-I Wu
1 sibling, 1 reply; 30+ messages in thread
From: Nicolas Frattaroli @ 2025-09-12 10:59 UTC (permalink / raw)
To: Chia-I Wu
Cc: AngeloGioacchino Del Regno, Boris Brezillon, Steven Price,
Liviu Dudau, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Matthias Brugger, MyungJoo Ham, Kyungmin Park,
Chanwoo Choi, Jassi Brar, Kees Cook, Gustavo A. R. Silva,
Chen-Yu Tsai, kernel, dri-devel, devicetree, linux-kernel,
linux-arm-kernel, linux-mediatek, linux-pm, linux-hardening
On Friday, 12 September 2025 06:48:17 Central European Summer Time Chia-I Wu wrote:
> On Fri, Sep 5, 2025 at 3:24 AM Nicolas Frattaroli
> <nicolas.frattaroli@collabora.com> wrote:
> >
> > The MT8196 SoC uses an embedded MCU to control frequencies and power of
> > the GPU. This controller is referred to as "GPUEB".
> >
> > It communicates to the application processor, among other ways, through
> > a mailbox.
> >
> > The mailbox exposes one interrupt, which appears to only be fired when a
> > response is received, rather than a transaction is completed. For us,
> > this means we unfortunately need to poll for txdone.
> >
> > The mailbox also requires the EB clock to be on when touching any of the
> > mailbox registers.
> >
> > Add a simple driver for it based on the common mailbox framework.
> >
> > Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> > ---
> > drivers/mailbox/Kconfig | 10 ++
> > drivers/mailbox/Makefile | 2 +
> > drivers/mailbox/mtk-gpueb-mailbox.c | 330 ++++++++++++++++++++++++++++++++++++
> > 3 files changed, 342 insertions(+)
> >
> > diff --git a/drivers/mailbox/Kconfig b/drivers/mailbox/Kconfig
> > index 02432d4a5ccd46a16156a09c7f277fb03e4013f5..2016defda1fabb5c0fcc8078f84a52d4e4e00167 100644
> > --- a/drivers/mailbox/Kconfig
> > +++ b/drivers/mailbox/Kconfig
> > @@ -294,6 +294,16 @@ config MTK_CMDQ_MBOX
> > critical time limitation, such as updating display configuration
> > during the vblank.
> >
> > +config MTK_GPUEB_MBOX
> > + tristate "MediaTek GPUEB Mailbox Support"
> > + depends on ARCH_MEDIATEK || COMPILE_TEST
> > + help
> > + The MediaTek GPUEB mailbox is used to communicate with the embedded
> > + controller in charge of GPU frequency and power management on some
> > + MediaTek SoCs, such as the MT8196.
> > + Say Y or m here if you want to support the MT8196 SoC in your kernel
> > + build.
> > +
> > config ZYNQMP_IPI_MBOX
> > tristate "Xilinx ZynqMP IPI Mailbox"
> > depends on ARCH_ZYNQMP && OF
> > diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
> > index 98a68f838486eed117d24296138bf59fda3f92e4..564d06e71313e6d1972e4a6036e1e78c8c7ec450 100644
> > --- a/drivers/mailbox/Makefile
> > +++ b/drivers/mailbox/Makefile
> > @@ -63,6 +63,8 @@ obj-$(CONFIG_MTK_ADSP_MBOX) += mtk-adsp-mailbox.o
> >
> > obj-$(CONFIG_MTK_CMDQ_MBOX) += mtk-cmdq-mailbox.o
> >
> > +obj-$(CONFIG_MTK_GPUEB_MBOX) += mtk-gpueb-mailbox.o
> > +
> > obj-$(CONFIG_ZYNQMP_IPI_MBOX) += zynqmp-ipi-mailbox.o
> >
> > obj-$(CONFIG_SUN6I_MSGBOX) += sun6i-msgbox.o
> > diff --git a/drivers/mailbox/mtk-gpueb-mailbox.c b/drivers/mailbox/mtk-gpueb-mailbox.c
> > new file mode 100644
> > index 0000000000000000000000000000000000000000..0236fb358136e434a09a21ef293fe949ced94123
> > --- /dev/null
> > +++ b/drivers/mailbox/mtk-gpueb-mailbox.c
> > @@ -0,0 +1,330 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * MediaTek GPUEB mailbox driver for SoCs such as the MT8196
> > + *
> > + * Copyright (C) 2025, Collabora Ltd.
> > + *
> > + * Developers harmed in the making of this driver:
> > + * - Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> > + */
> > +
> > +#include <linux/atomic.h>
> > +#include <linux/clk.h>
> > +#include <linux/device.h>
> > +#include <linux/interrupt.h>
> > +#include <linux/io.h>
> > +#include <linux/iopoll.h>
> > +#include <linux/mailbox_controller.h>
> > +#include <linux/module.h>
> > +#include <linux/of.h>
> > +#include <linux/platform_device.h>
> > +
> > +#define MBOX_CTL_TX_STS 0x0000
> > +#define MBOX_CTL_IRQ_SET 0x0004
> > +#define MBOX_CTL_IRQ_CLR 0x0074
> > +#define MBOX_CTL_RX_STS 0x0078
> > +
> > +#define MBOX_FULL BIT(0) /* i.e. we've received data */
> > +#define MBOX_CLOGGED BIT(1) /* i.e. the channel is shutdown */
> > +
> > +struct mtk_gpueb_mbox {
> > + struct device *dev;
> > + struct clk *clk;
> > + void __iomem *mbox_mmio;
> > + void __iomem *mbox_ctl;
> > + void **rx_buf;
> > + atomic_t *rx_status;
> > + struct mbox_controller mbox;
> > + unsigned int *chn;
> > + int irq;
> > + const struct mtk_gpueb_mbox_variant *v;
> > +};
> > +
> > +struct mtk_gpueb_mbox_ch {
> > + const char *name;
> > + const int num;
> > + const unsigned int tx_offset;
> > + const unsigned int tx_len;
> > + const unsigned int rx_offset;
> > + const unsigned int rx_len;
> > + const bool no_response;
> > +};
> > +
> > +struct mtk_gpueb_mbox_variant {
> > + unsigned int num_channels;
> > + const struct mtk_gpueb_mbox_ch channels[] __counted_by(num_channels);
> > +};
> > +
> > +/**
> > + * mtk_gpueb_mbox_read_rx - read RX buffer from MMIO into ebm's RX buffer
> > + * @ebm: pointer to &struct mtk_gpueb_mbox instance
> > + * @channel: number of channel to read
> > + */
> > +static void mtk_gpueb_mbox_read_rx(struct mtk_gpueb_mbox *ebm,
> > + unsigned int channel)
> > +{
> > + const struct mtk_gpueb_mbox_ch *ch;
> > +
> > + ch = &ebm->v->channels[channel];
> > +
> > + memcpy_fromio(ebm->rx_buf[channel], ebm->mbox_mmio + ch->rx_offset,
> > + ch->rx_len);
> > +
> > +}
> > +
> > +static irqreturn_t mtk_gpueb_mbox_isr(int irq, void *data)
> > +{
> > + struct mtk_gpueb_mbox *ebm = data;
> > + u32 rx_handled = 0;
> > + u32 rx_sts;
> > + int i;
> > +
> > + rx_sts = readl(ebm->mbox_ctl + MBOX_CTL_RX_STS);
> > +
> > + for (i = 0; i < ebm->v->num_channels; i++) {
> > + if (rx_sts & BIT(i)) {
> > + if (!atomic_cmpxchg(&ebm->rx_status[i], 0,
> > + MBOX_FULL | MBOX_CLOGGED))
> > + rx_handled |= BIT(i);
> > + }
> > + }
> We can loop over bits that are set in rx_sts, if we expect that only a
> few bits are set most of the time.
Could you elaborate on your preferred approach? num_channels will be
smaller than the bit width of rx_sts. I could loop from __fls down to
0 checking for (!rx_sts ^ rx_handled) for an early exit, but I'm not
sure if this microoptimisation is what you meant or if that even makes
much sense instruction cost wise.
Alternatively, I could loop from __fss to num_channels.
Is there some commonly used hyper-optimised "loop over set bits" macro
that I'm unfamiliar with?
> > +
> > + writel(rx_handled, ebm->mbox_ctl + MBOX_CTL_IRQ_CLR);
> > +
> > + if (!(rx_sts ^ rx_handled))
> "rx_sts == rx_handled" should be more direct.
Good point, will change.
> > + return IRQ_WAKE_THREAD;
> > +
> > + dev_warn_ratelimited(ebm->dev, "spurious interrupts on 0x%04X\n",
> > + rx_sts ^ rx_handled);
> > + return IRQ_NONE;
> It seems a bit too punishing when there are spurious interrupts. I
> wonder if we should warn, but return IRQ_WAKE_THREAD as long as
> rx_handled != 0.
>
> Also, if another interrupt can fire before mtk_gpueb_mbox_thread runs,
> that's data dropping rather than spurious interrupts.
>
Yeah, I agree that this is bad. As wens pointed out, my IRQ clearing
earlier would most certainly open us up to losing data. I'll
definitely look into changing this; iirc the current code resulted
from me being unsure whether partially handled IRQs should still
be counted as a handled IRQ or not.
> > +}
> > +
> > +static irqreturn_t mtk_gpueb_mbox_thread(int irq, void *data)
> > +{
> > + struct mtk_gpueb_mbox *ebm = data;
> > + irqreturn_t ret = IRQ_NONE;
> > + int status;
> > + int i;
> > +
> > + for (i = 0; i < ebm->v->num_channels; i++) {
> > + status = atomic_cmpxchg(&ebm->rx_status[i],
> > + MBOX_FULL | MBOX_CLOGGED, MBOX_FULL);
> > + if (status == (MBOX_FULL | MBOX_CLOGGED)) {
> We could also save rx_handled from mtk_gpueb_mbox_isr and loop over
> bits that are set. If we do that, ebm->rx_status[i] is guaranteed to
> be MBOX_FULL | MBOX_CLOGGED.
>
Wouldn't storing rx_handled open us up to data races with unpleasant
code to try and prevent them? As far as I understand, the ISR is allowed
to fire again before the thread has completed. Having the rx_status bits
as individual atomics gives us finer granularity than needing to protect
the entire set of bits.
If there is some way to pass data from the ISR to the awoken thread
directly as a per-invocation parameter rather than a squirreled away
member of the driver private struct, then I agree doing it your way
seems better.
> > + mtk_gpueb_mbox_read_rx(ebm, i);
> > + mbox_chan_received_data(&ebm->mbox.chans[i],
> > + ebm->rx_buf[i]);
> It looks like we read the data and pass it on to the client
> immediately. Does each channel need its own rx_buf?
>
Yeah, common mailbox framework does not do a copy here, it just passes
the pointer. Which is good, because nobody likes copies, and it
prevents the mailbox core from needing to know the size of received
data.
So the individual rx_bufs are needed to prevent channel cross-talk.
> > + /* FIXME: When does MBOX_FULL get cleared? Here? */
> > + atomic_set(&ebm->rx_status[i], 0);
> > + ret = IRQ_HANDLED;
> > + }
> > + }
> > +
> > + return ret;
> > +}
> > +
> > +static int mtk_gpueb_mbox_send_data(struct mbox_chan *chan, void *data)
> > +{
> > + struct mtk_gpueb_mbox *ebm = dev_get_drvdata(chan->mbox->dev);
> > + unsigned int *num = chan->con_priv;
> > + int i;
> > + u32 *values = data;
> > +
> > + if (*num >= ebm->v->num_channels)
> > + return -ECHRNG;
> Can this ever happen? (I am not familiar with the mbox subsystem)
Only if someone with a pointer to chan modifies con_priv outside of
what the mailbox driver set. It's an unlikely scenario, so I do think
I can remove this check, but the downside is that if someone ever does
manage to do this then the next line is a convenient read primitive.
> > +
> > + if (!ebm->v->channels[*num].no_response &&
> > + atomic_read(&ebm->rx_status[*num]))
> > + return -EBUSY;
> When no_response is true, rx_status is 0. We probably don't need to
> check no_response.
Now that I'm thinking about it, I can probably get rid of no_response
altogether. It was originally added because I conflated txdone with
response received, and one particular command on the sleep channel,
namely powering off the MFG, is unable to ever cause a response
interrupt to fire.
Since we now poll for txdone anyway using the TX status register, I
don't think this is needed at all anymore, so I'll try and get
rid of it.
> > +
> > + writel(BIT(*num), ebm->mbox_ctl + MBOX_CTL_IRQ_CLR);
> > +
> > + /*
> > + * We don't want any fancy nonsense, just write the 32-bit values in
> > + * order. memcpy_toio/__iowrite32_copy don't work here, because fancy.
> > + */
> > + for (i = 0; i < ebm->v->channels[*num].tx_len; i += 4) {
> > + writel(values[i / 4],
> > + ebm->mbox_mmio + ebm->v->channels[*num].tx_offset + i);
> > + }
> > +
> > + writel(BIT(*num), ebm->mbox_ctl + MBOX_CTL_IRQ_SET);
> > +
> > + return 0;
> > +}
> > +
> > +static int mtk_gpueb_mbox_startup(struct mbox_chan *chan)
> > +{
> > + struct mtk_gpueb_mbox *ebm = dev_get_drvdata(chan->mbox->dev);
> > + unsigned int *num = chan->con_priv;
> > +
> > + if (*num >= ebm->v->num_channels)
> > + return -ECHRNG;
> > +
> > + atomic_set(&ebm->rx_status[*num], 0);
> > +
> > + return 0;
> > +}
> > +
> > +static void mtk_gpueb_mbox_shutdown(struct mbox_chan *chan)
> > +{
> > + struct mtk_gpueb_mbox *ebm = dev_get_drvdata(chan->mbox->dev);
> > + unsigned int *num = chan->con_priv;
> > +
> > + atomic_set(&ebm->rx_status[*num], MBOX_CLOGGED);
> > +}
> > +
> > +static bool mtk_gpueb_mbox_last_tx_done(struct mbox_chan *chan)
> > +{
> > + struct mtk_gpueb_mbox *ebm = dev_get_drvdata(chan->mbox->dev);
> > + unsigned int *num = chan->con_priv;
> > +
> > + return !(readl(ebm->mbox_ctl + MBOX_CTL_TX_STS) & BIT(*num));
> > +}
> > +
> > +const struct mbox_chan_ops mtk_gpueb_mbox_ops = {
> > + .send_data = mtk_gpueb_mbox_send_data,
> > + .startup = mtk_gpueb_mbox_startup,
> > + .shutdown = mtk_gpueb_mbox_shutdown,
> > + .last_tx_done = mtk_gpueb_mbox_last_tx_done,
> > +};
> > +
> > +static struct mbox_chan *
> > +mtk_gpueb_mbox_of_xlate(struct mbox_controller *mbox,
> > + const struct of_phandle_args *sp)
> > +{
> > + struct mtk_gpueb_mbox *ebm = dev_get_drvdata(mbox->dev);
> > +
> > + if (!sp->args_count)
> > + return ERR_PTR(-EINVAL);
> > +
> > + if (sp->args[0] >= ebm->v->num_channels)
> > + return ERR_PTR(-ECHRNG);
> > +
> > + return &mbox->chans[sp->args[0]];
> > +}
> > +
> > +static int mtk_gpueb_mbox_probe(struct platform_device *pdev)
> > +{
> > + struct mtk_gpueb_mbox *ebm;
> > + unsigned int rx_buf_sz;
> > + void *buf;
> > + unsigned int i;
> > + int ret;
> > +
> > + ebm = devm_kzalloc(&pdev->dev, sizeof(*ebm), GFP_KERNEL);
> > + if (!ebm)
> > + return -ENOMEM;
> > +
> > + ebm->dev = &pdev->dev;
> > + ebm->v = of_device_get_match_data(ebm->dev);
> > +
> > + dev_set_drvdata(ebm->dev, ebm);
> > +
> > + ebm->clk = devm_clk_get_enabled(ebm->dev, NULL);
> > + if (IS_ERR(ebm->clk))
> > + return dev_err_probe(ebm->dev, PTR_ERR(ebm->clk),
> > + "Failed to get 'eb' clock\n");
> > +
> > + ebm->mbox_mmio = devm_platform_ioremap_resource_byname(pdev, "mbox");
> > + if (IS_ERR(ebm->mbox_mmio))
> > + return dev_err_probe(ebm->dev, PTR_ERR(ebm->mbox_mmio),
> > + "Couldn't map mailbox registers\n");
> > +
> > + ebm->mbox_ctl = devm_platform_ioremap_resource_byname(pdev, "mbox_ctl");
> > + if (IS_ERR(ebm->mbox_ctl))
> > + return dev_err_probe(
> > + ebm->dev, PTR_ERR(ebm->mbox_ctl),
> > + "Couldn't map mailbox control registers\n");
> > +
> > + rx_buf_sz = (ebm->v->channels[ebm->v->num_channels - 1].rx_offset +
> > + ebm->v->channels[ebm->v->num_channels - 1].rx_len);
> rx is after tx in mmio. The first half of the space is wasted.
>
> We follow mtk_gpueb_mbox_read_rx by mbox_chan_received_data. It seems
> we only need max of rx_len's.
We do need the sum of all rx_len's but you're correct that the current code
allocates too much memory. What I think it should do is
last rx_offset + last rx_len - first rx_offset
to get the total amount of MMIO occupied by RX.
> > +
> > + buf = devm_kzalloc(ebm->dev, rx_buf_sz, GFP_KERNEL);
> > + if (!buf)
> > + return -ENOMEM;
> > +
> > + ebm->rx_buf = devm_kmalloc_array(ebm->dev, ebm->v->num_channels,
> > + sizeof(*ebm->rx_buf), GFP_KERNEL);
> > + if (!ebm->rx_buf)
> > + return -ENOMEM;
> > +
> > + ebm->mbox.chans = devm_kcalloc(ebm->dev, ebm->v->num_channels,
> > + sizeof(struct mbox_chan), GFP_KERNEL);
> > + if (!ebm->mbox.chans)
> > + return -ENOMEM;
> > +
> > + ebm->rx_status = devm_kcalloc(ebm->dev, ebm->v->num_channels,
> > + sizeof(atomic_t), GFP_KERNEL);
> > + if (!ebm->rx_status)
> > + return -ENOMEM;
> > +
> > + ebm->chn = devm_kcalloc(ebm->dev, ebm->v->num_channels,
> > + sizeof(*ebm->chn), GFP_KERNEL);
> > +
> > + for (i = 0; i < ebm->v->num_channels; i++) {
> > + ebm->rx_buf[i] = buf + ebm->v->channels[i].rx_offset;
> > + spin_lock_init(&ebm->mbox.chans[i].lock);
> > + /* the things you do to avoid explicit casting void* */
> I actually prefer an inline helper that casts chan->con_priv to the
> channel number. Another option is "chan - ebm->mox.chans".
Hmmm, yeah. Though, I don't enjoy chan - ebm->mbox.chans because now we're
in the woods regarding pointer arithmetics.
I think the inline helper makes more sense, even if explicitly using a
pointer as an integer gives me the heebie-jeebies.
> > + ebm->chn[i] = i;
> > + ebm->mbox.chans[i].con_priv = &ebm->chn[i];
> > + atomic_set(&ebm->rx_status[i], MBOX_CLOGGED);
> > + }
> > +
> > + ebm->mbox.dev = ebm->dev;
> > + ebm->mbox.num_chans = ebm->v->num_channels;
> > + ebm->mbox.txdone_poll = true;
> > + ebm->mbox.txpoll_period = 0; /* minimum hrtimer interval */
> > + ebm->mbox.of_xlate = mtk_gpueb_mbox_of_xlate;
> > + ebm->mbox.ops = &mtk_gpueb_mbox_ops;
> > +
> > + ebm->irq = platform_get_irq(pdev, 0);
> > + if (ebm->irq < 0)
> > + return ebm->irq;
> > +
> > + ret = devm_request_threaded_irq(ebm->dev, ebm->irq, mtk_gpueb_mbox_isr,
> > + mtk_gpueb_mbox_thread, 0, NULL, ebm);
> > + if (ret)
> > + return dev_err_probe(ebm->dev, ret, "failed to request IRQ\n");
> > +
> > + ret = devm_mbox_controller_register(ebm->dev, &ebm->mbox);
> > +
> > + return 0;
> > +}
> > +
> > +static const struct mtk_gpueb_mbox_variant mtk_gpueb_mbox_mt8196 = {
> > + .num_channels = 12,
> > + .channels = {
> > + { "fast_dvfs_event", 0, 0x0000, 16, 0x00e0, 16, false },
> > + { "gpufreq", 1, 0x0010, 32, 0x00f0, 32, false },
> > + { "sleep", 2, 0x0030, 12, 0x0110, 4, true },
> > + { "timer", 3, 0x003c, 24, 0x0114, 4, false },
> > + { "fhctl", 4, 0x0054, 36, 0x0118, 4, false },
> > + { "ccf", 5, 0x0078, 16, 0x011c, 16, false },
> > + { "gpumpu", 6, 0x0088, 24, 0x012c, 4, false },
> > + { "fast_dvfs", 7, 0x00a0, 24, 0x0130, 24, false },
> > + { "ipir_c_met", 8, 0x00b8, 4, 0x0148, 16, false },
> > + { "ipis_c_met", 9, 0x00bc, 16, 0x0158, 4, false },
> > + { "brisket", 10, 0x00cc, 16, 0x015c, 16, false },
> > + { "ppb", 11, 0x00dc, 4, 0x016c, 4, false },
> > + },
> > +};
> > +
> > +static const struct of_device_id mtk_gpueb_mbox_of_ids[] = {
> > + { .compatible = "mediatek,mt8196-gpueb-mbox",
> > + .data = &mtk_gpueb_mbox_mt8196 },
> > + { /* Sentinel */ }
> > +};
> > +MODULE_DEVICE_TABLE(of, mtk_gpueb_mbox_of_ids);
> > +
> > +static struct platform_driver mtk_gpueb_mbox_drv = {
> > + .probe = mtk_gpueb_mbox_probe,
> > + .driver = {
> > + .name = "mtk-gpueb-mbox",
> > + .of_match_table = mtk_gpueb_mbox_of_ids,
> > + }
> > +};
> > +module_platform_driver(mtk_gpueb_mbox_drv);
> > +
> > +MODULE_AUTHOR("Nicolas Frattaroli <nicolas.frattaroli@collabora.com>");
> > +MODULE_DESCRIPTION("MediaTek GPUEB mailbox driver for SoCs such as the MT8196");
> > +MODULE_LICENSE("GPL");
> >
> > --
> > 2.51.0
> >
>
Thank you for your review!
Kind regards,
Nicolas Frattaroli
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [PATCH RFC 05/10] mailbox: add MediaTek GPUEB IPI mailbox
2025-09-12 10:59 ` Nicolas Frattaroli
@ 2025-09-12 17:30 ` Chia-I Wu
0 siblings, 0 replies; 30+ messages in thread
From: Chia-I Wu @ 2025-09-12 17:30 UTC (permalink / raw)
To: Nicolas Frattaroli
Cc: AngeloGioacchino Del Regno, Boris Brezillon, Steven Price,
Liviu Dudau, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Matthias Brugger, MyungJoo Ham, Kyungmin Park,
Chanwoo Choi, Jassi Brar, Kees Cook, Gustavo A. R. Silva,
Chen-Yu Tsai, kernel, dri-devel, devicetree, linux-kernel,
linux-arm-kernel, linux-mediatek, linux-pm, linux-hardening
On Fri, Sep 12, 2025 at 3:59 AM Nicolas Frattaroli
<nicolas.frattaroli@collabora.com> wrote:
>
> On Friday, 12 September 2025 06:48:17 Central European Summer Time Chia-I Wu wrote:
> > On Fri, Sep 5, 2025 at 3:24 AM Nicolas Frattaroli
> > <nicolas.frattaroli@collabora.com> wrote:
> > >
> > > The MT8196 SoC uses an embedded MCU to control frequencies and power of
> > > the GPU. This controller is referred to as "GPUEB".
> > >
> > > It communicates to the application processor, among other ways, through
> > > a mailbox.
> > >
> > > The mailbox exposes one interrupt, which appears to only be fired when a
> > > response is received, rather than a transaction is completed. For us,
> > > this means we unfortunately need to poll for txdone.
> > >
> > > The mailbox also requires the EB clock to be on when touching any of the
> > > mailbox registers.
> > >
> > > Add a simple driver for it based on the common mailbox framework.
> > >
> > > Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> > > ---
> > > drivers/mailbox/Kconfig | 10 ++
> > > drivers/mailbox/Makefile | 2 +
> > > drivers/mailbox/mtk-gpueb-mailbox.c | 330 ++++++++++++++++++++++++++++++++++++
> > > 3 files changed, 342 insertions(+)
> > >
> > > diff --git a/drivers/mailbox/Kconfig b/drivers/mailbox/Kconfig
> > > index 02432d4a5ccd46a16156a09c7f277fb03e4013f5..2016defda1fabb5c0fcc8078f84a52d4e4e00167 100644
> > > --- a/drivers/mailbox/Kconfig
> > > +++ b/drivers/mailbox/Kconfig
> > > @@ -294,6 +294,16 @@ config MTK_CMDQ_MBOX
> > > critical time limitation, such as updating display configuration
> > > during the vblank.
> > >
> > > +config MTK_GPUEB_MBOX
> > > + tristate "MediaTek GPUEB Mailbox Support"
> > > + depends on ARCH_MEDIATEK || COMPILE_TEST
> > > + help
> > > + The MediaTek GPUEB mailbox is used to communicate with the embedded
> > > + controller in charge of GPU frequency and power management on some
> > > + MediaTek SoCs, such as the MT8196.
> > > + Say Y or m here if you want to support the MT8196 SoC in your kernel
> > > + build.
> > > +
> > > config ZYNQMP_IPI_MBOX
> > > tristate "Xilinx ZynqMP IPI Mailbox"
> > > depends on ARCH_ZYNQMP && OF
> > > diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
> > > index 98a68f838486eed117d24296138bf59fda3f92e4..564d06e71313e6d1972e4a6036e1e78c8c7ec450 100644
> > > --- a/drivers/mailbox/Makefile
> > > +++ b/drivers/mailbox/Makefile
> > > @@ -63,6 +63,8 @@ obj-$(CONFIG_MTK_ADSP_MBOX) += mtk-adsp-mailbox.o
> > >
> > > obj-$(CONFIG_MTK_CMDQ_MBOX) += mtk-cmdq-mailbox.o
> > >
> > > +obj-$(CONFIG_MTK_GPUEB_MBOX) += mtk-gpueb-mailbox.o
> > > +
> > > obj-$(CONFIG_ZYNQMP_IPI_MBOX) += zynqmp-ipi-mailbox.o
> > >
> > > obj-$(CONFIG_SUN6I_MSGBOX) += sun6i-msgbox.o
> > > diff --git a/drivers/mailbox/mtk-gpueb-mailbox.c b/drivers/mailbox/mtk-gpueb-mailbox.c
> > > new file mode 100644
> > > index 0000000000000000000000000000000000000000..0236fb358136e434a09a21ef293fe949ced94123
> > > --- /dev/null
> > > +++ b/drivers/mailbox/mtk-gpueb-mailbox.c
> > > @@ -0,0 +1,330 @@
> > > +// SPDX-License-Identifier: GPL-2.0-only
> > > +/*
> > > + * MediaTek GPUEB mailbox driver for SoCs such as the MT8196
> > > + *
> > > + * Copyright (C) 2025, Collabora Ltd.
> > > + *
> > > + * Developers harmed in the making of this driver:
> > > + * - Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> > > + */
> > > +
> > > +#include <linux/atomic.h>
> > > +#include <linux/clk.h>
> > > +#include <linux/device.h>
> > > +#include <linux/interrupt.h>
> > > +#include <linux/io.h>
> > > +#include <linux/iopoll.h>
> > > +#include <linux/mailbox_controller.h>
> > > +#include <linux/module.h>
> > > +#include <linux/of.h>
> > > +#include <linux/platform_device.h>
> > > +
> > > +#define MBOX_CTL_TX_STS 0x0000
> > > +#define MBOX_CTL_IRQ_SET 0x0004
> > > +#define MBOX_CTL_IRQ_CLR 0x0074
> > > +#define MBOX_CTL_RX_STS 0x0078
> > > +
> > > +#define MBOX_FULL BIT(0) /* i.e. we've received data */
> > > +#define MBOX_CLOGGED BIT(1) /* i.e. the channel is shutdown */
> > > +
> > > +struct mtk_gpueb_mbox {
> > > + struct device *dev;
> > > + struct clk *clk;
> > > + void __iomem *mbox_mmio;
> > > + void __iomem *mbox_ctl;
> > > + void **rx_buf;
> > > + atomic_t *rx_status;
> > > + struct mbox_controller mbox;
> > > + unsigned int *chn;
> > > + int irq;
> > > + const struct mtk_gpueb_mbox_variant *v;
> > > +};
> > > +
> > > +struct mtk_gpueb_mbox_ch {
> > > + const char *name;
> > > + const int num;
> > > + const unsigned int tx_offset;
> > > + const unsigned int tx_len;
> > > + const unsigned int rx_offset;
> > > + const unsigned int rx_len;
> > > + const bool no_response;
> > > +};
> > > +
> > > +struct mtk_gpueb_mbox_variant {
> > > + unsigned int num_channels;
> > > + const struct mtk_gpueb_mbox_ch channels[] __counted_by(num_channels);
> > > +};
> > > +
> > > +/**
> > > + * mtk_gpueb_mbox_read_rx - read RX buffer from MMIO into ebm's RX buffer
> > > + * @ebm: pointer to &struct mtk_gpueb_mbox instance
> > > + * @channel: number of channel to read
> > > + */
> > > +static void mtk_gpueb_mbox_read_rx(struct mtk_gpueb_mbox *ebm,
> > > + unsigned int channel)
> > > +{
> > > + const struct mtk_gpueb_mbox_ch *ch;
> > > +
> > > + ch = &ebm->v->channels[channel];
> > > +
> > > + memcpy_fromio(ebm->rx_buf[channel], ebm->mbox_mmio + ch->rx_offset,
> > > + ch->rx_len);
> > > +
> > > +}
> > > +
> > > +static irqreturn_t mtk_gpueb_mbox_isr(int irq, void *data)
> > > +{
> > > + struct mtk_gpueb_mbox *ebm = data;
> > > + u32 rx_handled = 0;
> > > + u32 rx_sts;
> > > + int i;
> > > +
> > > + rx_sts = readl(ebm->mbox_ctl + MBOX_CTL_RX_STS);
> > > +
> > > + for (i = 0; i < ebm->v->num_channels; i++) {
> > > + if (rx_sts & BIT(i)) {
> > > + if (!atomic_cmpxchg(&ebm->rx_status[i], 0,
> > > + MBOX_FULL | MBOX_CLOGGED))
> > > + rx_handled |= BIT(i);
> > > + }
> > > + }
> > We can loop over bits that are set in rx_sts, if we expect that only a
> > few bits are set most of the time.
>
> Could you elaborate on your preferred approach? num_channels will be
> smaller than the bit width of rx_sts. I could loop from __fls down to
> 0 checking for (!rx_sts ^ rx_handled) for an early exit, but I'm not
> sure if this microoptimisation is what you meant or if that even makes
> much sense instruction cost wise.
>
> Alternatively, I could loop from __fss to num_channels.
>
> Is there some commonly used hyper-optimised "loop over set bits" macro
> that I'm unfamiliar with?
Not sure if there is a macro, but panthor open-codes something like
while (rx_sts) {
int chan = ffs(rx_sts) - 1;
rx_sts &= ~BIT(chan);
// do away with chan
}
>
> > > +
> > > + writel(rx_handled, ebm->mbox_ctl + MBOX_CTL_IRQ_CLR);
> > > +
> > > + if (!(rx_sts ^ rx_handled))
> > "rx_sts == rx_handled" should be more direct.
>
> Good point, will change.
>
> > > + return IRQ_WAKE_THREAD;
> > > +
> > > + dev_warn_ratelimited(ebm->dev, "spurious interrupts on 0x%04X\n",
> > > + rx_sts ^ rx_handled);
> > > + return IRQ_NONE;
> > It seems a bit too punishing when there are spurious interrupts. I
> > wonder if we should warn, but return IRQ_WAKE_THREAD as long as
> > rx_handled != 0.
> >
> > Also, if another interrupt can fire before mtk_gpueb_mbox_thread runs,
> > that's data dropping rather than spurious interrupts.
> >
>
> Yeah, I agree that this is bad. As wens pointed out, my IRQ clearing
> earlier would most certainly open us up to losing data. I'll
> definitely look into changing this; iirc the current code resulted
> from me being unsure whether partially handled IRQs should still
> be counted as a handled IRQ or not.
If we only get rx data (response) as a result of tx data (request),
this would be fine. But if there can be unsolicited tx data (looking
at fast_dvfs_event), yeah, we are clearing irq too early.
>
> > > +}
> > > +
> > > +static irqreturn_t mtk_gpueb_mbox_thread(int irq, void *data)
> > > +{
> > > + struct mtk_gpueb_mbox *ebm = data;
> > > + irqreturn_t ret = IRQ_NONE;
> > > + int status;
> > > + int i;
> > > +
> > > + for (i = 0; i < ebm->v->num_channels; i++) {
> > > + status = atomic_cmpxchg(&ebm->rx_status[i],
> > > + MBOX_FULL | MBOX_CLOGGED, MBOX_FULL);
> > > + if (status == (MBOX_FULL | MBOX_CLOGGED)) {
> > We could also save rx_handled from mtk_gpueb_mbox_isr and loop over
> > bits that are set. If we do that, ebm->rx_status[i] is guaranteed to
> > be MBOX_FULL | MBOX_CLOGGED.
> >
>
> Wouldn't storing rx_handled open us up to data races with unpleasant
> code to try and prevent them? As far as I understand, the ISR is allowed
> to fire again before the thread has completed. Having the rx_status bits
> as individual atomics gives us finer granularity than needing to protect
> the entire set of bits.
>
> If there is some way to pass data from the ISR to the awoken thread
> directly as a per-invocation parameter rather than a squirreled away
> member of the driver private struct, then I agree doing it your way
> seems better.
>
> > > + mtk_gpueb_mbox_read_rx(ebm, i);
> > > + mbox_chan_received_data(&ebm->mbox.chans[i],
> > > + ebm->rx_buf[i]);
> > It looks like we read the data and pass it on to the client
> > immediately. Does each channel need its own rx_buf?
> >
>
> Yeah, common mailbox framework does not do a copy here, it just passes
> the pointer. Which is good, because nobody likes copies, and it
> prevents the mailbox core from needing to know the size of received
> data.
>
> So the individual rx_bufs are needed to prevent channel cross-talk.
I checked a few other mailbox drivers. The data pointer could even
point to local variables in some cases. It seems, in general,
mbox_client cannot expect the data to be valid after
mbox_chan_received_data returns.
>
> > > + /* FIXME: When does MBOX_FULL get cleared? Here? */
> > > + atomic_set(&ebm->rx_status[i], 0);
> > > + ret = IRQ_HANDLED;
> > > + }
> > > + }
> > > +
> > > + return ret;
> > > +}
> > > +
> > > +static int mtk_gpueb_mbox_send_data(struct mbox_chan *chan, void *data)
> > > +{
> > > + struct mtk_gpueb_mbox *ebm = dev_get_drvdata(chan->mbox->dev);
> > > + unsigned int *num = chan->con_priv;
> > > + int i;
> > > + u32 *values = data;
> > > +
> > > + if (*num >= ebm->v->num_channels)
> > > + return -ECHRNG;
> > Can this ever happen? (I am not familiar with the mbox subsystem)
>
> Only if someone with a pointer to chan modifies con_priv outside of
> what the mailbox driver set. It's an unlikely scenario, so I do think
> I can remove this check, but the downside is that if someone ever does
> manage to do this then the next line is a convenient read primitive.
That should be an invalid usage that we hopefully don't need to worry
about. Otherwise, as things are, *num can also be modified after the
check so the check is not sufficient.
>
> > > +
> > > + if (!ebm->v->channels[*num].no_response &&
> > > + atomic_read(&ebm->rx_status[*num]))
> > > + return -EBUSY;
> > When no_response is true, rx_status is 0. We probably don't need to
> > check no_response.
>
> Now that I'm thinking about it, I can probably get rid of no_response
> altogether. It was originally added because I conflated txdone with
> response received, and one particular command on the sleep channel,
> namely powering off the MFG, is unable to ever cause a response
> interrupt to fire.
>
> Since we now poll for txdone anyway using the TX status register, I
> don't think this is needed at all anymore, so I'll try and get
> rid of it.
>
> > > +
> > > + writel(BIT(*num), ebm->mbox_ctl + MBOX_CTL_IRQ_CLR);
> > > +
> > > + /*
> > > + * We don't want any fancy nonsense, just write the 32-bit values in
> > > + * order. memcpy_toio/__iowrite32_copy don't work here, because fancy.
> > > + */
> > > + for (i = 0; i < ebm->v->channels[*num].tx_len; i += 4) {
> > > + writel(values[i / 4],
> > > + ebm->mbox_mmio + ebm->v->channels[*num].tx_offset + i);
> > > + }
> > > +
> > > + writel(BIT(*num), ebm->mbox_ctl + MBOX_CTL_IRQ_SET);
> > > +
> > > + return 0;
> > > +}
> > > +
> > > +static int mtk_gpueb_mbox_startup(struct mbox_chan *chan)
> > > +{
> > > + struct mtk_gpueb_mbox *ebm = dev_get_drvdata(chan->mbox->dev);
> > > + unsigned int *num = chan->con_priv;
> > > +
> > > + if (*num >= ebm->v->num_channels)
> > > + return -ECHRNG;
> > > +
> > > + atomic_set(&ebm->rx_status[*num], 0);
> > > +
> > > + return 0;
> > > +}
> > > +
> > > +static void mtk_gpueb_mbox_shutdown(struct mbox_chan *chan)
> > > +{
> > > + struct mtk_gpueb_mbox *ebm = dev_get_drvdata(chan->mbox->dev);
> > > + unsigned int *num = chan->con_priv;
> > > +
> > > + atomic_set(&ebm->rx_status[*num], MBOX_CLOGGED);
> > > +}
> > > +
> > > +static bool mtk_gpueb_mbox_last_tx_done(struct mbox_chan *chan)
> > > +{
> > > + struct mtk_gpueb_mbox *ebm = dev_get_drvdata(chan->mbox->dev);
> > > + unsigned int *num = chan->con_priv;
> > > +
> > > + return !(readl(ebm->mbox_ctl + MBOX_CTL_TX_STS) & BIT(*num));
> > > +}
> > > +
> > > +const struct mbox_chan_ops mtk_gpueb_mbox_ops = {
> > > + .send_data = mtk_gpueb_mbox_send_data,
> > > + .startup = mtk_gpueb_mbox_startup,
> > > + .shutdown = mtk_gpueb_mbox_shutdown,
> > > + .last_tx_done = mtk_gpueb_mbox_last_tx_done,
> > > +};
> > > +
> > > +static struct mbox_chan *
> > > +mtk_gpueb_mbox_of_xlate(struct mbox_controller *mbox,
> > > + const struct of_phandle_args *sp)
> > > +{
> > > + struct mtk_gpueb_mbox *ebm = dev_get_drvdata(mbox->dev);
> > > +
> > > + if (!sp->args_count)
> > > + return ERR_PTR(-EINVAL);
> > > +
> > > + if (sp->args[0] >= ebm->v->num_channels)
> > > + return ERR_PTR(-ECHRNG);
> > > +
> > > + return &mbox->chans[sp->args[0]];
> > > +}
> > > +
> > > +static int mtk_gpueb_mbox_probe(struct platform_device *pdev)
> > > +{
> > > + struct mtk_gpueb_mbox *ebm;
> > > + unsigned int rx_buf_sz;
> > > + void *buf;
> > > + unsigned int i;
> > > + int ret;
> > > +
> > > + ebm = devm_kzalloc(&pdev->dev, sizeof(*ebm), GFP_KERNEL);
> > > + if (!ebm)
> > > + return -ENOMEM;
> > > +
> > > + ebm->dev = &pdev->dev;
> > > + ebm->v = of_device_get_match_data(ebm->dev);
> > > +
> > > + dev_set_drvdata(ebm->dev, ebm);
> > > +
> > > + ebm->clk = devm_clk_get_enabled(ebm->dev, NULL);
> > > + if (IS_ERR(ebm->clk))
> > > + return dev_err_probe(ebm->dev, PTR_ERR(ebm->clk),
> > > + "Failed to get 'eb' clock\n");
> > > +
> > > + ebm->mbox_mmio = devm_platform_ioremap_resource_byname(pdev, "mbox");
> > > + if (IS_ERR(ebm->mbox_mmio))
> > > + return dev_err_probe(ebm->dev, PTR_ERR(ebm->mbox_mmio),
> > > + "Couldn't map mailbox registers\n");
> > > +
> > > + ebm->mbox_ctl = devm_platform_ioremap_resource_byname(pdev, "mbox_ctl");
> > > + if (IS_ERR(ebm->mbox_ctl))
> > > + return dev_err_probe(
> > > + ebm->dev, PTR_ERR(ebm->mbox_ctl),
> > > + "Couldn't map mailbox control registers\n");
> > > +
> > > + rx_buf_sz = (ebm->v->channels[ebm->v->num_channels - 1].rx_offset +
> > > + ebm->v->channels[ebm->v->num_channels - 1].rx_len);
> > rx is after tx in mmio. The first half of the space is wasted.
> >
> > We follow mtk_gpueb_mbox_read_rx by mbox_chan_received_data. It seems
> > we only need max of rx_len's.
>
> We do need the sum of all rx_len's but you're correct that the current code
> allocates too much memory. What I think it should do is
>
> last rx_offset + last rx_len - first rx_offset
>
> to get the total amount of MMIO occupied by RX.
>
> > > +
> > > + buf = devm_kzalloc(ebm->dev, rx_buf_sz, GFP_KERNEL);
> > > + if (!buf)
> > > + return -ENOMEM;
> > > +
> > > + ebm->rx_buf = devm_kmalloc_array(ebm->dev, ebm->v->num_channels,
> > > + sizeof(*ebm->rx_buf), GFP_KERNEL);
> > > + if (!ebm->rx_buf)
> > > + return -ENOMEM;
> > > +
> > > + ebm->mbox.chans = devm_kcalloc(ebm->dev, ebm->v->num_channels,
> > > + sizeof(struct mbox_chan), GFP_KERNEL);
> > > + if (!ebm->mbox.chans)
> > > + return -ENOMEM;
> > > +
> > > + ebm->rx_status = devm_kcalloc(ebm->dev, ebm->v->num_channels,
> > > + sizeof(atomic_t), GFP_KERNEL);
> > > + if (!ebm->rx_status)
> > > + return -ENOMEM;
> > > +
> > > + ebm->chn = devm_kcalloc(ebm->dev, ebm->v->num_channels,
> > > + sizeof(*ebm->chn), GFP_KERNEL);
> > > +
> > > + for (i = 0; i < ebm->v->num_channels; i++) {
> > > + ebm->rx_buf[i] = buf + ebm->v->channels[i].rx_offset;
> > > + spin_lock_init(&ebm->mbox.chans[i].lock);
> > > + /* the things you do to avoid explicit casting void* */
> > I actually prefer an inline helper that casts chan->con_priv to the
> > channel number. Another option is "chan - ebm->mox.chans".
>
> Hmmm, yeah. Though, I don't enjoy chan - ebm->mbox.chans because now we're
> in the woods regarding pointer arithmetics.
>
> I think the inline helper makes more sense, even if explicitly using a
> pointer as an integer gives me the heebie-jeebies.
>
> > > + ebm->chn[i] = i;
> > > + ebm->mbox.chans[i].con_priv = &ebm->chn[i];
> > > + atomic_set(&ebm->rx_status[i], MBOX_CLOGGED);
> > > + }
> > > +
> > > + ebm->mbox.dev = ebm->dev;
> > > + ebm->mbox.num_chans = ebm->v->num_channels;
> > > + ebm->mbox.txdone_poll = true;
> > > + ebm->mbox.txpoll_period = 0; /* minimum hrtimer interval */
> > > + ebm->mbox.of_xlate = mtk_gpueb_mbox_of_xlate;
> > > + ebm->mbox.ops = &mtk_gpueb_mbox_ops;
> > > +
> > > + ebm->irq = platform_get_irq(pdev, 0);
> > > + if (ebm->irq < 0)
> > > + return ebm->irq;
> > > +
> > > + ret = devm_request_threaded_irq(ebm->dev, ebm->irq, mtk_gpueb_mbox_isr,
> > > + mtk_gpueb_mbox_thread, 0, NULL, ebm);
> > > + if (ret)
> > > + return dev_err_probe(ebm->dev, ret, "failed to request IRQ\n");
> > > +
> > > + ret = devm_mbox_controller_register(ebm->dev, &ebm->mbox);
> > > +
> > > + return 0;
> > > +}
> > > +
> > > +static const struct mtk_gpueb_mbox_variant mtk_gpueb_mbox_mt8196 = {
> > > + .num_channels = 12,
> > > + .channels = {
> > > + { "fast_dvfs_event", 0, 0x0000, 16, 0x00e0, 16, false },
> > > + { "gpufreq", 1, 0x0010, 32, 0x00f0, 32, false },
> > > + { "sleep", 2, 0x0030, 12, 0x0110, 4, true },
> > > + { "timer", 3, 0x003c, 24, 0x0114, 4, false },
> > > + { "fhctl", 4, 0x0054, 36, 0x0118, 4, false },
> > > + { "ccf", 5, 0x0078, 16, 0x011c, 16, false },
> > > + { "gpumpu", 6, 0x0088, 24, 0x012c, 4, false },
> > > + { "fast_dvfs", 7, 0x00a0, 24, 0x0130, 24, false },
> > > + { "ipir_c_met", 8, 0x00b8, 4, 0x0148, 16, false },
> > > + { "ipis_c_met", 9, 0x00bc, 16, 0x0158, 4, false },
> > > + { "brisket", 10, 0x00cc, 16, 0x015c, 16, false },
> > > + { "ppb", 11, 0x00dc, 4, 0x016c, 4, false },
> > > + },
> > > +};
> > > +
> > > +static const struct of_device_id mtk_gpueb_mbox_of_ids[] = {
> > > + { .compatible = "mediatek,mt8196-gpueb-mbox",
> > > + .data = &mtk_gpueb_mbox_mt8196 },
> > > + { /* Sentinel */ }
> > > +};
> > > +MODULE_DEVICE_TABLE(of, mtk_gpueb_mbox_of_ids);
> > > +
> > > +static struct platform_driver mtk_gpueb_mbox_drv = {
> > > + .probe = mtk_gpueb_mbox_probe,
> > > + .driver = {
> > > + .name = "mtk-gpueb-mbox",
> > > + .of_match_table = mtk_gpueb_mbox_of_ids,
> > > + }
> > > +};
> > > +module_platform_driver(mtk_gpueb_mbox_drv);
> > > +
> > > +MODULE_AUTHOR("Nicolas Frattaroli <nicolas.frattaroli@collabora.com>");
> > > +MODULE_DESCRIPTION("MediaTek GPUEB mailbox driver for SoCs such as the MT8196");
> > > +MODULE_LICENSE("GPL");
> > >
> > > --
> > > 2.51.0
> > >
> >
>
> Thank you for your review!
>
> Kind regards,
> Nicolas Frattaroli
>
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH RFC 06/10] drm/panthor: call into devfreq for current frequency
2025-09-05 10:22 [PATCH RFC 00/10] MT8196 GPU Frequency/Power Control Support Nicolas Frattaroli
` (4 preceding siblings ...)
2025-09-05 10:23 ` [PATCH RFC 05/10] mailbox: add MediaTek GPUEB IPI mailbox Nicolas Frattaroli
@ 2025-09-05 10:23 ` Nicolas Frattaroli
2025-09-10 13:59 ` Steven Price
2025-09-05 10:23 ` [PATCH RFC 07/10] drm/panthor: move panthor_devfreq struct to header Nicolas Frattaroli
` (3 subsequent siblings)
9 siblings, 1 reply; 30+ messages in thread
From: Nicolas Frattaroli @ 2025-09-05 10:23 UTC (permalink / raw)
To: AngeloGioacchino Del Regno, Boris Brezillon, Steven Price,
Liviu Dudau, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Matthias Brugger, MyungJoo Ham, Kyungmin Park,
Chanwoo Choi, Jassi Brar, Kees Cook, Gustavo A. R. Silva
Cc: Chia-I Wu, Chen-Yu Tsai, kernel, dri-devel, devicetree,
linux-kernel, linux-arm-kernel, linux-mediatek, linux-pm,
linux-hardening, Nicolas Frattaroli
As it stands, panthor keeps a cached current frequency value for when it
wants to retrieve it. This doesn't work well for when things might
switch frequency without panthor's knowledge.
Instead, implement the get_cur_freq operation, and expose it through a
helper function to the rest of panthor.
Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
---
drivers/gpu/drm/panthor/panthor_devfreq.c | 33 +++++++++++++++++++++++++++----
drivers/gpu/drm/panthor/panthor_devfreq.h | 2 ++
drivers/gpu/drm/panthor/panthor_device.h | 3 ---
drivers/gpu/drm/panthor/panthor_drv.c | 4 +++-
4 files changed, 34 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/panthor/panthor_devfreq.c b/drivers/gpu/drm/panthor/panthor_devfreq.c
index 3686515d368db5bb329f4858d4a7247a4957cc24..8903f60c0a3f06313ac2008791c210ff32b6bd52 100644
--- a/drivers/gpu/drm/panthor/panthor_devfreq.c
+++ b/drivers/gpu/drm/panthor/panthor_devfreq.c
@@ -62,7 +62,6 @@ static void panthor_devfreq_update_utilization(struct panthor_devfreq *pdevfreq)
static int panthor_devfreq_target(struct device *dev, unsigned long *freq,
u32 flags)
{
- struct panthor_device *ptdev = dev_get_drvdata(dev);
struct dev_pm_opp *opp;
int err;
@@ -72,8 +71,6 @@ static int panthor_devfreq_target(struct device *dev, unsigned long *freq,
dev_pm_opp_put(opp);
err = dev_pm_opp_set_rate(dev, *freq);
- if (!err)
- ptdev->current_frequency = *freq;
return err;
}
@@ -115,11 +112,21 @@ static int panthor_devfreq_get_dev_status(struct device *dev,
return 0;
}
+static int panthor_devfreq_get_cur_freq(struct device *dev, unsigned long *freq)
+{
+ struct panthor_device *ptdev = dev_get_drvdata(dev);
+
+ *freq = clk_get_rate(ptdev->clks.core);
+
+ return 0;
+}
+
static struct devfreq_dev_profile panthor_devfreq_profile = {
.timer = DEVFREQ_TIMER_DELAYED,
.polling_ms = 50, /* ~3 frames */
.target = panthor_devfreq_target,
.get_dev_status = panthor_devfreq_get_dev_status,
+ .get_cur_freq = panthor_devfreq_get_cur_freq,
};
int panthor_devfreq_init(struct panthor_device *ptdev)
@@ -198,7 +205,6 @@ int panthor_devfreq_init(struct panthor_device *ptdev)
return PTR_ERR(opp);
panthor_devfreq_profile.initial_freq = cur_freq;
- ptdev->current_frequency = cur_freq;
/*
* Set the recommend OPP this will enable and configure the regulator
@@ -296,3 +302,22 @@ void panthor_devfreq_record_idle(struct panthor_device *ptdev)
spin_unlock_irqrestore(&pdevfreq->lock, irqflags);
}
+
+unsigned long panthor_devfreq_get_freq(struct panthor_device *ptdev)
+{
+ struct panthor_devfreq *pdevfreq = ptdev->devfreq;
+ unsigned long freq = 0;
+ int ret;
+
+ if (!pdevfreq || !pdevfreq->devfreq)
+ return 0;
+
+ if (pdevfreq->devfreq->profile->get_cur_freq) {
+ ret = pdevfreq->devfreq->profile->get_cur_freq(ptdev->base.dev,
+ &freq);
+ if (ret)
+ return 0;
+ }
+
+ return freq;
+}
diff --git a/drivers/gpu/drm/panthor/panthor_devfreq.h b/drivers/gpu/drm/panthor/panthor_devfreq.h
index b7631de695f7d79456478c87e8af5dc47673cd1d..f8e29e02f66cb3281ed4bb4c75cda9bd4df82b92 100644
--- a/drivers/gpu/drm/panthor/panthor_devfreq.h
+++ b/drivers/gpu/drm/panthor/panthor_devfreq.h
@@ -18,4 +18,6 @@ void panthor_devfreq_suspend(struct panthor_device *ptdev);
void panthor_devfreq_record_busy(struct panthor_device *ptdev);
void panthor_devfreq_record_idle(struct panthor_device *ptdev);
+unsigned long panthor_devfreq_get_freq(struct panthor_device *ptdev);
+
#endif /* __PANTHOR_DEVFREQ_H__ */
diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
index 4fc7cf2aeed577f623aac73ed287d6327645ecaa..a14239c8f9ca9229d8d6d36d327e6fd6d05f8f2f 100644
--- a/drivers/gpu/drm/panthor/panthor_device.h
+++ b/drivers/gpu/drm/panthor/panthor_device.h
@@ -200,9 +200,6 @@ struct panthor_device {
/** @profile_mask: User-set profiling flags for job accounting. */
u32 profile_mask;
- /** @current_frequency: Device clock frequency at present. Set by DVFS*/
- unsigned long current_frequency;
-
/** @fast_rate: Maximum device clock frequency. Set by DVFS */
unsigned long fast_rate;
diff --git a/drivers/gpu/drm/panthor/panthor_drv.c b/drivers/gpu/drm/panthor/panthor_drv.c
index 4c202fc5ce0504e3f08bf6c8f18a314890eb88ec..c85a16e1339eaa164a8719ffecf5214cafff1a55 100644
--- a/drivers/gpu/drm/panthor/panthor_drv.c
+++ b/drivers/gpu/drm/panthor/panthor_drv.c
@@ -25,6 +25,7 @@
#include <drm/gpu_scheduler.h>
#include <drm/panthor_drm.h>
+#include "panthor_devfreq.h"
#include "panthor_device.h"
#include "panthor_fw.h"
#include "panthor_gem.h"
@@ -1519,7 +1520,8 @@ static void panthor_gpu_show_fdinfo(struct panthor_device *ptdev,
drm_printf(p, "drm-cycles-panthor:\t%llu\n", pfile->stats.cycles);
drm_printf(p, "drm-maxfreq-panthor:\t%lu Hz\n", ptdev->fast_rate);
- drm_printf(p, "drm-curfreq-panthor:\t%lu Hz\n", ptdev->current_frequency);
+ drm_printf(p, "drm-curfreq-panthor:\t%lu Hz\n",
+ panthor_devfreq_get_freq(ptdev));
}
static void panthor_show_internal_memory_stats(struct drm_printer *p, struct drm_file *file)
--
2.51.0
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH RFC 06/10] drm/panthor: call into devfreq for current frequency
2025-09-05 10:23 ` [PATCH RFC 06/10] drm/panthor: call into devfreq for current frequency Nicolas Frattaroli
@ 2025-09-10 13:59 ` Steven Price
0 siblings, 0 replies; 30+ messages in thread
From: Steven Price @ 2025-09-10 13:59 UTC (permalink / raw)
To: Nicolas Frattaroli, AngeloGioacchino Del Regno, Boris Brezillon,
Liviu Dudau, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Matthias Brugger, MyungJoo Ham, Kyungmin Park,
Chanwoo Choi, Jassi Brar, Kees Cook, Gustavo A. R. Silva
Cc: Chia-I Wu, Chen-Yu Tsai, kernel, dri-devel, devicetree,
linux-kernel, linux-arm-kernel, linux-mediatek, linux-pm,
linux-hardening
On 05/09/2025 11:23, Nicolas Frattaroli wrote:
> As it stands, panthor keeps a cached current frequency value for when it
> wants to retrieve it. This doesn't work well for when things might
> switch frequency without panthor's knowledge.
>
> Instead, implement the get_cur_freq operation, and expose it through a
> helper function to the rest of panthor.
>
> Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
> ---
> drivers/gpu/drm/panthor/panthor_devfreq.c | 33 +++++++++++++++++++++++++++----
> drivers/gpu/drm/panthor/panthor_devfreq.h | 2 ++
> drivers/gpu/drm/panthor/panthor_device.h | 3 ---
> drivers/gpu/drm/panthor/panthor_drv.c | 4 +++-
> 4 files changed, 34 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/panthor/panthor_devfreq.c b/drivers/gpu/drm/panthor/panthor_devfreq.c
> index 3686515d368db5bb329f4858d4a7247a4957cc24..8903f60c0a3f06313ac2008791c210ff32b6bd52 100644
> --- a/drivers/gpu/drm/panthor/panthor_devfreq.c
> +++ b/drivers/gpu/drm/panthor/panthor_devfreq.c
> @@ -62,7 +62,6 @@ static void panthor_devfreq_update_utilization(struct panthor_devfreq *pdevfreq)
> static int panthor_devfreq_target(struct device *dev, unsigned long *freq,
> u32 flags)
> {
> - struct panthor_device *ptdev = dev_get_drvdata(dev);
> struct dev_pm_opp *opp;
> int err;
>
> @@ -72,8 +71,6 @@ static int panthor_devfreq_target(struct device *dev, unsigned long *freq,
> dev_pm_opp_put(opp);
>
> err = dev_pm_opp_set_rate(dev, *freq);
> - if (!err)
> - ptdev->current_frequency = *freq;
>
> return err;
> }
> @@ -115,11 +112,21 @@ static int panthor_devfreq_get_dev_status(struct device *dev,
> return 0;
> }
>
> +static int panthor_devfreq_get_cur_freq(struct device *dev, unsigned long *freq)
> +{
> + struct panthor_device *ptdev = dev_get_drvdata(dev);
> +
> + *freq = clk_get_rate(ptdev->clks.core);
> +
> + return 0;
> +}
> +
> static struct devfreq_dev_profile panthor_devfreq_profile = {
> .timer = DEVFREQ_TIMER_DELAYED,
> .polling_ms = 50, /* ~3 frames */
> .target = panthor_devfreq_target,
> .get_dev_status = panthor_devfreq_get_dev_status,
> + .get_cur_freq = panthor_devfreq_get_cur_freq,
> };
>
> int panthor_devfreq_init(struct panthor_device *ptdev)
> @@ -198,7 +205,6 @@ int panthor_devfreq_init(struct panthor_device *ptdev)
> return PTR_ERR(opp);
>
> panthor_devfreq_profile.initial_freq = cur_freq;
> - ptdev->current_frequency = cur_freq;
>
> /*
> * Set the recommend OPP this will enable and configure the regulator
> @@ -296,3 +302,22 @@ void panthor_devfreq_record_idle(struct panthor_device *ptdev)
>
> spin_unlock_irqrestore(&pdevfreq->lock, irqflags);
> }
> +
> +unsigned long panthor_devfreq_get_freq(struct panthor_device *ptdev)
> +{
> + struct panthor_devfreq *pdevfreq = ptdev->devfreq;
> + unsigned long freq = 0;
> + int ret;
> +
> + if (!pdevfreq || !pdevfreq->devfreq)
> + return 0;
> +
> + if (pdevfreq->devfreq->profile->get_cur_freq) {
> + ret = pdevfreq->devfreq->profile->get_cur_freq(ptdev->base.dev,
> + &freq);
> + if (ret)
> + return 0;
> + }
> +
> + return freq;
> +}
> diff --git a/drivers/gpu/drm/panthor/panthor_devfreq.h b/drivers/gpu/drm/panthor/panthor_devfreq.h
> index b7631de695f7d79456478c87e8af5dc47673cd1d..f8e29e02f66cb3281ed4bb4c75cda9bd4df82b92 100644
> --- a/drivers/gpu/drm/panthor/panthor_devfreq.h
> +++ b/drivers/gpu/drm/panthor/panthor_devfreq.h
> @@ -18,4 +18,6 @@ void panthor_devfreq_suspend(struct panthor_device *ptdev);
> void panthor_devfreq_record_busy(struct panthor_device *ptdev);
> void panthor_devfreq_record_idle(struct panthor_device *ptdev);
>
> +unsigned long panthor_devfreq_get_freq(struct panthor_device *ptdev);
> +
> #endif /* __PANTHOR_DEVFREQ_H__ */
> diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
> index 4fc7cf2aeed577f623aac73ed287d6327645ecaa..a14239c8f9ca9229d8d6d36d327e6fd6d05f8f2f 100644
> --- a/drivers/gpu/drm/panthor/panthor_device.h
> +++ b/drivers/gpu/drm/panthor/panthor_device.h
> @@ -200,9 +200,6 @@ struct panthor_device {
> /** @profile_mask: User-set profiling flags for job accounting. */
> u32 profile_mask;
>
> - /** @current_frequency: Device clock frequency at present. Set by DVFS*/
> - unsigned long current_frequency;
> -
> /** @fast_rate: Maximum device clock frequency. Set by DVFS */
> unsigned long fast_rate;
>
> diff --git a/drivers/gpu/drm/panthor/panthor_drv.c b/drivers/gpu/drm/panthor/panthor_drv.c
> index 4c202fc5ce0504e3f08bf6c8f18a314890eb88ec..c85a16e1339eaa164a8719ffecf5214cafff1a55 100644
> --- a/drivers/gpu/drm/panthor/panthor_drv.c
> +++ b/drivers/gpu/drm/panthor/panthor_drv.c
> @@ -25,6 +25,7 @@
> #include <drm/gpu_scheduler.h>
> #include <drm/panthor_drm.h>
>
> +#include "panthor_devfreq.h"
> #include "panthor_device.h"
> #include "panthor_fw.h"
> #include "panthor_gem.h"
> @@ -1519,7 +1520,8 @@ static void panthor_gpu_show_fdinfo(struct panthor_device *ptdev,
> drm_printf(p, "drm-cycles-panthor:\t%llu\n", pfile->stats.cycles);
>
> drm_printf(p, "drm-maxfreq-panthor:\t%lu Hz\n", ptdev->fast_rate);
> - drm_printf(p, "drm-curfreq-panthor:\t%lu Hz\n", ptdev->current_frequency);
> + drm_printf(p, "drm-curfreq-panthor:\t%lu Hz\n",
> + panthor_devfreq_get_freq(ptdev));
> }
>
> static void panthor_show_internal_memory_stats(struct drm_printer *p, struct drm_file *file)
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH RFC 07/10] drm/panthor: move panthor_devfreq struct to header
2025-09-05 10:22 [PATCH RFC 00/10] MT8196 GPU Frequency/Power Control Support Nicolas Frattaroli
` (5 preceding siblings ...)
2025-09-05 10:23 ` [PATCH RFC 06/10] drm/panthor: call into devfreq for current frequency Nicolas Frattaroli
@ 2025-09-05 10:23 ` Nicolas Frattaroli
2025-09-10 13:59 ` Steven Price
2025-09-05 10:23 ` [PATCH RFC 08/10] drm/panthor: devfreq: expose get_dev_status and make it more generic Nicolas Frattaroli
` (2 subsequent siblings)
9 siblings, 1 reply; 30+ messages in thread
From: Nicolas Frattaroli @ 2025-09-05 10:23 UTC (permalink / raw)
To: AngeloGioacchino Del Regno, Boris Brezillon, Steven Price,
Liviu Dudau, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Matthias Brugger, MyungJoo Ham, Kyungmin Park,
Chanwoo Choi, Jassi Brar, Kees Cook, Gustavo A. R. Silva
Cc: Chia-I Wu, Chen-Yu Tsai, kernel, dri-devel, devicetree,
linux-kernel, linux-arm-kernel, linux-mediatek, linux-pm,
linux-hardening, Nicolas Frattaroli
In order to make files other than panthor_devfreq.c be able to touch the
members of a panthor_devfreq instance, it needs to live somewhere other
than the .c file.
Move it into the panthor_devfreq.h header, so that the upcoming MediaTek
MFG devfreq can use it as well.
Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
---
drivers/gpu/drm/panthor/panthor_devfreq.c | 32 ---------------------------
drivers/gpu/drm/panthor/panthor_devfreq.h | 36 ++++++++++++++++++++++++++++++-
2 files changed, 35 insertions(+), 33 deletions(-)
diff --git a/drivers/gpu/drm/panthor/panthor_devfreq.c b/drivers/gpu/drm/panthor/panthor_devfreq.c
index 8903f60c0a3f06313ac2008791c210ff32b6bd52..02eb3ca15d1874e1cbafc6b614b196c5cc75b6a1 100644
--- a/drivers/gpu/drm/panthor/panthor_devfreq.c
+++ b/drivers/gpu/drm/panthor/panthor_devfreq.c
@@ -12,38 +12,6 @@
#include "panthor_devfreq.h"
#include "panthor_device.h"
-/**
- * struct panthor_devfreq - Device frequency management
- */
-struct panthor_devfreq {
- /** @devfreq: devfreq device. */
- struct devfreq *devfreq;
-
- /** @gov_data: Governor data. */
- struct devfreq_simple_ondemand_data gov_data;
-
- /** @busy_time: Busy time. */
- ktime_t busy_time;
-
- /** @idle_time: Idle time. */
- ktime_t idle_time;
-
- /** @time_last_update: Last update time. */
- ktime_t time_last_update;
-
- /** @last_busy_state: True if the GPU was busy last time we updated the state. */
- bool last_busy_state;
-
- /**
- * @lock: Lock used to protect busy_time, idle_time, time_last_update and
- * last_busy_state.
- *
- * These fields can be accessed concurrently by panthor_devfreq_get_dev_status()
- * and panthor_devfreq_record_{busy,idle}().
- */
- spinlock_t lock;
-};
-
static void panthor_devfreq_update_utilization(struct panthor_devfreq *pdevfreq)
{
ktime_t now, last;
diff --git a/drivers/gpu/drm/panthor/panthor_devfreq.h b/drivers/gpu/drm/panthor/panthor_devfreq.h
index f8e29e02f66cb3281ed4bb4c75cda9bd4df82b92..e8b5ccddd45c52ee3215e9c84c6ebd9109640282 100644
--- a/drivers/gpu/drm/panthor/panthor_devfreq.h
+++ b/drivers/gpu/drm/panthor/panthor_devfreq.h
@@ -4,11 +4,45 @@
#ifndef __PANTHOR_DEVFREQ_H__
#define __PANTHOR_DEVFREQ_H__
+#include <linux/devfreq.h>
+
struct devfreq;
struct thermal_cooling_device;
struct panthor_device;
-struct panthor_devfreq;
+
+/**
+ * struct panthor_devfreq - Device frequency management
+ */
+struct panthor_devfreq {
+ /** @devfreq: devfreq device. */
+ struct devfreq *devfreq;
+
+ /** @gov_data: Governor data. */
+ struct devfreq_simple_ondemand_data gov_data;
+
+ /** @busy_time: Busy time. */
+ ktime_t busy_time;
+
+ /** @idle_time: Idle time. */
+ ktime_t idle_time;
+
+ /** @time_last_update: Last update time. */
+ ktime_t time_last_update;
+
+ /** @last_busy_state: True if the GPU was busy last time we updated the state. */
+ bool last_busy_state;
+
+ /**
+ * @lock: Lock used to protect busy_time, idle_time, time_last_update and
+ * last_busy_state.
+ *
+ * These fields can be accessed concurrently by panthor_devfreq_get_dev_status()
+ * and panthor_devfreq_record_{busy,idle}().
+ */
+ spinlock_t lock;
+};
+
int panthor_devfreq_init(struct panthor_device *ptdev);
--
2.51.0
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH RFC 07/10] drm/panthor: move panthor_devfreq struct to header
2025-09-05 10:23 ` [PATCH RFC 07/10] drm/panthor: move panthor_devfreq struct to header Nicolas Frattaroli
@ 2025-09-10 13:59 ` Steven Price
0 siblings, 0 replies; 30+ messages in thread
From: Steven Price @ 2025-09-10 13:59 UTC (permalink / raw)
To: Nicolas Frattaroli, AngeloGioacchino Del Regno, Boris Brezillon,
Liviu Dudau, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Matthias Brugger, MyungJoo Ham, Kyungmin Park,
Chanwoo Choi, Jassi Brar, Kees Cook, Gustavo A. R. Silva
Cc: Chia-I Wu, Chen-Yu Tsai, kernel, dri-devel, devicetree,
linux-kernel, linux-arm-kernel, linux-mediatek, linux-pm,
linux-hardening
On 05/09/2025 11:23, Nicolas Frattaroli wrote:
> In order to make files other than panthor_devfreq.c be able to touch the
> members of a panthor_devfreq instance, it needs to live somewhere other
> than the .c file.
>
> Move it into the panthor_devfreq.h header, so that the upcoming MediaTek
> MFG devfreq can use it as well.
>
> Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
> ---
> drivers/gpu/drm/panthor/panthor_devfreq.c | 32 ---------------------------
> drivers/gpu/drm/panthor/panthor_devfreq.h | 36 ++++++++++++++++++++++++++++++-
> 2 files changed, 35 insertions(+), 33 deletions(-)
>
> diff --git a/drivers/gpu/drm/panthor/panthor_devfreq.c b/drivers/gpu/drm/panthor/panthor_devfreq.c
> index 8903f60c0a3f06313ac2008791c210ff32b6bd52..02eb3ca15d1874e1cbafc6b614b196c5cc75b6a1 100644
> --- a/drivers/gpu/drm/panthor/panthor_devfreq.c
> +++ b/drivers/gpu/drm/panthor/panthor_devfreq.c
> @@ -12,38 +12,6 @@
> #include "panthor_devfreq.h"
> #include "panthor_device.h"
>
> -/**
> - * struct panthor_devfreq - Device frequency management
> - */
> -struct panthor_devfreq {
> - /** @devfreq: devfreq device. */
> - struct devfreq *devfreq;
> -
> - /** @gov_data: Governor data. */
> - struct devfreq_simple_ondemand_data gov_data;
> -
> - /** @busy_time: Busy time. */
> - ktime_t busy_time;
> -
> - /** @idle_time: Idle time. */
> - ktime_t idle_time;
> -
> - /** @time_last_update: Last update time. */
> - ktime_t time_last_update;
> -
> - /** @last_busy_state: True if the GPU was busy last time we updated the state. */
> - bool last_busy_state;
> -
> - /**
> - * @lock: Lock used to protect busy_time, idle_time, time_last_update and
> - * last_busy_state.
> - *
> - * These fields can be accessed concurrently by panthor_devfreq_get_dev_status()
> - * and panthor_devfreq_record_{busy,idle}().
> - */
> - spinlock_t lock;
> -};
> -
> static void panthor_devfreq_update_utilization(struct panthor_devfreq *pdevfreq)
> {
> ktime_t now, last;
> diff --git a/drivers/gpu/drm/panthor/panthor_devfreq.h b/drivers/gpu/drm/panthor/panthor_devfreq.h
> index f8e29e02f66cb3281ed4bb4c75cda9bd4df82b92..e8b5ccddd45c52ee3215e9c84c6ebd9109640282 100644
> --- a/drivers/gpu/drm/panthor/panthor_devfreq.h
> +++ b/drivers/gpu/drm/panthor/panthor_devfreq.h
> @@ -4,11 +4,45 @@
> #ifndef __PANTHOR_DEVFREQ_H__
> #define __PANTHOR_DEVFREQ_H__
>
> +#include <linux/devfreq.h>
> +
> struct devfreq;
> struct thermal_cooling_device;
>
> struct panthor_device;
> -struct panthor_devfreq;
> +
> +/**
> + * struct panthor_devfreq - Device frequency management
> + */
> +struct panthor_devfreq {
> + /** @devfreq: devfreq device. */
> + struct devfreq *devfreq;
> +
> + /** @gov_data: Governor data. */
> + struct devfreq_simple_ondemand_data gov_data;
> +
> + /** @busy_time: Busy time. */
> + ktime_t busy_time;
> +
> + /** @idle_time: Idle time. */
> + ktime_t idle_time;
> +
> + /** @time_last_update: Last update time. */
> + ktime_t time_last_update;
> +
> + /** @last_busy_state: True if the GPU was busy last time we updated the state. */
> + bool last_busy_state;
> +
> + /**
> + * @lock: Lock used to protect busy_time, idle_time, time_last_update and
> + * last_busy_state.
> + *
> + * These fields can be accessed concurrently by panthor_devfreq_get_dev_status()
> + * and panthor_devfreq_record_{busy,idle}().
> + */
> + spinlock_t lock;
> +};
> +
>
> int panthor_devfreq_init(struct panthor_device *ptdev);
>
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH RFC 08/10] drm/panthor: devfreq: expose get_dev_status and make it more generic
2025-09-05 10:22 [PATCH RFC 00/10] MT8196 GPU Frequency/Power Control Support Nicolas Frattaroli
` (6 preceding siblings ...)
2025-09-05 10:23 ` [PATCH RFC 07/10] drm/panthor: move panthor_devfreq struct to header Nicolas Frattaroli
@ 2025-09-05 10:23 ` Nicolas Frattaroli
2025-09-10 13:59 ` Steven Price
2025-09-05 10:23 ` [PATCH RFC 09/10] drm/panthor: devfreq: add pluggable devfreq providers Nicolas Frattaroli
2025-09-05 10:23 ` [PATCH RFC 10/10] drm/panthor: add support for MediaTek MFlexGraphics Nicolas Frattaroli
9 siblings, 1 reply; 30+ messages in thread
From: Nicolas Frattaroli @ 2025-09-05 10:23 UTC (permalink / raw)
To: AngeloGioacchino Del Regno, Boris Brezillon, Steven Price,
Liviu Dudau, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Matthias Brugger, MyungJoo Ham, Kyungmin Park,
Chanwoo Choi, Jassi Brar, Kees Cook, Gustavo A. R. Silva
Cc: Chia-I Wu, Chen-Yu Tsai, kernel, dri-devel, devicetree,
linux-kernel, linux-arm-kernel, linux-mediatek, linux-pm,
linux-hardening, Nicolas Frattaroli
The only device-specific part of panthor's get_dev_status devfreq
callback is getting the clock frequency. All the other logic surrounding
what it does may be useful for other devfreq implementations however.
Expose it in the panthor_devfreq.h header file, and make it call back
into get_cur_freq instead of poking the common clock framework directly.
Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
---
drivers/gpu/drm/panthor/panthor_devfreq.c | 29 +++++++++++++++++------------
drivers/gpu/drm/panthor/panthor_devfreq.h | 3 +++
2 files changed, 20 insertions(+), 12 deletions(-)
diff --git a/drivers/gpu/drm/panthor/panthor_devfreq.c b/drivers/gpu/drm/panthor/panthor_devfreq.c
index 02eb3ca15d1874e1cbafc6b614b196c5cc75b6a1..d495dd856299826c4bddc205087d8914d01d7fc4 100644
--- a/drivers/gpu/drm/panthor/panthor_devfreq.c
+++ b/drivers/gpu/drm/panthor/panthor_devfreq.c
@@ -43,6 +43,15 @@ static int panthor_devfreq_target(struct device *dev, unsigned long *freq,
return err;
}
+static int panthor_devfreq_get_cur_freq(struct device *dev, unsigned long *freq)
+{
+ struct panthor_device *ptdev = dev_get_drvdata(dev);
+
+ *freq = clk_get_rate(ptdev->clks.core);
+
+ return 0;
+}
+
static void panthor_devfreq_reset(struct panthor_devfreq *pdevfreq)
{
pdevfreq->busy_time = 0;
@@ -50,14 +59,18 @@ static void panthor_devfreq_reset(struct panthor_devfreq *pdevfreq)
pdevfreq->time_last_update = ktime_get();
}
-static int panthor_devfreq_get_dev_status(struct device *dev,
- struct devfreq_dev_status *status)
+int panthor_devfreq_get_dev_status(struct device *dev,
+ struct devfreq_dev_status *status)
{
struct panthor_device *ptdev = dev_get_drvdata(dev);
struct panthor_devfreq *pdevfreq = ptdev->devfreq;
+ struct devfreq_dev_profile *p = pdevfreq->devfreq->profile;
unsigned long irqflags;
+ int ret;
- status->current_frequency = clk_get_rate(ptdev->clks.core);
+ ret = p->get_cur_freq(dev, &status->current_frequency);
+ if (ret)
+ return ret;
spin_lock_irqsave(&pdevfreq->lock, irqflags);
@@ -79,15 +92,7 @@ static int panthor_devfreq_get_dev_status(struct device *dev,
return 0;
}
-
-static int panthor_devfreq_get_cur_freq(struct device *dev, unsigned long *freq)
-{
- struct panthor_device *ptdev = dev_get_drvdata(dev);
-
- *freq = clk_get_rate(ptdev->clks.core);
-
- return 0;
-}
+EXPORT_SYMBOL(panthor_devfreq_get_dev_status);
static struct devfreq_dev_profile panthor_devfreq_profile = {
.timer = DEVFREQ_TIMER_DELAYED,
diff --git a/drivers/gpu/drm/panthor/panthor_devfreq.h b/drivers/gpu/drm/panthor/panthor_devfreq.h
index e8b5ccddd45c52ee3215e9c84c6ebd9109640282..a891cb5fdc34636444f141e10f5d45828fc35b51 100644
--- a/drivers/gpu/drm/panthor/panthor_devfreq.h
+++ b/drivers/gpu/drm/panthor/panthor_devfreq.h
@@ -52,6 +52,9 @@ void panthor_devfreq_suspend(struct panthor_device *ptdev);
void panthor_devfreq_record_busy(struct panthor_device *ptdev);
void panthor_devfreq_record_idle(struct panthor_device *ptdev);
+int panthor_devfreq_get_dev_status(struct device *dev,
+ struct devfreq_dev_status *status);
+
unsigned long panthor_devfreq_get_freq(struct panthor_device *ptdev);
#endif /* __PANTHOR_DEVFREQ_H__ */
--
2.51.0
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH RFC 08/10] drm/panthor: devfreq: expose get_dev_status and make it more generic
2025-09-05 10:23 ` [PATCH RFC 08/10] drm/panthor: devfreq: expose get_dev_status and make it more generic Nicolas Frattaroli
@ 2025-09-10 13:59 ` Steven Price
0 siblings, 0 replies; 30+ messages in thread
From: Steven Price @ 2025-09-10 13:59 UTC (permalink / raw)
To: Nicolas Frattaroli, AngeloGioacchino Del Regno, Boris Brezillon,
Liviu Dudau, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Matthias Brugger, MyungJoo Ham, Kyungmin Park,
Chanwoo Choi, Jassi Brar, Kees Cook, Gustavo A. R. Silva
Cc: Chia-I Wu, Chen-Yu Tsai, kernel, dri-devel, devicetree,
linux-kernel, linux-arm-kernel, linux-mediatek, linux-pm,
linux-hardening
On 05/09/2025 11:23, Nicolas Frattaroli wrote:
> The only device-specific part of panthor's get_dev_status devfreq
> callback is getting the clock frequency. All the other logic surrounding
> what it does may be useful for other devfreq implementations however.
>
> Expose it in the panthor_devfreq.h header file, and make it call back
> into get_cur_freq instead of poking the common clock framework directly.
>
> Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> ---
> drivers/gpu/drm/panthor/panthor_devfreq.c | 29 +++++++++++++++++------------
> drivers/gpu/drm/panthor/panthor_devfreq.h | 3 +++
> 2 files changed, 20 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/gpu/drm/panthor/panthor_devfreq.c b/drivers/gpu/drm/panthor/panthor_devfreq.c
> index 02eb3ca15d1874e1cbafc6b614b196c5cc75b6a1..d495dd856299826c4bddc205087d8914d01d7fc4 100644
> --- a/drivers/gpu/drm/panthor/panthor_devfreq.c
> +++ b/drivers/gpu/drm/panthor/panthor_devfreq.c
> @@ -43,6 +43,15 @@ static int panthor_devfreq_target(struct device *dev, unsigned long *freq,
> return err;
> }
>
> +static int panthor_devfreq_get_cur_freq(struct device *dev, unsigned long *freq)
> +{
> + struct panthor_device *ptdev = dev_get_drvdata(dev);
> +
> + *freq = clk_get_rate(ptdev->clks.core);
> +
> + return 0;
> +}
> +
I don't see any reason why this was moved up here as part of this patch?
Otherwise this looks fine to me.
Thanks,
Steve
> static void panthor_devfreq_reset(struct panthor_devfreq *pdevfreq)
> {
> pdevfreq->busy_time = 0;
> @@ -50,14 +59,18 @@ static void panthor_devfreq_reset(struct panthor_devfreq *pdevfreq)
> pdevfreq->time_last_update = ktime_get();
> }
>
> -static int panthor_devfreq_get_dev_status(struct device *dev,
> - struct devfreq_dev_status *status)
> +int panthor_devfreq_get_dev_status(struct device *dev,
> + struct devfreq_dev_status *status)
> {
> struct panthor_device *ptdev = dev_get_drvdata(dev);
> struct panthor_devfreq *pdevfreq = ptdev->devfreq;
> + struct devfreq_dev_profile *p = pdevfreq->devfreq->profile;
> unsigned long irqflags;
> + int ret;
>
> - status->current_frequency = clk_get_rate(ptdev->clks.core);
> + ret = p->get_cur_freq(dev, &status->current_frequency);
> + if (ret)
> + return ret;
>
> spin_lock_irqsave(&pdevfreq->lock, irqflags);
>
> @@ -79,15 +92,7 @@ static int panthor_devfreq_get_dev_status(struct device *dev,
>
> return 0;
> }
> -
> -static int panthor_devfreq_get_cur_freq(struct device *dev, unsigned long *freq)
> -{
> - struct panthor_device *ptdev = dev_get_drvdata(dev);
> -
> - *freq = clk_get_rate(ptdev->clks.core);
> -
> - return 0;
> -}
> +EXPORT_SYMBOL(panthor_devfreq_get_dev_status);
>
> static struct devfreq_dev_profile panthor_devfreq_profile = {
> .timer = DEVFREQ_TIMER_DELAYED,
> diff --git a/drivers/gpu/drm/panthor/panthor_devfreq.h b/drivers/gpu/drm/panthor/panthor_devfreq.h
> index e8b5ccddd45c52ee3215e9c84c6ebd9109640282..a891cb5fdc34636444f141e10f5d45828fc35b51 100644
> --- a/drivers/gpu/drm/panthor/panthor_devfreq.h
> +++ b/drivers/gpu/drm/panthor/panthor_devfreq.h
> @@ -52,6 +52,9 @@ void panthor_devfreq_suspend(struct panthor_device *ptdev);
> void panthor_devfreq_record_busy(struct panthor_device *ptdev);
> void panthor_devfreq_record_idle(struct panthor_device *ptdev);
>
> +int panthor_devfreq_get_dev_status(struct device *dev,
> + struct devfreq_dev_status *status);
> +
> unsigned long panthor_devfreq_get_freq(struct panthor_device *ptdev);
>
> #endif /* __PANTHOR_DEVFREQ_H__ */
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH RFC 09/10] drm/panthor: devfreq: add pluggable devfreq providers
2025-09-05 10:22 [PATCH RFC 00/10] MT8196 GPU Frequency/Power Control Support Nicolas Frattaroli
` (7 preceding siblings ...)
2025-09-05 10:23 ` [PATCH RFC 08/10] drm/panthor: devfreq: expose get_dev_status and make it more generic Nicolas Frattaroli
@ 2025-09-05 10:23 ` Nicolas Frattaroli
2025-09-10 13:59 ` Steven Price
2025-09-05 10:23 ` [PATCH RFC 10/10] drm/panthor: add support for MediaTek MFlexGraphics Nicolas Frattaroli
9 siblings, 1 reply; 30+ messages in thread
From: Nicolas Frattaroli @ 2025-09-05 10:23 UTC (permalink / raw)
To: AngeloGioacchino Del Regno, Boris Brezillon, Steven Price,
Liviu Dudau, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Matthias Brugger, MyungJoo Ham, Kyungmin Park,
Chanwoo Choi, Jassi Brar, Kees Cook, Gustavo A. R. Silva
Cc: Chia-I Wu, Chen-Yu Tsai, kernel, dri-devel, devicetree,
linux-kernel, linux-arm-kernel, linux-mediatek, linux-pm,
linux-hardening, Nicolas Frattaroli
On some devices, devfreq is not controlled directly by DT OPPs and the
common clock framework, but through an external devfreq driver. To
permit this type of usage, add the concept of devfreq providers.
Devfreq providers for panthor register themselves with panthor as a
provider. panthor then gets whatever driver is suckling on its
performance-node property, finds the registered devfreq provider init
function for it, and calls it.
Should the probe order work out such that panthor probes before the
devfreq provider is finished probing and registering itself, then we
just defer the probe after adding a device link.
Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
---
drivers/gpu/drm/panthor/panthor_devfreq.c | 74 ++++++++++++++++++++++++++++++-
drivers/gpu/drm/panthor/panthor_devfreq.h | 16 +++++++
2 files changed, 89 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/panthor/panthor_devfreq.c b/drivers/gpu/drm/panthor/panthor_devfreq.c
index d495dd856299826c4bddc205087d8914d01d7fc4..1b0c21f6f069b059b8b0e412a79556c602c5f1e7 100644
--- a/drivers/gpu/drm/panthor/panthor_devfreq.c
+++ b/drivers/gpu/drm/panthor/panthor_devfreq.c
@@ -4,6 +4,7 @@
#include <linux/clk.h>
#include <linux/devfreq.h>
#include <linux/devfreq_cooling.h>
+#include <linux/of_platform.h>
#include <linux/platform_device.h>
#include <linux/pm_opp.h>
@@ -12,6 +13,34 @@
#include "panthor_devfreq.h"
#include "panthor_device.h"
+static LIST_HEAD(panthor_devfreq_providers);
+static DEFINE_MUTEX(panthor_devfreq_providers_lock);
+
+int panthor_devfreq_register_provider(struct panthor_devfreq_provider *prov)
+{
+ guard(mutex)(&panthor_devfreq_providers_lock);
+
+ list_add(&prov->node, &panthor_devfreq_providers);
+
+ return 0;
+}
+EXPORT_SYMBOL(panthor_devfreq_register_provider);
+
+static int panthor_devfreq_init_provider(struct panthor_device *ptdev,
+ struct device *provider_dev)
+{
+ struct panthor_devfreq_provider *prov;
+
+ guard(mutex)(&panthor_devfreq_providers_lock);
+
+ list_for_each_entry(prov, &panthor_devfreq_providers, node) {
+ if (prov->dev == provider_dev)
+ return prov->init(ptdev, provider_dev);
+ }
+
+ return -EPROBE_DEFER;
+}
+
static void panthor_devfreq_update_utilization(struct panthor_devfreq *pdevfreq)
{
ktime_t now, last;
@@ -102,7 +131,7 @@ static struct devfreq_dev_profile panthor_devfreq_profile = {
.get_cur_freq = panthor_devfreq_get_cur_freq,
};
-int panthor_devfreq_init(struct panthor_device *ptdev)
+static int panthor_devfreq_init_of(struct panthor_device *ptdev)
{
/* There's actually 2 regulators (mali and sram), but the OPP core only
* supports one.
@@ -222,6 +251,49 @@ int panthor_devfreq_init(struct panthor_device *ptdev)
return 0;
}
+static int panthor_devfreq_init_platform(struct panthor_device *ptdev)
+{
+ struct device_node *pcnode;
+ struct platform_device *pdev;
+ struct device_link *link;
+ int ret;
+
+ pcnode = of_parse_phandle(ptdev->base.dev->of_node,
+ "performance-controller", 0);
+ if (!pcnode)
+ return -EINVAL;
+
+ pdev = of_find_device_by_node(pcnode);
+ of_node_put(pcnode);
+ if (!pdev)
+ return -ENODEV;
+
+ link = device_link_add(ptdev->base.dev, &pdev->dev,
+ DL_FLAG_PM_RUNTIME | DL_FLAG_STATELESS);
+ if (!link) {
+ dev_err(ptdev->base.dev, "failed to add device link\n");
+ return -ENODEV;
+ }
+
+ ret = panthor_devfreq_init_provider(ptdev, &pdev->dev);
+ if (ret)
+ return dev_err_probe(ptdev->base.dev, ret,
+ "failed to initialize devfreq provider\n");
+
+ DRM_DEV_INFO(ptdev->base.dev, "initialized devfreq provider %s\n",
+ dev_name(&pdev->dev));
+
+ return 0;
+}
+
+int panthor_devfreq_init(struct panthor_device *ptdev)
+{
+ if (!of_property_present(ptdev->base.dev->of_node, "performance-controller"))
+ return panthor_devfreq_init_of(ptdev);
+
+ return panthor_devfreq_init_platform(ptdev);
+}
+
void panthor_devfreq_resume(struct panthor_device *ptdev)
{
struct panthor_devfreq *pdevfreq = ptdev->devfreq;
diff --git a/drivers/gpu/drm/panthor/panthor_devfreq.h b/drivers/gpu/drm/panthor/panthor_devfreq.h
index a891cb5fdc34636444f141e10f5d45828fc35b51..94c9768d5d038c4ba8516929edb565a1f13443fb 100644
--- a/drivers/gpu/drm/panthor/panthor_devfreq.h
+++ b/drivers/gpu/drm/panthor/panthor_devfreq.h
@@ -8,6 +8,7 @@
struct devfreq;
struct thermal_cooling_device;
+struct platform_device;
struct panthor_device;
@@ -43,6 +44,19 @@ struct panthor_devfreq {
spinlock_t lock;
};
+struct panthor_devfreq_provider {
+ /** @dev: device pointer to the provider device */
+ struct device *dev;
+ /**
+ * @init: the provider's init callback that allocates a
+ * &struct panthor_devfreq, adds it to panthor, and adds a devfreq
+ * device to panthor. Will be called during panthor's probe.
+ */
+ int (*init)(struct panthor_device *ptdev, struct device *dev);
+
+ struct list_head node;
+};
+
int panthor_devfreq_init(struct panthor_device *ptdev);
@@ -57,4 +71,6 @@ int panthor_devfreq_get_dev_status(struct device *dev,
unsigned long panthor_devfreq_get_freq(struct panthor_device *ptdev);
+int panthor_devfreq_register_provider(struct panthor_devfreq_provider *prov);
+
#endif /* __PANTHOR_DEVFREQ_H__ */
--
2.51.0
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH RFC 09/10] drm/panthor: devfreq: add pluggable devfreq providers
2025-09-05 10:23 ` [PATCH RFC 09/10] drm/panthor: devfreq: add pluggable devfreq providers Nicolas Frattaroli
@ 2025-09-10 13:59 ` Steven Price
0 siblings, 0 replies; 30+ messages in thread
From: Steven Price @ 2025-09-10 13:59 UTC (permalink / raw)
To: Nicolas Frattaroli, AngeloGioacchino Del Regno, Boris Brezillon,
Liviu Dudau, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Matthias Brugger, MyungJoo Ham, Kyungmin Park,
Chanwoo Choi, Jassi Brar, Kees Cook, Gustavo A. R. Silva
Cc: Chia-I Wu, Chen-Yu Tsai, kernel, dri-devel, devicetree,
linux-kernel, linux-arm-kernel, linux-mediatek, linux-pm,
linux-hardening
On 05/09/2025 11:23, Nicolas Frattaroli wrote:
> On some devices, devfreq is not controlled directly by DT OPPs and the
> common clock framework, but through an external devfreq driver. To
> permit this type of usage, add the concept of devfreq providers.
>
> Devfreq providers for panthor register themselves with panthor as a
> provider. panthor then gets whatever driver is suckling on its
> performance-node property, finds the registered devfreq provider init
> function for it, and calls it.
>
> Should the probe order work out such that panthor probes before the
> devfreq provider is finished probing and registering itself, then we
> just defer the probe after adding a device link.
>
> Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
Looks fine to me, but I think there are some updates needed due to the
DT review.
Thanks,
Steve
> ---
> drivers/gpu/drm/panthor/panthor_devfreq.c | 74 ++++++++++++++++++++++++++++++-
> drivers/gpu/drm/panthor/panthor_devfreq.h | 16 +++++++
> 2 files changed, 89 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/panthor/panthor_devfreq.c b/drivers/gpu/drm/panthor/panthor_devfreq.c
> index d495dd856299826c4bddc205087d8914d01d7fc4..1b0c21f6f069b059b8b0e412a79556c602c5f1e7 100644
> --- a/drivers/gpu/drm/panthor/panthor_devfreq.c
> +++ b/drivers/gpu/drm/panthor/panthor_devfreq.c
> @@ -4,6 +4,7 @@
> #include <linux/clk.h>
> #include <linux/devfreq.h>
> #include <linux/devfreq_cooling.h>
> +#include <linux/of_platform.h>
> #include <linux/platform_device.h>
> #include <linux/pm_opp.h>
>
> @@ -12,6 +13,34 @@
> #include "panthor_devfreq.h"
> #include "panthor_device.h"
>
> +static LIST_HEAD(panthor_devfreq_providers);
> +static DEFINE_MUTEX(panthor_devfreq_providers_lock);
> +
> +int panthor_devfreq_register_provider(struct panthor_devfreq_provider *prov)
> +{
> + guard(mutex)(&panthor_devfreq_providers_lock);
> +
> + list_add(&prov->node, &panthor_devfreq_providers);
> +
> + return 0;
> +}
> +EXPORT_SYMBOL(panthor_devfreq_register_provider);
> +
> +static int panthor_devfreq_init_provider(struct panthor_device *ptdev,
> + struct device *provider_dev)
> +{
> + struct panthor_devfreq_provider *prov;
> +
> + guard(mutex)(&panthor_devfreq_providers_lock);
> +
> + list_for_each_entry(prov, &panthor_devfreq_providers, node) {
> + if (prov->dev == provider_dev)
> + return prov->init(ptdev, provider_dev);
> + }
> +
> + return -EPROBE_DEFER;
> +}
> +
> static void panthor_devfreq_update_utilization(struct panthor_devfreq *pdevfreq)
> {
> ktime_t now, last;
> @@ -102,7 +131,7 @@ static struct devfreq_dev_profile panthor_devfreq_profile = {
> .get_cur_freq = panthor_devfreq_get_cur_freq,
> };
>
> -int panthor_devfreq_init(struct panthor_device *ptdev)
> +static int panthor_devfreq_init_of(struct panthor_device *ptdev)
> {
> /* There's actually 2 regulators (mali and sram), but the OPP core only
> * supports one.
> @@ -222,6 +251,49 @@ int panthor_devfreq_init(struct panthor_device *ptdev)
> return 0;
> }
>
> +static int panthor_devfreq_init_platform(struct panthor_device *ptdev)
> +{
> + struct device_node *pcnode;
> + struct platform_device *pdev;
> + struct device_link *link;
> + int ret;
> +
> + pcnode = of_parse_phandle(ptdev->base.dev->of_node,
> + "performance-controller", 0);
> + if (!pcnode)
> + return -EINVAL;
> +
> + pdev = of_find_device_by_node(pcnode);
> + of_node_put(pcnode);
> + if (!pdev)
> + return -ENODEV;
> +
> + link = device_link_add(ptdev->base.dev, &pdev->dev,
> + DL_FLAG_PM_RUNTIME | DL_FLAG_STATELESS);
> + if (!link) {
> + dev_err(ptdev->base.dev, "failed to add device link\n");
> + return -ENODEV;
> + }
> +
> + ret = panthor_devfreq_init_provider(ptdev, &pdev->dev);
> + if (ret)
> + return dev_err_probe(ptdev->base.dev, ret,
> + "failed to initialize devfreq provider\n");
> +
> + DRM_DEV_INFO(ptdev->base.dev, "initialized devfreq provider %s\n",
> + dev_name(&pdev->dev));
> +
> + return 0;
> +}
> +
> +int panthor_devfreq_init(struct panthor_device *ptdev)
> +{
> + if (!of_property_present(ptdev->base.dev->of_node, "performance-controller"))
> + return panthor_devfreq_init_of(ptdev);
> +
> + return panthor_devfreq_init_platform(ptdev);
> +}
> +
> void panthor_devfreq_resume(struct panthor_device *ptdev)
> {
> struct panthor_devfreq *pdevfreq = ptdev->devfreq;
> diff --git a/drivers/gpu/drm/panthor/panthor_devfreq.h b/drivers/gpu/drm/panthor/panthor_devfreq.h
> index a891cb5fdc34636444f141e10f5d45828fc35b51..94c9768d5d038c4ba8516929edb565a1f13443fb 100644
> --- a/drivers/gpu/drm/panthor/panthor_devfreq.h
> +++ b/drivers/gpu/drm/panthor/panthor_devfreq.h
> @@ -8,6 +8,7 @@
>
> struct devfreq;
> struct thermal_cooling_device;
> +struct platform_device;
>
> struct panthor_device;
>
> @@ -43,6 +44,19 @@ struct panthor_devfreq {
> spinlock_t lock;
> };
>
> +struct panthor_devfreq_provider {
> + /** @dev: device pointer to the provider device */
> + struct device *dev;
> + /**
> + * @init: the provider's init callback that allocates a
> + * &struct panthor_devfreq, adds it to panthor, and adds a devfreq
> + * device to panthor. Will be called during panthor's probe.
> + */
> + int (*init)(struct panthor_device *ptdev, struct device *dev);
> +
> + struct list_head node;
> +};
> +
>
> int panthor_devfreq_init(struct panthor_device *ptdev);
>
> @@ -57,4 +71,6 @@ int panthor_devfreq_get_dev_status(struct device *dev,
>
> unsigned long panthor_devfreq_get_freq(struct panthor_device *ptdev);
>
> +int panthor_devfreq_register_provider(struct panthor_devfreq_provider *prov);
> +
> #endif /* __PANTHOR_DEVFREQ_H__ */
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH RFC 10/10] drm/panthor: add support for MediaTek MFlexGraphics
2025-09-05 10:22 [PATCH RFC 00/10] MT8196 GPU Frequency/Power Control Support Nicolas Frattaroli
` (8 preceding siblings ...)
2025-09-05 10:23 ` [PATCH RFC 09/10] drm/panthor: devfreq: add pluggable devfreq providers Nicolas Frattaroli
@ 2025-09-05 10:23 ` Nicolas Frattaroli
9 siblings, 0 replies; 30+ messages in thread
From: Nicolas Frattaroli @ 2025-09-05 10:23 UTC (permalink / raw)
To: AngeloGioacchino Del Regno, Boris Brezillon, Steven Price,
Liviu Dudau, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Matthias Brugger, MyungJoo Ham, Kyungmin Park,
Chanwoo Choi, Jassi Brar, Kees Cook, Gustavo A. R. Silva
Cc: Chia-I Wu, Chen-Yu Tsai, kernel, dri-devel, devicetree,
linux-kernel, linux-arm-kernel, linux-mediatek, linux-pm,
linux-hardening, Nicolas Frattaroli
MediaTek uses some glue logic to control frequency and power on some of
their GPUs. This is best exposed as a devfreq driver, as it saves us
from having to hardcode OPPs into the device tree, and can be extended
with additional devfreq-y logic like more clever governors that use the
hardware's GPUEB MCU to set frame time targets and power limits.
Add this driver to the panthor subdirectory. It needs to live here as it
needs to call into panthor's devfreq layer, and panthor for its part
also needs to call into this driver during probe to get a devfreq device
registered. Solving the cyclical dependency by having mediatek_mfg live
without knowledge of what a panthor is would require moving the devfreq
provider stuff into a generic devfreq subsystem solution, which I didn't
want to do.
Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
---
drivers/gpu/drm/panthor/Kconfig | 13 +
drivers/gpu/drm/panthor/Makefile | 2 +
drivers/gpu/drm/panthor/mediatek_mfg.c | 1053 ++++++++++++++++++++++++++++++++
3 files changed, 1068 insertions(+)
diff --git a/drivers/gpu/drm/panthor/Kconfig b/drivers/gpu/drm/panthor/Kconfig
index 55b40ad07f3b0779e0c434469ddc874ff74fde27..c4d2599c05df9e0e009b8e99b3d29c220269ca0d 100644
--- a/drivers/gpu/drm/panthor/Kconfig
+++ b/drivers/gpu/drm/panthor/Kconfig
@@ -21,3 +21,16 @@ config DRM_PANTHOR
Note that the Mali-G68 and Mali-G78, while Valhall architecture, will
be supported with the panfrost driver as they are not CSF GPUs.
+
+config DRM_PANTHOR_MEDIATEK_MFG
+ tristate "MediaTek MFlexGraphics Extensions for Panthor"
+ depends on (DRM_PANTHOR && ARCH_MEDIATEK) || COMPILE_TEST
+ select DEVFREQ_GOV_SIMPLE_ONDEMAND
+ select PM_DEVFREQ
+ select MTK_GPUEB_MBOX
+ help
+ Support for power and clock control in Panthor for MediaTek
+ MFlexGraphics devices, such as the GPU on the MT8196 or MT6991 SoCs.
+
+ These extensions are required for the GPU to work on these platforms,
+ as they control the glue logic that powers on the GPU.
diff --git a/drivers/gpu/drm/panthor/Makefile b/drivers/gpu/drm/panthor/Makefile
index 02db21748c125688d2ef20ed254b5ebd7ff642e4..e0ebfdfb20bd78e0003c860c86c040746248fb89 100644
--- a/drivers/gpu/drm/panthor/Makefile
+++ b/drivers/gpu/drm/panthor/Makefile
@@ -12,4 +12,6 @@ panthor-y := \
panthor_mmu.o \
panthor_sched.o
+obj-$(CONFIG_DRM_PANTHOR_MEDIATEK_MFG) += mediatek_mfg.o
+
obj-$(CONFIG_DRM_PANTHOR) += panthor.o
diff --git a/drivers/gpu/drm/panthor/mediatek_mfg.c b/drivers/gpu/drm/panthor/mediatek_mfg.c
new file mode 100644
index 0000000000000000000000000000000000000000..bc2bf6539b28197b3cf347a28274e1618ff10d61
--- /dev/null
+++ b/drivers/gpu/drm/panthor/mediatek_mfg.c
@@ -0,0 +1,1053 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Driver for MediaTek MFlexGraphics Devices
+ *
+ * Copyright (C) 2025, Collabora Ltd.
+ */
+
+#include <linux/atomic.h>
+#include <linux/completion.h>
+#include <linux/clk.h>
+#include <linux/container_of.h>
+#include <linux/devfreq.h>
+#include <linux/iopoll.h>
+#include <linux/mailbox_client.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/of_platform.h>
+#include <linux/platform_device.h>
+#include <linux/pm_opp.h>
+#include <linux/regulator/consumer.h>
+
+#include <drm/drm_managed.h>
+
+#include "panthor_devfreq.h"
+#include "panthor_device.h"
+
+#define GPR_LP_STATE 0x0028
+#define EB_ON_SUSPEND 0x0
+#define EB_ON_RESUME 0x1
+#define GPR_IPI_MAGIC 0x34
+
+#define RPC_PWR_CON 0x0504
+#define PWR_ACK_M GENMASK(31, 30)
+#define RPC_DUMMY_REG_2 0x0658
+#define RPC_GHPM_CFG0_CON 0x0800
+#define GHPM_ENABLE_M BIT(0)
+#define GHPM_ON_SEQ_M BIT(2)
+#define RPC_GHPM_RO0_CON 0x09A4
+#define GHPM_STATE_M GENMASK(7, 0)
+#define GHPM_PWR_STATE_M BIT(16)
+
+#define GF_REG_MAGIC 0x0000
+#define GF_REG_GPU_OPP_IDX 0x0004
+#define GF_REG_STK_OPP_IDX 0x0008
+#define GF_REG_GPU_OPP_NUM 0x000c
+#define GF_REG_STK_OPP_NUM 0x0010
+#define GF_REG_GPU_OPP_SNUM 0x0014
+#define GF_REG_STK_OPP_SNUM 0x0018
+#define GF_REG_POWER_COUNT 0x001c
+#define GF_REG_BUCK_COUNT 0x0020
+#define GF_REG_MTCMOS_COUNT 0x0024
+#define GF_REG_CG_COUNT 0x0028 /* CG = Clock Gate? */
+#define GF_REG_ACTIVE_COUNT 0x002C
+#define GF_REG_TEMP_RAW 0x0030
+#define GF_REG_TEMP_NORM_GPU 0x0034
+#define GF_REG_TEMP_HIGH_GPU 0x0038
+#define GF_REG_TEMP_NORM_STK 0x003C
+#define GF_REG_TEMP_HIGH_STK 0x0040
+#define GF_REG_FREQ_CUR_GPU 0x0044
+#define GF_REG_FREQ_CUR_STK 0x0048
+#define GF_REG_FREQ_OUT_GPU 0x004C /* Guess: actual achieved freq */
+#define GF_REG_FREQ_OUT_STK 0x0050 /* Guess: actual achieved freq */
+#define GF_REG_FREQ_METER_GPU 0x0054 /* Seems unused, always 0 */
+#define GF_REG_FREQ_METER_STK 0x0058 /* Seems unused, always 0 */
+#define GF_REG_VOLT_CUR_GPU 0x005C /* in tens of microvolts */
+#define GF_REG_VOLT_CUR_STK 0x0060 /* in tens of microvolts */
+#define GF_REG_VOLT_CUR_GPU_SRAM 0x0064
+#define GF_REG_VOLT_CUR_STK_SRAM 0x0068
+#define GF_REG_VOLT_CUR_GPU_REG 0x006C /* Seems unused, always 0 */
+#define GF_REG_VOLT_CUR_STK_REG 0x0070 /* Seems unused, always 0 */
+#define GF_REG_VOLT_CUR_GPU_REG_SRAM 0x0074
+#define GF_REG_VOLT_CUR_STK_REG_SRAM 0x0078
+#define GF_REG_PWR_CUR_GPU 0x007C /* in milliwatts */
+#define GF_REG_PWR_CUR_STK 0x0080 /* in milliwatts */
+#define GF_REG_PWR_MAX_GPU 0x0084 /* in milliwatts */
+#define GF_REG_PWR_MAX_STK 0x0088 /* in milliwatts */
+#define GF_REG_PWR_MIN_GPU 0x008C /* in milliwatts */
+#define GF_REG_PWR_MIN_STK 0x0090 /* in milliwatts */
+#define GF_REG_LEAKAGE_RT_GPU 0x0094 /* Unknown */
+#define GF_REG_LEAKAGE_RT_STK 0x0098 /* Unknown */
+#define GF_REG_LEAKAGE_RT_SRAM 0x009C /* Unknown */
+#define GF_REG_LEAKAGE_HT_GPU 0x00A0 /* Unknown */
+#define GF_REG_LEAKAGE_HT_STK 0x00A4 /* Unknown */
+#define GF_REG_LEAKAGE_HT_SRAM 0x00A8 /* Unknown */
+#define GF_REG_VOLT_DAC_LOW_GPU 0x00AC /* Seems unused, always 0 */
+#define GF_REG_VOLT_DAC_LOW_STK 0x00B0 /* Seems unused, always 0 */
+#define GF_REG_OPP_CUR_CEIL 0x00B4
+#define GF_REG_OPP_CUR_FLOOR 0x00B8
+#define GF_REG_OPP_CUR_LIMITER_CEIL 0x00BC
+#define GF_REG_OPP_CUR_LIMITER_FLOOR 0x00C0
+#define GF_REG_OPP_PRIORITY_CEIL 0x00C4
+#define GF_REG_OPP_PRIORITY_FLOOR 0x00C8
+#define GF_REG_PWR_CTL 0x00CC
+#define GF_REG_ACTIVE_SLEEP_CTL 0x00D0
+#define GF_REG_DVFS_STATE 0x00D4
+#define GF_REG_SHADER_PRESENT 0x00D8
+#define GF_REG_ASENSOR_ENABLE 0x00DC
+#define GF_REG_AGING_LOAD 0x00E0
+#define GF_REG_AGING_MARGIN 0x00E4
+#define GF_REG_AVS_ENABLE 0x00E8
+#define GF_REG_AVS_MARGIN 0x00EC
+#define GF_REG_CHIP_TYPE 0x00F0
+#define GF_REG_SB_VERSION 0x00F4
+#define GF_REG_PTP_VERSION 0x00F8
+#define GF_REG_DBG_VERSION 0x00FC
+#define GF_REG_KDBG_VERSION 0x0100
+#define GF_REG_GPM1_MODE 0x0104
+#define GF_REG_GPM3_MODE 0x0108
+#define GF_REG_DFD_MODE 0x010C
+#define GF_REG_DUAL_BUCK 0x0110
+#define GF_REG_SEGMENT_ID 0x0114
+#define GF_REG_POWER_TIME_H 0x0118
+#define GF_REG_POWER_TIME_L 0x011C
+#define GF_REG_PWR_STATUS 0x0120
+#define GF_REG_STRESS_TEST 0x0124
+#define GF_REG_TEST_MODE 0x0128
+#define GF_REG_IPS_MODE 0x012C
+#define GF_REG_TEMP_COMP_MODE 0x0130
+#define GF_REG_HT_TEMP_COMP_MODE 0x0134
+#define GF_REG_PWR_TRACKER_MODE 0x0138
+#define GF_REG_OPP_TABLE_GPU 0x0314
+#define GF_REG_OPP_TABLE_STK 0x09A4
+#define GF_REG_OPP_TABLE_GPU_S 0x1034
+#define GF_REG_OPP_TABLE_STK_S 0x16c4
+#define GF_REG_LIMIT_TABLE 0x1d54
+#define GF_REG_GPM3_TABLE 0x223C
+
+#define MFG_MT8196_E2_ID 0x101
+#define GPUEB_SLEEP_MAGIC 0x55667788UL
+
+#define GPUEB_TIMEOUT_US 10000UL
+#define GPUEB_POLL_US 50
+
+#define MAX_OPP_NUM 70
+
+/*
+ * This enum is part of the ABI of the GPUEB firmware. Don't change the
+ * numbering, as you would wreak havoc.
+ */
+enum mtk_mfg_ipi_cmd {
+ CMD_INIT_SHARED_MEM = 0,
+ CMD_GET_FREQ_BY_IDX = 1,
+ CMD_GET_POWER_BY_IDX = 2,
+ CMD_GET_OPPIDX_BY_FREQ = 3,
+ CMD_GET_LEAKAGE_POWER = 4,
+ CMD_SET_LIMIT = 5,
+ CMD_POWER_CONTROL = 6,
+ CMD_ACTIVE_SLEEP_CONTROL = 7,
+ CMD_COMMIT = 8,
+ CMD_DUAL_COMMIT = 9,
+ CMD_PDCA_CONFIG = 10,
+ CMD_UPDATE_DEBUG_OPP_INFO = 11,
+ CMD_SWITCH_LIMIT = 12,
+ CMD_FIX_TARGET_OPPIDX = 13,
+ CMD_FIX_DUAL_TARGET_OPPIDX = 14,
+ CMD_FIX_CUSTOM_FREQ_VOLT = 15,
+ CMD_FIX_DUAL_CUSTOM_FREQ_VOLT = 16,
+ CMD_SET_MFGSYS_CONFIG = 17,
+ CMD_MSSV_COMMIT = 18,
+ CMD_NUM = 19,
+};
+
+/*
+ * This struct is part of the ABI of the GPUEB firmware. Changing it, or
+ * reordering fields in it, will break things, so don't do it. Thank you.
+ */
+struct __packed mtk_mfg_ipi_msg {
+ __le32 magic;
+ __le32 cmd;
+ __le32 target;
+ /*
+ * Downstream relies on the compiler to implicitly add the following
+ * padding, as it declares the struct as non-packed.
+ */
+ __le32 padding_lol;
+ union {
+ s32 __bitwise oppidx;
+ s32 __bitwise return_value;
+ __le32 freq;
+ __le32 volt;
+ __le32 power;
+ __le32 power_state;
+ __le32 mode;
+ __le32 value;
+ struct {
+ __le64 base;
+ __le32 size;
+ } shared_mem;
+ struct {
+ __le32 freq;
+ __le32 volt;
+ } custom;
+ struct {
+ __le32 limiter;
+ s32 __bitwise ceiling_info;
+ s32 __bitwise floor_info;
+ } set_limit;
+ struct {
+ __le32 target;
+ __le32 val;
+ } mfg_cfg;
+ struct {
+ __le32 target;
+ __le32 val;
+ } mssv;
+ struct {
+ s32 __bitwise gpu_oppidx;
+ s32 __bitwise stack_oppidx;
+ } dual_commit;
+ struct {
+ __le32 fgpu;
+ __le32 vgpu;
+ __le32 fstack;
+ __le32 vstack;
+ } dual_custom;
+ } u;
+};
+
+struct __packed mtk_mfg_ipi_sleep_msg {
+ __le32 event;
+ __le32 state;
+ __le32 magic;
+};
+
+/**
+ * struct mtk_mfg_opp_entry - OPP table entry from firmware
+ * @freq_khz: The operating point's frequency in kilohertz
+ * @voltage_core: The operating point's core voltage in tens of microvolts
+ * @voltage_sram: The operating point's SRAM voltage in tens of microvolts
+ * @posdiv: exponent of base 2 for PLL frequency divisor used for this OPP
+ * @voltage_margin: Number of tens of microvolts the voltage can be undershot
+ * @power_mw: estimate of power usage at this operating point, in milliwatts
+ *
+ * This struct is part of the ABI with the EB firmware. Do not change it.
+ */
+struct __packed mtk_mfg_opp_entry {
+ __le32 freq_khz;
+ __le32 voltage_core;
+ __le32 voltage_sram;
+ __le32 posdiv;
+ __le32 voltage_margin;
+ __le32 power_mw;
+};
+
+struct mtk_mfg_mbox {
+ struct mbox_client cl;
+ struct completion rx_done;
+ struct mtk_mfg *mfg;
+ struct mbox_chan *ch;
+ void *rx_data;
+};
+
+struct mtk_mfg {
+ struct platform_device *pdev;
+ struct panthor_device *ptdev;
+ struct clk *clk_eb;
+ struct clk_bulk_data *gpu_clks;
+ struct regulator_bulk_data *gpu_regs;
+ void __iomem *rpc;
+ void __iomem *gpr;
+ void __iomem *sram;
+ phys_addr_t sram_phys;
+ unsigned int sram_size;
+ unsigned int ghpm_en_reg;
+ u32 ipi_magic;
+ unsigned int num_opps;
+ unsigned int num_unique_gpu_opps;
+ struct dev_pm_opp_data *gpu_opps;
+ struct dev_pm_opp_data *stack_opps;
+ struct mtk_mfg_mbox *gf_mbox;
+ struct mtk_mfg_mbox *slp_mbox;
+ int last_opp;
+ const struct mtk_mfg_variant *variant;
+};
+
+struct mtk_mfg_variant {
+ const char *const *clk_names;
+ unsigned int num_clks;
+ const char *const *regulator_names;
+ unsigned int num_regulators;
+ int (*init)(struct mtk_mfg *mfg);
+};
+
+static inline void mtk_mfg_update_reg_bits(void __iomem *addr, u32 mask, u32 val)
+{
+ writel((readl(addr) & ~mask) | (val & mask), addr);
+}
+
+static inline unsigned long mtk_mfg_read_gpu_freq(struct mtk_mfg *mfg)
+{
+ return readl(mfg->sram + GF_REG_FREQ_CUR_GPU) * 1000UL;
+}
+
+static int mtk_mfg_eb_on(struct mtk_mfg *mfg)
+{
+ struct device *dev = &mfg->pdev->dev;
+ u32 val;
+ int ret;
+
+ /*
+ * If MFG is already on from e.g. the bootloader, we should skip doing
+ * the power-on sequence, as it wouldn't work without powering it off
+ * first.
+ */
+ if ((readl(mfg->rpc + RPC_PWR_CON) & PWR_ACK_M) == PWR_ACK_M)
+ return 0;
+
+ ret = readl_poll_timeout(mfg->rpc + RPC_GHPM_RO0_CON, val,
+ !(val & (GHPM_PWR_STATE_M | GHPM_STATE_M)),
+ GPUEB_POLL_US, GPUEB_TIMEOUT_US);
+ if (ret) {
+ dev_err(dev, "timed out waiting for EB to power on\n");
+ return ret;
+ }
+
+ mtk_mfg_update_reg_bits(mfg->rpc + mfg->ghpm_en_reg, GHPM_ENABLE_M,
+ GHPM_ENABLE_M);
+
+ mtk_mfg_update_reg_bits(mfg->rpc + RPC_GHPM_CFG0_CON, GHPM_ON_SEQ_M, 0);
+ mtk_mfg_update_reg_bits(mfg->rpc + RPC_GHPM_CFG0_CON, GHPM_ON_SEQ_M,
+ GHPM_ON_SEQ_M);
+
+ mtk_mfg_update_reg_bits(mfg->rpc + mfg->ghpm_en_reg, GHPM_ENABLE_M, 0);
+
+
+ ret = readl_poll_timeout(mfg->rpc + RPC_PWR_CON, val,
+ (val & PWR_ACK_M) == PWR_ACK_M,
+ GPUEB_POLL_US, GPUEB_TIMEOUT_US);
+ if (ret) {
+ dev_err(dev, "timed out waiting for EB power ack, val = 0x%X\n",
+ val);
+ return ret;
+ }
+
+ ret = readl_poll_timeout(mfg->gpr + GPR_LP_STATE, val,
+ (val == EB_ON_RESUME),
+ GPUEB_POLL_US, GPUEB_TIMEOUT_US);
+ if (ret) {
+ dev_err(dev, "timed out waiting for EB to resume, status = 0x%X\n", val);
+ return ret;
+ }
+
+ return 0;
+}
+
+static int mtk_mfg_eb_off(struct mtk_mfg *mfg)
+{
+ struct mtk_mfg_ipi_sleep_msg msg = {
+ .event = 0,
+ .state = 0,
+ .magic = GPUEB_SLEEP_MAGIC
+ };
+ u32 val;
+ int ret;
+
+ ret = mbox_send_message(mfg->slp_mbox->ch, &msg);
+ if (ret < 0) {
+ dev_err(&mfg->pdev->dev, "failure in mbox comms: %pe\n",
+ ERR_PTR(ret));
+ return ret;
+ }
+
+ ret = readl_poll_timeout(mfg->rpc + RPC_PWR_CON, val,
+ !(val & PWR_ACK_M), GPUEB_POLL_US,
+ GPUEB_TIMEOUT_US);
+
+ if (ret) {
+ dev_err(&mfg->pdev->dev,
+ "timed out waiting for EB to power off, val=0x%08X\n",
+ val);
+ }
+
+ return ret;
+}
+
+static int mtk_mfg_send_ipi(struct mtk_mfg *mfg, struct mtk_mfg_ipi_msg *msg)
+{
+ int ret;
+
+ msg->magic = mfg->ipi_magic;
+
+ ret = mbox_send_message(mfg->gf_mbox->ch, msg);
+ if (ret < 0) {
+ dev_err(&mfg->pdev->dev, "error from mailbox subsystem: %pe\n",
+ ERR_PTR(ret));
+ return ret;
+ }
+
+ wait_for_completion(&mfg->gf_mbox->rx_done);
+
+ memcpy(msg, mfg->gf_mbox->rx_data, sizeof(*msg));
+
+ if (msg->u.return_value < 0) {
+ dev_err(&mfg->pdev->dev, "IPI return: %d\n",
+ msg->u.return_value);
+ return -EPROTO;
+ }
+
+ return 0;
+}
+
+static int mtk_mfg_init_shared_mem(struct mtk_mfg *mfg)
+{
+ struct device *dev = &mfg->pdev->dev;
+ struct mtk_mfg_ipi_msg msg = {};
+ int ret;
+
+ dev_info(dev, "clearing GPUEB sram, 0x%X bytes\n", mfg->sram_size);
+ memset_io(mfg->sram, 0, mfg->sram_size);
+
+ msg.cmd = CMD_INIT_SHARED_MEM;
+ msg.u.shared_mem.base = mfg->sram_phys;
+ msg.u.shared_mem.size = mfg->sram_size;
+
+ ret = mtk_mfg_send_ipi(mfg, &msg);
+ if (ret)
+ return ret;
+
+ if (readl(mfg->sram) != 0xBABADADA) {
+ dev_err(dev, "EB did not initialise SRAM correctly\n");
+ return -EIO;
+ }
+
+ dev_info(dev, "initialised mem at phys 0x%016llX\n", mfg->sram_phys);
+
+ return 0;
+}
+
+static int mtk_mfg_power_control(struct mtk_mfg *mfg, bool enabled)
+{
+ struct mtk_mfg_ipi_msg msg = {};
+
+ msg.cmd = CMD_POWER_CONTROL;
+ msg.u.power_state = enabled ? 1 : 0;
+
+ return mtk_mfg_send_ipi(mfg, &msg);
+}
+
+static int mtk_mfg_set_oppidx(struct mtk_mfg *mfg, int opp_idx)
+{
+ struct mtk_mfg_ipi_msg msg = {};
+ int ret;
+
+ if (!mfg->gpu_opps || !mfg->stack_opps)
+ return 0;
+
+ if (opp_idx >= mfg->num_opps)
+ return -EINVAL;
+
+ if (mfg->last_opp == opp_idx)
+ return 0;
+
+ msg.cmd = CMD_FIX_TARGET_OPPIDX;
+ msg.u.oppidx = opp_idx;
+
+ ret = mtk_mfg_send_ipi(mfg, &msg);
+ if (ret)
+ return ret;
+
+ mfg->last_opp = opp_idx;
+
+ return 0;
+}
+
+static int mtk_mfg_read_opp_tables(struct mtk_mfg *mfg)
+{
+ struct device *dev = &mfg->pdev->dev;
+ struct mtk_mfg_opp_entry e = {};
+ unsigned int i;
+ unsigned long long last_freq;
+
+ mfg->num_opps = readl(mfg->sram + GF_REG_GPU_OPP_NUM);
+ if (mfg->num_opps != readl(mfg->sram + GF_REG_STK_OPP_NUM)) {
+ dev_err(dev, "OPP count of GPU and Stack differ, %u vs. %u\n",
+ mfg->num_opps, readl(mfg->sram + GF_REG_STK_OPP_NUM));
+ return -EINVAL;
+ }
+
+ if (mfg->num_opps > MAX_OPP_NUM || mfg->num_opps == 0) {
+ dev_err(dev, "OPP count (%u) out of range %u >= count > 0\n",
+ mfg->num_opps, MAX_OPP_NUM);
+ return -EINVAL;
+ }
+
+ mfg->gpu_opps = devm_kcalloc(dev, mfg->num_opps,
+ sizeof(struct dev_pm_opp_data), GFP_KERNEL);
+ if (!mfg->gpu_opps)
+ return -ENOMEM;
+
+ mfg->stack_opps = devm_kcalloc(dev, mfg->num_opps,
+ sizeof(struct dev_pm_opp_data), GFP_KERNEL);
+ if (!mfg->stack_opps)
+ return -ENOMEM;
+
+ for (i = 0; i < mfg->num_opps; i++) {
+ memcpy_fromio(&e, mfg->sram + GF_REG_OPP_TABLE_GPU + i * sizeof(e),
+ sizeof(e));
+ if (mem_is_zero(&e, sizeof(e))) {
+ dev_err(dev, "ran into an empty GPU OPP at index %u\n",
+ i);
+ return -EINVAL;
+ }
+ mfg->gpu_opps[i].freq = e.freq_khz * 1000ULL;
+ mfg->gpu_opps[i].u_volt = e.voltage_core * 10;
+
+ if (!last_freq || mfg->gpu_opps[i].freq != last_freq)
+ mfg->num_unique_gpu_opps++;
+
+ last_freq = mfg->gpu_opps[i].freq;
+ }
+
+ /* Why do I even bother? */
+ for (i = 0; i < mfg->num_opps; i++) {
+ memcpy_fromio(&e, mfg->sram + GF_REG_OPP_TABLE_STK + i * sizeof(e),
+ sizeof(e));
+ if (mem_is_zero(&e, sizeof(e))) {
+ dev_err(dev, "ran into an empty GPU OPP at index %u\n",
+ i);
+ return -EINVAL;
+ }
+ mfg->stack_opps[i].freq = e.freq_khz * 1000ULL;
+ mfg->stack_opps[i].u_volt = e.voltage_core * 10;
+ }
+
+ return 0;
+}
+
+static inline bool mtk_mfg_opp_idx_match(struct mtk_mfg *mfg, int idx,
+ unsigned long rate)
+{
+ if ((idx == mfg->num_opps - 1) && mfg->gpu_opps[idx].freq >= rate)
+ return true;
+
+ if (mfg->gpu_opps[idx].freq >= rate && mfg->gpu_opps[idx + 1].freq < rate)
+ return true;
+
+ return false;
+}
+
+/**
+ * mtk_mfg_get_closest_opp_idx - find OPP index for desired GPU frequency
+ * @mfg: pointer to a &struct mtk_mfg driver instance
+ * @gpu_rate: desired rate of the GPU core in Hz
+ *
+ * Given a desired target frequency for the GPU core, find the index of a
+ * matching OPP, or the next higher OPP if no exact match is found, or the
+ * maximum OPP for frequencies exceeding the maximum OPP's frequency.
+ *
+ * For duplicate OPP entries, chooses the highest OPP index, as in, the one
+ * closest to the next lower frequency OPP.
+ *
+ * Returns -EINVAL on error, or the OPP index on success.
+ */
+static int mtk_mfg_get_closest_opp_idx(struct mtk_mfg *mfg, unsigned long gpu_rate)
+{
+ int l = 0;
+ int r = mfg->num_opps - 1;
+ int m;
+
+ if (!mfg->gpu_opps)
+ return -EINVAL;
+
+ if (mfg->gpu_opps[0].freq <= gpu_rate)
+ return 0;
+
+ while (l <= r) {
+ m = l + (r - l) / 2;
+ if (mtk_mfg_opp_idx_match(mfg, m, gpu_rate)) {
+ return m;
+ } else if (mfg->gpu_opps[m].freq >= gpu_rate) /* >= for dupes */
+ l = m + 1;
+ else
+ r = m - 1;
+ }
+
+ return -EINVAL;
+}
+
+static int mtk_mfg_devfreq_target(struct device *dev, unsigned long *freq,
+ u32 flags)
+{
+ struct panthor_device *ptdev = dev_get_drvdata(dev);
+ struct panthor_devfreq *pdevfreq = ptdev->devfreq;
+ struct mtk_mfg *mfg = dev_get_drvdata(&pdevfreq->devfreq->dev);
+ int ret, opp;
+
+ opp = mtk_mfg_get_closest_opp_idx(mfg, *freq);
+ if (opp < 0) {
+ dev_err(dev, "Couldn't get closest OPP: %pe\n", ERR_PTR(opp));
+ return opp;
+ }
+
+ ret = pm_runtime_resume_and_get(&mfg->pdev->dev);
+ if (ret)
+ return ret;
+
+ ret = mtk_mfg_set_oppidx(mfg, opp);
+ if (!ret)
+ *freq = mtk_mfg_read_gpu_freq(mfg);
+ else
+ dev_err(dev, "Couldn't set OPP: %pe\n", ERR_PTR(ret));
+
+ pm_runtime_put(&mfg->pdev->dev);
+
+ return ret;
+}
+
+static int mtk_mfg_devfreq_get_cur_freq(struct device *dev, unsigned long *freq)
+{
+ struct panthor_device *ptdev = dev_get_drvdata(dev);
+ struct panthor_devfreq *pdevfreq = ptdev->devfreq;
+ struct mtk_mfg *mfg;
+ int ret;
+
+ if (IS_ERR_OR_NULL(pdevfreq->devfreq))
+ return -ENODEV;
+
+ mfg = dev_get_drvdata(&pdevfreq->devfreq->dev);
+ if (IS_ERR_OR_NULL(mfg)) {
+ *freq = pdevfreq->devfreq->profile->initial_freq;
+ return 0;
+ }
+
+ ret = pm_runtime_resume_and_get(&mfg->pdev->dev);
+ if (ret)
+ return ret;
+
+ *freq = mtk_mfg_read_gpu_freq(mfg);
+
+ pm_runtime_put(&mfg->pdev->dev);
+
+ return 0;
+}
+
+static const struct devfreq_dev_profile mtk_mfg_devfreq_profile = {
+ .timer = DEVFREQ_TIMER_DELAYED,
+ .polling_ms = 50, /* ~3 frames */
+ .target = mtk_mfg_devfreq_target,
+ .get_dev_status = panthor_devfreq_get_dev_status,
+ .get_cur_freq = mtk_mfg_devfreq_get_cur_freq,
+};
+
+static const char *const mtk_mfg_mt8196_clk_names[] = {
+ "mfgpll",
+ "mfgpll_sc0",
+ "mfgpll_sc1",
+};
+
+static const char *const mtk_mfg_mt8196_regulators[] = {
+ "core",
+ "stack",
+ "sram",
+};
+
+static int mtk_mfg_mt8196_init(struct mtk_mfg *mfg)
+{
+ void __iomem *e2_base;
+
+ e2_base = devm_platform_ioremap_resource_byname(mfg->pdev, "e2_id");
+ if (IS_ERR(e2_base))
+ return dev_err_probe(&mfg->pdev->dev, PTR_ERR(e2_base),
+ "Couldn't get e2_id register\n");
+
+ if (readl(e2_base) == MFG_MT8196_E2_ID)
+ mfg->ghpm_en_reg = RPC_DUMMY_REG_2;
+ else
+ mfg->ghpm_en_reg = RPC_GHPM_CFG0_CON;
+
+ return 0;
+};
+
+static const struct mtk_mfg_variant mtk_mfg_mt8196_variant = {
+ .clk_names = mtk_mfg_mt8196_clk_names,
+ .num_clks = ARRAY_SIZE(mtk_mfg_mt8196_clk_names),
+ .regulator_names = mtk_mfg_mt8196_regulators,
+ .num_regulators = ARRAY_SIZE(mtk_mfg_mt8196_regulators),
+ .init = mtk_mfg_mt8196_init,
+};
+
+static int mtk_mfg_init_devfreq(struct panthor_device *ptdev,
+ struct device *gpufreq_dev)
+{
+ struct device *gpu = ptdev->base.dev;
+ struct panthor_devfreq *pdevfreq;
+ struct mtk_mfg *mfg;
+ struct devfreq_dev_profile *profile;
+ struct dev_pm_opp *opp;
+ int j = 0;
+ int i, ret;
+
+ mfg = dev_get_drvdata(gpufreq_dev);
+
+ if (!mfg)
+ return -EPROBE_DEFER;
+
+ if (!mfg->gpu_opps)
+ return -EPROBE_DEFER;
+
+ pdevfreq = drmm_kzalloc(&ptdev->base, sizeof(*ptdev->devfreq), GFP_KERNEL);
+ if (!pdevfreq)
+ return -ENOMEM;
+
+ ptdev->devfreq = pdevfreq;
+ mfg->ptdev = ptdev;
+
+ spin_lock_init(&pdevfreq->lock);
+
+ profile = devm_kmemdup(gpu, &mtk_mfg_devfreq_profile, sizeof(*profile),
+ GFP_KERNEL);
+
+ if (!profile)
+ return -ENOMEM;
+
+ ret = pm_runtime_resume_and_get(&mfg->pdev->dev);
+ if (ret) {
+ dev_err(gpufreq_dev, "could not resume runtime pm: %pe\n",
+ ERR_PTR(ret));
+ return ret;
+ }
+ profile->initial_freq = mtk_mfg_read_gpu_freq(mfg);
+ pm_runtime_put(&mfg->pdev->dev);
+
+ profile->freq_table = devm_kcalloc(gpu, mfg->num_unique_gpu_opps,
+ sizeof(*profile->freq_table),
+ GFP_KERNEL);
+ if (!profile->freq_table)
+ return -ENOMEM;
+
+ profile->max_state = mfg->num_unique_gpu_opps;
+
+ for (i = mfg->num_opps - 1; i >= 0 && j < profile->max_state; i--) {
+ if ((j == 0) || (mfg->gpu_opps[i].freq != profile->freq_table[j - 1])) {
+ profile->freq_table[j] = mfg->gpu_opps[i].freq;
+ ret = dev_pm_opp_add_dynamic(gpu, &mfg->gpu_opps[i]);
+ if (ret) {
+ dev_err(gpu, "couldn't add OPP %d: %pe\n",
+ i, ERR_PTR(ret));
+ return ret;
+ }
+ j++;
+ }
+ }
+
+ opp = devfreq_recommended_opp(gpu, &profile->initial_freq, 0);
+ if (IS_ERR(opp)) {
+ dev_err(gpufreq_dev, "failed getting OPP recommendation: %pe\n",
+ opp);
+ return PTR_ERR(opp);
+ }
+
+ ret = dev_pm_opp_set_opp(gpu, opp);
+ dev_pm_opp_put(opp);
+ if (ret) {
+ dev_err(gpu, "Couldn't set recommended OPP\n");
+ return ret;
+ }
+
+ ptdev->fast_rate = profile->freq_table[profile->max_state - 1];
+
+ pdevfreq->time_last_update = ktime_get();
+ pdevfreq->gov_data.upthreshold = 45;
+ pdevfreq->gov_data.downdifferential = 5;
+
+ pdevfreq->devfreq = devm_devfreq_add_device(ptdev->base.dev, profile,
+ DEVFREQ_GOV_SIMPLE_ONDEMAND,
+ &pdevfreq->gov_data);
+ if (IS_ERR(pdevfreq->devfreq)) {
+ dev_err(gpu, "Failed to register devfreq device: %pe\n",
+ pdevfreq->devfreq);
+ return PTR_ERR(pdevfreq->devfreq);
+ }
+
+ /*
+ * There are some ways devfreq_add_device can fail without setting an
+ * error pointer, but setting a NULL pointer instead. We really don't
+ * want to return 0 in the above check, so return -EINVAL here.
+ */
+ if (!pdevfreq->devfreq) {
+ dev_err(gpu, "Failed to register devfreq device\n");
+ return -EINVAL;
+ }
+
+ dev_set_drvdata(&pdevfreq->devfreq->dev, mfg);
+
+ return 0;
+}
+
+static void mtk_mfg_mbox_rx_callback(struct mbox_client *cl, void *mssg)
+{
+ struct mtk_mfg_mbox *mb = container_of(cl, struct mtk_mfg_mbox, cl);
+
+ mb->rx_data = mssg;
+ complete(&mb->rx_done);
+}
+
+static int mtk_mfg_init_mbox(struct mtk_mfg *mfg)
+{
+ struct device *dev = &mfg->pdev->dev;
+ struct mtk_mfg_mbox *gf;
+ struct mtk_mfg_mbox *slp;
+
+ gf = devm_kzalloc(dev, sizeof(*gf), GFP_KERNEL);
+ if (!gf)
+ return -ENOMEM;
+
+ slp = devm_kzalloc(dev, sizeof(*slp), GFP_KERNEL);
+ if (!slp)
+ return -ENOMEM;
+
+ gf->mfg = mfg;
+ init_completion(&gf->rx_done);
+ gf->cl.dev = dev;
+ gf->cl.rx_callback = mtk_mfg_mbox_rx_callback;
+ gf->cl.tx_tout = GPUEB_TIMEOUT_US / USEC_PER_MSEC;
+ gf->ch = mbox_request_channel_byname(&gf->cl, "gpufreq");
+ if (IS_ERR(gf->ch))
+ return PTR_ERR(gf->ch);
+
+ mfg->gf_mbox = gf;
+
+ slp->mfg = mfg;
+ init_completion(&slp->rx_done);
+ slp->cl.dev = dev;
+ slp->cl.tx_tout = GPUEB_TIMEOUT_US / USEC_PER_MSEC;
+ slp->cl.tx_block = true;
+ slp->ch = mbox_request_channel_byname(&slp->cl, "sleep");
+ if (IS_ERR(slp->ch))
+ return PTR_ERR(slp->ch);
+
+ mfg->slp_mbox = slp;
+
+ return 0;
+}
+
+static const struct of_device_id mtk_mfg_of_match[] = {
+ { .compatible = "mediatek,mt8196-gpufreq", .data = &mtk_mfg_mt8196_variant },
+ {}
+};
+MODULE_DEVICE_TABLE(of, mtk_mfg_of_match);
+
+static int mtk_mfg_probe(struct platform_device *pdev)
+{
+ struct device_node *shmem __free(device_node);
+ struct panthor_devfreq_provider *prov;
+ struct mtk_mfg *mfg;
+ struct device *dev = &pdev->dev;
+ const struct mtk_mfg_variant *data = of_device_get_match_data(dev);
+ struct resource res;
+ int ret;
+ int i;
+
+ mfg = devm_kzalloc(dev, sizeof(*mfg), GFP_KERNEL);
+ if (!mfg)
+ return -ENOMEM;
+
+ prov = devm_kzalloc(dev, sizeof(*prov), GFP_KERNEL);
+ if (!prov)
+ return -ENOMEM;
+
+ prov->dev = dev;
+ prov->init = mtk_mfg_init_devfreq;
+
+ mfg->pdev = pdev;
+ mfg->variant = data;
+
+ dev_set_drvdata(dev, mfg);
+
+ mfg->gpr = devm_platform_ioremap_resource_byname(pdev, "gpr");
+ if (IS_ERR(mfg->gpr))
+ return dev_err_probe(dev, PTR_ERR(mfg->gpr),
+ "Could not retrieve GPR MMIO registers\n");
+
+ mfg->rpc = devm_platform_ioremap_resource_byname(pdev, "rpc");
+ if (IS_ERR(mfg->rpc))
+ return dev_err_probe(dev, PTR_ERR(mfg->rpc),
+ "Could not retrieve RPC MMIO registers\n");
+
+ mfg->clk_eb = devm_clk_get(dev, "eb");
+ if (IS_ERR(mfg->clk_eb))
+ return dev_err_probe(dev, PTR_ERR(mfg->clk_eb),
+ "Could not get 'eb' clock\n");
+
+ mfg->gpu_clks = devm_kcalloc(dev, data->num_clks, sizeof(*mfg->gpu_clks),
+ GFP_KERNEL);
+ if (!mfg->gpu_clks)
+ return -ENOMEM;
+
+ for (i = 0; i < data->num_clks; i++)
+ mfg->gpu_clks[i].id = data->clk_names[i];
+
+ ret = devm_clk_bulk_get(dev, data->num_clks, mfg->gpu_clks);
+ if (ret)
+ return dev_err_probe(dev, ret, "couldn't get GPU clocks\n");
+
+ mfg->gpu_regs = devm_kcalloc(dev, data->num_regulators,
+ sizeof(*mfg->gpu_regs), GFP_KERNEL);
+ if (!mfg->gpu_regs)
+ return -ENOMEM;
+
+ for (i = 0; i < data->num_regulators; i++)
+ mfg->gpu_regs[i].supply = data->regulator_names[i];
+
+ ret = devm_regulator_bulk_get(dev, data->num_regulators, mfg->gpu_regs);
+ if (ret)
+ return dev_err_probe(dev, ret, "couldn't get GPU regulators\n");
+
+ shmem = of_parse_phandle(dev->of_node, "shmem", 0);
+ if (!shmem)
+ return dev_err_probe(dev, -ENODEV, "Could not get 'shmem'\n");
+
+ ret = of_address_to_resource(shmem, 0, &res);
+ if (ret)
+ return dev_err_probe(dev, ret,
+ "failed to get GPUEB shared memory\n");
+
+ mfg->sram = devm_ioremap(dev, res.start, resource_size(&res));
+ if (!mfg->sram)
+ return dev_err_probe(dev, -EADDRNOTAVAIL,
+ "failed to ioremap GPUEB sram\n");
+ mfg->sram_size = resource_size(&res);
+ mfg->sram_phys = res.start;
+
+ if (data->init) {
+ ret = data->init(mfg);
+ if (ret)
+ return dev_err_probe(dev, ret, "Variant init failed\n");
+ }
+
+ ret = clk_prepare_enable(mfg->clk_eb);
+ if (ret)
+ return dev_err_probe(dev, ret, "failed to turn on EB clock\n");
+ mfg->ipi_magic = readl(mfg->gpr + GPR_IPI_MAGIC);
+ writel(0x0, mfg->gpr + GPR_IPI_MAGIC); /* why. */
+
+ ret = mtk_mfg_init_mbox(mfg);
+ if (ret) {
+ ret = dev_err_probe(dev, ret, "Couldn't initialise mailbox\n");
+ goto out;
+ }
+
+ clk_disable_unprepare(mfg->clk_eb);
+
+ mfg->last_opp = -1;
+
+ devm_pm_runtime_enable(dev);
+
+ ret = pm_runtime_resume_and_get(dev);
+ if (ret)
+ return dev_err_probe(dev, ret, "Failed to power on MFG\n");
+
+ ret = mtk_mfg_init_shared_mem(mfg);
+ if (ret) {
+ dev_err(dev, "Couldn't initialize EB SRAM: %pe\n", ERR_PTR(ret));
+ goto out;
+ }
+
+ ret = mtk_mfg_read_opp_tables(mfg);
+ if (ret) {
+ dev_err(dev, "Error reading OPP tables from EB: %pe\n",
+ ERR_PTR(ret));
+ goto out;
+ }
+
+ panthor_devfreq_register_provider(prov);
+
+out:
+ pm_runtime_put(dev);
+ return ret;
+}
+
+static int mtk_mfg_suspend(struct device *dev)
+{
+ struct mtk_mfg *mfg = dev_get_drvdata(dev);
+ int ret;
+
+ ret = mtk_mfg_power_control(mfg, false);
+ if (ret) {
+ dev_err(dev, "power_control failed: %pe\n", ERR_PTR(ret));
+ return ret;
+ }
+ ret = mtk_mfg_eb_off(mfg);
+ if (ret) {
+ dev_err(dev, "eb_off failed: %pe\n", ERR_PTR(ret));
+ return ret;
+ }
+ clk_bulk_disable_unprepare(mfg->variant->num_clks, mfg->gpu_clks);
+ clk_disable_unprepare(mfg->clk_eb);
+ ret = regulator_bulk_disable(mfg->variant->num_regulators, mfg->gpu_regs);
+
+ return ret;
+}
+
+static int mtk_mfg_resume(struct device *dev)
+{
+ struct mtk_mfg *mfg = dev_get_drvdata(dev);
+ int ret;
+
+ ret = regulator_bulk_enable(mfg->variant->num_regulators,
+ mfg->gpu_regs);
+ if (ret)
+ return ret;
+
+ ret = clk_prepare_enable(mfg->clk_eb);
+ if (ret)
+ goto err_disable_regulators;
+
+ ret = clk_bulk_prepare_enable(mfg->variant->num_clks, mfg->gpu_clks);
+ if (ret)
+ goto err_disable_eb_clk;
+
+ ret = mtk_mfg_eb_on(mfg);
+ if (ret)
+ goto err_disable_clks;
+
+ ret = mtk_mfg_power_control(mfg, true);
+ if (ret)
+ goto err_eb_off;
+
+ return 0;
+
+err_eb_off:
+ mtk_mfg_eb_off(mfg);
+err_disable_clks:
+ clk_bulk_disable_unprepare(mfg->variant->num_clks, mfg->gpu_clks);
+err_disable_eb_clk:
+ clk_disable_unprepare(mfg->clk_eb);
+err_disable_regulators:
+ regulator_bulk_disable(mfg->variant->num_regulators, mfg->gpu_regs);
+
+ return ret;
+}
+
+
+static DEFINE_RUNTIME_DEV_PM_OPS(mtk_mfg_pm_ops,
+ mtk_mfg_suspend,
+ mtk_mfg_resume,
+ NULL);
+
+static struct platform_driver mtk_mfg_driver = {
+ .driver = {
+ .name = "panthor-mtk-mfg",
+ .of_match_table = mtk_mfg_of_match,
+ .pm = pm_ptr(&mtk_mfg_pm_ops),
+ },
+ .probe = mtk_mfg_probe,
+};
+module_platform_driver(mtk_mfg_driver);
+
+MODULE_SOFTDEP("pre: panthor");
+MODULE_AUTHOR("Nicolas Frattaroli <nicolas.frattaroli@collabora.com>");
+MODULE_DESCRIPTION("MediaTek MFlexGraphics Extension for Panthor");
+MODULE_LICENSE("GPL");
--
2.51.0
^ permalink raw reply related [flat|nested] 30+ messages in thread