* [RFC PATCH] drm/xe/hwmon: report a single fan for DG2 instead of two
@ 2026-05-27 11:53 Zhan Wei
2026-05-27 12:17 ` sashiko-bot
2026-05-27 13:53 ` Raag Jadav
0 siblings, 2 replies; 5+ messages in thread
From: Zhan Wei @ 2026-05-27 11:53 UTC (permalink / raw)
To: Matthew Brost, Thomas Hellström, Rodrigo Vivi
Cc: Raag Jadav, Andi Shyti, David Airlie, Simona Vetter,
Guenter Roeck, intel-xe, dri-devel, linux-hwmon, linux-kernel,
Zhan Wei
xe_hwmon_pcode_read_fan_control() currently hardcodes *uval = 2 when
queried with FSC_READ_NUM_FANS on DG2. This causes fan2_input to be
exposed via sysfs, but on the tested Arc A750 LE (DG2 G10, PCI ID
0x56a1) fan2_input reads 0 RPM permanently while fan1_input correctly
reports ~800 RPM with both physical fan physically spinning.
The RPM is calculated delta-based from a tach pulse counter:
rotations = (reg_val - fi->reg_val_prev) / 2;
so a constant-zero RPM means the register at offset 0x138170
(BMG_FAN_2_SPEED) simply does not accumulate pulses on DG2 silicon.
The i915 driver does not expose fan2 on DG2 at all -- it only maps
PCU_PWM_FAN_SPEED (0x138140, identical to BMG_FAN_1_SPEED), consistent
with the observation that only one fan tach register is wired on DG2.
Report a single fan for DG2 to keep the phantom fan2_input out of
sysfs. Battlemage paths are unchanged.
Tested on Arc A750 LE (DG2 G10): with this patch applied, fan2_input
no longer appears in /sys/class/hwmon/hwmonX/ and `sensors xe-pci-0300`
shows fan1 only.
Fixes: 28f79ac609de ("drm/xe/hwmon: expose fan speed")
Signed-off-by: Zhan Wei <zhanwei919@gmail.com>
---
Open questions for reviewers: this is verified only on DG2 G10. Owners
of G11 (e.g. ASRock Challenger A750) and G12 (e.g. Sparkle Titan A750
with three physical fans) -- does fan2_input or fan3_input ever read
non-zero in your setup? If so, the right fix is a per-subplatform
table rather than a flat 1.
drivers/gpu/drm/xe/xe_hwmon.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_hwmon.c b/drivers/gpu/drm/xe/xe_hwmon.c
index de3f2aeffc3f..2a60a76b1971 100644
--- a/drivers/gpu/drm/xe/xe_hwmon.c
+++ b/drivers/gpu/drm/xe/xe_hwmon.c
@@ -860,9 +860,15 @@ static int xe_hwmon_pcode_read_fan_control(const struct xe_hwmon *hwmon, u32 sub
{
struct xe_tile *root_tile = xe_device_get_root_tile(hwmon->xe);
- /* Platforms that don't return correct value */
+ /*
+ * The PCODE FAN_SPEED_CONTROL subcommands return an error on DG2, so we
+ * answer the FSC_READ_NUM_FANS query here. DG2 only wires a single fan
+ * tachometer register (BMG_FAN_1_SPEED == 0x138140, shared with i915's
+ * PCU_PWM_FAN_SPEED); BMG_FAN_2/3_SPEED read 0 on DG2 silicon. Reporting
+ * one fan keeps a phantom fan2_input that always reads 0 out of sysfs.
+ */
if (hwmon->xe->info.platform == XE_DG2 && subcmd == FSC_READ_NUM_FANS) {
- *uval = 2;
+ *uval = 1;
return 0;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [RFC PATCH] drm/xe/hwmon: report a single fan for DG2 instead of two
2026-05-27 11:53 [RFC PATCH] drm/xe/hwmon: report a single fan for DG2 instead of two Zhan Wei
@ 2026-05-27 12:17 ` sashiko-bot
2026-05-27 13:53 ` Raag Jadav
1 sibling, 0 replies; 5+ messages in thread
From: sashiko-bot @ 2026-05-27 12:17 UTC (permalink / raw)
To: Zhan Wei; +Cc: linux-hwmon
> xe_hwmon_pcode_read_fan_control() currently hardcodes *uval = 2 when
> queried with FSC_READ_NUM_FANS on DG2. This causes fan2_input to be
> exposed via sysfs, but on the tested Arc A750 LE (DG2 G10, PCI ID
> 0x56a1) fan2_input reads 0 RPM permanently while fan1_input correctly
> reports ~800 RPM with both physical fan physically spinning.
>
> The RPM is calculated delta-based from a tach pulse counter:
>
> rotations = (reg_val - fi->reg_val_prev) / 2;
>
> so a constant-zero RPM means the register at offset 0x138170
> (BMG_FAN_2_SPEED) simply does not accumulate pulses on DG2 silicon.
> The i915 driver does not expose fan2 on DG2 at all -- it only maps
> PCU_PWM_FAN_SPEED (0x138140, identical to BMG_FAN_1_SPEED), consistent
> with the observation that only one fan tach register is wired on DG2.
>
> Report a single fan for DG2 to keep the phantom fan2_input out of
> sysfs. Battlemage paths are unchanged.
>
> Tested on Arc A750 LE (DG2 G10): with this patch applied, fan2_input
> no longer appears in /sys/class/hwmon/hwmonX/ and `sensors xe-pci-0300`
> shows fan1 only.
>
> Fixes: 28f79ac609de ("drm/xe/hwmon: expose fan speed")
> Signed-off-by: Zhan Wei <zhanwei919@gmail.com>
Sashiko has reviewed this patch and found no issues. It looks great!
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260527115311.13398-1-zhanwei919@gmail.com?part=1
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC PATCH] drm/xe/hwmon: report a single fan for DG2 instead of two
2026-05-27 11:53 [RFC PATCH] drm/xe/hwmon: report a single fan for DG2 instead of two Zhan Wei
2026-05-27 12:17 ` sashiko-bot
@ 2026-05-27 13:53 ` Raag Jadav
2026-05-27 15:18 ` 占wei
1 sibling, 1 reply; 5+ messages in thread
From: Raag Jadav @ 2026-05-27 13:53 UTC (permalink / raw)
To: Zhan Wei
Cc: Matthew Brost, Thomas Hellström, Rodrigo Vivi, Andi Shyti,
David Airlie, Simona Vetter, Guenter Roeck, intel-xe, dri-devel,
linux-hwmon, linux-kernel
On Wed, May 27, 2026 at 07:53:11PM +0800, Zhan Wei wrote:
> xe_hwmon_pcode_read_fan_control() currently hardcodes *uval = 2 when
> queried with FSC_READ_NUM_FANS on DG2. This causes fan2_input to be
> exposed via sysfs, but on the tested Arc A750 LE (DG2 G10, PCI ID
> 0x56a1) fan2_input reads 0 RPM permanently while fan1_input correctly
> reports ~800 RPM with both physical fan physically spinning.
>
> The RPM is calculated delta-based from a tach pulse counter:
>
> rotations = (reg_val - fi->reg_val_prev) / 2;
>
> so a constant-zero RPM means the register at offset 0x138170
> (BMG_FAN_2_SPEED) simply does not accumulate pulses on DG2 silicon.
> The i915 driver does not expose fan2 on DG2 at all -- it only maps
> PCU_PWM_FAN_SPEED (0x138140, identical to BMG_FAN_1_SPEED), consistent
> with the observation that only one fan tach register is wired on DG2.
i915 is for legacy cards (like DG1) which only has a single channel
in hardware. I just happen to extend the support to DG2 for the folks
that might be using it.
> Report a single fan for DG2 to keep the phantom fan2_input out of
> sysfs. Battlemage paths are unchanged.
>
> Tested on Arc A750 LE (DG2 G10): with this patch applied, fan2_input
> no longer appears in /sys/class/hwmon/hwmonX/ and `sensors xe-pci-0300`
> shows fan1 only.
>
> Fixes: 28f79ac609de ("drm/xe/hwmon: expose fan speed")
> Signed-off-by: Zhan Wei <zhanwei919@gmail.com>
> ---
> Open questions for reviewers: this is verified only on DG2 G10. Owners
> of G11 (e.g. ASRock Challenger A750) and G12 (e.g. Sparkle Titan A750
> with three physical fans) -- does fan2_input or fan3_input ever read
> non-zero in your setup? If so, the right fix is a per-subplatform
> table rather than a flat 1.
There's no straight answer here :)
root@DUT2147DG2FRD:/home/gta# cat /sys/class/drm/card0/device/device
0x56a1
root@DUT2147DG2FRD:/home/gta# sensors xe-pci-0300
xe-pci-0300
Adapter: PCI adapter
pkg: 758.00 mV
fan1: 636 RPM
fan2: 652 RPM
pkg: +47.0°C
vram: +50.0°C
pkg: N/A (max = 190.00 W)
pkg: 14.37 kJ
The way this works is upto the OEMs how they design their cards. Some reuse
a single channel for multiple physical fans while some use 1:1 mapped multiple
channels for each fan.
This is unfortunately not possible to figure out from the driver without
FSC_READ_NUM_FANS command (which has been found to be not working on some
cards and hence the hardcoded value).
Raag
> drivers/gpu/drm/xe/xe_hwmon.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_hwmon.c b/drivers/gpu/drm/xe/xe_hwmon.c
> index de3f2aeffc3f..2a60a76b1971 100644
> --- a/drivers/gpu/drm/xe/xe_hwmon.c
> +++ b/drivers/gpu/drm/xe/xe_hwmon.c
> @@ -860,9 +860,15 @@ static int xe_hwmon_pcode_read_fan_control(const struct xe_hwmon *hwmon, u32 sub
> {
> struct xe_tile *root_tile = xe_device_get_root_tile(hwmon->xe);
>
> - /* Platforms that don't return correct value */
> + /*
> + * The PCODE FAN_SPEED_CONTROL subcommands return an error on DG2, so we
> + * answer the FSC_READ_NUM_FANS query here. DG2 only wires a single fan
> + * tachometer register (BMG_FAN_1_SPEED == 0x138140, shared with i915's
> + * PCU_PWM_FAN_SPEED); BMG_FAN_2/3_SPEED read 0 on DG2 silicon. Reporting
> + * one fan keeps a phantom fan2_input that always reads 0 out of sysfs.
> + */
> if (hwmon->xe->info.platform == XE_DG2 && subcmd == FSC_READ_NUM_FANS) {
> - *uval = 2;
> + *uval = 1;
> return 0;
> }
>
> --
> 2.43.0
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC PATCH] drm/xe/hwmon: report a single fan for DG2 instead of two
2026-05-27 13:53 ` Raag Jadav
@ 2026-05-27 15:18 ` 占wei
2026-05-28 16:49 ` Raag Jadav
0 siblings, 1 reply; 5+ messages in thread
From: 占wei @ 2026-05-27 15:18 UTC (permalink / raw)
To: Raag Jadav
Cc: Matthew Brost, Thomas Hellström, Rodrigo Vivi, Andi Shyti,
David Airlie, Simona Vetter, Guenter Roeck, intel-xe, dri-devel,
linux-hwmon, linux-kernel
Thanks for the detailed explanation -- that make sense
I can think of two paths forward:
1) Have fan_input_read() return -ENODATA if one channel has started
reporting pulses but another remains silent for, say, 30 seconds.
This way the phantom entry still appears in sysfs but userspace
tools like `sensors` can handle the "no data" case gracefully
instead of showing a misleading 0 RPM.
2) Drop the code change entirely and instead add a short note in
Documentation/gpu/xe/xe_hwmon.rst explaining that on DG2 boards
where the OEM routes multiple physical fans through a shared tach
line, fan{2,3}_input may read 0, so future contributors don't end
up re-attempting the same v1 patch I just sent.
What do you think?
Raag Jadav <raag.jadav@intel.com> 于2026年5月27日周三 21:53写道:
>
> On Wed, May 27, 2026 at 07:53:11PM +0800, Zhan Wei wrote:
> > xe_hwmon_pcode_read_fan_control() currently hardcodes *uval = 2 when
> > queried with FSC_READ_NUM_FANS on DG2. This causes fan2_input to be
> > exposed via sysfs, but on the tested Arc A750 LE (DG2 G10, PCI ID
> > 0x56a1) fan2_input reads 0 RPM permanently while fan1_input correctly
> > reports ~800 RPM with both physical fan physically spinning.
> >
> > The RPM is calculated delta-based from a tach pulse counter:
> >
> > rotations = (reg_val - fi->reg_val_prev) / 2;
> >
> > so a constant-zero RPM means the register at offset 0x138170
> > (BMG_FAN_2_SPEED) simply does not accumulate pulses on DG2 silicon.
> > The i915 driver does not expose fan2 on DG2 at all -- it only maps
> > PCU_PWM_FAN_SPEED (0x138140, identical to BMG_FAN_1_SPEED), consistent
> > with the observation that only one fan tach register is wired on DG2.
>
> i915 is for legacy cards (like DG1) which only has a single channel
> in hardware. I just happen to extend the support to DG2 for the folks
> that might be using it.
>
> > Report a single fan for DG2 to keep the phantom fan2_input out of
> > sysfs. Battlemage paths are unchanged.
> >
> > Tested on Arc A750 LE (DG2 G10): with this patch applied, fan2_input
> > no longer appears in /sys/class/hwmon/hwmonX/ and `sensors xe-pci-0300`
> > shows fan1 only.
> >
> > Fixes: 28f79ac609de ("drm/xe/hwmon: expose fan speed")
> > Signed-off-by: Zhan Wei <zhanwei919@gmail.com>
> > ---
> > Open questions for reviewers: this is verified only on DG2 G10. Owners
> > of G11 (e.g. ASRock Challenger A750) and G12 (e.g. Sparkle Titan A750
> > with three physical fans) -- does fan2_input or fan3_input ever read
> > non-zero in your setup? If so, the right fix is a per-subplatform
> > table rather than a flat 1.
>
> There's no straight answer here :)
>
> root@DUT2147DG2FRD:/home/gta# cat /sys/class/drm/card0/device/device
> 0x56a1
>
> root@DUT2147DG2FRD:/home/gta# sensors xe-pci-0300
> xe-pci-0300
> Adapter: PCI adapter
> pkg: 758.00 mV
> fan1: 636 RPM
> fan2: 652 RPM
> pkg: +47.0°C
> vram: +50.0°C
> pkg: N/A (max = 190.00 W)
> pkg: 14.37 kJ
>
>
> The way this works is upto the OEMs how they design their cards. Some reuse
> a single channel for multiple physical fans while some use 1:1 mapped multiple
> channels for each fan.
>
> This is unfortunately not possible to figure out from the driver without
> FSC_READ_NUM_FANS command (which has been found to be not working on some
> cards and hence the hardcoded value).
>
> Raag
>
> > drivers/gpu/drm/xe/xe_hwmon.c | 10 ++++++++--
> > 1 file changed, 8 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_hwmon.c b/drivers/gpu/drm/xe/xe_hwmon.c
> > index de3f2aeffc3f..2a60a76b1971 100644
> > --- a/drivers/gpu/drm/xe/xe_hwmon.c
> > +++ b/drivers/gpu/drm/xe/xe_hwmon.c
> > @@ -860,9 +860,15 @@ static int xe_hwmon_pcode_read_fan_control(const struct xe_hwmon *hwmon, u32 sub
> > {
> > struct xe_tile *root_tile = xe_device_get_root_tile(hwmon->xe);
> >
> > - /* Platforms that don't return correct value */
> > + /*
> > + * The PCODE FAN_SPEED_CONTROL subcommands return an error on DG2, so we
> > + * answer the FSC_READ_NUM_FANS query here. DG2 only wires a single fan
> > + * tachometer register (BMG_FAN_1_SPEED == 0x138140, shared with i915's
> > + * PCU_PWM_FAN_SPEED); BMG_FAN_2/3_SPEED read 0 on DG2 silicon. Reporting
> > + * one fan keeps a phantom fan2_input that always reads 0 out of sysfs.
> > + */
> > if (hwmon->xe->info.platform == XE_DG2 && subcmd == FSC_READ_NUM_FANS) {
> > - *uval = 2;
> > + *uval = 1;
> > return 0;
> > }
> >
> > --
> > 2.43.0
> >
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC PATCH] drm/xe/hwmon: report a single fan for DG2 instead of two
2026-05-27 15:18 ` 占wei
@ 2026-05-28 16:49 ` Raag Jadav
0 siblings, 0 replies; 5+ messages in thread
From: Raag Jadav @ 2026-05-28 16:49 UTC (permalink / raw)
To: 占wei
Cc: Matthew Brost, Thomas Hellström, Rodrigo Vivi, Andi Shyti,
David Airlie, Simona Vetter, Guenter Roeck, intel-xe, dri-devel,
linux-hwmon, linux-kernel
On Wed, May 27, 2026 at 11:18:52PM +0800, 占wei wrote:
> Thanks for the detailed explanation -- that make sense
>
> I can think of two paths forward:
>
> 1) Have fan_input_read() return -ENODATA if one channel has started
> reporting pulses but another remains silent for, say, 30 seconds.
> This way the phantom entry still appears in sysfs but userspace
> tools like `sensors` can handle the "no data" case gracefully
> instead of showing a misleading 0 RPM.
Sounds a bit over engineered solution with its own caveats because
a) We assume that both channels are monitored simultaneously and first
channel actually reports non-zero value for 30 seconds (or whatever
trivial value we device) continuously, which is not guaranteed.
b) This means the output of one channel depends on another and I'm
doubtful if maintainers would be okay with such hacks.
> 2) Drop the code change entirely and instead add a short note in
> Documentation/gpu/xe/xe_hwmon.rst explaining that on DG2 boards
> where the OEM routes multiple physical fans through a shared tach
> line, fan{2,3}_input may read 0, so future contributors don't end
> up re-attempting the same v1 patch I just sent.
This one makes more sense to me though.
Raag
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-05-28 16:49 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-27 11:53 [RFC PATCH] drm/xe/hwmon: report a single fan for DG2 instead of two Zhan Wei
2026-05-27 12:17 ` sashiko-bot
2026-05-27 13:53 ` Raag Jadav
2026-05-27 15:18 ` 占wei
2026-05-28 16:49 ` Raag Jadav
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox