All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH] drm/xe/hwmon: report a single fan for DG2 instead of two
@ 2026-05-27 11:53 Zhan Wei
  2026-05-27 12:17 ` sashiko-bot
                   ` (5 more replies)
  0 siblings, 6 replies; 14+ messages in thread
From: Zhan Wei @ 2026-05-27 11:53 UTC (permalink / raw)
  To: Matthew Brost, Thomas Hellström, Rodrigo Vivi
  Cc: Raag Jadav, Andi Shyti, David Airlie, Simona Vetter,
	Guenter Roeck, intel-xe, dri-devel, linux-hwmon, linux-kernel,
	Zhan Wei

xe_hwmon_pcode_read_fan_control() currently hardcodes *uval = 2 when
queried with FSC_READ_NUM_FANS on DG2. This causes fan2_input to be
exposed via sysfs, but on the tested Arc A750 LE (DG2 G10, PCI ID
0x56a1) fan2_input reads 0 RPM permanently while fan1_input correctly
reports ~800 RPM with both physical fan physically spinning.

The RPM is calculated delta-based from a tach pulse counter:

    rotations = (reg_val - fi->reg_val_prev) / 2;

so a constant-zero RPM means the register at offset 0x138170
(BMG_FAN_2_SPEED) simply does not accumulate pulses on DG2 silicon.
The i915 driver does not expose fan2 on DG2 at all -- it only maps
PCU_PWM_FAN_SPEED (0x138140, identical to BMG_FAN_1_SPEED), consistent
with the observation that only one fan tach register is wired on DG2.

Report a single fan for DG2 to keep the phantom fan2_input out of
sysfs.  Battlemage paths are unchanged.

Tested on Arc A750 LE (DG2 G10): with this patch applied, fan2_input
no longer appears in /sys/class/hwmon/hwmonX/ and `sensors xe-pci-0300`
shows fan1 only.

Fixes: 28f79ac609de ("drm/xe/hwmon: expose fan speed")
Signed-off-by: Zhan Wei <zhanwei919@gmail.com>
---
Open questions for reviewers: this is verified only on DG2 G10. Owners
of G11 (e.g. ASRock Challenger A750) and G12 (e.g. Sparkle Titan A750
with three physical fans) -- does fan2_input or fan3_input ever read
non-zero in your setup? If so, the right fix is a per-subplatform
table rather than a flat 1.

 drivers/gpu/drm/xe/xe_hwmon.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_hwmon.c b/drivers/gpu/drm/xe/xe_hwmon.c
index de3f2aeffc3f..2a60a76b1971 100644
--- a/drivers/gpu/drm/xe/xe_hwmon.c
+++ b/drivers/gpu/drm/xe/xe_hwmon.c
@@ -860,9 +860,15 @@ static int xe_hwmon_pcode_read_fan_control(const struct xe_hwmon *hwmon, u32 sub
 {
 	struct xe_tile *root_tile = xe_device_get_root_tile(hwmon->xe);
 
-	/* Platforms that don't return correct value */
+	/*
+	 * The PCODE FAN_SPEED_CONTROL subcommands return an error on DG2, so we
+	 * answer the FSC_READ_NUM_FANS query here. DG2 only wires a single fan
+	 * tachometer register (BMG_FAN_1_SPEED == 0x138140, shared with i915's
+	 * PCU_PWM_FAN_SPEED); BMG_FAN_2/3_SPEED read 0 on DG2 silicon. Reporting
+	 * one fan keeps a phantom fan2_input that always reads 0 out of sysfs.
+	 */
 	if (hwmon->xe->info.platform == XE_DG2 && subcmd == FSC_READ_NUM_FANS) {
-		*uval = 2;
+		*uval = 1;
 		return 0;
 	}
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH] drm/xe/hwmon: report a single fan for DG2 instead of two
  2026-05-27 11:53 [RFC PATCH] drm/xe/hwmon: report a single fan for DG2 instead of two Zhan Wei
@ 2026-05-27 12:17 ` sashiko-bot
  2026-05-27 13:53 ` Raag Jadav
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: sashiko-bot @ 2026-05-27 12:17 UTC (permalink / raw)
  To: Zhan Wei; +Cc: linux-hwmon

> xe_hwmon_pcode_read_fan_control() currently hardcodes *uval = 2 when
> queried with FSC_READ_NUM_FANS on DG2. This causes fan2_input to be
> exposed via sysfs, but on the tested Arc A750 LE (DG2 G10, PCI ID
> 0x56a1) fan2_input reads 0 RPM permanently while fan1_input correctly
> reports ~800 RPM with both physical fan physically spinning.
> 
> The RPM is calculated delta-based from a tach pulse counter:
> 
>     rotations = (reg_val - fi->reg_val_prev) / 2;
> 
> so a constant-zero RPM means the register at offset 0x138170
> (BMG_FAN_2_SPEED) simply does not accumulate pulses on DG2 silicon.
> The i915 driver does not expose fan2 on DG2 at all -- it only maps
> PCU_PWM_FAN_SPEED (0x138140, identical to BMG_FAN_1_SPEED), consistent
> with the observation that only one fan tach register is wired on DG2.
> 
> Report a single fan for DG2 to keep the phantom fan2_input out of
> sysfs.  Battlemage paths are unchanged.
> 
> Tested on Arc A750 LE (DG2 G10): with this patch applied, fan2_input
> no longer appears in /sys/class/hwmon/hwmonX/ and `sensors xe-pci-0300`
> shows fan1 only.
> 
> Fixes: 28f79ac609de ("drm/xe/hwmon: expose fan speed")
> Signed-off-by: Zhan Wei <zhanwei919@gmail.com>

Sashiko has reviewed this patch and found no issues. It looks great!

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260527115311.13398-1-zhanwei919@gmail.com?part=1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH] drm/xe/hwmon: report a single fan for DG2 instead of two
  2026-05-27 11:53 [RFC PATCH] drm/xe/hwmon: report a single fan for DG2 instead of two Zhan Wei
  2026-05-27 12:17 ` sashiko-bot
@ 2026-05-27 13:53 ` Raag Jadav
  2026-05-27 15:18   ` 占wei
  2026-05-28 14:37 ` ✗ LGCI.VerificationFailed: failure for " Patchwork
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 14+ messages in thread
From: Raag Jadav @ 2026-05-27 13:53 UTC (permalink / raw)
  To: Zhan Wei
  Cc: Matthew Brost, Thomas Hellström, Rodrigo Vivi, Andi Shyti,
	David Airlie, Simona Vetter, Guenter Roeck, intel-xe, dri-devel,
	linux-hwmon, linux-kernel

On Wed, May 27, 2026 at 07:53:11PM +0800, Zhan Wei wrote:
> xe_hwmon_pcode_read_fan_control() currently hardcodes *uval = 2 when
> queried with FSC_READ_NUM_FANS on DG2. This causes fan2_input to be
> exposed via sysfs, but on the tested Arc A750 LE (DG2 G10, PCI ID
> 0x56a1) fan2_input reads 0 RPM permanently while fan1_input correctly
> reports ~800 RPM with both physical fan physically spinning.
> 
> The RPM is calculated delta-based from a tach pulse counter:
> 
>     rotations = (reg_val - fi->reg_val_prev) / 2;
> 
> so a constant-zero RPM means the register at offset 0x138170
> (BMG_FAN_2_SPEED) simply does not accumulate pulses on DG2 silicon.
> The i915 driver does not expose fan2 on DG2 at all -- it only maps
> PCU_PWM_FAN_SPEED (0x138140, identical to BMG_FAN_1_SPEED), consistent
> with the observation that only one fan tach register is wired on DG2.

i915 is for legacy cards (like DG1) which only has a single channel
in hardware. I just happen to extend the support to DG2 for the folks
that might be using it.

> Report a single fan for DG2 to keep the phantom fan2_input out of
> sysfs.  Battlemage paths are unchanged.
> 
> Tested on Arc A750 LE (DG2 G10): with this patch applied, fan2_input
> no longer appears in /sys/class/hwmon/hwmonX/ and `sensors xe-pci-0300`
> shows fan1 only.
> 
> Fixes: 28f79ac609de ("drm/xe/hwmon: expose fan speed")
> Signed-off-by: Zhan Wei <zhanwei919@gmail.com>
> ---
> Open questions for reviewers: this is verified only on DG2 G10. Owners
> of G11 (e.g. ASRock Challenger A750) and G12 (e.g. Sparkle Titan A750
> with three physical fans) -- does fan2_input or fan3_input ever read
> non-zero in your setup? If so, the right fix is a per-subplatform
> table rather than a flat 1.

There's no straight answer here :)

root@DUT2147DG2FRD:/home/gta# cat /sys/class/drm/card0/device/device
0x56a1

root@DUT2147DG2FRD:/home/gta# sensors xe-pci-0300
xe-pci-0300
Adapter: PCI adapter
pkg:         758.00 mV
fan1:         636 RPM
fan2:         652 RPM
pkg:          +47.0°C
vram:         +50.0°C
pkg:              N/A  (max = 190.00 W)
pkg:          14.37 kJ


The way this works is upto the OEMs how they design their cards. Some reuse
a single channel for multiple physical fans while some use 1:1 mapped multiple
channels for each fan.

This is unfortunately not possible to figure out from the driver without
FSC_READ_NUM_FANS command (which has been found to be not working on some
cards and hence the hardcoded value).

Raag

>  drivers/gpu/drm/xe/xe_hwmon.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_hwmon.c b/drivers/gpu/drm/xe/xe_hwmon.c
> index de3f2aeffc3f..2a60a76b1971 100644
> --- a/drivers/gpu/drm/xe/xe_hwmon.c
> +++ b/drivers/gpu/drm/xe/xe_hwmon.c
> @@ -860,9 +860,15 @@ static int xe_hwmon_pcode_read_fan_control(const struct xe_hwmon *hwmon, u32 sub
>  {
>  	struct xe_tile *root_tile = xe_device_get_root_tile(hwmon->xe);
>  
> -	/* Platforms that don't return correct value */
> +	/*
> +	 * The PCODE FAN_SPEED_CONTROL subcommands return an error on DG2, so we
> +	 * answer the FSC_READ_NUM_FANS query here. DG2 only wires a single fan
> +	 * tachometer register (BMG_FAN_1_SPEED == 0x138140, shared with i915's
> +	 * PCU_PWM_FAN_SPEED); BMG_FAN_2/3_SPEED read 0 on DG2 silicon. Reporting
> +	 * one fan keeps a phantom fan2_input that always reads 0 out of sysfs.
> +	 */
>  	if (hwmon->xe->info.platform == XE_DG2 && subcmd == FSC_READ_NUM_FANS) {
> -		*uval = 2;
> +		*uval = 1;
>  		return 0;
>  	}
>  
> -- 
> 2.43.0
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH] drm/xe/hwmon: report a single fan for DG2 instead of two
  2026-05-27 13:53 ` Raag Jadav
@ 2026-05-27 15:18   ` 占wei
  2026-05-28 16:49     ` Raag Jadav
  0 siblings, 1 reply; 14+ messages in thread
From: 占wei @ 2026-05-27 15:18 UTC (permalink / raw)
  To: Raag Jadav
  Cc: Matthew Brost, Thomas Hellström, Rodrigo Vivi, Andi Shyti,
	David Airlie, Simona Vetter, Guenter Roeck, intel-xe, dri-devel,
	linux-hwmon, linux-kernel

Thanks for the detailed explanation -- that make sense

I can think of two paths forward:

1) Have fan_input_read() return -ENODATA if one channel has started
 reporting pulses but another remains silent for, say, 30 seconds.
 This way the phantom entry still appears in sysfs but userspace
 tools like `sensors` can handle the "no data" case gracefully
 instead of showing a misleading 0 RPM.

2) Drop the code change entirely and instead add a short note in
 Documentation/gpu/xe/xe_hwmon.rst explaining that on DG2 boards
 where the OEM routes multiple physical fans through a shared tach
 line, fan{2,3}_input may read 0, so future contributors don't end
 up re-attempting the same v1 patch I just sent.


What do you think?


Raag Jadav <raag.jadav@intel.com> 于2026年5月27日周三 21:53写道:
>
> On Wed, May 27, 2026 at 07:53:11PM +0800, Zhan Wei wrote:
> > xe_hwmon_pcode_read_fan_control() currently hardcodes *uval = 2 when
> > queried with FSC_READ_NUM_FANS on DG2. This causes fan2_input to be
> > exposed via sysfs, but on the tested Arc A750 LE (DG2 G10, PCI ID
> > 0x56a1) fan2_input reads 0 RPM permanently while fan1_input correctly
> > reports ~800 RPM with both physical fan physically spinning.
> >
> > The RPM is calculated delta-based from a tach pulse counter:
> >
> >     rotations = (reg_val - fi->reg_val_prev) / 2;
> >
> > so a constant-zero RPM means the register at offset 0x138170
> > (BMG_FAN_2_SPEED) simply does not accumulate pulses on DG2 silicon.
> > The i915 driver does not expose fan2 on DG2 at all -- it only maps
> > PCU_PWM_FAN_SPEED (0x138140, identical to BMG_FAN_1_SPEED), consistent
> > with the observation that only one fan tach register is wired on DG2.
>
> i915 is for legacy cards (like DG1) which only has a single channel
> in hardware. I just happen to extend the support to DG2 for the folks
> that might be using it.
>
> > Report a single fan for DG2 to keep the phantom fan2_input out of
> > sysfs.  Battlemage paths are unchanged.
> >
> > Tested on Arc A750 LE (DG2 G10): with this patch applied, fan2_input
> > no longer appears in /sys/class/hwmon/hwmonX/ and `sensors xe-pci-0300`
> > shows fan1 only.
> >
> > Fixes: 28f79ac609de ("drm/xe/hwmon: expose fan speed")
> > Signed-off-by: Zhan Wei <zhanwei919@gmail.com>
> > ---
> > Open questions for reviewers: this is verified only on DG2 G10. Owners
> > of G11 (e.g. ASRock Challenger A750) and G12 (e.g. Sparkle Titan A750
> > with three physical fans) -- does fan2_input or fan3_input ever read
> > non-zero in your setup? If so, the right fix is a per-subplatform
> > table rather than a flat 1.
>
> There's no straight answer here :)
>
> root@DUT2147DG2FRD:/home/gta# cat /sys/class/drm/card0/device/device
> 0x56a1
>
> root@DUT2147DG2FRD:/home/gta# sensors xe-pci-0300
> xe-pci-0300
> Adapter: PCI adapter
> pkg:         758.00 mV
> fan1:         636 RPM
> fan2:         652 RPM
> pkg:          +47.0°C
> vram:         +50.0°C
> pkg:              N/A  (max = 190.00 W)
> pkg:          14.37 kJ
>
>
> The way this works is upto the OEMs how they design their cards. Some reuse
> a single channel for multiple physical fans while some use 1:1 mapped multiple
> channels for each fan.
>
> This is unfortunately not possible to figure out from the driver without
> FSC_READ_NUM_FANS command (which has been found to be not working on some
> cards and hence the hardcoded value).
>
> Raag
>
> >  drivers/gpu/drm/xe/xe_hwmon.c | 10 ++++++++--
> >  1 file changed, 8 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_hwmon.c b/drivers/gpu/drm/xe/xe_hwmon.c
> > index de3f2aeffc3f..2a60a76b1971 100644
> > --- a/drivers/gpu/drm/xe/xe_hwmon.c
> > +++ b/drivers/gpu/drm/xe/xe_hwmon.c
> > @@ -860,9 +860,15 @@ static int xe_hwmon_pcode_read_fan_control(const struct xe_hwmon *hwmon, u32 sub
> >  {
> >       struct xe_tile *root_tile = xe_device_get_root_tile(hwmon->xe);
> >
> > -     /* Platforms that don't return correct value */
> > +     /*
> > +      * The PCODE FAN_SPEED_CONTROL subcommands return an error on DG2, so we
> > +      * answer the FSC_READ_NUM_FANS query here. DG2 only wires a single fan
> > +      * tachometer register (BMG_FAN_1_SPEED == 0x138140, shared with i915's
> > +      * PCU_PWM_FAN_SPEED); BMG_FAN_2/3_SPEED read 0 on DG2 silicon. Reporting
> > +      * one fan keeps a phantom fan2_input that always reads 0 out of sysfs.
> > +      */
> >       if (hwmon->xe->info.platform == XE_DG2 && subcmd == FSC_READ_NUM_FANS) {
> > -             *uval = 2;
> > +             *uval = 1;
> >               return 0;
> >       }
> >
> > --
> > 2.43.0
> >

^ permalink raw reply	[flat|nested] 14+ messages in thread

* ✗ LGCI.VerificationFailed: failure for drm/xe/hwmon: report a single fan for DG2 instead of two
  2026-05-27 11:53 [RFC PATCH] drm/xe/hwmon: report a single fan for DG2 instead of two Zhan Wei
  2026-05-27 12:17 ` sashiko-bot
  2026-05-27 13:53 ` Raag Jadav
@ 2026-05-28 14:37 ` Patchwork
  2026-05-29 13:50 ` [PATCH v2] drm/xe/hwmon: document DG2 fan speed reporting quirk Zhan Wei
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: Patchwork @ 2026-05-28 14:37 UTC (permalink / raw)
  To: 占wei; +Cc: intel-xe

== Series Details ==

Series: drm/xe/hwmon: report a single fan for DG2 instead of two
URL   : https://patchwork.freedesktop.org/series/167465/
State : failure

== Summary ==

Address 'zhanwei919@gmail.com' is not on the allowlist, which prevents CI from being triggered for this patch.
If you want Intel GFX CI to accept this address, please contact the script maintainers at i915-ci-infra@lists.freedesktop.org.
Exception occurred during validation, bailing out!
Build URL: http://intel-gfx-ci-public.igk.intel.com:8080/job/xe_pw_trigger/1172273/ (on master)



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH] drm/xe/hwmon: report a single fan for DG2 instead of two
  2026-05-27 15:18   ` 占wei
@ 2026-05-28 16:49     ` Raag Jadav
  0 siblings, 0 replies; 14+ messages in thread
From: Raag Jadav @ 2026-05-28 16:49 UTC (permalink / raw)
  To: 占wei
  Cc: Matthew Brost, Thomas Hellström, Rodrigo Vivi, Andi Shyti,
	David Airlie, Simona Vetter, Guenter Roeck, intel-xe, dri-devel,
	linux-hwmon, linux-kernel

On Wed, May 27, 2026 at 11:18:52PM +0800, 占wei wrote:
> Thanks for the detailed explanation -- that make sense
> 
> I can think of two paths forward:
> 
> 1) Have fan_input_read() return -ENODATA if one channel has started
>  reporting pulses but another remains silent for, say, 30 seconds.
>  This way the phantom entry still appears in sysfs but userspace
>  tools like `sensors` can handle the "no data" case gracefully
>  instead of showing a misleading 0 RPM.

Sounds a bit over engineered solution with its own caveats because

a) We assume that both channels are monitored simultaneously and first
   channel actually reports non-zero value for 30 seconds (or whatever
   trivial value we device) continuously, which is not guaranteed.

b) This means the output of one channel depends on another and I'm
   doubtful if maintainers would be okay with such hacks.

> 2) Drop the code change entirely and instead add a short note in
>  Documentation/gpu/xe/xe_hwmon.rst explaining that on DG2 boards
>  where the OEM routes multiple physical fans through a shared tach
>  line, fan{2,3}_input may read 0, so future contributors don't end
>  up re-attempting the same v1 patch I just sent.

This one makes more sense to me though.

Raag

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2] drm/xe/hwmon: document DG2 fan speed reporting quirk
  2026-05-27 11:53 [RFC PATCH] drm/xe/hwmon: report a single fan for DG2 instead of two Zhan Wei
                   ` (2 preceding siblings ...)
  2026-05-28 14:37 ` ✗ LGCI.VerificationFailed: failure for " Patchwork
@ 2026-05-29 13:50 ` Zhan Wei
  2026-05-29 14:05   ` 占wei
  2026-06-01 15:25 ` ✗ LGCI.VerificationFailed: failure for drm/xe/hwmon: report a single fan for DG2 instead of two (rev3) Patchwork
  2026-06-03 11:13 ` ✗ LGCI.VerificationFailed: failure for drm/xe/hwmon: report a single fan for DG2 instead of two (rev4) Patchwork
  5 siblings, 1 reply; 14+ messages in thread
From: Zhan Wei @ 2026-05-29 13:50 UTC (permalink / raw)
  To: Matthew Brost, Thomas Hellström, Rodrigo Vivi
  Cc: Jonathan Corbet, Shuah Khan, intel-xe, dri-devel, linux-doc,
	linux-kernel, Zhan Wei

The number of fanN_input attributes on DG2 is hardcoded to two because
FSC_READ_NUM_FANS returns an incorrect value on some boards. How the
physical fans map onto the tach channels is left to the board vendor:
some OEMs route multiple physical fans through a single shared tach
line, in which case the unwired channel's pulse counter never
accumulates and fanN_input reads a constant 0 RPM.

This is expected behaviour for such boards rather than a driver fault,
and the driver has no reliable way to distinguish a shared-tach layout
from a genuinely silent fan. Document this so the flat DG2 fan count is
not mistaken for a bug and "fixed" by lowering it, which would hide a
working fan2 on boards that do wire two tach lines.

Signed-off-by: Zhan Wei <zhanwei919@gmail.com>
---
v1 -> v2: Drop the code change. As pointed out in review, the same PCI
  device ID ships with both shared-tach (multiple physical fans on one
  channel) and 1:1 fan wiring, and FSC_READ_NUM_FANS is unreliable on
  some boards, so the DG2 fan count cannot be lowered without hiding a
  working fan2 on boards that do wire two tach lines. Document the
  behaviour instead of changing the reported fan count.

v1: https://lore.kernel.org/intel-xe/20260527115311.13398-1-zhanwei919@gmail.com/

 Documentation/gpu/xe/index.rst    |  1 +
 Documentation/gpu/xe/xe_hwmon.rst | 48 +++++++++++++++++++++++++++++++
 2 files changed, 49 insertions(+)
 create mode 100644 Documentation/gpu/xe/xe_hwmon.rst

diff --git a/Documentation/gpu/xe/index.rst b/Documentation/gpu/xe/index.rst
index 874ffcb6da3a..3c14cdcaa8a6 100644
--- a/Documentation/gpu/xe/index.rst
+++ b/Documentation/gpu/xe/index.rst
@@ -30,3 +30,4 @@ DG2, etc is provided to prototype the driver.
    xe-drm-usage-stats.rst
    xe_configfs
    xe_gt_stats
+   xe_hwmon
diff --git a/Documentation/gpu/xe/xe_hwmon.rst b/Documentation/gpu/xe/xe_hwmon.rst
new file mode 100644
index 000000000000..8cd48df59386
--- /dev/null
+++ b/Documentation/gpu/xe/xe_hwmon.rst
@@ -0,0 +1,48 @@
+.. SPDX-License-Identifier: (GPL-2.0+ OR MIT)
+
+=================
+Xe HWMON support
+=================
+
+The xe driver exposes hardware monitoring sensors (power, energy,
+temperature, voltage and fan speed) through the kernel hwmon subsystem,
+typically consumed via ``/sys/class/hwmon/hwmonX/`` or tools such as
+``sensors``.
+
+Fan speed reporting
+===================
+
+Fan speed (``fanN_input``) is reported in RPM and computed from a tach
+pulse counter: the driver reads an accumulating pulse register, divides
+the delta between two subsequent readings by two pulses per rotation,
+and time-averages the result.
+
+Number of fan channels
+-----------------------
+
+The number of ``fanN_input`` attributes exposed in sysfs is the fan
+count returned by the ``FSC_READ_NUM_FANS`` pcode command. On DG2 this
+command has been found to return an incorrect value on some boards, so
+the driver hardcodes a fan count of two there. As a result up to
+``fan1_input`` and ``fan2_input`` are always exposed on DG2 regardless
+of how many tach lines are actually wired.
+
+Zero RPM on DG2 is not necessarily a bug
+----------------------------------------
+
+How physical fans map onto the tach channels is left to the board
+vendor. Some OEMs route several physical fans through a single shared
+tach line, while others wire each fan to its own channel 1:1. The
+driver has no reliable way to tell these layouts apart, and the same PCI
+device ID can ship in either configuration.
+
+When a channel has no tach line driving it, its pulse counter never
+accumulates, so the corresponding ``fanN_input`` reads a constant 0 RPM.
+On DG2 this is most often seen on ``fan2_input`` for boards that drive
+both physical fans from a single tach line. This is expected behaviour
+for such boards, not a driver fault, and reflects the board wiring
+rather than a missing or stalled fan.
+
+For this reason the fan count on DG2 is intentionally left at a flat
+value rather than tracked per board: there is no driver-visible signal
+that distinguishes a shared-tach layout from a genuinely silent fan.
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] drm/xe/hwmon: document DG2 fan speed reporting quirk
  2026-05-29 13:50 ` [PATCH v2] drm/xe/hwmon: document DG2 fan speed reporting quirk Zhan Wei
@ 2026-05-29 14:05   ` 占wei
  2026-05-29 16:12     ` Raag Jadav
  0 siblings, 1 reply; 14+ messages in thread
From: 占wei @ 2026-05-29 14:05 UTC (permalink / raw)
  To: Matthew Brost, Thomas Hellström, Rodrigo Vivi
  Cc: Jonathan Corbet, Shuah Khan, intel-xe, dri-devel, linux-doc,
	linux-kernel, Raag Jadav

+Cc Raag, who authored the fan support and reviewed v1.

Thanks for your help, this v2 drops the code change and documents the
DG2 shared-tach behaviour instead, per your feedback on v1.

Zhan Wei <zhanwei919@gmail.com> 于2026年5月29日周五 21:50写道:
>
> The number of fanN_input attributes on DG2 is hardcoded to two because
> FSC_READ_NUM_FANS returns an incorrect value on some boards. How the
> physical fans map onto the tach channels is left to the board vendor:
> some OEMs route multiple physical fans through a single shared tach
> line, in which case the unwired channel's pulse counter never
> accumulates and fanN_input reads a constant 0 RPM.
>
> This is expected behaviour for such boards rather than a driver fault,
> and the driver has no reliable way to distinguish a shared-tach layout
> from a genuinely silent fan. Document this so the flat DG2 fan count is
> not mistaken for a bug and "fixed" by lowering it, which would hide a
> working fan2 on boards that do wire two tach lines.
>
> Signed-off-by: Zhan Wei <zhanwei919@gmail.com>
> ---
> v1 -> v2: Drop the code change. As pointed out in review, the same PCI
>   device ID ships with both shared-tach (multiple physical fans on one
>   channel) and 1:1 fan wiring, and FSC_READ_NUM_FANS is unreliable on
>   some boards, so the DG2 fan count cannot be lowered without hiding a
>   working fan2 on boards that do wire two tach lines. Document the
>   behaviour instead of changing the reported fan count.
>
> v1: https://lore.kernel.org/intel-xe/20260527115311.13398-1-zhanwei919@gmail.com/
>
>  Documentation/gpu/xe/index.rst    |  1 +
>  Documentation/gpu/xe/xe_hwmon.rst | 48 +++++++++++++++++++++++++++++++
>  2 files changed, 49 insertions(+)
>  create mode 100644 Documentation/gpu/xe/xe_hwmon.rst
>
> diff --git a/Documentation/gpu/xe/index.rst b/Documentation/gpu/xe/index.rst
> index 874ffcb6da3a..3c14cdcaa8a6 100644
> --- a/Documentation/gpu/xe/index.rst
> +++ b/Documentation/gpu/xe/index.rst
> @@ -30,3 +30,4 @@ DG2, etc is provided to prototype the driver.
>     xe-drm-usage-stats.rst
>     xe_configfs
>     xe_gt_stats
> +   xe_hwmon
> diff --git a/Documentation/gpu/xe/xe_hwmon.rst b/Documentation/gpu/xe/xe_hwmon.rst
> new file mode 100644
> index 000000000000..8cd48df59386
> --- /dev/null
> +++ b/Documentation/gpu/xe/xe_hwmon.rst
> @@ -0,0 +1,48 @@
> +.. SPDX-License-Identifier: (GPL-2.0+ OR MIT)
> +
> +=================
> +Xe HWMON support
> +=================
> +
> +The xe driver exposes hardware monitoring sensors (power, energy,
> +temperature, voltage and fan speed) through the kernel hwmon subsystem,
> +typically consumed via ``/sys/class/hwmon/hwmonX/`` or tools such as
> +``sensors``.
> +
> +Fan speed reporting
> +===================
> +
> +Fan speed (``fanN_input``) is reported in RPM and computed from a tach
> +pulse counter: the driver reads an accumulating pulse register, divides
> +the delta between two subsequent readings by two pulses per rotation,
> +and time-averages the result.
> +
> +Number of fan channels
> +-----------------------
> +
> +The number of ``fanN_input`` attributes exposed in sysfs is the fan
> +count returned by the ``FSC_READ_NUM_FANS`` pcode command. On DG2 this
> +command has been found to return an incorrect value on some boards, so
> +the driver hardcodes a fan count of two there. As a result up to
> +``fan1_input`` and ``fan2_input`` are always exposed on DG2 regardless
> +of how many tach lines are actually wired.
> +
> +Zero RPM on DG2 is not necessarily a bug
> +----------------------------------------
> +
> +How physical fans map onto the tach channels is left to the board
> +vendor. Some OEMs route several physical fans through a single shared
> +tach line, while others wire each fan to its own channel 1:1. The
> +driver has no reliable way to tell these layouts apart, and the same PCI
> +device ID can ship in either configuration.
> +
> +When a channel has no tach line driving it, its pulse counter never
> +accumulates, so the corresponding ``fanN_input`` reads a constant 0 RPM.
> +On DG2 this is most often seen on ``fan2_input`` for boards that drive
> +both physical fans from a single tach line. This is expected behaviour
> +for such boards, not a driver fault, and reflects the board wiring
> +rather than a missing or stalled fan.
> +
> +For this reason the fan count on DG2 is intentionally left at a flat
> +value rather than tracked per board: there is no driver-visible signal
> +that distinguishes a shared-tach layout from a genuinely silent fan.
> --
> 2.43.0
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] drm/xe/hwmon: document DG2 fan speed reporting quirk
  2026-05-29 14:05   ` 占wei
@ 2026-05-29 16:12     ` Raag Jadav
  2026-05-29 17:24       ` [PATCH v3] " Zhan Wei
  0 siblings, 1 reply; 14+ messages in thread
From: Raag Jadav @ 2026-05-29 16:12 UTC (permalink / raw)
  To: 占wei
  Cc: Matthew Brost, Thomas Hellström, Rodrigo Vivi,
	Jonathan Corbet, Shuah Khan, intel-xe, dri-devel, linux-doc,
	linux-kernel

On Fri, May 29, 2026 at 10:05:58PM +0800, 占wei wrote:
> +Cc Raag, who authored the fan support and reviewed v1.
> 
> Thanks for your help, this v2 drops the code change and documents the
> DG2 shared-tach behaviour instead, per your feedback on v1.

IMO it's a bit verbose to have a dedicated doc for this. Just add a small
comment in the existing ABI doc[1] under fan channel description.

[1] Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon

Raag

> Zhan Wei <zhanwei919@gmail.com> 于2026年5月29日周五 21:50写道:
> >
> > The number of fanN_input attributes on DG2 is hardcoded to two because
> > FSC_READ_NUM_FANS returns an incorrect value on some boards. How the
> > physical fans map onto the tach channels is left to the board vendor:
> > some OEMs route multiple physical fans through a single shared tach
> > line, in which case the unwired channel's pulse counter never
> > accumulates and fanN_input reads a constant 0 RPM.
> >
> > This is expected behaviour for such boards rather than a driver fault,
> > and the driver has no reliable way to distinguish a shared-tach layout
> > from a genuinely silent fan. Document this so the flat DG2 fan count is
> > not mistaken for a bug and "fixed" by lowering it, which would hide a
> > working fan2 on boards that do wire two tach lines.
> >
> > Signed-off-by: Zhan Wei <zhanwei919@gmail.com>
> > ---
> > v1 -> v2: Drop the code change. As pointed out in review, the same PCI
> >   device ID ships with both shared-tach (multiple physical fans on one
> >   channel) and 1:1 fan wiring, and FSC_READ_NUM_FANS is unreliable on
> >   some boards, so the DG2 fan count cannot be lowered without hiding a
> >   working fan2 on boards that do wire two tach lines. Document the
> >   behaviour instead of changing the reported fan count.
> >
> > v1: https://lore.kernel.org/intel-xe/20260527115311.13398-1-zhanwei919@gmail.com/
> >
> >  Documentation/gpu/xe/index.rst    |  1 +
> >  Documentation/gpu/xe/xe_hwmon.rst | 48 +++++++++++++++++++++++++++++++
> >  2 files changed, 49 insertions(+)
> >  create mode 100644 Documentation/gpu/xe/xe_hwmon.rst
> >
> > diff --git a/Documentation/gpu/xe/index.rst b/Documentation/gpu/xe/index.rst
> > index 874ffcb6da3a..3c14cdcaa8a6 100644
> > --- a/Documentation/gpu/xe/index.rst
> > +++ b/Documentation/gpu/xe/index.rst
> > @@ -30,3 +30,4 @@ DG2, etc is provided to prototype the driver.
> >     xe-drm-usage-stats.rst
> >     xe_configfs
> >     xe_gt_stats
> > +   xe_hwmon
> > diff --git a/Documentation/gpu/xe/xe_hwmon.rst b/Documentation/gpu/xe/xe_hwmon.rst
> > new file mode 100644
> > index 000000000000..8cd48df59386
> > --- /dev/null
> > +++ b/Documentation/gpu/xe/xe_hwmon.rst
> > @@ -0,0 +1,48 @@
> > +.. SPDX-License-Identifier: (GPL-2.0+ OR MIT)
> > +
> > +=================
> > +Xe HWMON support
> > +=================
> > +
> > +The xe driver exposes hardware monitoring sensors (power, energy,
> > +temperature, voltage and fan speed) through the kernel hwmon subsystem,
> > +typically consumed via ``/sys/class/hwmon/hwmonX/`` or tools such as
> > +``sensors``.
> > +
> > +Fan speed reporting
> > +===================
> > +
> > +Fan speed (``fanN_input``) is reported in RPM and computed from a tach
> > +pulse counter: the driver reads an accumulating pulse register, divides
> > +the delta between two subsequent readings by two pulses per rotation,
> > +and time-averages the result.
> > +
> > +Number of fan channels
> > +-----------------------
> > +
> > +The number of ``fanN_input`` attributes exposed in sysfs is the fan
> > +count returned by the ``FSC_READ_NUM_FANS`` pcode command. On DG2 this
> > +command has been found to return an incorrect value on some boards, so
> > +the driver hardcodes a fan count of two there. As a result up to
> > +``fan1_input`` and ``fan2_input`` are always exposed on DG2 regardless
> > +of how many tach lines are actually wired.
> > +
> > +Zero RPM on DG2 is not necessarily a bug
> > +----------------------------------------
> > +
> > +How physical fans map onto the tach channels is left to the board
> > +vendor. Some OEMs route several physical fans through a single shared
> > +tach line, while others wire each fan to its own channel 1:1. The
> > +driver has no reliable way to tell these layouts apart, and the same PCI
> > +device ID can ship in either configuration.
> > +
> > +When a channel has no tach line driving it, its pulse counter never
> > +accumulates, so the corresponding ``fanN_input`` reads a constant 0 RPM.
> > +On DG2 this is most often seen on ``fan2_input`` for boards that drive
> > +both physical fans from a single tach line. This is expected behaviour
> > +for such boards, not a driver fault, and reflects the board wiring
> > +rather than a missing or stalled fan.
> > +
> > +For this reason the fan count on DG2 is intentionally left at a flat
> > +value rather than tracked per board: there is no driver-visible signal
> > +that distinguishes a shared-tach layout from a genuinely silent fan.
> > --
> > 2.43.0
> >

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v3] drm/xe/hwmon: document DG2 fan speed reporting quirk
  2026-05-29 16:12     ` Raag Jadav
@ 2026-05-29 17:24       ` Zhan Wei
  2026-05-30  7:12         ` Raag Jadav
  0 siblings, 1 reply; 14+ messages in thread
From: Zhan Wei @ 2026-05-29 17:24 UTC (permalink / raw)
  To: matthew.brost, thomas.hellstrom, rodrigo.vivi
  Cc: raag.jadav, corbet, skhan, intel-xe, dri-devel, linux-doc,
	linux-kernel, Zhan Wei

On DG2 the driver always shows two fan channels, because the
FSC_READ_NUM_FANS command does not work on some cards. OEMs decide how
the fans map to tach channels, so two fans can share one tach line.
When that happens, the second channel reads 0 RPM even though the fan
is spinning.

Note this on the fan2_input ABI entry so the steady 0 RPM is not
mistaken for a driver bug.

Signed-off-by: Zhan Wei <zhanwei919@gmail.com>
---
v3:
- Drop the dedicated Documentation/gpu/xe/xe_hwmon.rst doc and the
  index.rst hunk; add a short note under the fan2_input entry in the
  existing ABI doc instead, per Raag's feedback.
v2: https://lore.kernel.org/intel-xe/20260529135028.20763-1-zhanwei919@gmail.com/
- Drop the code change that reported a single fan on DG2; document the
  shared-tach behaviour instead, per review feedback on v1.
v1: https://lore.kernel.org/intel-xe/20260527115311.13398-1-zhanwei919@gmail.com/

 Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon b/Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon
index 55ab45f669ac..0da739d9a816 100644
--- a/Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon
+++ b/Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon
@@ -251,6 +251,13 @@ Description:	RO. Fan 2 speed in RPM.
 
 		Only supported for particular Intel Xe graphics platforms.
 
+		On DG2 the driver always shows two fan channels, because the
+		FSC_READ_NUM_FANS command does not work on some cards. OEMs
+		decide how the fans map to tach channels, so two fans can share
+		one tach line. When that happens, the second channel
+		reads 0 RPM even though the fan is spinning. This is normal, not
+		a bug.
+
 What:		/sys/bus/pci/drivers/xe/.../hwmon/hwmon<i>/fan3_input
 Date:		March 2025
 KernelVersion:	6.16
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v3] drm/xe/hwmon: document DG2 fan speed reporting quirk
  2026-05-29 17:24       ` [PATCH v3] " Zhan Wei
@ 2026-05-30  7:12         ` Raag Jadav
  2026-06-02 16:17           ` [PATCH v4] " Zhan Wei
  0 siblings, 1 reply; 14+ messages in thread
From: Raag Jadav @ 2026-05-30  7:12 UTC (permalink / raw)
  To: Zhan Wei
  Cc: matthew.brost, thomas.hellstrom, rodrigo.vivi, corbet, skhan,
	intel-xe, dri-devel, linux-doc, linux-kernel

On Sat, May 30, 2026 at 01:24:49AM +0800, Zhan Wei wrote:
> On DG2 the driver always shows two fan channels, because the
> FSC_READ_NUM_FANS command does not work on some cards. OEMs decide how
> the fans map to tach channels, so two fans can share one tach line.
> When that happens, the second channel reads 0 RPM even though the fan
> is spinning.
> 
> Note this on the fan2_input ABI entry so the steady 0 RPM is not
> mistaken for a driver bug.

Fixes: 28f79ac609de ("drm/xe/hwmon: expose fan speed")

> Signed-off-by: Zhan Wei <zhanwei919@gmail.com>

Reviewed-by: Raag Jadav <raag.jadav@intel.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* ✗ LGCI.VerificationFailed: failure for drm/xe/hwmon: report a single fan for DG2 instead of two (rev3)
  2026-05-27 11:53 [RFC PATCH] drm/xe/hwmon: report a single fan for DG2 instead of two Zhan Wei
                   ` (3 preceding siblings ...)
  2026-05-29 13:50 ` [PATCH v2] drm/xe/hwmon: document DG2 fan speed reporting quirk Zhan Wei
@ 2026-06-01 15:25 ` Patchwork
  2026-06-03 11:13 ` ✗ LGCI.VerificationFailed: failure for drm/xe/hwmon: report a single fan for DG2 instead of two (rev4) Patchwork
  5 siblings, 0 replies; 14+ messages in thread
From: Patchwork @ 2026-06-01 15:25 UTC (permalink / raw)
  To: Zhan Wei; +Cc: intel-xe

== Series Details ==

Series: drm/xe/hwmon: report a single fan for DG2 instead of two (rev3)
URL   : https://patchwork.freedesktop.org/series/167465/
State : failure

== Summary ==

Address 'zhanwei919@gmail.com' is not on the allowlist, which prevents CI from being triggered for this patch.
If you want Intel GFX CI to accept this address, please contact the script maintainers at i915-ci-infra@lists.freedesktop.org.
Exception occurred during validation, bailing out!
Build URL: http://intel-gfx-ci-public.igk.intel.com:8080/job/xe_pw_trigger/1177062/ (on master)



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v4] drm/xe/hwmon: document DG2 fan speed reporting quirk
  2026-05-30  7:12         ` Raag Jadav
@ 2026-06-02 16:17           ` Zhan Wei
  0 siblings, 0 replies; 14+ messages in thread
From: Zhan Wei @ 2026-06-02 16:17 UTC (permalink / raw)
  To: matthew.brost, thomas.hellstrom, rodrigo.vivi
  Cc: raag.jadav, corbet, skhan, intel-xe, dri-devel, linux-doc,
	linux-kernel, Zhan Wei

On DG2 the driver always shows two fan channels, because the
FSC_READ_NUM_FANS command does not work on some cards. OEMs decide how
the fans map to tach channels, so two fans can share one tach line.
When that happens, the second channel reads 0 RPM even though the fan
is spinning.

Note this on the fan2_input ABI entry so the steady 0 RPM is not
mistaken for a driver bug.

Fixes: 28f79ac609de ("drm/xe/hwmon: expose fan speed")
Signed-off-by: Zhan Wei <zhanwei919@gmail.com>
Reviewed-by: Raag Jadav <raag.jadav@intel.com>
---
v4:
- Add Fixes: tag and collect Reviewed-by from Raag.
v3: https://lore.kernel.org/intel-xe/20260529172449.41504-1-zhanwei919@gmail.com/
- Drop the dedicated Documentation/gpu/xe/xe_hwmon.rst doc and the
  index.rst hunk; add a short note under the fan2_input entry in the
  existing ABI doc instead, per Raag's feedback.
v2: https://lore.kernel.org/intel-xe/20260529135028.20763-1-zhanwei919@gmail.com/
- Drop the code change that reported a single fan on DG2; document the
  shared-tach behaviour instead, per review feedback on v1.
v1: https://lore.kernel.org/intel-xe/20260527115311.13398-1-zhanwei919@gmail.com/
 Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon b/Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon
index 55ab45f669ac..0da739d9a816 100644
--- a/Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon
+++ b/Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon
@@ -251,6 +251,13 @@ Description:	RO. Fan 2 speed in RPM.
 
 		Only supported for particular Intel Xe graphics platforms.
 
+		On DG2 the driver always shows two fan channels, because the
+		FSC_READ_NUM_FANS command does not work on some cards. OEMs
+		decide how the fans map to tach channels, so two fans can share
+		one tach line. When that happens, the second channel
+		reads 0 RPM even though the fan is spinning. This is normal, not
+		a bug.
+
 What:		/sys/bus/pci/drivers/xe/.../hwmon/hwmon<i>/fan3_input
 Date:		March 2025
 KernelVersion:	6.16
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* ✗ LGCI.VerificationFailed: failure for drm/xe/hwmon: report a single fan for DG2 instead of two (rev4)
  2026-05-27 11:53 [RFC PATCH] drm/xe/hwmon: report a single fan for DG2 instead of two Zhan Wei
                   ` (4 preceding siblings ...)
  2026-06-01 15:25 ` ✗ LGCI.VerificationFailed: failure for drm/xe/hwmon: report a single fan for DG2 instead of two (rev3) Patchwork
@ 2026-06-03 11:13 ` Patchwork
  5 siblings, 0 replies; 14+ messages in thread
From: Patchwork @ 2026-06-03 11:13 UTC (permalink / raw)
  To: Zhan Wei; +Cc: intel-xe

== Series Details ==

Series: drm/xe/hwmon: report a single fan for DG2 instead of two (rev4)
URL   : https://patchwork.freedesktop.org/series/167465/
State : failure

== Summary ==

Address 'zhanwei919@gmail.com' is not on the allowlist, which prevents CI from being triggered for this patch.
If you want Intel GFX CI to accept this address, please contact the script maintainers at i915-ci-infra@lists.freedesktop.org.
Exception occurred during validation, bailing out!
Build URL: http://intel-gfx-ci-public.igk.intel.com:8080/job/xe_pw_trigger/1178476/ (on master)



^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2026-06-03 11:13 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-27 11:53 [RFC PATCH] drm/xe/hwmon: report a single fan for DG2 instead of two Zhan Wei
2026-05-27 12:17 ` sashiko-bot
2026-05-27 13:53 ` Raag Jadav
2026-05-27 15:18   ` 占wei
2026-05-28 16:49     ` Raag Jadav
2026-05-28 14:37 ` ✗ LGCI.VerificationFailed: failure for " Patchwork
2026-05-29 13:50 ` [PATCH v2] drm/xe/hwmon: document DG2 fan speed reporting quirk Zhan Wei
2026-05-29 14:05   ` 占wei
2026-05-29 16:12     ` Raag Jadav
2026-05-29 17:24       ` [PATCH v3] " Zhan Wei
2026-05-30  7:12         ` Raag Jadav
2026-06-02 16:17           ` [PATCH v4] " Zhan Wei
2026-06-01 15:25 ` ✗ LGCI.VerificationFailed: failure for drm/xe/hwmon: report a single fan for DG2 instead of two (rev3) Patchwork
2026-06-03 11:13 ` ✗ LGCI.VerificationFailed: failure for drm/xe/hwmon: report a single fan for DG2 instead of two (rev4) Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.