* Weird Dell SMM bug since 6.18
@ 2026-03-13 12:39 Jan Claußen
2026-03-13 16:43 ` Guenter Roeck
0 siblings, 1 reply; 12+ messages in thread
From: Jan Claußen @ 2026-03-13 12:39 UTC (permalink / raw)
To: linux-hwmon
Hi,
I've been experiencing a very weird bug since kernel 6.18 and had been
staying on 6.12 LTS because of it over the last months. I am using the
application coolercontrol to control the case fans on my old Dell
Precision 5810. Here is some background
https://gitlab.com/coolercontrol/coolercontrol/-/work_items/557
To be clear, I am not 100% sure if this is a bug in the kernel or in the
in the application. I am not the only one experiencing it though and the
maintainers of coolercontrol don't know what caused it either, so I am
hoping for some help/advice here.
The issue:
Everything was fine on 6.17 and when 6.18 was released coolercontrol
said it couldn't write the pwm attributes anymore. They were writable
using echo though. After downgrading to 6.17 everything was fine again.
I took the time to bisect the kernel from 6.17 to 6.18 and got the
following result:
c050daf69f3edf72e274eaa321f663b1779c4391 is the first bad commit
commit c050daf69f3edf72e274eaa321f663b1779c4391
Merge: 989253cc46ff 8f2689f194b8
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date: Wed Oct 1 10:33:17 2025 -0700
Merge tag 'pwm/for-6.18-rc1' of
git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux
Pull pwm updates from Uwe Kleine-König:
"The core highlights for this cycle are:
- The pca9586 driver was converted to the waveform API
- Waveform drivers automatically provide a gpio chip to make PWMs
usable as GPIOs (The pca9586 driver did that in a driver specific
implementation before)
Otherwise it's the usual mix of fixes and device tree and driver
changes to support new hardware variants"
* tag 'pwm/for-6.18-rc1' of
git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux: (30 commits)
pwm: cros-ec: Avoid -Wflex-array-member-not-at-end warnings
dt-bindings: pwm: samsung: add exynos8890 compatible
dt-bindings: pwm: apple,s5l-fpwm: Add t6020-fpwm compatible
dt-bindings: pwm: nxp,lpc1850-sct-pwm: Minor whitespace cleanup
in example
pwm: pca9586: Convert to waveform API
pwm: pca9685: Drop GPIO support
pwm: pca9685: Make use of register caching in regmap
pwm: pca9685: Use bulk write to atomicially update registers
pwm: pca9685: Don't disable hardware in .free()
pwm: Add the S32G support in the Freescale FTM driver
dt-bindings: pwm: fsl,vf610-ftm-pwm: Add compatible for s32g2 and
s32g3
pwm: mediatek: Lock and cache clock rate
pwm: mediatek: Fix various issues in the .apply() callback
pwm: mediatek: Implement .get_state() callback
pwm: mediatek: Initialize clks when the hardware is enabled at
probe time
pwm: mediatek: Rework parameters for clk helper function
pwm: mediatek: Introduce and use a few more register defines
pwm: mediatek: Simplify representation of channel offsets
pwm: tiecap: Document behaviour of hardware disable
pwm: Provide a gpio device for waveform drivers
...
Documentation/devicetree/bindings/pwm/apple,s5l-fpwm.yaml | 3 +-
Documentation/devicetree/bindings/pwm/fsl,vf610-ftm-pwm.yaml | 11 ++--
Documentation/devicetree/bindings/pwm/nxp,lpc1850-sct-pwm.yaml | 2 +-
Documentation/devicetree/bindings/pwm/pwm-samsung.yaml | 1 +
Documentation/devicetree/bindings/timer/renesas,rz-mtu3.yaml | 7 ++-
drivers/pwm/Kconfig | 9 ++++
drivers/pwm/core.c | 108
+++++++++++++++++++++++++++++++++-----
drivers/pwm/pwm-berlin.c | 4 +-
drivers/pwm/pwm-cros-ec.c | 10 ++--
drivers/pwm/pwm-fsl-ftm.c | 35
++++++++++++-
drivers/pwm/pwm-loongson.c | 2 +-
drivers/pwm/pwm-mediatek.c | 308
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++------------------------------
drivers/pwm/pwm-pca9685.c | 515
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----------------------------------------------------------------------------------------------------------
drivers/pwm/pwm-tiecap.c | 4 ++
drivers/pwm/pwm-tiehrpwm.c | 154
++++++++++++++++++++++--------------------------------
include/linux/pwm.h | 3 ++
16 files changed, 661 insertions(+), 515 deletions(-)
This seemed like it could be it, as it's pwm-related, but nothing
Dell-specific. One merge before though there was
1c1658058c99 hwmon: (dell-smm) Add support for automatic fan mode
which could be related. Since the pwm_enable attribute was introduced in
6.18, I am suspecting it has something to do with it.
Now the weird part:
git bisect start
# Status: warte auf guten und schlechten Commit
# good: [e5f0a698b34ed76002dc5cff3804a61c80233a7a] Linux 6.17
git bisect good e5f0a698b34ed76002dc5cff3804a61c80233a7a
# Status: warte auf schlechten Commit, 1 guter Commit bekannt
# bad: [7d0a66e4bb9081d75c82ec4957c50034cb0ea449] Linux 6.18
git bisect bad 7d0a66e4bb9081d75c82ec4957c50034cb0ea449
# bad: [f79e772258df311c2cb21594ca0996318e720d28] Merge tag
'media/v6.18-1' of
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
git bisect bad f79e772258df311c2cb21594ca0996318e720d28
# bad: [0f048c878ee32a4259dbf28e0ad8fd0b71ee0085] Merge tag
'soc-dt-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
git bisect bad 0f048c878ee32a4259dbf28e0ad8fd0b71ee0085
# bad: [c050daf69f3edf72e274eaa321f663b1779c4391] Merge tag
'pwm/for-6.18-rc1' of
git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux
git bisect bad c050daf69f3edf72e274eaa321f663b1779c4391
# good: [a23cd25baed2316e50597f8b67192bdc904f955b] Merge tag
'sched_ext-for-6.18' of
git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext
git bisect good a23cd25baed2316e50597f8b67192bdc904f955b
# good: [4b81e2eb9e4db8f6094c077d0c8b27c264901c1b] Merge tag
'timers-vdso-2025-09-29' of
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good 4b81e2eb9e4db8f6094c077d0c8b27c264901c1b
# good: [ae28ed4578e6d5a481e39c5a9827f27048661fdd] Merge tag
'bpf-next-6.18' of
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
git bisect good ae28ed4578e6d5a481e39c5a9827f27048661fdd
# good: [eb3289fc474f74105e0627bf508e3f9742fd3b63] Merge tag
'driver-core-6.18-rc1' of
git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core
git bisect good eb3289fc474f74105e0627bf508e3f9742fd3b63
# good: [eb3289fc474f74105e0627bf508e3f9742fd3b63] Merge tag
'driver-core-6.18-rc1' of
git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core
git bisect good eb3289fc474f74105e0627bf508e3f9742fd3b63
# good: [7e5969a4d3e794993c9ca8d4026cf31a34b32b30] dt-bindings:
trivial-devices: Add sht2x sensors
git bisect good 7e5969a4d3e794993c9ca8d4026cf31a34b32b30
# good: [7e5969a4d3e794993c9ca8d4026cf31a34b32b30] dt-bindings:
trivial-devices: Add sht2x sensors
git bisect good 7e5969a4d3e794993c9ca8d4026cf31a34b32b30
# good: [989253cc46ff3f4973495b58e02c7fcb1ffb713e] Merge tag
'hwmon-for-v6.18-rc1' of
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
git bisect good 989253cc46ff3f4973495b58e02c7fcb1ffb713e
# good: [f43e1280731c2a6bbd2d9517fd6b726d6ebe6641] pwm: mediatek: Rework
parameters for clk helper function
git bisect good f43e1280731c2a6bbd2d9517fd6b726d6ebe6641
# good: [de5855613263b426ee697dd30224322f2e634dec] pwm: pca9685: Use
bulk write to atomicially update registers
git bisect good de5855613263b426ee697dd30224322f2e634dec
# good: [efedb508591e231b47b23ce6b353c81eeb3b9a84] dt-bindings: pwm:
nxp,lpc1850-sct-pwm: Minor whitespace cleanup in example
git bisect good efedb508591e231b47b23ce6b353c81eeb3b9a84
# good: [ebd524a3ac3a172aa26f99d20d4d00d57da9a875] dt-bindings: pwm:
samsung: add exynos8890 compatible
git bisect good ebd524a3ac3a172aa26f99d20d4d00d57da9a875
# good: [8f2689f194b8d1bff41150ae316abdfccf191309] pwm: cros-ec: Avoid
-Wflex-array-member-not-at-end warnings
git bisect good 8f2689f194b8d1bff41150ae316abdfccf191309
# first bad commit: [c050daf69f3edf72e274eaa321f663b1779c4391] Merge tag
'pwm/for-6.18-rc1' of
git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux
# good: [c050daf69f3edf72e274eaa321f663b1779c4391] Merge tag
'pwm/for-6.18-rc1' of
git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux
git bisect good c050daf69f3edf72e274eaa321f663b1779c4391
The first time I hit the pwm/for-6.18-rc1 it was bad, now it suddenly is
good. I am writing this from 6.19 and the issue is fixed all of a
sudden. The only way I can explain this is that some configuration has
changed. It might be that a jump from 6.17 to 6.18 causes the
configuration not to be applied correctly but traversing commits step by
step fixed it.
Does anyone here have a possible explanation for this? Since I am not
the only one affected, as stated in the coolercontrol issue, it might be
worth looking into this further.
Regards,
Jan
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Weird Dell SMM bug since 6.18
2026-03-13 12:39 Weird Dell SMM bug since 6.18 Jan Claußen
@ 2026-03-13 16:43 ` Guenter Roeck
2026-03-13 19:06 ` Jan Claußen
0 siblings, 1 reply; 12+ messages in thread
From: Guenter Roeck @ 2026-03-13 16:43 UTC (permalink / raw)
To: Jan Claußen, linux-hwmon
On 3/13/26 05:39, Jan Claußen wrote:
> Hi,
>
> I've been experiencing a very weird bug since kernel 6.18 and had been staying on 6.12 LTS because of it over the last months. I am using the application coolercontrol to control the case fans on my old Dell Precision 5810. Here is some background https://gitlab.com/coolercontrol/coolercontrol/-/work_items/557
>
> To be clear, I am not 100% sure if this is a bug in the kernel or in the in the application. I am not the only one experiencing it though and the maintainers of coolercontrol don't know what caused it either, so I am hoping for some help/advice here.
>
> The issue:
>
> Everything was fine on 6.17 and when 6.18 was released coolercontrol said it couldn't write the pwm attributes anymore. They were writable using echo though. After downgrading to 6.17 everything was fine again. I took the time to bisect the kernel from 6.17 to 6.18 and got the following result:
>
> c050daf69f3edf72e274eaa321f663b1779c4391 is the first bad commit
> commit c050daf69f3edf72e274eaa321f663b1779c4391
> Merge: 989253cc46ff 8f2689f194b8
> Author: Linus Torvalds <torvalds@linux-foundation.org>
> Date: Wed Oct 1 10:33:17 2025 -0700
>
> Merge tag 'pwm/for-6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux
>
> Pull pwm updates from Uwe Kleine-König:
> "The core highlights for this cycle are:
>
> - The pca9586 driver was converted to the waveform API
>
> - Waveform drivers automatically provide a gpio chip to make PWMs
> usable as GPIOs (The pca9586 driver did that in a driver specific
> implementation before)
>
> Otherwise it's the usual mix of fixes and device tree and driver
> changes to support new hardware variants"
>
> * tag 'pwm/for-6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux: (30 commits)
> pwm: cros-ec: Avoid -Wflex-array-member-not-at-end warnings
> dt-bindings: pwm: samsung: add exynos8890 compatible
> dt-bindings: pwm: apple,s5l-fpwm: Add t6020-fpwm compatible
> dt-bindings: pwm: nxp,lpc1850-sct-pwm: Minor whitespace cleanup in example
> pwm: pca9586: Convert to waveform API
> pwm: pca9685: Drop GPIO support
> pwm: pca9685: Make use of register caching in regmap
> pwm: pca9685: Use bulk write to atomicially update registers
> pwm: pca9685: Don't disable hardware in .free()
> pwm: Add the S32G support in the Freescale FTM driver
> dt-bindings: pwm: fsl,vf610-ftm-pwm: Add compatible for s32g2 and s32g3
> pwm: mediatek: Lock and cache clock rate
> pwm: mediatek: Fix various issues in the .apply() callback
> pwm: mediatek: Implement .get_state() callback
> pwm: mediatek: Initialize clks when the hardware is enabled at probe time
> pwm: mediatek: Rework parameters for clk helper function
> pwm: mediatek: Introduce and use a few more register defines
> pwm: mediatek: Simplify representation of channel offsets
> pwm: tiecap: Document behaviour of hardware disable
> pwm: Provide a gpio device for waveform drivers
> ...
>
> Documentation/devicetree/bindings/pwm/apple,s5l-fpwm.yaml | 3 +-
> Documentation/devicetree/bindings/pwm/fsl,vf610-ftm-pwm.yaml | 11 ++--
> Documentation/devicetree/bindings/pwm/nxp,lpc1850-sct-pwm.yaml | 2 +-
> Documentation/devicetree/bindings/pwm/pwm-samsung.yaml | 1 +
> Documentation/devicetree/bindings/timer/renesas,rz-mtu3.yaml | 7 ++-
> drivers/pwm/Kconfig | 9 ++++
> drivers/pwm/core.c | 108 +++++++++++++++++++++++++++++++++-----
> drivers/pwm/pwm-berlin.c | 4 +-
> drivers/pwm/pwm-cros-ec.c | 10 ++--
> drivers/pwm/pwm-fsl-ftm.c | 35 ++++++++++++-
> drivers/pwm/pwm-loongson.c | 2 +-
> drivers/pwm/pwm-mediatek.c | 308 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++------------------------------
> drivers/pwm/pwm-pca9685.c | 515 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----------------------------------------------------------------------------------------------------------
> drivers/pwm/pwm-tiecap.c | 4 ++
> drivers/pwm/pwm-tiehrpwm.c | 154 ++++++++++++++++++++++--------------------------------
> include/linux/pwm.h | 3 ++
> 16 files changed, 661 insertions(+), 515 deletions(-)
>
> This seemed like it could be it, as it's pwm-related, but nothing Dell-specific. One merge before though there was
>
> 1c1658058c99 hwmon: (dell-smm) Add support for automatic fan mode
>
> which could be related. Since the pwm_enable attribute was introduced in 6.18, I am suspecting it has something to do with it.
>
> Now the weird part:
>
I strongly suspect that the bisect is a red herring. I'd suggest to revert
commits fae00a7186cec and 1c1658058c99b and see if that makes a difference.
Thanks,
Guenter
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Weird Dell SMM bug since 6.18
2026-03-13 16:43 ` Guenter Roeck
@ 2026-03-13 19:06 ` Jan Claußen
2026-03-13 23:10 ` Armin Wolf
0 siblings, 1 reply; 12+ messages in thread
From: Jan Claußen @ 2026-03-13 19:06 UTC (permalink / raw)
To: Guenter Roeck, Jan Claußen, linux-hwmon
I was actually in a fixed state for some reason and everything worked. I then tried to get back to the broken state and seems that echoing 2 into the pwmX_enable endpoints followed by reboot did the trick.
I then reverted the commits you suggested and it indeed fixed the issue. What to do now? Should coolercontrol adjust its code to this change or the commit be get reverted upstream?
My two cents regarding cranking the fans up to max until the control software kicks in: I think medium fan speed should be sufficent. I actually would always default to BIOS control and make the control software in userspace responsible for setting pwmX_enable.
Regards,
Jan
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Weird Dell SMM bug since 6.18
2026-03-13 19:06 ` Jan Claußen
@ 2026-03-13 23:10 ` Armin Wolf
2026-03-16 15:52 ` Guenter Roeck
0 siblings, 1 reply; 12+ messages in thread
From: Armin Wolf @ 2026-03-13 23:10 UTC (permalink / raw)
To: Jan Claußen, Guenter Roeck, linux-hwmon
Am 13.03.26 um 20:06 schrieb Jan Claußen:
> I was actually in a fixed state for some reason and everything worked. I then tried to get back to the broken state and seems that echoing 2 into the pwmX_enable endpoints followed by reboot did the trick.
The developers of CoolerControl said that:
"As mentioned in my previous response above, CC reads the pwmX values first, and if that fails, then it doesn't bother
with checking the pwmX_enable attributes. So, according to the logs, it's never getting to the point of reading the
pwmX_enable files at all, ..."
Reading pwmX when the associated PWM channel is still in automatic control mode will return -ENODATA because
the driver cannot retrieve the current PWM value when the BIOS controls the fan. This likely causes CoolerControl
to ignore the dell-smm hwmon device.
There a two solutions for this:
1. CoolerControl does not ignore PWM channels that cannot be read when in automatic control mode (or adds some special handling for this driver).
2. The dell-smm-hwmon driver stops returning -ENODATA and instead returns a dummy value (like 0).
Guenter, would you be OK with the second approach? I get the feeling that this issue might affect more
fan control daemons.
Thanks,
Armin Wolf
>
> I then reverted the commits you suggested and it indeed fixed the issue. What to do now? Should coolercontrol adjust its code to this change or the commit be get reverted upstream?
>
> My two cents regarding cranking the fans up to max until the control software kicks in: I think medium fan speed should be sufficent. I actually would always default to BIOS control and make the control software in userspace responsible for setting pwmX_enable.
The problem is that on some devices, the "medium" setting is veeeeeery low, so the device might overheat should the fan control daemon somehow fail
to start. Normally this maximum speed is quickly overwritten by the fan control daemon after entering manual control mode, so this should not be a problem
in this case.
Thanks,
Armin Wolf
> Regards,
> Jan
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Weird Dell SMM bug since 6.18
2026-03-13 23:10 ` Armin Wolf
@ 2026-03-16 15:52 ` Guenter Roeck
2026-03-16 20:10 ` Jan Claußen
0 siblings, 1 reply; 12+ messages in thread
From: Guenter Roeck @ 2026-03-16 15:52 UTC (permalink / raw)
To: Armin Wolf; +Cc: Jan Claußen, linux-hwmon
On Sat, Mar 14, 2026 at 12:10:26AM +0100, Armin Wolf wrote:
> Am 13.03.26 um 20:06 schrieb Jan Claußen:
>
> > I was actually in a fixed state for some reason and everything worked. I then tried to get back to the broken state and seems that echoing 2 into the pwmX_enable endpoints followed by reboot did the trick.
>
> The developers of CoolerControl said that:
>
> "As mentioned in my previous response above, CC reads the pwmX values first, and if that fails, then it doesn't bother
> with checking the pwmX_enable attributes. So, according to the logs, it's never getting to the point of reading the
> pwmX_enable files at all, ..."
>
> Reading pwmX when the associated PWM channel is still in automatic control mode will return -ENODATA because
> the driver cannot retrieve the current PWM value when the BIOS controls the fan. This likely causes CoolerControl
> to ignore the dell-smm hwmon device.
>
> There a two solutions for this:
> 1. CoolerControl does not ignore PWM channels that cannot be read when in automatic control mode (or adds some special handling for this driver).
> 2. The dell-smm-hwmon driver stops returning -ENODATA and instead returns a dummy value (like 0).
>
> Guenter, would you be OK with the second approach? I get the feeling that this issue might affect more
> fan control daemons.
>
Not really. -ENODATA seems to be the correct response if the current pwm value is not readable.
Returning 0 or any other number would be misldeading and trigger other problems (such as some
userspace code believing that it can write the value back with no impact, which would be worse).
Guenter
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Weird Dell SMM bug since 6.18
2026-03-16 15:52 ` Guenter Roeck
@ 2026-03-16 20:10 ` Jan Claußen
2026-03-17 0:55 ` Guenter Roeck
0 siblings, 1 reply; 12+ messages in thread
From: Jan Claußen @ 2026-03-16 20:10 UTC (permalink / raw)
To: Guenter Roeck, Armin Wolf; +Cc: linux-hwmon
I notified the coolercontrol developers about this thread. Maybe he can
find a solution from userspace, but a solution in kernelspace is always
preferred of course.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Weird Dell SMM bug since 6.18
2026-03-16 20:10 ` Jan Claußen
@ 2026-03-17 0:55 ` Guenter Roeck
2026-03-17 1:29 ` Armin Wolf
0 siblings, 1 reply; 12+ messages in thread
From: Guenter Roeck @ 2026-03-17 0:55 UTC (permalink / raw)
To: Jan Claußen, Armin Wolf; +Cc: linux-hwmon
On 3/16/26 13:10, Jan Claußen wrote:
> I notified the coolercontrol developers about this thread. Maybe he can
> find a solution from userspace, but a solution in kernelspace is always
> preferred of course.
The information on https://gitlab.com/coolercontrol/coolercontrol/-/work_items/557
seems to suggest that reading pwmX sometimes works and sometimes doesn't, which
is a bit suspicious.
I would suggest to add some debugging code into the kernel to determine return
values from i8k_get_fan_status() and the actual value of data->i8k_fan_max.
It might be useful to add some dev_dbg() into dell_smm_read() so we can do this
in the future without having to hack the kernel.
The described condition sounds like the returned value is >= 3 and
data->i8k_fan_max == 2. I'd suggest to monitor the returned value over time
and under varying load conditions to see if/how it changes on its own.
Then set pwmX_enable to "1" and try again.
Based on that we might be able to determine if i8k_get_fan_status()==3
means "turbo" fan speed or if it means "automatic fan control".
Either case, overloading I8K_FAN_TURBO and I8K_FAN_AUTO _is_ quite fragile.
I strongly suspect that Dell Latitude D520 (and any other system where
I8K_FAN_TURBO means a fan speed and not automatic fan control) has a problem
with the current code.
Thanks,
Guenter
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Weird Dell SMM bug since 6.18
2026-03-17 0:55 ` Guenter Roeck
@ 2026-03-17 1:29 ` Armin Wolf
2026-03-19 9:49 ` Guy Boldon
0 siblings, 1 reply; 12+ messages in thread
From: Armin Wolf @ 2026-03-17 1:29 UTC (permalink / raw)
To: Guenter Roeck, Jan Claußen; +Cc: linux-hwmon
Am 17.03.26 um 01:55 schrieb Guenter Roeck:
> On 3/16/26 13:10, Jan Claußen wrote:
>> I notified the coolercontrol developers about this thread. Maybe he can
>> find a solution from userspace, but a solution in kernelspace is always
>> preferred of course.
>
> The information on
> https://gitlab.com/coolercontrol/coolercontrol/-/work_items/557
> seems to suggest that reading pwmX sometimes works and sometimes
> doesn't, which
> is a bit suspicious.
>
I suspect that the successful reads happen after the pwmX attribute has been set manually
using "cat". The driver will enter manual fan control mode automatically in such a case
to keep compatibility with legacy userspace applications.
> I would suggest to add some debugging code into the kernel to
> determine return
> values from i8k_get_fan_status() and the actual value of
> data->i8k_fan_max.
> It might be useful to add some dev_dbg() into dell_smm_read() so we
> can do this
> in the future without having to hack the kernel.
>
We already log the individual SMM calls, i can decode them if requested.
> The described condition sounds like the returned value is >= 3 and
> data->i8k_fan_max == 2. I'd suggest to monitor the returned value over
> time
> and under varying load conditions to see if/how it changes on its own.
> Then set pwmX_enable to "1" and try again.
>
> Based on that we might be able to determine if i8k_get_fan_status()==3
> means "turbo" fan speed or if it means "automatic fan control".
>
> Either case, overloading I8K_FAN_TURBO and I8K_FAN_AUTO _is_ quite
> fragile.
> I strongly suspect that Dell Latitude D520 (and any other system where
> I8K_FAN_TURBO means a fan speed and not automatic fan control) has a
> problem
> with the current code.
>
The overload is necessary because both fan states are reusing the same value :/
Devices like the D520 should not be affected by this because they already set
i8k_fan_max to I8K_FAN_TURBO. This disables the handling of I8K_FAN_AUTO.
Thanks,
Armin Wolf
> Thanks,
> Guenter
>
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Weird Dell SMM bug since 6.18
2026-03-17 1:29 ` Armin Wolf
@ 2026-03-19 9:49 ` Guy Boldon
2026-03-19 15:52 ` Guenter Roeck
0 siblings, 1 reply; 12+ messages in thread
From: Guy Boldon @ 2026-03-19 9:49 UTC (permalink / raw)
To: linux-hwmon; +Cc: gb, W_Armin, linux, jan.claussen10
From: gb@guyboldon.com
Hi, I'm the CoolerControl maintainer, a few notes from the userspace side.
On Mon, Mar 16, 2026 at 17:55:01 -0700, Guenter Roeck wrote:
> Not really. -ENODATA seems to be the correct response if the current pwm
> value is not readable. Returning 0 or any other number would be misleading
> and trigger other problems (such as some userspace code believing that it
> can write the value back with no impact, which would be worse).
For context: thinkpad_acpi has long returned 255 as a dummy value for pwmX
when in auto mode (pwmX_enable=2), since it similarly cannot retrieve the
real PWM value during BIOS control. This was likely motivated by fancontrol
compatibility, which AFAIR requires a readable pwmX. CoolerControl reads pwmX to
confirm a channel is functional and to track data values over time, hence why we
need it readable. We can however adapt our handling for -ENODATA.
On the write concern: several drivers I'm familiar with (e.g. nct67xx, it87,
thinkpad_acpi) do not implicitly switch to manual mode on a pwmX write.
Writing pwmX having no effect when pwmX_enable != 1 is expected, normal
behavior from our perspective.
On Tue, Mar 17, 2026 at 02:29:39 +0100, Armin Wolf wrote:
> I suspect that the successful reads happen after the pwmX attribute has
> been set manually using "cat". The driver will enter manual fan control
> mode automatically in such a case to keep compatibility with legacy
> userspace applications.
That makes sense. Might be worth noting in the docs either way.
As a related point: gpd_fan returns -EOPNOTSUPP rather than -ENODATA
when in auto mode, and documents that behavior in the kernel docs. The
inconsistency between drivers, different errors for the same condition,
means userspace ends up needing per-driver handling for the same use case.
Not ideal, but at least documentation helps.
Thanks,
Guy
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Weird Dell SMM bug since 6.18
2026-03-19 9:49 ` Guy Boldon
@ 2026-03-19 15:52 ` Guenter Roeck
2026-03-22 10:18 ` Guy Boldon
0 siblings, 1 reply; 12+ messages in thread
From: Guenter Roeck @ 2026-03-19 15:52 UTC (permalink / raw)
To: Guy Boldon, linux-hwmon; +Cc: W_Armin, jan.claussen10
Hi,
On 3/19/26 02:49, Guy Boldon wrote:
> From: gb@guyboldon.com
>
> Hi, I'm the CoolerControl maintainer, a few notes from the userspace side.
>
Thanks a lot for the feedback.
> On Mon, Mar 16, 2026 at 17:55:01 -0700, Guenter Roeck wrote:
>> Not really. -ENODATA seems to be the correct response if the current pwm
>> value is not readable. Returning 0 or any other number would be misleading
>> and trigger other problems (such as some userspace code believing that it
>> can write the value back with no impact, which would be worse).
>
> For context: thinkpad_acpi has long returned 255 as a dummy value for pwmX
> when in auto mode (pwmX_enable=2), since it similarly cannot retrieve the
That driver is located outside drivers/hwmon/ and thus not in hwmon subsystem
control. Such drivers often implement functionality / attributes which would
not be acceptable in drivers/hwmon/. Non-standard functionality of such drivers
is often fiercely defended by driver authors, so I gave up even trying.
> real PWM value during BIOS control. This was likely motivated by fancontrol
> compatibility, which AFAIR requires a readable pwmX. CoolerControl reads pwmX to
> confirm a channel is functional and to track data values over time, hence why we
> need it readable. We can however adapt our handling for -ENODATA.
>
hwmon drivers exist since the beginning of Linux. You'll find _lots_ of
inconsistencies across different drivers.
The use of -ENODATA in hwmon to report that a value is not available is
relatively new and isn't even fully documented in the sysfs ABI (admittedly
a major oversight). The major driver for its use is that it more accurately
reflects reality as reported by the "sensors" command if an attribute value
is not available (sensors reports "N/A" instead of an error message if it
gets an -ENODATA error).
> On the write concern: several drivers I'm familiar with (e.g. nct67xx, it87,
> thinkpad_acpi) do not implicitly switch to manual mode on a pwmX write.
> Writing pwmX having no effect when pwmX_enable != 1 is expected, normal
> behavior from our perspective.
It always depends on the chip in question. For some chips, it is actually
necessary or at least desirable to write the pwm value before switching to
manual mode since otherwise the chip might behave erratically. That does not
mean it makes sense or is even possible for all chips. Anyway, the problem
here is (potentially) writing back a value which isn't based on real data.
The question is also what to report when _reading_ a value, not how to handle
writing it.
>
> On Tue, Mar 17, 2026 at 02:29:39 +0100, Armin Wolf wrote:
>> I suspect that the successful reads happen after the pwmX attribute has
>> been set manually using "cat". The driver will enter manual fan control
>> mode automatically in such a case to keep compatibility with legacy
>> userspace applications.
>
> That makes sense. Might be worth noting in the docs either way.
>
> As a related point: gpd_fan returns -EOPNOTSUPP rather than -ENODATA
> when in auto mode, and documents that behavior in the kernel docs. The
Please feel free to submit a patch to fix that.
> inconsistency between drivers, different errors for the same condition,
> means userspace ends up needing per-driver handling for the same use case.
> Not ideal, but at least documentation helps.
>
I can only repeat what I said above: hwmon drivers exist since the beginning
of Linux. You'll find _lots_ of inconsistencies across different drivers.
The best we can do is to find a means to improve consistency, but as you
can see here even that is difficult because different people will have
different opinions on how that consistency should look like. Error response
will vary, as will attribute visibility.
If you would like to get actively involved, please feel free to submit patches
improving the documentation (Documentation/hwmon/sysfs-interface.rst,
Documentation/ABI/testing/sysfs-class-hwmon, and or driver specific
documentation) as well as driver patches to help improve consistency across
drivers.
Thanks,
Guenter
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Weird Dell SMM bug since 6.18
2026-03-19 15:52 ` Guenter Roeck
@ 2026-03-22 10:18 ` Guy Boldon
2026-03-23 10:25 ` Armin Wolf
0 siblings, 1 reply; 12+ messages in thread
From: Guy Boldon @ 2026-03-22 10:18 UTC (permalink / raw)
To: Guenter Roeck, Guy Boldon, linux-hwmon; +Cc: W_Armin, jan.claussen10
Hi Guenter,
Thank you for the explanations, much appreciated.
On Thu Mar 19, 2026 at 4:52 PM CET, Guenter Roeck wrote:
> The use of -ENODATA in hwmon to report that a value is not available is
> relatively new and isn't even fully documented in the sysfs ABI (admittedly
> a major oversight). The major driver for its use is that it more accurately
> reflects reality as reported by the "sensors" command if an attribute value
> is not available (sensors reports "N/A" instead of an error message if it
> gets an -ENODATA error).
Ah, it seemed somewhat new and that makes sense. -ENODATA converts to
a clean N/A without an error message. We will adjust to handle -ENODATA
going forward.
>> As a related point: gpd_fan returns -EOPNOTSUPP rather than -ENODATA
>> when in auto mode, and documents that behavior in the kernel docs. The
>
> Please feel free to submit a patch to fix that.
I'll submit a patch for that.
> The best we can do is to find a means to improve consistency, but as you
> can see here even that is difficult because different people will have
> different opinions on how that consistency should look like. Error response
> will vary, as will attribute visibility.
>
> If you would like to get actively involved, please feel free to submit patches
> improving the documentation (Documentation/hwmon/sysfs-interface.rst,
> Documentation/ABI/testing/sysfs-class-hwmon, and or driver specific
> documentation) as well as driver patches to help improve consistency across
> drivers.
I agree and I would like to help improve consistency for the interface. I'll
look at the docs as well. CoolerControl is a direct consumer of this interface
across a large range of drivers, but improving consistency, I think, benefits
anyone interacting with more than a handful of drivers.
Thanks again,
Guy
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Weird Dell SMM bug since 6.18
2026-03-22 10:18 ` Guy Boldon
@ 2026-03-23 10:25 ` Armin Wolf
0 siblings, 0 replies; 12+ messages in thread
From: Armin Wolf @ 2026-03-23 10:25 UTC (permalink / raw)
To: Guy Boldon, Guenter Roeck, linux-hwmon; +Cc: jan.claussen10
Am 22.03.26 um 11:18 schrieb Guy Boldon:
> Hi Guenter,
>
> Thank you for the explanations, much appreciated.
>
>
> On Thu Mar 19, 2026 at 4:52 PM CET, Guenter Roeck wrote:
>> The use of -ENODATA in hwmon to report that a value is not available is
>> relatively new and isn't even fully documented in the sysfs ABI (admittedly
>> a major oversight). The major driver for its use is that it more accurately
>> reflects reality as reported by the "sensors" command if an attribute value
>> is not available (sensors reports "N/A" instead of an error message if it
>> gets an -ENODATA error).
> Ah, it seemed somewhat new and that makes sense. -ENODATA converts to
> a clean N/A without an error message. We will adjust to handle -ENODATA
> going forward.
Nice, sounds like a good plan to me.
Thanks,
Armin Wolf
>>> As a related point: gpd_fan returns -EOPNOTSUPP rather than -ENODATA
>>> when in auto mode, and documents that behavior in the kernel docs. The
>> Please feel free to submit a patch to fix that.
> I'll submit a patch for that.
>
>> The best we can do is to find a means to improve consistency, but as you
>> can see here even that is difficult because different people will have
>> different opinions on how that consistency should look like. Error response
>> will vary, as will attribute visibility.
>>
>> If you would like to get actively involved, please feel free to submit patches
>> improving the documentation (Documentation/hwmon/sysfs-interface.rst,
>> Documentation/ABI/testing/sysfs-class-hwmon, and or driver specific
>> documentation) as well as driver patches to help improve consistency across
>> drivers.
> I agree and I would like to help improve consistency for the interface. I'll
> look at the docs as well. CoolerControl is a direct consumer of this interface
> across a large range of drivers, but improving consistency, I think, benefits
> anyone interacting with more than a handful of drivers.
>
> Thanks again,
> Guy
>
>
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2026-03-23 10:25 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-13 12:39 Weird Dell SMM bug since 6.18 Jan Claußen
2026-03-13 16:43 ` Guenter Roeck
2026-03-13 19:06 ` Jan Claußen
2026-03-13 23:10 ` Armin Wolf
2026-03-16 15:52 ` Guenter Roeck
2026-03-16 20:10 ` Jan Claußen
2026-03-17 0:55 ` Guenter Roeck
2026-03-17 1:29 ` Armin Wolf
2026-03-19 9:49 ` Guy Boldon
2026-03-19 15:52 ` Guenter Roeck
2026-03-22 10:18 ` Guy Boldon
2026-03-23 10:25 ` Armin Wolf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox