Linux Hardware Monitor development
 help / color / mirror / Atom feed
* Suspend/resume failing due to SPD5118
@ 2024-12-19 22:59 Lucas De Marchi
  2024-12-20  1:20 ` Guenter Roeck
  0 siblings, 1 reply; 12+ messages in thread
From: Lucas De Marchi @ 2024-12-19 22:59 UTC (permalink / raw)
  To: linux-hwmon; +Cc: linux-kernel, Guenter Roeck, Jean Delvare

Hi,

In our CI for xe and i915 drivers we are noticing some issues
with suspend/resume with these error messages from spd5118:

<3> [120.648546] spd5118 3-0051: PM: dpm_run_callback(): spd5118_resume [spd5118] returns -110
<3> [120.648598] spd5118 3-0051: PM: failed to resume async: error -110
<3> [122.825989] spd5118 3-0053: PM: failed to resume async: error -110

Example:
https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3885

(there are a few other issues in which this error shows up, but this is
the the cleanest one that doesn't mix with other bugs)

thanks
Lucas De Marchi

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Suspend/resume failing due to SPD5118
  2024-12-19 22:59 Suspend/resume failing due to SPD5118 Lucas De Marchi
@ 2024-12-20  1:20 ` Guenter Roeck
  2025-05-02 23:07   ` carlon.luca
  0 siblings, 1 reply; 12+ messages in thread
From: Guenter Roeck @ 2024-12-20  1:20 UTC (permalink / raw)
  To: Lucas De Marchi, linux-hwmon; +Cc: linux-kernel, Jean Delvare

On 12/19/24 14:59, Lucas De Marchi wrote:
> Hi,
> 
> In our CI for xe and i915 drivers we are noticing some issues
> with suspend/resume with these error messages from spd5118:
> 
> <3> [120.648546] spd5118 3-0051: PM: dpm_run_callback(): spd5118_resume [spd5118] returns -110
> <3> [120.648598] spd5118 3-0051: PM: failed to resume async: error -110
> <3> [122.825989] spd5118 3-0053: PM: failed to resume async: error -110
> 
> Example:
> https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3885
> 
> (there are a few other issues in which this error shows up, but this is
> the the cleanest one that doesn't mix with other bugs)
> 
> thanks
> Lucas De Marchi
> 

The timeout is observed when the resume code tries to write data back to the
spd5118 chip. It originates from the i2c controller driver, not from the spd5118
driver. I have no idea why the i2c controller would time out in this situation.
Presumably it should have been brought out of suspend by the time devices connected
to it are re-enabled, but I don't see any associated message in the log.

I know that others have tested suspend/resume support with the driver and confirmed
that it works. It might help to enable debugging of the i2c controller driver if that
is possible. Other than that I have no idea what might cause this problem or how to
track it down.

Guenter


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Suspend/resume failing due to SPD5118
  2024-12-20  1:20 ` Guenter Roeck
@ 2025-05-02 23:07   ` carlon.luca
  2025-05-02 23:51     ` Armin Wolf
  2025-05-03  0:14     ` Guenter Roeck
  0 siblings, 2 replies; 12+ messages in thread
From: carlon.luca @ 2025-05-02 23:07 UTC (permalink / raw)
  To: linux; +Cc: jdelvare, linux-hwmon, linux-kernel, lucas.demarchi

> The timeout is observed when the resume code tries to write data back to the
> spd5118 chip. It originates from the i2c controller driver, not from the spd5118
> driver. I have no idea why the i2c controller would time out in this situation.
> Presumably it should have been brought out of suspend by the time devices connected
> to it are re-enabled, but I don't see any associated message in the log.
> 
> I know that others have tested suspend/resume support with the driver and confirmed
> that it works. It might help to enable debugging of the i2c controller driver if that
> is possible. Other than that I have no idea what might cause this problem or how to
> track it down.

Hi,

I recently bought a new machine, and trying to hibernate results in these messages
from the kernel:

[  195.176483] PM: hibernation: hibernation entry
[  195.200054] Filesystems sync: 0.005 seconds
[  195.200760] Freezing user space processes
[  195.203723] Freezing user space processes completed (elapsed 0.002 seconds)
[  195.203732] OOM killer disabled.
[  195.204506] PM: hibernation: Marking nosave pages: [mem 0x00000000-0x00000fff]
[  195.204512] PM: hibernation: Marking nosave pages: [mem 0x0009f000-0x000fffff]
[  195.204517] PM: hibernation: Marking nosave pages: [mem 0x4ee2f000-0x524fefff]
[  195.204924] PM: hibernation: Marking nosave pages: [mem 0x8b93b000-0x8bb7bfff]
[  195.204941] PM: hibernation: Marking nosave pages: [mem 0x8eedd000-0x8eeddfff]
[  195.204943] PM: hibernation: Marking nosave pages: [mem 0x92fe3000-0x92fe3fff]
[  195.204945] PM: hibernation: Marking nosave pages: [mem 0x944ff000-0x97ffefff]
[  195.205340] PM: hibernation: Marking nosave pages: [mem 0x98000000-0xffffffff]
[  195.210276] PM: hibernation: Basic memory bitmaps created
[  195.212709] PM: hibernation: Preallocating image memory
[  196.875538] PM: hibernation: Allocated 1013859 pages for snapshot
[  196.875544] PM: hibernation: Allocated 4055436 kbytes in 1.66 seconds (2443.03 MB/s)
[  196.875547] Freezing remaining freezable tasks
[  196.876843] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[  197.559071] printk: Suspending console(s) (use no_console_suspend to debug)
[  197.771716] spd5118 1-0051: PM: dpm_run_callback(): spd5118_suspend [spd5118] returns -110
[  197.771734] spd5118 1-0051: PM: failed to freeze async: error -110
[  197.979717] spd5118 1-0050: PM: dpm_run_callback(): spd5118_suspend [spd5118] returns -110
[  197.979739] spd5118 1-0050: PM: failed to freeze async: error -110
[  199.028103] PM: hibernation: Basic memory bitmaps freed
[  199.028080] mei_pxp i915.mei-gsc.768-fbf6fcf1-96cf-4e2e-a6a6-1bab8cbe36b1: bound 0000:03:00.0 (ops i915_pxp_tee_component_ops [i915])
[  199.029101] OOM killer enabled.
[  199.029104] Restarting tasks ... done.
[  199.088935] efivarfs: resyncing variable state
[  199.326219] PM: hibernation: hibernation exit

the operation aborts and I cannot hibernate the machine. I fixed the problem by
blacklisting the driver spd5118.

I see that the reported problem was in the resume operation, mine is in the suspend
operation, so I'm not sure if this is related and if the logs can help.

My kernel version is 6.14.4-arch1-1.

Regards.

Luca Carlon

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Suspend/resume failing due to SPD5118
  2025-05-02 23:07   ` carlon.luca
@ 2025-05-02 23:51     ` Armin Wolf
  2025-05-03  0:33       ` Luca Carlon
  2025-05-03  0:14     ` Guenter Roeck
  1 sibling, 1 reply; 12+ messages in thread
From: Armin Wolf @ 2025-05-02 23:51 UTC (permalink / raw)
  To: carlon.luca, linux; +Cc: jdelvare, linux-hwmon, linux-kernel, lucas.demarchi

Am 03.05.25 um 01:07 schrieb carlon.luca@gmail.com:

>> The timeout is observed when the resume code tries to write data back to the
>> spd5118 chip. It originates from the i2c controller driver, not from the spd5118
>> driver. I have no idea why the i2c controller would time out in this situation.
>> Presumably it should have been brought out of suspend by the time devices connected
>> to it are re-enabled, but I don't see any associated message in the log.
>>
>> I know that others have tested suspend/resume support with the driver and confirmed
>> that it works. It might help to enable debugging of the i2c controller driver if that
>> is possible. Other than that I have no idea what might cause this problem or how to
>> track it down.
> Hi,
>
> I recently bought a new machine, and trying to hibernate results in these messages
> from the kernel:
>
> [  195.176483] PM: hibernation: hibernation entry
> [  195.200054] Filesystems sync: 0.005 seconds
> [  195.200760] Freezing user space processes
> [  195.203723] Freezing user space processes completed (elapsed 0.002 seconds)
> [  195.203732] OOM killer disabled.
> [  195.204506] PM: hibernation: Marking nosave pages: [mem 0x00000000-0x00000fff]
> [  195.204512] PM: hibernation: Marking nosave pages: [mem 0x0009f000-0x000fffff]
> [  195.204517] PM: hibernation: Marking nosave pages: [mem 0x4ee2f000-0x524fefff]
> [  195.204924] PM: hibernation: Marking nosave pages: [mem 0x8b93b000-0x8bb7bfff]
> [  195.204941] PM: hibernation: Marking nosave pages: [mem 0x8eedd000-0x8eeddfff]
> [  195.204943] PM: hibernation: Marking nosave pages: [mem 0x92fe3000-0x92fe3fff]
> [  195.204945] PM: hibernation: Marking nosave pages: [mem 0x944ff000-0x97ffefff]
> [  195.205340] PM: hibernation: Marking nosave pages: [mem 0x98000000-0xffffffff]
> [  195.210276] PM: hibernation: Basic memory bitmaps created
> [  195.212709] PM: hibernation: Preallocating image memory
> [  196.875538] PM: hibernation: Allocated 1013859 pages for snapshot
> [  196.875544] PM: hibernation: Allocated 4055436 kbytes in 1.66 seconds (2443.03 MB/s)
> [  196.875547] Freezing remaining freezable tasks
> [  196.876843] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
> [  197.559071] printk: Suspending console(s) (use no_console_suspend to debug)
> [  197.771716] spd5118 1-0051: PM: dpm_run_callback(): spd5118_suspend [spd5118] returns -110
> [  197.771734] spd5118 1-0051: PM: failed to freeze async: error -110
> [  197.979717] spd5118 1-0050: PM: dpm_run_callback(): spd5118_suspend [spd5118] returns -110
> [  197.979739] spd5118 1-0050: PM: failed to freeze async: error -110
> [  199.028103] PM: hibernation: Basic memory bitmaps freed
> [  199.028080] mei_pxp i915.mei-gsc.768-fbf6fcf1-96cf-4e2e-a6a6-1bab8cbe36b1: bound 0000:03:00.0 (ops i915_pxp_tee_component_ops [i915])
> [  199.029101] OOM killer enabled.
> [  199.029104] Restarting tasks ... done.
> [  199.088935] efivarfs: resyncing variable state
> [  199.326219] PM: hibernation: hibernation exit
>
> the operation aborts and I cannot hibernate the machine. I fixed the problem by
> blacklisting the driver spd5118.
>
> I see that the reported problem was in the resume operation, mine is in the suspend
> operation, so I'm not sure if this is related and if the logs can help.
>
> My kernel version is 6.14.4-arch1-1.
>
> Regards.
>
> Luca Carlon

Interesting, please check which i2c controller handles the SPD5118 chip (cat /sys/bus/i2c/devices/i2c-1/name).

Can you access the i2c bus (using i2cdetect) and/or the spd5118 chip after such a failed resume attempt?

Thanks,
Armin Wolf


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Suspend/resume failing due to SPD5118
  2025-05-02 23:07   ` carlon.luca
  2025-05-02 23:51     ` Armin Wolf
@ 2025-05-03  0:14     ` Guenter Roeck
  2025-05-03  0:58       ` Luca Carlon
  1 sibling, 1 reply; 12+ messages in thread
From: Guenter Roeck @ 2025-05-03  0:14 UTC (permalink / raw)
  To: carlon.luca; +Cc: jdelvare, linux-hwmon, linux-kernel, lucas.demarchi

On 5/2/25 16:07, carlon.luca@gmail.com wrote:
>> The timeout is observed when the resume code tries to write data back to the
>> spd5118 chip. It originates from the i2c controller driver, not from the spd5118
>> driver. I have no idea why the i2c controller would time out in this situation.
>> Presumably it should have been brought out of suspend by the time devices connected
>> to it are re-enabled, but I don't see any associated message in the log.
>>
>> I know that others have tested suspend/resume support with the driver and confirmed
>> that it works. It might help to enable debugging of the i2c controller driver if that
>> is possible. Other than that I have no idea what might cause this problem or how to
>> track it down.
> 
> Hi,
> 
> I recently bought a new machine, and trying to hibernate results in these messages
> from the kernel:
> 
> [  195.176483] PM: hibernation: hibernation entry
> [  195.200054] Filesystems sync: 0.005 seconds
> [  195.200760] Freezing user space processes
> [  195.203723] Freezing user space processes completed (elapsed 0.002 seconds)
> [  195.203732] OOM killer disabled.
> [  195.204506] PM: hibernation: Marking nosave pages: [mem 0x00000000-0x00000fff]
> [  195.204512] PM: hibernation: Marking nosave pages: [mem 0x0009f000-0x000fffff]
> [  195.204517] PM: hibernation: Marking nosave pages: [mem 0x4ee2f000-0x524fefff]
> [  195.204924] PM: hibernation: Marking nosave pages: [mem 0x8b93b000-0x8bb7bfff]
> [  195.204941] PM: hibernation: Marking nosave pages: [mem 0x8eedd000-0x8eeddfff]
> [  195.204943] PM: hibernation: Marking nosave pages: [mem 0x92fe3000-0x92fe3fff]
> [  195.204945] PM: hibernation: Marking nosave pages: [mem 0x944ff000-0x97ffefff]
> [  195.205340] PM: hibernation: Marking nosave pages: [mem 0x98000000-0xffffffff]
> [  195.210276] PM: hibernation: Basic memory bitmaps created
> [  195.212709] PM: hibernation: Preallocating image memory
> [  196.875538] PM: hibernation: Allocated 1013859 pages for snapshot
> [  196.875544] PM: hibernation: Allocated 4055436 kbytes in 1.66 seconds (2443.03 MB/s)
> [  196.875547] Freezing remaining freezable tasks
> [  196.876843] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
> [  197.559071] printk: Suspending console(s) (use no_console_suspend to debug)
> [  197.771716] spd5118 1-0051: PM: dpm_run_callback(): spd5118_suspend [spd5118] returns -110
> [  197.771734] spd5118 1-0051: PM: failed to freeze async: error -110
> [  197.979717] spd5118 1-0050: PM: dpm_run_callback(): spd5118_suspend [spd5118] returns -110
> [  197.979739] spd5118 1-0050: PM: failed to freeze async: error -110
> [  199.028103] PM: hibernation: Basic memory bitmaps freed
> [  199.028080] mei_pxp i915.mei-gsc.768-fbf6fcf1-96cf-4e2e-a6a6-1bab8cbe36b1: bound 0000:03:00.0 (ops i915_pxp_tee_component_ops [i915])
> [  199.029101] OOM killer enabled.
> [  199.029104] Restarting tasks ... done.
> [  199.088935] efivarfs: resyncing variable state
> [  199.326219] PM: hibernation: hibernation exit
> 
> the operation aborts and I cannot hibernate the machine. I fixed the problem by
> blacklisting the driver spd5118.
> 
> I see that the reported problem was in the resume operation, mine is in the suspend
> operation, so I'm not sure if this is related and if the logs can help.
> 

That must be something different. If normal operation works (the sensors command
shows correct temperatures if the driver is loaded, and it is possible to read the
SPD eeprom), maybe the I2C controller is already suspended when the spd5118 driver's
suspend function is called. I don't know how that can happen, though. I would expect
that I2C controllers are only suspended after all its connected devices are suspended.

 From the context it looks like the "sensors" command was never executed. To get
another data point, it would help if you could load the driver, run the "sensors"
command, and then try to hibernate.

Thanks,
Guenter


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Suspend/resume failing due to SPD5118
  2025-05-02 23:51     ` Armin Wolf
@ 2025-05-03  0:33       ` Luca Carlon
  0 siblings, 0 replies; 12+ messages in thread
From: Luca Carlon @ 2025-05-03  0:33 UTC (permalink / raw)
  To: w_armin
  Cc: carlon.luca, jdelvare, linux-hwmon, linux-kernel, linux,
	lucas.demarchi

> Interesting, please check which i2c controller handles the SPD5118 chip (cat /sys/bus/i2c/devices/i2c-1/name).

$ cat /sys/bus/i2c/devices/i2c-1/name
SMBus I801 adapter at efa0

> Can you access the i2c bus (using i2cdetect) and/or the spd5118 chip after such a failed resume attempt?

In my case, the suspend fails, and therefore I do not think I can reach the resume.

However, if you mean to list the installed busses, this is the list before trying to suspend:

i2c-0   unknown         Synopsys DesignWare I2C adapter         N/A
i2c-1   unknown         SMBus I801 adapter at efa0              N/A
i2c-2   i2c             i915 gmbus dpa                          I2C adapter
i2c-3   i2c             i915 gmbus dpb                          I2C adapter
i2c-4   i2c             i915 gmbus dpc                          I2C adapter
i2c-5   i2c             i915 gmbus dpd                          I2C adapter
i2c-6   i2c             i915 gmbus tc1                          I2C adapter
i2c-7   i2c             AUX A/DDI A/PHY A                       I2C adapter
i2c-8   i2c             AUX C/DDI C/PHY C                       I2C adapter
i2c-9   i2c             AUX D/DDI D/PHY D                       I2C adapter

after the failed attempt, the list is unchanged.

If this is not what you wanted, please let me know, I can try to provide more info.

Thank you.

Luca Carlon

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Suspend/resume failing due to SPD5118
  2025-05-03  0:14     ` Guenter Roeck
@ 2025-05-03  0:58       ` Luca Carlon
  2025-05-03  1:40         ` Guenter Roeck
  0 siblings, 1 reply; 12+ messages in thread
From: Luca Carlon @ 2025-05-03  0:58 UTC (permalink / raw)
  To: linux; +Cc: carlon.luca, jdelvare, linux-hwmon, linux-kernel, lucas.demarchi

> From the context it looks like the "sensors" command was never executed. To get
> another data point, it would help if you could load the driver, run the "sensors"
> command, and then try to hibernate.

Hello,

yes, I did not have the sensors command installed.

I removed SPD5118 from the blacklist and I rebooted the system. This is what the
"sensors" command is reporting after the boot:

spd5118-i2c-1-50
Adapter: SMBus I801 adapter at efa0
ERROR: Can't get value of subfeature temp1_lcrit_alarm: Can't read
ERROR: Can't get value of subfeature temp1_min_alarm: Can't read
ERROR: Can't get value of subfeature temp1_max_alarm: Can't read
ERROR: Can't get value of subfeature temp1_crit_alarm: Can't read
ERROR: Can't get value of subfeature temp1_min: Can't read
ERROR: Can't get value of subfeature temp1_max: Can't read
ERROR: Can't get value of subfeature temp1_lcrit: Can't read
ERROR: Can't get value of subfeature temp1_crit: Can't read
temp1:            N/A  (low  =  +0.0°C, high =  +0.0°C)
                       (crit low =  +0.0°C, crit =  +0.0°C)

[...]

spd5118-i2c-1-51
Adapter: SMBus I801 adapter at efa0
ERROR: Can't get value of subfeature temp1_lcrit_alarm: Can't read
ERROR: Can't get value of subfeature temp1_min_alarm: Can't read
ERROR: Can't get value of subfeature temp1_max_alarm: Can't read
ERROR: Can't get value of subfeature temp1_crit_alarm: Can't read
ERROR: Can't get value of subfeature temp1_min: Can't read
ERROR: Can't get value of subfeature temp1_max: Can't read
ERROR: Can't get value of subfeature temp1_lcrit: Can't read
ERROR: Can't get value of subfeature temp1_crit: Can't read
temp1:            N/A  (low  =  +0.0°C, high =  +0.0°C)
                       (crit low =  +0.0°C, crit =  +0.0°C)

I then tried to hibernate. Hibernation failed and the output of the "sensors"
command did not change.

I also tried to rmmod spd5118 and modprobe it. The output of the sensors
command does not show spd5118 anymore.

Hope I did what you asked properly.
Thanks for your answer.

Luca Carlon

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Suspend/resume failing due to SPD5118
  2025-05-03  0:58       ` Luca Carlon
@ 2025-05-03  1:40         ` Guenter Roeck
  2025-05-04 16:59           ` Armin Wolf
  0 siblings, 1 reply; 12+ messages in thread
From: Guenter Roeck @ 2025-05-03  1:40 UTC (permalink / raw)
  To: Luca Carlon; +Cc: jdelvare, linux-hwmon, linux-kernel, lucas.demarchi

On 5/2/25 17:58, Luca Carlon wrote:
>>  From the context it looks like the "sensors" command was never executed. To get
>> another data point, it would help if you could load the driver, run the "sensors"
>> command, and then try to hibernate.
> 
> Hello,
> 
> yes, I did not have the sensors command installed.
> 
> I removed SPD5118 from the blacklist and I rebooted the system. This is what the
> "sensors" command is reporting after the boot:
> 
> spd5118-i2c-1-50
> Adapter: SMBus I801 adapter at efa0
> ERROR: Can't get value of subfeature temp1_lcrit_alarm: Can't read
> ERROR: Can't get value of subfeature temp1_min_alarm: Can't read
> ERROR: Can't get value of subfeature temp1_max_alarm: Can't read
> ERROR: Can't get value of subfeature temp1_crit_alarm: Can't read
> ERROR: Can't get value of subfeature temp1_min: Can't read
> ERROR: Can't get value of subfeature temp1_max: Can't read
> ERROR: Can't get value of subfeature temp1_lcrit: Can't read
> ERROR: Can't get value of subfeature temp1_crit: Can't read
> temp1:            N/A  (low  =  +0.0°C, high =  +0.0°C)
>                         (crit low =  +0.0°C, crit =  +0.0°C)
> 
> [...]
> 
> spd5118-i2c-1-51
> Adapter: SMBus I801 adapter at efa0
> ERROR: Can't get value of subfeature temp1_lcrit_alarm: Can't read
> ERROR: Can't get value of subfeature temp1_min_alarm: Can't read
> ERROR: Can't get value of subfeature temp1_max_alarm: Can't read
> ERROR: Can't get value of subfeature temp1_crit_alarm: Can't read
> ERROR: Can't get value of subfeature temp1_min: Can't read
> ERROR: Can't get value of subfeature temp1_max: Can't read
> ERROR: Can't get value of subfeature temp1_lcrit: Can't read
> ERROR: Can't get value of subfeature temp1_crit: Can't read
> temp1:            N/A  (low  =  +0.0°C, high =  +0.0°C)
>                         (crit low =  +0.0°C, crit =  +0.0°C)
> 

That means there is a problem with the I2C controller, and you'll have to
black-list the driver. I don't have a better solution, sorry.

Guenter

> I then tried to hibernate. Hibernation failed and the output of the "sensors"
> command did not change.
> 
> I also tried to rmmod spd5118 and modprobe it. The output of the sensors
> command does not show spd5118 anymore.
> 
> Hope I did what you asked properly.
> Thanks for your answer.
> 
> Luca Carlon


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Suspend/resume failing due to SPD5118
  2025-05-03  1:40         ` Guenter Roeck
@ 2025-05-04 16:59           ` Armin Wolf
  2025-05-04 18:41             ` Luca Carlon
  0 siblings, 1 reply; 12+ messages in thread
From: Armin Wolf @ 2025-05-04 16:59 UTC (permalink / raw)
  To: Guenter Roeck, Luca Carlon
  Cc: jdelvare, linux-hwmon, linux-kernel, lucas.demarchi

Am 03.05.25 um 03:40 schrieb Guenter Roeck:

> On 5/2/25 17:58, Luca Carlon wrote:
>>>  From the context it looks like the "sensors" command was never 
>>> executed. To get
>>> another data point, it would help if you could load the driver, run 
>>> the "sensors"
>>> command, and then try to hibernate.
>>
>> Hello,
>>
>> yes, I did not have the sensors command installed.
>>
>> I removed SPD5118 from the blacklist and I rebooted the system. This 
>> is what the
>> "sensors" command is reporting after the boot:
>>
>> spd5118-i2c-1-50
>> Adapter: SMBus I801 adapter at efa0
>> ERROR: Can't get value of subfeature temp1_lcrit_alarm: Can't read
>> ERROR: Can't get value of subfeature temp1_min_alarm: Can't read
>> ERROR: Can't get value of subfeature temp1_max_alarm: Can't read
>> ERROR: Can't get value of subfeature temp1_crit_alarm: Can't read
>> ERROR: Can't get value of subfeature temp1_min: Can't read
>> ERROR: Can't get value of subfeature temp1_max: Can't read
>> ERROR: Can't get value of subfeature temp1_lcrit: Can't read
>> ERROR: Can't get value of subfeature temp1_crit: Can't read
>> temp1:            N/A  (low  =  +0.0°C, high =  +0.0°C)
>>                         (crit low =  +0.0°C, crit =  +0.0°C)
>>
>> [...]
>>
>> spd5118-i2c-1-51
>> Adapter: SMBus I801 adapter at efa0
>> ERROR: Can't get value of subfeature temp1_lcrit_alarm: Can't read
>> ERROR: Can't get value of subfeature temp1_min_alarm: Can't read
>> ERROR: Can't get value of subfeature temp1_max_alarm: Can't read
>> ERROR: Can't get value of subfeature temp1_crit_alarm: Can't read
>> ERROR: Can't get value of subfeature temp1_min: Can't read
>> ERROR: Can't get value of subfeature temp1_max: Can't read
>> ERROR: Can't get value of subfeature temp1_lcrit: Can't read
>> ERROR: Can't get value of subfeature temp1_crit: Can't read
>> temp1:            N/A  (low  =  +0.0°C, high =  +0.0°C)
>>                         (crit low =  +0.0°C, crit =  +0.0°C)
>>
>
> That means there is a problem with the I2C controller, and you'll have to
> black-list the driver. I don't have a better solution, sorry.
>
> Guenter

I do not thing that the i2c controller is at fault here. It seems that when loading the spd5118 driver
for the first time everything works until some point where the spd5118 device stops responding to i2c
requests.

Please you load the i2c-dev module (sudo modprobe i2c-dev) and share the results of the following commands:

	sudo i2cdump 1 0x50
	sudo i2cdump 1 0x51

This should return the register contents of the spd5118 devices. Please make sure that the spd5118 driver
as been blacklisted and unloaded before executing those commands.

I suspect that somehow the spd5118 driver confuses the spd5118 devices, causing them to lock up.

Could you also please tell us the name of the RAM sticks you are using?

Thanks,
Armin Wolf

>
>> I then tried to hibernate. Hibernation failed and the output of the 
>> "sensors"
>> command did not change.
>>
>> I also tried to rmmod spd5118 and modprobe it. The output of the sensors
>> command does not show spd5118 anymore.
>>
>> Hope I did what you asked properly.
>> Thanks for your answer.
>>
>> Luca Carlon
>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Suspend/resume failing due to SPD5118
  2025-05-04 16:59           ` Armin Wolf
@ 2025-05-04 18:41             ` Luca Carlon
  2025-05-04 23:31               ` Guenter Roeck
  0 siblings, 1 reply; 12+ messages in thread
From: Luca Carlon @ 2025-05-04 18:41 UTC (permalink / raw)
  To: w_armin
  Cc: carlon.luca, jdelvare, linux-hwmon, linux-kernel, linux,
	lucas.demarchi

> Please you load the i2c-dev module (sudo modprobe i2c-dev) and share the results of the following commands:
> 
> 	sudo i2cdump 1 0x50
> 	sudo i2cdump 1 0x51
> 
> This should return the register contents of the spd5118 devices. Please make sure that the spd5118 driver
> as been blacklisted and unloaded before executing those commands.

Hello,

I followed what you asked:

# lsmod | grep spd
# modprobe i2c-dev
# lsmod | grep i2c
i2c_algo_bit           24576  2 xe,i915
i2c_i801               40960  0
i2c_smbus              20480  1 i2c_i801
i2c_mux                16384  1 i2c_i801
i2c_hid_acpi           12288  0
i2c_hid                45056  1 i2c_hid_acpi
i2c_dev                28672  0
# i2cdump 1 0x50
No size specified (using byte-data access)
WARNING! This program can confuse your I2C bus, cause data loss and worse!
I will probe file /dev/i2c-1, address 0x50, mode byte
Continue? [Y/n]  
     0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f    0123456789abcdef
00: XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX    XXXXXXXXXXXXXXXX
10: XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX    XXXXXXXXXXXXXXXX
20: XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX    XXXXXXXXXXXXXXXX
30: XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX    XXXXXXXXXXXXXXXX
40: XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX    XXXXXXXXXXXXXXXX
50: XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX    XXXXXXXXXXXXXXXX
60: XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX    XXXXXXXXXXXXXXXX
70: XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX    XXXXXXXXXXXXXXXX
80: XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX    XXXXXXXXXXXXXXXX
90: XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX    XXXXXXXXXXXXXXXX
a0: XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX    XXXXXXXXXXXXXXXX
b0: XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX    XXXXXXXXXXXXXXXX
c0: XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX    XXXXXXXXXXXXXXXX
d0: XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX    XXXXXXXXXXXXXXXX
e0: XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX    XXXXXXXXXXXXXXXX
f0: XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX    XXXXXXXXXXXXXXXX
# i2cdump 1 0x51
No size specified (using byte-data access)
WARNING! This program can confuse your I2C bus, cause data loss and worse!
I will probe file /dev/i2c-1, address 0x51, mode byte
Continue? [Y/n] 
     0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f    0123456789abcdef
00: XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX    XXXXXXXXXXXXXXXX
10: XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX    XXXXXXXXXXXXXXXX
20: XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX    XXXXXXXXXXXXXXXX
30: XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX    XXXXXXXXXXXXXXXX
40: XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX    XXXXXXXXXXXXXXXX
50: XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX    XXXXXXXXXXXXXXXX
60: XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX    XXXXXXXXXXXXXXXX
70: XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX    XXXXXXXXXXXXXXXX
80: XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX    XXXXXXXXXXXXXXXX
90: XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX    XXXXXXXXXXXXXXXX
a0: XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX    XXXXXXXXXXXXXXXX
b0: XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX    XXXXXXXXXXXXXXXX
c0: XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX    XXXXXXXXXXXXXXXX
d0: XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX    XXXXXXXXXXXXXXXX
e0: XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX    XXXXXXXXXXXXXXXX
f0: XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX    XXXXXXXXXXXXXXXX

> Could you also please tell us the name of the RAM sticks you are using?

Of course. I have two 32GB modules. One was provided with the machine and it
would be difficult for me to reach that physically. The other one is simple
to reach and I installed it myself: it is marketed as Crucial CL46 - CT32G56C46S5.
This is what I can get from dmidecode:

Handle 0x0002, DMI type 17, 92 bytes
Memory Device
        Array Handle: 0x0001
        Error Information Handle: Not Provided
        Total Width: 64 bits
        Data Width: 64 bits
        Size: 32 GB
        Form Factor: SODIMM
        Set: None
        Locator: Controller0-ChannelA/B-DIMM0
        Bank Locator: BANK 0/1
        Type: DDR5
        Type Detail: Synchronous
        Speed: 5600 MT/s
        Manufacturer: Micron Technology
        Serial Number: E97A5953
        Asset Tag: None
        Part Number: CT32G56C46S5.C16D   
        Rank: 2
        Configured Memory Speed: 3600 MT/s
        Minimum Voltage: Unknown
        Maximum Voltage: Unknown
        Configured Voltage: 1.1 V
        Memory Technology: DRAM
        Memory Operating Mode Capability: Volatile memory
        Firmware Version: Not Specified
        Module Manufacturer ID: Bank 1, Hex 0x2C
        Module Product ID: Unknown
        Memory Subsystem Controller Manufacturer ID: Unknown
        Memory Subsystem Controller Product ID: Unknown
        Non-Volatile Size: None
        Volatile Size: 32 GB
        Cache Size: None
        Logical Size: None

Handle 0x0003, DMI type 17, 92 bytes
Memory Device
        Array Handle: 0x0001
        Error Information Handle: Not Provided
        Total Width: 64 bits
        Data Width: 64 bits
        Size: 32 GB
        Form Factor: SODIMM
        Set: None
        Locator: Controller0-ChannelA/B-DIMM1
        Bank Locator: BANK 0/1
        Type: DDR5
        Type Detail: Synchronous
        Speed: 5600 MT/s
        Manufacturer: SK Hynix
        Serial Number: 2C3B11DE
        Asset Tag: None
        Part Number: HMCG88AGBSA095N     
        Rank: 2
        Configured Memory Speed: 3600 MT/s
        Minimum Voltage: Unknown
        Maximum Voltage: Unknown
        Configured Voltage: 1.1 V
        Memory Technology: DRAM
        Memory Operating Mode Capability: Volatile memory
        Firmware Version: Not Specified
        Module Manufacturer ID: Bank 1, Hex 0xAD
        Module Product ID: Unknown
        Memory Subsystem Controller Manufacturer ID: Unknown
        Memory Subsystem Controller Product ID: Unknown
        Non-Volatile Size: None
        Volatile Size: 32 GB
        Cache Size: None
        Logical Size: None

When I was told that the problem may lie in the i2c bus, I started to search elsewhere
and this thread came up: https://bugzilla.kernel.org/show_bug.cgi?id=213345. I
therefore provided in that thread some more info I collected.

Please let me know if I did something wrong.
Thank you.
Luca Carlon

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Suspend/resume failing due to SPD5118
  2025-05-04 18:41             ` Luca Carlon
@ 2025-05-04 23:31               ` Guenter Roeck
  2025-05-05  0:27                 ` Armin Wolf
  0 siblings, 1 reply; 12+ messages in thread
From: Guenter Roeck @ 2025-05-04 23:31 UTC (permalink / raw)
  To: Luca Carlon, w_armin; +Cc: jdelvare, linux-hwmon, linux-kernel, lucas.demarchi

On 5/4/25 11:41, Luca Carlon wrote:
...
> When I was told that the problem may lie in the i2c bus, I started to search elsewhere
> and this thread came up: https://bugzilla.kernel.org/show_bug.cgi?id=213345. I
> therefore provided in that thread some more info I collected.
> 

 From there:

[    5.416572] i801_smbus 0000:00:1f.4: SPD Write Disable is set

I think you are out of luck. The above is incompatible with spd5118 devices;
See [1] for details. I don't immediately see why that would cause the i2c bus
to lock up, but even if it didn't lock up the bus I don't see a means to get
this to work.

It is still puzzling that reading the i2c data using i2cdump fails. The spd5118
driver should not even probe if that is the case. Maybe Armin has an idea.

Guenter

---
[1] https://lore.kernel.org/linux-i2c/20250430-for-upstream-i801-spd5118-no-instantiate-v2-0-2f54d91ae2c7@canonical.com/


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Suspend/resume failing due to SPD5118
  2025-05-04 23:31               ` Guenter Roeck
@ 2025-05-05  0:27                 ` Armin Wolf
  0 siblings, 0 replies; 12+ messages in thread
From: Armin Wolf @ 2025-05-05  0:27 UTC (permalink / raw)
  To: Guenter Roeck, Luca Carlon
  Cc: jdelvare, linux-hwmon, linux-kernel, lucas.demarchi

Am 05.05.25 um 01:31 schrieb Guenter Roeck:

> On 5/4/25 11:41, Luca Carlon wrote:
> ...
>> When I was told that the problem may lie in the i2c bus, I started to 
>> search elsewhere
>> and this thread came up: 
>> https://bugzilla.kernel.org/show_bug.cgi?id=213345. I
>> therefore provided in that thread some more info I collected.
>>
>
> From there:
>
> [    5.416572] i801_smbus 0000:00:1f.4: SPD Write Disable is set
>
> I think you are out of luck. The above is incompatible with spd5118 
> devices;
> See [1] for details. I don't immediately see why that would cause the 
> i2c bus
> to lock up, but even if it didn't lock up the bus I don't see a means 
> to get
> this to work.
>
> It is still puzzling that reading the i2c data using i2cdump fails. 
> The spd5118
> driver should not even probe if that is the case. Maybe Armin has an 
> idea.
>
> Guenter

I have an idea, see the bugzilla post for details.

Thanks,
Armin Wolf

>
> ---
> [1] 
> https://lore.kernel.org/linux-i2c/20250430-for-upstream-i801-spd5118-no-instantiate-v2-0-2f54d91ae2c7@canonical.com/
>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2025-05-05  0:28 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-19 22:59 Suspend/resume failing due to SPD5118 Lucas De Marchi
2024-12-20  1:20 ` Guenter Roeck
2025-05-02 23:07   ` carlon.luca
2025-05-02 23:51     ` Armin Wolf
2025-05-03  0:33       ` Luca Carlon
2025-05-03  0:14     ` Guenter Roeck
2025-05-03  0:58       ` Luca Carlon
2025-05-03  1:40         ` Guenter Roeck
2025-05-04 16:59           ` Armin Wolf
2025-05-04 18:41             ` Luca Carlon
2025-05-04 23:31               ` Guenter Roeck
2025-05-05  0:27                 ` Armin Wolf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox