acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr

All of lore.kernel.org
 help / color / mirror / Atom feed

* acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
@ 2026-01-22 18:21 Jaroslav Pulchart
  2026-01-22 18:55 ` Rafael J. Wysocki
  0 siblings, 1 reply; 60+ messages in thread
From: Jaroslav Pulchart @ 2026-01-22 18:21 UTC (permalink / raw)
  To: linux-acpi, linux-hwmon
  Cc: Igor Raits, Daniel Secik, Zdenek Pesek, Jiri Jurica

Hello,

after upgrading from kernel 6.17.y to 6.18.y we started to observe a regression
in the ACPI power meter hwmon interface. Reading power*_average sysfs
attributes blocks indefinitely and causes tasks to enter uninterruptible
sleep (D state).

This affects both simple tools (e.g. cat) and monitoring agents scraping
hwmon (Prometheus via Grafana Alloy), eventually leading to hung task warnings
and stalled metric collection.

$ cat /sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/hwmon/hwmonX/power1_average
# blocks foreverJaroslav Pulchart

Blocked tasks show the following call chain:

[  528.108418] u[  617.816097]  __mutex_lock.constprop.0+0x3c9/0xa00
[  617.823094]  hwmon_attr_show+0x36/0x130
[  617.828889]  dev_attr_show+0x19/0x60
[  617.834095]  sysfs_kf_seq_show+0xbf/0x140
[  617.841102]  seq_read_iter+0x112/0x510
[  617.847099]  ? security_file_permission+0x8e/0xa0
[  617.854101]  vfs_read+0x215/0x340
[  617.860095]  ksys_read+0x61/0xe0
[  617.866096]  do_syscall_64+0x5d/0xa70
[  617.872097]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  617.880076] RIP: 0033:0x408d8e
[  617.885204] RSP: 002b:000000c004fed498 EFLAGS: 00000206 ORIG_RAX:
0000000000000000
[  617.896098] RAX: ffffffffffffffda RBX: 000000000000001e RCX: 0000000000408d8e
[  617.905398] RDX: 0000000000000080 RSI: 000000c004db7a80 RDI: 000000000000001e
[  617.915098] RBP: 000000c004fed4d8 R08: 0000000000000000 R09: 0000000000000000
[  617.925099] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
[  617.935103] R13: 0000000000000040 R14: 000000c003e03880 R15: 0000000000000054
[  617.944105]  </TASK>
[  620.891393] usb 3-1.2: USB disconnect, device number 15
[  740.286380] INFO: task alloy:5901 blocked for more than 122 seconds.
[  740.292941]       Tainted: G            E       6.18.6-1.gdc.el9.x86_64 #1
[  740.299938] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  740.313368] task:alloy           state:D stack:0     pid:5901
tgid:5899  ppid:1      task_flags:0x400140 flags:0x00080001
[  740.326290] Call Trace:
[  740.331572]  <TASK>
[  740.336311]  __schedule+0x2b5/0x690
[  740.341312]  schedule+0x23/0x80
[  740.347316]  schedule_preempt_disabled+0x11/0x20
[  740.354335]  __mutex_lock.constprop.0+0x3c9/0xa00
[  740.360318]  hwmon_attr_show+0x36/0x130
[  740.366315]  dev_attr_show+0x19/0x60
[  740.372315]  sysfs_kf_seq_show+0xbf/0x140
[  740.378315]  seq_read_iter+0x112/0x510
[  740.384312]  ? security_file_permission+0x8e/0xa0
[  740.391316]  vfs_read+0x215/0x340
[  740.397336]  ksys_read+0x61/0xe0
[  740.403310]  do_syscall_64+0x5d/0xa70
[  740.409321]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  740.417340] RIP: 0033:0x408d8e
[  740.422111] RSP: 002b:000000c003401498 EFLAGS: 00000206 ORIG_RAX:
0000000000000000
[  740.431439] RAX: ffffffffffffffda RBX: 000000000000001c RCX: 0000000000408d8e
[  740.441336] RDX: 0000000000000080 RSI: 000000c00479d800 RDI: 000000000000001c
[  740.451326] RBP: 000000c0034014d8 R08: 0000000000000000 R09: 0000000000000000
[  740.461320] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
[  740.471322] R13: 0000000000000040 R14: 000000c004d4d6c0 R15: ffffffffffffffff
[  740.480548]  </TASK>
[  740.485315] INFO: task alloy:5903 blocked for more than 123 seconds.
[  740.494310]       Tainted: G            E       6.18.6-1.gdc.el9.x86_64 #1
[  740.503355] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  740.513305] task:alloy           state:D stack:0     pid:5903
tgid:5899  ppid:1      task_flags:0x400140 flags:0x00080001
[  740.526807] Call Trace:
[  740.532010]  <TASK>
[  740.536313]  __schedule+0x2b5/0x690
[  740.542319]  schedule+0x23/0x80
[  740.547444]  schedule_preempt_disabled+0x11/0x20
[  740.555324]  __mutex_lock.constprop.0+0x3c9/0xa00
[  740.561576]  hwmon_attr_show+0x36/0x130
[  740.567326]  dev_attr_show+0x19/0x60
[  740.572315]  sysfs_kf_seq_show+0xbf/0x140
[  740.578311]  seq_read_iter+0x112/0x510
[  740.584311]  ? security_file_permission+0x8e/0xa0
[  740.591316]  vfs_read+0x215/0x340
[  740.596477]  ksys_read+0x61/0xe0
[  740.601312]  do_syscall_64+0x5d/0xa70
[  740.606333]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  740.614312] RIP: 0033:0x408d8e
[  740.618667] RSP: 002b:000000c002217498 EFLAGS: 00000206 ORIG_RAX:
0000000000000000
[  740.628317] RAX: ffffffffffffffda RBX: 0000000000000027 RCX: 0000000000408d8e
[  740.638353] RDX: 0000000000000080 RSI: 000000c006f78e80 RDI: 0000000000000027
[  740.648314] RBP: 000000c0022174d8 R08: 0000000000000000 R09: 0000000000000000
[  740.657348] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
[  740.667288] R13: 0000000000000040 R14: 000000c0047276c0 R15: ffffffffffffffff
[  740.676348]  </TASK>
[  740.680440] INFO: task alloy:5921 blocked for more than 123 seconds.
[  740.689314]       Tainted: G            E       6.18.6-1.gdc.el9.x86_64 #1
[  740.698323] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  740.708314] task:alloy           state:D stack:0     pid:5921
tgid:5899  ppid:1      task_flags:0x400140 flags:0x00080001
[  740.721209] Call Trace:[  741.467334]  power_meter_read+0x1ea/0x2c0
[acpi_power_meter]
[  741.475320]  hwmon_attr_show+0x5e/0x130
[  741.481343]  dev_attr_show+0x19/0x60
[  741.487325]  sysfs_kf_seq_show+0xbf/0x140
[  741.494315]  seq_read_iter+0x112/0x510
[  741.498537]  ? security_file_permission+0x8e/0xa0
[  741.503544]  vfs_read+0x215/0x340
[  741.507125]  ksys_read+0x61/0xe0
[  741.510650]  do_syscall_64+0x5d/0xa70
[  741.514578]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  741.519954] RIP: 0033:0x408d8e
[  741.523283] RSP: 002b:000000c007a8b498 EFLAGS: 00000206 ORIG_RAX:
0000000000000000
[  741.531260] RAX: ffffffffffffffda RBX: 000000000000001f RCX: 0000000000408d8e
[  741.538718] RDX: 0000000000000080 RSI: 000000c0060a3000 RDI: 000000000000001f
[  741.546256] RBP: 000000c007a8b4d8 R08: 0000000000000000 R09: 0000000000000000
[  741.553731] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
[  741.561265] R13: 0000000000000040 R14: 000000c006dbae00 R15: ffffffffffffffff
[  741.568728]  </TASK>
[  741.571167] INFO: task alloy:6139 blocked for more than 123 seconds.
[  741.577893]       Tainted: G            E       6.18.6-1.gdc.el9.x86_64 #1
[  741.585112] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  741.593345] task:alloy           state:D stack:0     pid:6139
tgid:5899  ppid:1      task_flags:0x400140 flags:0x00080001
[  741.604775] Call Trace:
[  741.607508]  <TASK>
[  741.609883]  __schedule+0x2b5/0x690
[  741.613629]  schedule+0x23/0x80
[  741.617023]  schedule_preempt_disabled+0x11/0x20
[  741.621921]  __mutex_lock.constprop.0+0x3c9/0xa00
[  741.626946]  hwmon_attr_show+0x36/0x130
[  741.631052]  dev_attr_show+0x19/0x60
[  741.634903]  sysfs_kf_seq_show+0xbf/0x140
[  741.639231]  seq_read_iter+0x112/0x510
[  741.643261]  ? security_file_permission+0x8e/0xa0
[  741.648236]  vfs_read+0x215/0x340
[  741.651800]  ksys_read+0x61/0xe0
[  741.655302]  do_syscall_64+0x5d/0xa70
[  741.659221]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  741.664566] RIP: 0033:0x408d8e
[  741.667876] RSP: 002b:000000c003805498 EFLAGS: 00000206 ORIG_RAX:
0000000000000000
[  741.675821] RAX: ffffffffffffffda RBX: 0000000000000023 RCX: 0000000000408d8e
[  741.683331] RDX: 0000000000000080 RSI: 000000c00731d000 RDI: 0000000000000023
[  741.690831] RBP: 000000c0038054d8 R08: 0000000000000000 R09: 0000000000000000
[  741.698340] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
[  741.705812] R13: 0000000000000040 R14: 000000c00552e380 R15: ffffffffffffffff
[  741.713298]  </TASK>
[  741.715755] INFO: task alloy:6146 blocked for more than 247 seconds.
[  741.722459]       Tainted: G            E       6.18.6-1.gdc.el9.x86_64 #1
[  741.729641] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  741.737842] task:alloy           state:D stack:0     pid:6146
tgid:5899  ppid:1      task_flags:0x400140 flags:0x00080001
[  741.749353] Call Trace:
[  741.752118]  <TASK>
[  741.754483]  __schedule+0x2b5/0x690
[  741.758268]  schedule+0x23/0x80
[  741.761738]  schedule_preempt_disabled+0x11/0x20
[  741.766704]  __mutex_lock.constprop.0+0x3c9/0xa00
[  741.771723]  hwmon_attr_show+0x36/0x130
[  741.775911]  dev_attr_show+0x19/0x60
[  741.779802]  sysfs_kf_seq_show+0xbf/0x140
[  741.784140]  seq_read_iter+0x112/0x510
[  741.788171]  ? security_file_permission+0x8e/0xa0
[  741.793223]  vfs_read+0x215/0x340
[  741.796834]  ksys_read+0x61/0xe0
[  741.800372]  do_syscall_64+0x5d/0xa70
[  741.804312]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  741.809713] RIP: 0033:0x408d8e
[  741.813030] RSP: 002b:000000c004fed498 EFLAGS: 00000206 ORIG_RAX:
0000000000000000
[  741.820985] RAX: ffffffffffffffda RBX: 000000000000001e RCX: 0000000000408d8e
[  741.828512] RDX: 0000000000000080 RSI: 000000c004db7a80 RDI: 000000000000001e
[  741.836025] RBP: 000000c004fed4d8 R08: 0000000000000000 R09: 0000000000000000
[  741.843517] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
[  741.850996] R13: 0000000000000040 R14: 000000c003e03880 R15: 0000000000000054
[  741.858511]  </TASK>

Any help or guidance on how to fix this issue would be greatly appreciated.

Best,
Jaroslav Pulchart

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-01-22 18:21 acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y Jaroslav Pulchart
@ 2026-01-22 18:55 ` Rafael J. Wysocki
  2026-01-22 19:51   ` Guenter Roeck
  2026-01-28 18:18   ` Guenter Roeck
  0 siblings, 2 replies; 60+ messages in thread
From: Rafael J. Wysocki @ 2026-01-22 18:55 UTC (permalink / raw)
  To: Jaroslav Pulchart
  Cc: linux-acpi, linux-hwmon, Igor Raits, Daniel Secik, Zdenek Pesek,
	Jiri Jurica, Guenter Roeck, Huisong Li

On Thu, Jan 22, 2026 at 7:21 PM Jaroslav Pulchart
<jaroslav.pulchart@gooddata.com> wrote:
>
> Hello,
>
> after upgrading from kernel 6.17.y to 6.18.y we started to observe a regression
> in the ACPI power meter hwmon interface. Reading power*_average sysfs
> attributes blocks indefinitely and causes tasks to enter uninterruptible
> sleep (D state).

The most recent change in the acpi_power_meter driver was made in
6.15, so this is not a regression in that driver.

Also, nothing suspicious is done in power1_average_min_show() and
power1_average_min_store() AFAICS.

> This affects both simple tools (e.g. cat) and monitoring agents scraping
> hwmon (Prometheus via Grafana Alloy), eventually leading to hung task warnings
> and stalled metric collection.
>
> $ cat /sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/hwmon/hwmonX/power1_average
> # blocks foreverJaroslav Pulchart
>
> Blocked tasks show the following call chain:
>
> [  528.108418] u[  617.816097]  __mutex_lock.constprop.0+0x3c9/0xa00
> [  617.823094]  hwmon_attr_show+0x36/0x130
> [  617.828889]  dev_attr_show+0x19/0x60
> [  617.834095]  sysfs_kf_seq_show+0xbf/0x140
> [  617.841102]  seq_read_iter+0x112/0x510
> [  617.847099]  ? security_file_permission+0x8e/0xa0
> [  617.854101]  vfs_read+0x215/0x340
> [  617.860095]  ksys_read+0x61/0xe0
> [  617.866096]  do_syscall_64+0x5d/0xa70
> [  617.872097]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [  617.880076] RIP: 0033:0x408d8e
> [  617.885204] RSP: 002b:000000c004fed498 EFLAGS: 00000206 ORIG_RAX:
> 0000000000000000
> [  617.896098] RAX: ffffffffffffffda RBX: 000000000000001e RCX: 0000000000408d8e
> [  617.905398] RDX: 0000000000000080 RSI: 000000c004db7a80 RDI: 000000000000001e
> [  617.915098] RBP: 000000c004fed4d8 R08: 0000000000000000 R09: 0000000000000000
> [  617.925099] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> [  617.935103] R13: 0000000000000040 R14: 000000c003e03880 R15: 0000000000000054
> [  617.944105]  </TASK>
> [  620.891393] usb 3-1.2: USB disconnect, device number 15
> [  740.286380] INFO: task alloy:5901 blocked for more than 122 seconds.
> [  740.292941]       Tainted: G            E       6.18.6-1.gdc.el9.x86_64 #1
> [  740.299938] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  740.313368] task:alloy           state:D stack:0     pid:5901
> tgid:5899  ppid:1      task_flags:0x400140 flags:0x00080001
> [  740.326290] Call Trace:
> [  740.331572]  <TASK>
> [  740.336311]  __schedule+0x2b5/0x690
> [  740.341312]  schedule+0x23/0x80
> [  740.347316]  schedule_preempt_disabled+0x11/0x20
> [  740.354335]  __mutex_lock.constprop.0+0x3c9/0xa00
> [  740.360318]  hwmon_attr_show+0x36/0x130
> [  740.366315]  dev_attr_show+0x19/0x60
> [  740.372315]  sysfs_kf_seq_show+0xbf/0x140
> [  740.378315]  seq_read_iter+0x112/0x510
> [  740.384312]  ? security_file_permission+0x8e/0xa0
> [  740.391316]  vfs_read+0x215/0x340
> [  740.397336]  ksys_read+0x61/0xe0
> [  740.403310]  do_syscall_64+0x5d/0xa70
> [  740.409321]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [  740.417340] RIP: 0033:0x408d8e
> [  740.422111] RSP: 002b:000000c003401498 EFLAGS: 00000206 ORIG_RAX:
> 0000000000000000
> [  740.431439] RAX: ffffffffffffffda RBX: 000000000000001c RCX: 0000000000408d8e
> [  740.441336] RDX: 0000000000000080 RSI: 000000c00479d800 RDI: 000000000000001c
> [  740.451326] RBP: 000000c0034014d8 R08: 0000000000000000 R09: 0000000000000000
> [  740.461320] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> [  740.471322] R13: 0000000000000040 R14: 000000c004d4d6c0 R15: ffffffffffffffff
> [  740.480548]  </TASK>
> [  740.485315] INFO: task alloy:5903 blocked for more than 123 seconds.
> [  740.494310]       Tainted: G            E       6.18.6-1.gdc.el9.x86_64 #1
> [  740.503355] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  740.513305] task:alloy           state:D stack:0     pid:5903
> tgid:5899  ppid:1      task_flags:0x400140 flags:0x00080001
> [  740.526807] Call Trace:
> [  740.532010]  <TASK>
> [  740.536313]  __schedule+0x2b5/0x690
> [  740.542319]  schedule+0x23/0x80
> [  740.547444]  schedule_preempt_disabled+0x11/0x20
> [  740.555324]  __mutex_lock.constprop.0+0x3c9/0xa00
> [  740.561576]  hwmon_attr_show+0x36/0x130
> [  740.567326]  dev_attr_show+0x19/0x60
> [  740.572315]  sysfs_kf_seq_show+0xbf/0x140
> [  740.578311]  seq_read_iter+0x112/0x510
> [  740.584311]  ? security_file_permission+0x8e/0xa0
> [  740.591316]  vfs_read+0x215/0x340
> [  740.596477]  ksys_read+0x61/0xe0
> [  740.601312]  do_syscall_64+0x5d/0xa70
> [  740.606333]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [  740.614312] RIP: 0033:0x408d8e
> [  740.618667] RSP: 002b:000000c002217498 EFLAGS: 00000206 ORIG_RAX:
> 0000000000000000
> [  740.628317] RAX: ffffffffffffffda RBX: 0000000000000027 RCX: 0000000000408d8e
> [  740.638353] RDX: 0000000000000080 RSI: 000000c006f78e80 RDI: 0000000000000027
> [  740.648314] RBP: 000000c0022174d8 R08: 0000000000000000 R09: 0000000000000000
> [  740.657348] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> [  740.667288] R13: 0000000000000040 R14: 000000c0047276c0 R15: ffffffffffffffff
> [  740.676348]  </TASK>
> [  740.680440] INFO: task alloy:5921 blocked for more than 123 seconds.
> [  740.689314]       Tainted: G            E       6.18.6-1.gdc.el9.x86_64 #1
> [  740.698323] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  740.708314] task:alloy           state:D stack:0     pid:5921
> tgid:5899  ppid:1      task_flags:0x400140 flags:0x00080001
> [  740.721209] Call Trace:[  741.467334]  power_meter_read+0x1ea/0x2c0
> [acpi_power_meter]
> [  741.475320]  hwmon_attr_show+0x5e/0x130
> [  741.481343]  dev_attr_show+0x19/0x60
> [  741.487325]  sysfs_kf_seq_show+0xbf/0x140
> [  741.494315]  seq_read_iter+0x112/0x510
> [  741.498537]  ? security_file_permission+0x8e/0xa0
> [  741.503544]  vfs_read+0x215/0x340
> [  741.507125]  ksys_read+0x61/0xe0
> [  741.510650]  do_syscall_64+0x5d/0xa70
> [  741.514578]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [  741.519954] RIP: 0033:0x408d8e
> [  741.523283] RSP: 002b:000000c007a8b498 EFLAGS: 00000206 ORIG_RAX:
> 0000000000000000
> [  741.531260] RAX: ffffffffffffffda RBX: 000000000000001f RCX: 0000000000408d8e
> [  741.538718] RDX: 0000000000000080 RSI: 000000c0060a3000 RDI: 000000000000001f
> [  741.546256] RBP: 000000c007a8b4d8 R08: 0000000000000000 R09: 0000000000000000
> [  741.553731] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> [  741.561265] R13: 0000000000000040 R14: 000000c006dbae00 R15: ffffffffffffffff
> [  741.568728]  </TASK>
> [  741.571167] INFO: task alloy:6139 blocked for more than 123 seconds.
> [  741.577893]       Tainted: G            E       6.18.6-1.gdc.el9.x86_64 #1
> [  741.585112] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  741.593345] task:alloy           state:D stack:0     pid:6139
> tgid:5899  ppid:1      task_flags:0x400140 flags:0x00080001
> [  741.604775] Call Trace:
> [  741.607508]  <TASK>
> [  741.609883]  __schedule+0x2b5/0x690
> [  741.613629]  schedule+0x23/0x80
> [  741.617023]  schedule_preempt_disabled+0x11/0x20
> [  741.621921]  __mutex_lock.constprop.0+0x3c9/0xa00
> [  741.626946]  hwmon_attr_show+0x36/0x130
> [  741.631052]  dev_attr_show+0x19/0x60
> [  741.634903]  sysfs_kf_seq_show+0xbf/0x140
> [  741.639231]  seq_read_iter+0x112/0x510
> [  741.643261]  ? security_file_permission+0x8e/0xa0
> [  741.648236]  vfs_read+0x215/0x340
> [  741.651800]  ksys_read+0x61/0xe0
> [  741.655302]  do_syscall_64+0x5d/0xa70
> [  741.659221]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [  741.664566] RIP: 0033:0x408d8e
> [  741.667876] RSP: 002b:000000c003805498 EFLAGS: 00000206 ORIG_RAX:
> 0000000000000000
> [  741.675821] RAX: ffffffffffffffda RBX: 0000000000000023 RCX: 0000000000408d8e
> [  741.683331] RDX: 0000000000000080 RSI: 000000c00731d000 RDI: 0000000000000023
> [  741.690831] RBP: 000000c0038054d8 R08: 0000000000000000 R09: 0000000000000000
> [  741.698340] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> [  741.705812] R13: 0000000000000040 R14: 000000c00552e380 R15: ffffffffffffffff
> [  741.713298]  </TASK>
> [  741.715755] INFO: task alloy:6146 blocked for more than 247 seconds.
> [  741.722459]       Tainted: G            E       6.18.6-1.gdc.el9.x86_64 #1
> [  741.729641] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  741.737842] task:alloy           state:D stack:0     pid:6146
> tgid:5899  ppid:1      task_flags:0x400140 flags:0x00080001
> [  741.749353] Call Trace:
> [  741.752118]  <TASK>
> [  741.754483]  __schedule+0x2b5/0x690
> [  741.758268]  schedule+0x23/0x80
> [  741.761738]  schedule_preempt_disabled+0x11/0x20
> [  741.766704]  __mutex_lock.constprop.0+0x3c9/0xa00
> [  741.771723]  hwmon_attr_show+0x36/0x130
> [  741.775911]  dev_attr_show+0x19/0x60
> [  741.779802]  sysfs_kf_seq_show+0xbf/0x140
> [  741.784140]  seq_read_iter+0x112/0x510
> [  741.788171]  ? security_file_permission+0x8e/0xa0
> [  741.793223]  vfs_read+0x215/0x340
> [  741.796834]  ksys_read+0x61/0xe0
> [  741.800372]  do_syscall_64+0x5d/0xa70
> [  741.804312]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [  741.809713] RIP: 0033:0x408d8e
> [  741.813030] RSP: 002b:000000c004fed498 EFLAGS: 00000206 ORIG_RAX:
> 0000000000000000
> [  741.820985] RAX: ffffffffffffffda RBX: 000000000000001e RCX: 0000000000408d8e
> [  741.828512] RDX: 0000000000000080 RSI: 000000c004db7a80 RDI: 000000000000001e
> [  741.836025] RBP: 000000c004fed4d8 R08: 0000000000000000 R09: 0000000000000000
> [  741.843517] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> [  741.850996] R13: 0000000000000040 R14: 000000c003e03880 R15: 0000000000000054
> [  741.858511]  </TASK>
>
> Any help or guidance on how to fix this issue would be greatly appreciated.

To my eyes, all of the tasks above are blocking in hwmon_attr_show()
and I'm not sure how reading power*_average may lead to that.

Anyway, I suspend some unexpected interaction between acpi_power_meter
and commit 3ad2a7b9b15d ("hwmon: Serialize accesses in hwmon core").

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-01-22 18:55 ` Rafael J. Wysocki
@ 2026-01-22 19:51   ` Guenter Roeck
  2026-01-22 20:42     ` Rafael J. Wysocki
  2026-01-28 18:18   ` Guenter Roeck
  1 sibling, 1 reply; 60+ messages in thread
From: Guenter Roeck @ 2026-01-22 19:51 UTC (permalink / raw)
  To: Rafael J. Wysocki, Jaroslav Pulchart
  Cc: linux-acpi, linux-hwmon, Igor Raits, Daniel Secik, Zdenek Pesek,
	Jiri Jurica, Huisong Li

On 1/22/26 10:55, Rafael J. Wysocki wrote:
> On Thu, Jan 22, 2026 at 7:21 PM Jaroslav Pulchart
> <jaroslav.pulchart@gooddata.com> wrote:
>>
>> Hello,
>>
>> after upgrading from kernel 6.17.y to 6.18.y we started to observe a regression
>> in the ACPI power meter hwmon interface. Reading power*_average sysfs
>> attributes blocks indefinitely and causes tasks to enter uninterruptible
>> sleep (D state).
> 
> The most recent change in the acpi_power_meter driver was made in
> 6.15, so this is not a regression in that driver.
> 
> Also, nothing suspicious is done in power1_average_min_show() and
> power1_average_min_store() AFAICS.
> 

I suspect a circular locking problem between the hwmon lock and the
resource lock. Unfortunately I don't immediately see it.

Would it be possible to test this with a kernel which has lock debugging
enabled ? I'd test it myself but I don't have a system available
that supports the acpi power meter.

Thanks,
Guenter

>> This affects both simple tools (e.g. cat) and monitoring agents scraping
>> hwmon (Prometheus via Grafana Alloy), eventually leading to hung task warnings
>> and stalled metric collection.
>>
>> $ cat /sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/hwmon/hwmonX/power1_average
>> # blocks foreverJaroslav Pulchart
>>
>> Blocked tasks show the following call chain:
>>
>> [  528.108418] u[  617.816097]  __mutex_lock.constprop.0+0x3c9/0xa00
>> [  617.823094]  hwmon_attr_show+0x36/0x130
>> [  617.828889]  dev_attr_show+0x19/0x60
>> [  617.834095]  sysfs_kf_seq_show+0xbf/0x140
>> [  617.841102]  seq_read_iter+0x112/0x510
>> [  617.847099]  ? security_file_permission+0x8e/0xa0
>> [  617.854101]  vfs_read+0x215/0x340
>> [  617.860095]  ksys_read+0x61/0xe0
>> [  617.866096]  do_syscall_64+0x5d/0xa70
>> [  617.872097]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> [  617.880076] RIP: 0033:0x408d8e
>> [  617.885204] RSP: 002b:000000c004fed498 EFLAGS: 00000206 ORIG_RAX:
>> 0000000000000000
>> [  617.896098] RAX: ffffffffffffffda RBX: 000000000000001e RCX: 0000000000408d8e
>> [  617.905398] RDX: 0000000000000080 RSI: 000000c004db7a80 RDI: 000000000000001e
>> [  617.915098] RBP: 000000c004fed4d8 R08: 0000000000000000 R09: 0000000000000000
>> [  617.925099] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
>> [  617.935103] R13: 0000000000000040 R14: 000000c003e03880 R15: 0000000000000054
>> [  617.944105]  </TASK>
>> [  620.891393] usb 3-1.2: USB disconnect, device number 15
>> [  740.286380] INFO: task alloy:5901 blocked for more than 122 seconds.
>> [  740.292941]       Tainted: G            E       6.18.6-1.gdc.el9.x86_64 #1
>> [  740.299938] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>> disables this message.
>> [  740.313368] task:alloy           state:D stack:0     pid:5901
>> tgid:5899  ppid:1      task_flags:0x400140 flags:0x00080001
>> [  740.326290] Call Trace:
>> [  740.331572]  <TASK>
>> [  740.336311]  __schedule+0x2b5/0x690
>> [  740.341312]  schedule+0x23/0x80
>> [  740.347316]  schedule_preempt_disabled+0x11/0x20
>> [  740.354335]  __mutex_lock.constprop.0+0x3c9/0xa00
>> [  740.360318]  hwmon_attr_show+0x36/0x130
>> [  740.366315]  dev_attr_show+0x19/0x60
>> [  740.372315]  sysfs_kf_seq_show+0xbf/0x140
>> [  740.378315]  seq_read_iter+0x112/0x510
>> [  740.384312]  ? security_file_permission+0x8e/0xa0
>> [  740.391316]  vfs_read+0x215/0x340
>> [  740.397336]  ksys_read+0x61/0xe0
>> [  740.403310]  do_syscall_64+0x5d/0xa70
>> [  740.409321]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> [  740.417340] RIP: 0033:0x408d8e
>> [  740.422111] RSP: 002b:000000c003401498 EFLAGS: 00000206 ORIG_RAX:
>> 0000000000000000
>> [  740.431439] RAX: ffffffffffffffda RBX: 000000000000001c RCX: 0000000000408d8e
>> [  740.441336] RDX: 0000000000000080 RSI: 000000c00479d800 RDI: 000000000000001c
>> [  740.451326] RBP: 000000c0034014d8 R08: 0000000000000000 R09: 0000000000000000
>> [  740.461320] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
>> [  740.471322] R13: 0000000000000040 R14: 000000c004d4d6c0 R15: ffffffffffffffff
>> [  740.480548]  </TASK>
>> [  740.485315] INFO: task alloy:5903 blocked for more than 123 seconds.
>> [  740.494310]       Tainted: G            E       6.18.6-1.gdc.el9.x86_64 #1
>> [  740.503355] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>> disables this message.
>> [  740.513305] task:alloy           state:D stack:0     pid:5903
>> tgid:5899  ppid:1      task_flags:0x400140 flags:0x00080001
>> [  740.526807] Call Trace:
>> [  740.532010]  <TASK>
>> [  740.536313]  __schedule+0x2b5/0x690
>> [  740.542319]  schedule+0x23/0x80
>> [  740.547444]  schedule_preempt_disabled+0x11/0x20
>> [  740.555324]  __mutex_lock.constprop.0+0x3c9/0xa00
>> [  740.561576]  hwmon_attr_show+0x36/0x130
>> [  740.567326]  dev_attr_show+0x19/0x60
>> [  740.572315]  sysfs_kf_seq_show+0xbf/0x140
>> [  740.578311]  seq_read_iter+0x112/0x510
>> [  740.584311]  ? security_file_permission+0x8e/0xa0
>> [  740.591316]  vfs_read+0x215/0x340
>> [  740.596477]  ksys_read+0x61/0xe0
>> [  740.601312]  do_syscall_64+0x5d/0xa70
>> [  740.606333]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> [  740.614312] RIP: 0033:0x408d8e
>> [  740.618667] RSP: 002b:000000c002217498 EFLAGS: 00000206 ORIG_RAX:
>> 0000000000000000
>> [  740.628317] RAX: ffffffffffffffda RBX: 0000000000000027 RCX: 0000000000408d8e
>> [  740.638353] RDX: 0000000000000080 RSI: 000000c006f78e80 RDI: 0000000000000027
>> [  740.648314] RBP: 000000c0022174d8 R08: 0000000000000000 R09: 0000000000000000
>> [  740.657348] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
>> [  740.667288] R13: 0000000000000040 R14: 000000c0047276c0 R15: ffffffffffffffff
>> [  740.676348]  </TASK>
>> [  740.680440] INFO: task alloy:5921 blocked for more than 123 seconds.
>> [  740.689314]       Tainted: G            E       6.18.6-1.gdc.el9.x86_64 #1
>> [  740.698323] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>> disables this message.
>> [  740.708314] task:alloy           state:D stack:0     pid:5921
>> tgid:5899  ppid:1      task_flags:0x400140 flags:0x00080001
>> [  740.721209] Call Trace:[  741.467334]  power_meter_read+0x1ea/0x2c0
>> [acpi_power_meter]
>> [  741.475320]  hwmon_attr_show+0x5e/0x130
>> [  741.481343]  dev_attr_show+0x19/0x60
>> [  741.487325]  sysfs_kf_seq_show+0xbf/0x140
>> [  741.494315]  seq_read_iter+0x112/0x510
>> [  741.498537]  ? security_file_permission+0x8e/0xa0
>> [  741.503544]  vfs_read+0x215/0x340
>> [  741.507125]  ksys_read+0x61/0xe0
>> [  741.510650]  do_syscall_64+0x5d/0xa70
>> [  741.514578]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> [  741.519954] RIP: 0033:0x408d8e
>> [  741.523283] RSP: 002b:000000c007a8b498 EFLAGS: 00000206 ORIG_RAX:
>> 0000000000000000
>> [  741.531260] RAX: ffffffffffffffda RBX: 000000000000001f RCX: 0000000000408d8e
>> [  741.538718] RDX: 0000000000000080 RSI: 000000c0060a3000 RDI: 000000000000001f
>> [  741.546256] RBP: 000000c007a8b4d8 R08: 0000000000000000 R09: 0000000000000000
>> [  741.553731] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
>> [  741.561265] R13: 0000000000000040 R14: 000000c006dbae00 R15: ffffffffffffffff
>> [  741.568728]  </TASK>
>> [  741.571167] INFO: task alloy:6139 blocked for more than 123 seconds.
>> [  741.577893]       Tainted: G            E       6.18.6-1.gdc.el9.x86_64 #1
>> [  741.585112] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>> disables this message.
>> [  741.593345] task:alloy           state:D stack:0     pid:6139
>> tgid:5899  ppid:1      task_flags:0x400140 flags:0x00080001
>> [  741.604775] Call Trace:
>> [  741.607508]  <TASK>
>> [  741.609883]  __schedule+0x2b5/0x690
>> [  741.613629]  schedule+0x23/0x80
>> [  741.617023]  schedule_preempt_disabled+0x11/0x20
>> [  741.621921]  __mutex_lock.constprop.0+0x3c9/0xa00
>> [  741.626946]  hwmon_attr_show+0x36/0x130
>> [  741.631052]  dev_attr_show+0x19/0x60
>> [  741.634903]  sysfs_kf_seq_show+0xbf/0x140
>> [  741.639231]  seq_read_iter+0x112/0x510
>> [  741.643261]  ? security_file_permission+0x8e/0xa0
>> [  741.648236]  vfs_read+0x215/0x340
>> [  741.651800]  ksys_read+0x61/0xe0
>> [  741.655302]  do_syscall_64+0x5d/0xa70
>> [  741.659221]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> [  741.664566] RIP: 0033:0x408d8e
>> [  741.667876] RSP: 002b:000000c003805498 EFLAGS: 00000206 ORIG_RAX:
>> 0000000000000000
>> [  741.675821] RAX: ffffffffffffffda RBX: 0000000000000023 RCX: 0000000000408d8e
>> [  741.683331] RDX: 0000000000000080 RSI: 000000c00731d000 RDI: 0000000000000023
>> [  741.690831] RBP: 000000c0038054d8 R08: 0000000000000000 R09: 0000000000000000
>> [  741.698340] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
>> [  741.705812] R13: 0000000000000040 R14: 000000c00552e380 R15: ffffffffffffffff
>> [  741.713298]  </TASK>
>> [  741.715755] INFO: task alloy:6146 blocked for more than 247 seconds.
>> [  741.722459]       Tainted: G            E       6.18.6-1.gdc.el9.x86_64 #1
>> [  741.729641] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>> disables this message.
>> [  741.737842] task:alloy           state:D stack:0     pid:6146
>> tgid:5899  ppid:1      task_flags:0x400140 flags:0x00080001
>> [  741.749353] Call Trace:
>> [  741.752118]  <TASK>
>> [  741.754483]  __schedule+0x2b5/0x690
>> [  741.758268]  schedule+0x23/0x80
>> [  741.761738]  schedule_preempt_disabled+0x11/0x20
>> [  741.766704]  __mutex_lock.constprop.0+0x3c9/0xa00
>> [  741.771723]  hwmon_attr_show+0x36/0x130
>> [  741.775911]  dev_attr_show+0x19/0x60
>> [  741.779802]  sysfs_kf_seq_show+0xbf/0x140
>> [  741.784140]  seq_read_iter+0x112/0x510
>> [  741.788171]  ? security_file_permission+0x8e/0xa0
>> [  741.793223]  vfs_read+0x215/0x340
>> [  741.796834]  ksys_read+0x61/0xe0
>> [  741.800372]  do_syscall_64+0x5d/0xa70
>> [  741.804312]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> [  741.809713] RIP: 0033:0x408d8e
>> [  741.813030] RSP: 002b:000000c004fed498 EFLAGS: 00000206 ORIG_RAX:
>> 0000000000000000
>> [  741.820985] RAX: ffffffffffffffda RBX: 000000000000001e RCX: 0000000000408d8e
>> [  741.828512] RDX: 0000000000000080 RSI: 000000c004db7a80 RDI: 000000000000001e
>> [  741.836025] RBP: 000000c004fed4d8 R08: 0000000000000000 R09: 0000000000000000
>> [  741.843517] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
>> [  741.850996] R13: 0000000000000040 R14: 000000c003e03880 R15: 0000000000000054
>> [  741.858511]  </TASK>
>>
>> Any help or guidance on how to fix this issue would be greatly appreciated.
> 
> To my eyes, all of the tasks above are blocking in hwmon_attr_show()
> and I'm not sure how reading power*_average may lead to that.
> 
> Anyway, I suspend some unexpected interaction between acpi_power_meter
> and commit 3ad2a7b9b15d ("hwmon: Serialize accesses in hwmon core").


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-01-22 19:51   ` Guenter Roeck
@ 2026-01-22 20:42     ` Rafael J. Wysocki
  2026-01-22 22:28       ` Guenter Roeck
  2026-01-23  2:53       ` lihuisong (C)
  0 siblings, 2 replies; 60+ messages in thread
From: Rafael J. Wysocki @ 2026-01-22 20:42 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Rafael J. Wysocki, Jaroslav Pulchart, linux-acpi, linux-hwmon,
	Igor Raits, Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li

On Thu, Jan 22, 2026 at 8:51 PM Guenter Roeck <linux@roeck-us.net> wrote:
>
> On 1/22/26 10:55, Rafael J. Wysocki wrote:
> > On Thu, Jan 22, 2026 at 7:21 PM Jaroslav Pulchart
> > <jaroslav.pulchart@gooddata.com> wrote:
> >>
> >> Hello,
> >>
> >> after upgrading from kernel 6.17.y to 6.18.y we started to observe a regression
> >> in the ACPI power meter hwmon interface. Reading power*_average sysfs
> >> attributes blocks indefinitely and causes tasks to enter uninterruptible
> >> sleep (D state).
> >
> > The most recent change in the acpi_power_meter driver was made in
> > 6.15, so this is not a regression in that driver.
> >
> > Also, nothing suspicious is done in power1_average_min_show() and
> > power1_average_min_store() AFAICS.
> >
>
> I suspect a circular locking problem between the hwmon lock and the
> resource lock. Unfortunately I don't immediately see it.
>
> Would it be possible to test this with a kernel which has lock debugging
> enabled ? I'd test it myself but I don't have a system available
> that supports the acpi power meter.

One problem I found in acpi_power_meter through code inspection is
calling hwmon_device_unregister() under resource->lock in
acpi_power_meter_notify(), which may be responsible for the observed
symptoms, but then I'm not sure why it started to be visible after
6.18.

> >> This affects both simple tools (e.g. cat) and monitoring agents scraping
> >> hwmon (Prometheus via Grafana Alloy), eventually leading to hung task warnings
> >> and stalled metric collection.
> >>
> >> $ cat /sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/hwmon/hwmonX/power1_average
> >> # blocks foreverJaroslav Pulchart
> >>
> >> Blocked tasks show the following call chain:
> >>
> >> [  528.108418] u[  617.816097]  __mutex_lock.constprop.0+0x3c9/0xa00
> >> [  617.823094]  hwmon_attr_show+0x36/0x130
> >> [  617.828889]  dev_attr_show+0x19/0x60
> >> [  617.834095]  sysfs_kf_seq_show+0xbf/0x140
> >> [  617.841102]  seq_read_iter+0x112/0x510
> >> [  617.847099]  ? security_file_permission+0x8e/0xa0
> >> [  617.854101]  vfs_read+0x215/0x340
> >> [  617.860095]  ksys_read+0x61/0xe0
> >> [  617.866096]  do_syscall_64+0x5d/0xa70
> >> [  617.872097]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> >> [  617.880076] RIP: 0033:0x408d8e
> >> [  617.885204] RSP: 002b:000000c004fed498 EFLAGS: 00000206 ORIG_RAX:
> >> 0000000000000000
> >> [  617.896098] RAX: ffffffffffffffda RBX: 000000000000001e RCX: 0000000000408d8e
> >> [  617.905398] RDX: 0000000000000080 RSI: 000000c004db7a80 RDI: 000000000000001e
> >> [  617.915098] RBP: 000000c004fed4d8 R08: 0000000000000000 R09: 0000000000000000
> >> [  617.925099] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> >> [  617.935103] R13: 0000000000000040 R14: 000000c003e03880 R15: 0000000000000054
> >> [  617.944105]  </TASK>
> >> [  620.891393] usb 3-1.2: USB disconnect, device number 15
> >> [  740.286380] INFO: task alloy:5901 blocked for more than 122 seconds.
> >> [  740.292941]       Tainted: G            E       6.18.6-1.gdc.el9.x86_64 #1
> >> [  740.299938] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >> disables this message.
> >> [  740.313368] task:alloy           state:D stack:0     pid:5901
> >> tgid:5899  ppid:1      task_flags:0x400140 flags:0x00080001
> >> [  740.326290] Call Trace:
> >> [  740.331572]  <TASK>
> >> [  740.336311]  __schedule+0x2b5/0x690
> >> [  740.341312]  schedule+0x23/0x80
> >> [  740.347316]  schedule_preempt_disabled+0x11/0x20
> >> [  740.354335]  __mutex_lock.constprop.0+0x3c9/0xa00
> >> [  740.360318]  hwmon_attr_show+0x36/0x130
> >> [  740.366315]  dev_attr_show+0x19/0x60
> >> [  740.372315]  sysfs_kf_seq_show+0xbf/0x140
> >> [  740.378315]  seq_read_iter+0x112/0x510
> >> [  740.384312]  ? security_file_permission+0x8e/0xa0
> >> [  740.391316]  vfs_read+0x215/0x340
> >> [  740.397336]  ksys_read+0x61/0xe0
> >> [  740.403310]  do_syscall_64+0x5d/0xa70
> >> [  740.409321]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> >> [  740.417340] RIP: 0033:0x408d8e
> >> [  740.422111] RSP: 002b:000000c003401498 EFLAGS: 00000206 ORIG_RAX:
> >> 0000000000000000
> >> [  740.431439] RAX: ffffffffffffffda RBX: 000000000000001c RCX: 0000000000408d8e
> >> [  740.441336] RDX: 0000000000000080 RSI: 000000c00479d800 RDI: 000000000000001c
> >> [  740.451326] RBP: 000000c0034014d8 R08: 0000000000000000 R09: 0000000000000000
> >> [  740.461320] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> >> [  740.471322] R13: 0000000000000040 R14: 000000c004d4d6c0 R15: ffffffffffffffff
> >> [  740.480548]  </TASK>
> >> [  740.485315] INFO: task alloy:5903 blocked for more than 123 seconds.
> >> [  740.494310]       Tainted: G            E       6.18.6-1.gdc.el9.x86_64 #1
> >> [  740.503355] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >> disables this message.
> >> [  740.513305] task:alloy           state:D stack:0     pid:5903
> >> tgid:5899  ppid:1      task_flags:0x400140 flags:0x00080001
> >> [  740.526807] Call Trace:
> >> [  740.532010]  <TASK>
> >> [  740.536313]  __schedule+0x2b5/0x690
> >> [  740.542319]  schedule+0x23/0x80
> >> [  740.547444]  schedule_preempt_disabled+0x11/0x20
> >> [  740.555324]  __mutex_lock.constprop.0+0x3c9/0xa00
> >> [  740.561576]  hwmon_attr_show+0x36/0x130
> >> [  740.567326]  dev_attr_show+0x19/0x60
> >> [  740.572315]  sysfs_kf_seq_show+0xbf/0x140
> >> [  740.578311]  seq_read_iter+0x112/0x510
> >> [  740.584311]  ? security_file_permission+0x8e/0xa0
> >> [  740.591316]  vfs_read+0x215/0x340
> >> [  740.596477]  ksys_read+0x61/0xe0
> >> [  740.601312]  do_syscall_64+0x5d/0xa70
> >> [  740.606333]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> >> [  740.614312] RIP: 0033:0x408d8e
> >> [  740.618667] RSP: 002b:000000c002217498 EFLAGS: 00000206 ORIG_RAX:
> >> 0000000000000000
> >> [  740.628317] RAX: ffffffffffffffda RBX: 0000000000000027 RCX: 0000000000408d8e
> >> [  740.638353] RDX: 0000000000000080 RSI: 000000c006f78e80 RDI: 0000000000000027
> >> [  740.648314] RBP: 000000c0022174d8 R08: 0000000000000000 R09: 0000000000000000
> >> [  740.657348] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> >> [  740.667288] R13: 0000000000000040 R14: 000000c0047276c0 R15: ffffffffffffffff
> >> [  740.676348]  </TASK>
> >> [  740.680440] INFO: task alloy:5921 blocked for more than 123 seconds.
> >> [  740.689314]       Tainted: G            E       6.18.6-1.gdc.el9.x86_64 #1
> >> [  740.698323] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >> disables this message.
> >> [  740.708314] task:alloy           state:D stack:0     pid:5921
> >> tgid:5899  ppid:1      task_flags:0x400140 flags:0x00080001
> >> [  740.721209] Call Trace:[  741.467334]  power_meter_read+0x1ea/0x2c0
> >> [acpi_power_meter]
> >> [  741.475320]  hwmon_attr_show+0x5e/0x130
> >> [  741.481343]  dev_attr_show+0x19/0x60
> >> [  741.487325]  sysfs_kf_seq_show+0xbf/0x140
> >> [  741.494315]  seq_read_iter+0x112/0x510
> >> [  741.498537]  ? security_file_permission+0x8e/0xa0
> >> [  741.503544]  vfs_read+0x215/0x340
> >> [  741.507125]  ksys_read+0x61/0xe0
> >> [  741.510650]  do_syscall_64+0x5d/0xa70
> >> [  741.514578]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> >> [  741.519954] RIP: 0033:0x408d8e
> >> [  741.523283] RSP: 002b:000000c007a8b498 EFLAGS: 00000206 ORIG_RAX:
> >> 0000000000000000
> >> [  741.531260] RAX: ffffffffffffffda RBX: 000000000000001f RCX: 0000000000408d8e
> >> [  741.538718] RDX: 0000000000000080 RSI: 000000c0060a3000 RDI: 000000000000001f
> >> [  741.546256] RBP: 000000c007a8b4d8 R08: 0000000000000000 R09: 0000000000000000
> >> [  741.553731] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> >> [  741.561265] R13: 0000000000000040 R14: 000000c006dbae00 R15: ffffffffffffffff
> >> [  741.568728]  </TASK>
> >> [  741.571167] INFO: task alloy:6139 blocked for more than 123 seconds.
> >> [  741.577893]       Tainted: G            E       6.18.6-1.gdc.el9.x86_64 #1
> >> [  741.585112] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >> disables this message.
> >> [  741.593345] task:alloy           state:D stack:0     pid:6139
> >> tgid:5899  ppid:1      task_flags:0x400140 flags:0x00080001
> >> [  741.604775] Call Trace:
> >> [  741.607508]  <TASK>
> >> [  741.609883]  __schedule+0x2b5/0x690
> >> [  741.613629]  schedule+0x23/0x80
> >> [  741.617023]  schedule_preempt_disabled+0x11/0x20
> >> [  741.621921]  __mutex_lock.constprop.0+0x3c9/0xa00
> >> [  741.626946]  hwmon_attr_show+0x36/0x130
> >> [  741.631052]  dev_attr_show+0x19/0x60
> >> [  741.634903]  sysfs_kf_seq_show+0xbf/0x140
> >> [  741.639231]  seq_read_iter+0x112/0x510
> >> [  741.643261]  ? security_file_permission+0x8e/0xa0
> >> [  741.648236]  vfs_read+0x215/0x340
> >> [  741.651800]  ksys_read+0x61/0xe0
> >> [  741.655302]  do_syscall_64+0x5d/0xa70
> >> [  741.659221]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> >> [  741.664566] RIP: 0033:0x408d8e
> >> [  741.667876] RSP: 002b:000000c003805498 EFLAGS: 00000206 ORIG_RAX:
> >> 0000000000000000
> >> [  741.675821] RAX: ffffffffffffffda RBX: 0000000000000023 RCX: 0000000000408d8e
> >> [  741.683331] RDX: 0000000000000080 RSI: 000000c00731d000 RDI: 0000000000000023
> >> [  741.690831] RBP: 000000c0038054d8 R08: 0000000000000000 R09: 0000000000000000
> >> [  741.698340] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> >> [  741.705812] R13: 0000000000000040 R14: 000000c00552e380 R15: ffffffffffffffff
> >> [  741.713298]  </TASK>
> >> [  741.715755] INFO: task alloy:6146 blocked for more than 247 seconds.
> >> [  741.722459]       Tainted: G            E       6.18.6-1.gdc.el9.x86_64 #1
> >> [  741.729641] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >> disables this message.
> >> [  741.737842] task:alloy           state:D stack:0     pid:6146
> >> tgid:5899  ppid:1      task_flags:0x400140 flags:0x00080001
> >> [  741.749353] Call Trace:
> >> [  741.752118]  <TASK>
> >> [  741.754483]  __schedule+0x2b5/0x690
> >> [  741.758268]  schedule+0x23/0x80
> >> [  741.761738]  schedule_preempt_disabled+0x11/0x20
> >> [  741.766704]  __mutex_lock.constprop.0+0x3c9/0xa00
> >> [  741.771723]  hwmon_attr_show+0x36/0x130
> >> [  741.775911]  dev_attr_show+0x19/0x60
> >> [  741.779802]  sysfs_kf_seq_show+0xbf/0x140
> >> [  741.784140]  seq_read_iter+0x112/0x510
> >> [  741.788171]  ? security_file_permission+0x8e/0xa0
> >> [  741.793223]  vfs_read+0x215/0x340
> >> [  741.796834]  ksys_read+0x61/0xe0
> >> [  741.800372]  do_syscall_64+0x5d/0xa70
> >> [  741.804312]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> >> [  741.809713] RIP: 0033:0x408d8e
> >> [  741.813030] RSP: 002b:000000c004fed498 EFLAGS: 00000206 ORIG_RAX:
> >> 0000000000000000
> >> [  741.820985] RAX: ffffffffffffffda RBX: 000000000000001e RCX: 0000000000408d8e
> >> [  741.828512] RDX: 0000000000000080 RSI: 000000c004db7a80 RDI: 000000000000001e
> >> [  741.836025] RBP: 000000c004fed4d8 R08: 0000000000000000 R09: 0000000000000000
> >> [  741.843517] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> >> [  741.850996] R13: 0000000000000040 R14: 000000c003e03880 R15: 0000000000000054
> >> [  741.858511]  </TASK>
> >>
> >> Any help or guidance on how to fix this issue would be greatly appreciated.
> >
> > To my eyes, all of the tasks above are blocking in hwmon_attr_show()
> > and I'm not sure how reading power*_average may lead to that.
> >
> > Anyway, I suspend some unexpected interaction between acpi_power_meter
> > and commit 3ad2a7b9b15d ("hwmon: Serialize accesses in hwmon core").
>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-01-22 20:42     ` Rafael J. Wysocki
@ 2026-01-22 22:28       ` Guenter Roeck
  2026-01-23 10:19         ` Jaroslav Pulchart
  2026-01-23  2:53       ` lihuisong (C)
  1 sibling, 1 reply; 60+ messages in thread
From: Guenter Roeck @ 2026-01-22 22:28 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Jaroslav Pulchart, linux-acpi, linux-hwmon, Igor Raits,
	Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li

On 1/22/26 12:42, Rafael J. Wysocki wrote:
> On Thu, Jan 22, 2026 at 8:51 PM Guenter Roeck <linux@roeck-us.net> wrote:
>>
>> On 1/22/26 10:55, Rafael J. Wysocki wrote:
>>> On Thu, Jan 22, 2026 at 7:21 PM Jaroslav Pulchart
>>> <jaroslav.pulchart@gooddata.com> wrote:
>>>>
>>>> Hello,
>>>>
>>>> after upgrading from kernel 6.17.y to 6.18.y we started to observe a regression
>>>> in the ACPI power meter hwmon interface. Reading power*_average sysfs
>>>> attributes blocks indefinitely and causes tasks to enter uninterruptible
>>>> sleep (D state).
>>>
>>> The most recent change in the acpi_power_meter driver was made in
>>> 6.15, so this is not a regression in that driver.
>>>
>>> Also, nothing suspicious is done in power1_average_min_show() and
>>> power1_average_min_store() AFAICS.
>>>
>>
>> I suspect a circular locking problem between the hwmon lock and the
>> resource lock. Unfortunately I don't immediately see it.
>>
>> Would it be possible to test this with a kernel which has lock debugging
>> enabled ? I'd test it myself but I don't have a system available
>> that supports the acpi power meter.
> 
> One problem I found in acpi_power_meter through code inspection is
> calling hwmon_device_unregister() under resource->lock in
> acpi_power_meter_notify(), which may be responsible for the observed
> symptoms, but then I'm not sure why it started to be visible after
> 6.18.
> 

I noticed that too. It is kind of broken; a driver should never do that.
The proper solution would have been to re-evaluate sensor visibility.
I have a patch series to add that capability to the hwmon core, but that
is untested so I never submitted it.

Anyway, I don't immediately see how that would trigger the problem.

Guenter


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-01-22 22:28       ` Guenter Roeck
@ 2026-01-23 10:19         ` Jaroslav Pulchart
  2026-01-23 18:23           ` Guenter Roeck
  0 siblings, 1 reply; 60+ messages in thread
From: Jaroslav Pulchart @ 2026-01-23 10:19 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Rafael J. Wysocki, linux-acpi, linux-hwmon, Igor Raits,
	Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li

čt 22. 1. 2026 v 23:28 odesílatel Guenter Roeck <linux@roeck-us.net> napsal:
>
> On 1/22/26 12:42, Rafael J. Wysocki wrote:
> > On Thu, Jan 22, 2026 at 8:51 PM Guenter Roeck <linux@roeck-us.net> wrote:
> >>
> >> On 1/22/26 10:55, Rafael J. Wysocki wrote:
> >>> On Thu, Jan 22, 2026 at 7:21 PM Jaroslav Pulchart
> >>> <jaroslav.pulchart@gooddata.com> wrote:
> >>>>
> >>>> Hello,
> >>>>
> >>>> after upgrading from kernel 6.17.y to 6.18.y we started to observe a regression
> >>>> in the ACPI power meter hwmon interface. Reading power*_average sysfs
> >>>> attributes blocks indefinitely and causes tasks to enter uninterruptible
> >>>> sleep (D state).
> >>>
> >>> The most recent change in the acpi_power_meter driver was made in
> >>> 6.15, so this is not a regression in that driver.
> >>>
> >>> Also, nothing suspicious is done in power1_average_min_show() and
> >>> power1_average_min_store() AFAICS.
> >>>
> >>
> >> I suspect a circular locking problem between the hwmon lock and the
> >> resource lock. Unfortunately I don't immediately see it.
> >>
> >> Would it be possible to test this with a kernel which has lock debugging
> >> enabled ? I'd test it myself but I don't have a system available
> >> that supports the acpi power meter.
> >
> > One problem I found in acpi_power_meter through code inspection is
> > calling hwmon_device_unregister() under resource->lock in
> > acpi_power_meter_notify(), which may be responsible for the observed
> > symptoms, but then I'm not sure why it started to be visible after
> > 6.18.
> >
>
> I noticed that too. It is kind of broken; a driver should never do that.
> The proper solution would have been to re-evaluate sensor visibility.
> I have a patch series to add that capability to the hwmon core, but that
> is untested so I never submitted it.
>
> Anyway, I don't immediately see how that would trigger the problem.
>
> Guenter
>

Hello,

Thank you for the analysis and insights.

To add some context from our side: we observe this issue shortly after
a server reinstall or fw updates followed by a reboot, typically when
hwmon sensors are accessed for the first time (e.g. by monitoring during
early system initialization). We have not seen it trigger during normal
operation, such as a simple reboot or on long-running systems.

Given this behavior, it is possible that the issue is related to
hwmon_device_unregister() being triggered early during system
initialization, but I do not have concrete evidence at this point.

As a workaround, we will exclude the ACPI power meter from hwmon monitoring
(Grafana Alloy) until the mentioned refactoring or a fix becomes available.

Best regards,
Jaroslav

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-01-23 10:19         ` Jaroslav Pulchart
@ 2026-01-23 18:23           ` Guenter Roeck
  2026-01-27 11:58             ` Jaroslav Pulchart
  0 siblings, 1 reply; 60+ messages in thread
From: Guenter Roeck @ 2026-01-23 18:23 UTC (permalink / raw)
  To: Jaroslav Pulchart
  Cc: Rafael J. Wysocki, linux-acpi, linux-hwmon, Igor Raits,
	Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li

Hi,

On 1/23/26 02:19, Jaroslav Pulchart wrote:
> čt 22. 1. 2026 v 23:28 odesílatel Guenter Roeck <linux@roeck-us.net> napsal:
>>
>> On 1/22/26 12:42, Rafael J. Wysocki wrote:
>>> On Thu, Jan 22, 2026 at 8:51 PM Guenter Roeck <linux@roeck-us.net> wrote:
>>>>
>>>> On 1/22/26 10:55, Rafael J. Wysocki wrote:
>>>>> On Thu, Jan 22, 2026 at 7:21 PM Jaroslav Pulchart
>>>>> <jaroslav.pulchart@gooddata.com> wrote:
>>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> after upgrading from kernel 6.17.y to 6.18.y we started to observe a regression
>>>>>> in the ACPI power meter hwmon interface. Reading power*_average sysfs
>>>>>> attributes blocks indefinitely and causes tasks to enter uninterruptible
>>>>>> sleep (D state).
>>>>>
>>>>> The most recent change in the acpi_power_meter driver was made in
>>>>> 6.15, so this is not a regression in that driver.
>>>>>
>>>>> Also, nothing suspicious is done in power1_average_min_show() and
>>>>> power1_average_min_store() AFAICS.
>>>>>
>>>>
>>>> I suspect a circular locking problem between the hwmon lock and the
>>>> resource lock. Unfortunately I don't immediately see it.
>>>>
>>>> Would it be possible to test this with a kernel which has lock debugging
>>>> enabled ? I'd test it myself but I don't have a system available
>>>> that supports the acpi power meter.
>>>
>>> One problem I found in acpi_power_meter through code inspection is
>>> calling hwmon_device_unregister() under resource->lock in
>>> acpi_power_meter_notify(), which may be responsible for the observed
>>> symptoms, but then I'm not sure why it started to be visible after
>>> 6.18.
>>>
>>
>> I noticed that too. It is kind of broken; a driver should never do that.
>> The proper solution would have been to re-evaluate sensor visibility.
>> I have a patch series to add that capability to the hwmon core, but that
>> is untested so I never submitted it.
>>
>> Anyway, I don't immediately see how that would trigger the problem.
>>
>> Guenter
>>
> 
> Hello,
> 
> Thank you for the analysis and insights.
> 
> To add some context from our side: we observe this issue shortly after
> a server reinstall or fw updates followed by a reboot, typically when
> hwmon sensors are accessed for the first time (e.g. by monitoring during
> early system initialization). We have not seen it trigger during normal
> operation, such as a simple reboot or on long-running systems.
> 
> Given this behavior, it is possible that the issue is related to
> hwmon_device_unregister() being triggered early during system
> initialization, but I do not have concrete evidence at this point.
> 
> As a workaround, we will exclude the ACPI power meter from hwmon monitoring
> (Grafana Alloy) until the mentioned refactoring or a fix becomes available.
> 

Would it be possible for you to test the patch series I just sent out ?

Thanks,
Guenter


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-01-23 18:23           ` Guenter Roeck
@ 2026-01-27 11:58             ` Jaroslav Pulchart
  2026-01-27 14:24               ` Guenter Roeck
  0 siblings, 1 reply; 60+ messages in thread
From: Jaroslav Pulchart @ 2026-01-27 11:58 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Rafael J. Wysocki, linux-acpi, linux-hwmon, Igor Raits,
	Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li

>
> Hi,
>
> On 1/23/26 02:19, Jaroslav Pulchart wrote:
> > čt 22. 1. 2026 v 23:28 odesílatel Guenter Roeck <linux@roeck-us.net> napsal:
> >>
> >> On 1/22/26 12:42, Rafael J. Wysocki wrote:
> >>> On Thu, Jan 22, 2026 at 8:51 PM Guenter Roeck <linux@roeck-us.net> wrote:
> >>>>
> >>>> On 1/22/26 10:55, Rafael J. Wysocki wrote:
> >>>>> On Thu, Jan 22, 2026 at 7:21 PM Jaroslav Pulchart
> >>>>> <jaroslav.pulchart@gooddata.com> wrote:
> >>>>>>
> >>>>>> Hello,
> >>>>>>
> >>>>>> after upgrading from kernel 6.17.y to 6.18.y we started to observe a regression
> >>>>>> in the ACPI power meter hwmon interface. Reading power*_average sysfs
> >>>>>> attributes blocks indefinitely and causes tasks to enter uninterruptible
> >>>>>> sleep (D state).
> >>>>>
> >>>>> The most recent change in the acpi_power_meter driver was made in
> >>>>> 6.15, so this is not a regression in that driver.
> >>>>>
> >>>>> Also, nothing suspicious is done in power1_average_min_show() and
> >>>>> power1_average_min_store() AFAICS.
> >>>>>
> >>>>
> >>>> I suspect a circular locking problem between the hwmon lock and the
> >>>> resource lock. Unfortunately I don't immediately see it.
> >>>>
> >>>> Would it be possible to test this with a kernel which has lock debugging
> >>>> enabled ? I'd test it myself but I don't have a system available
> >>>> that supports the acpi power meter.
> >>>
> >>> One problem I found in acpi_power_meter through code inspection is
> >>> calling hwmon_device_unregister() under resource->lock in
> >>> acpi_power_meter_notify(), which may be responsible for the observed
> >>> symptoms, but then I'm not sure why it started to be visible after
> >>> 6.18.
> >>>
> >>
> >> I noticed that too. It is kind of broken; a driver should never do that.
> >> The proper solution would have been to re-evaluate sensor visibility.
> >> I have a patch series to add that capability to the hwmon core, but that
> >> is untested so I never submitted it.
> >>
> >> Anyway, I don't immediately see how that would trigger the problem.
> >>
> >> Guenter
> >>
> >
> > Hello,
> >
> > Thank you for the analysis and insights.
> >
> > To add some context from our side: we observe this issue shortly after
> > a server reinstall or fw updates followed by a reboot, typically when
> > hwmon sensors are accessed for the first time (e.g. by monitoring during
> > early system initialization). We have not seen it trigger during normal
> > operation, such as a simple reboot or on long-running systems.
> >
> > Given this behavior, it is possible that the issue is related to
> > hwmon_device_unregister() being triggered early during system
> > initialization, but I do not have concrete evidence at this point.
> >
> > As a workaround, we will exclude the ACPI power meter from hwmon monitoring
> > (Grafana Alloy) until the mentioned refactoring or a fix becomes available.
> >
>
> Would it be possible for you to test the patch series I just sent out ?

I suppose these 5 patches from
https://lore.kernel.org/lkml/20260123182208.2229670-1-linux@roeck-us.net/
 ?
I will try them and let you know.

>
> Thanks,
> Guenter
>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-01-27 11:58             ` Jaroslav Pulchart
@ 2026-01-27 14:24               ` Guenter Roeck
  2026-01-29 15:26                 ` Jaroslav Pulchart
  0 siblings, 1 reply; 60+ messages in thread
From: Guenter Roeck @ 2026-01-27 14:24 UTC (permalink / raw)
  To: Jaroslav Pulchart
  Cc: Rafael J. Wysocki, linux-acpi, linux-hwmon, Igor Raits,
	Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li

On 1/27/26 03:58, Jaroslav Pulchart wrote:
...
>>> Hello,
>>>
>>> Thank you for the analysis and insights.
>>>
>>> To add some context from our side: we observe this issue shortly after
>>> a server reinstall or fw updates followed by a reboot, typically when
>>> hwmon sensors are accessed for the first time (e.g. by monitoring during
>>> early system initialization). We have not seen it trigger during normal
>>> operation, such as a simple reboot or on long-running systems.
>>>
>>> Given this behavior, it is possible that the issue is related to
>>> hwmon_device_unregister() being triggered early during system
>>> initialization, but I do not have concrete evidence at this point.
>>>
>>> As a workaround, we will exclude the ACPI power meter from hwmon monitoring
>>> (Grafana Alloy) until the mentioned refactoring or a fix becomes available.
>>>
>>
>> Would it be possible for you to test the patch series I just sent out ?
> 
> I suppose these 5 patches from
> https://lore.kernel.org/lkml/20260123182208.2229670-1-linux@roeck-us.net/
>   ?

Correct.

> I will try them and let you know.
> 
Thanks !

Guenter


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-01-27 14:24               ` Guenter Roeck
@ 2026-01-29 15:26                 ` Jaroslav Pulchart
  2026-01-29 18:22                   ` Guenter Roeck
  0 siblings, 1 reply; 60+ messages in thread
From: Jaroslav Pulchart @ 2026-01-29 15:26 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Rafael J. Wysocki, linux-acpi, linux-hwmon, Igor Raits,
	Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li

>
> On 1/27/26 03:58, Jaroslav Pulchart wrote:
> ...
> >>> Hello,
> >>>
> >>> Thank you for the analysis and insights.
> >>>
> >>> To add some context from our side: we observe this issue shortly after
> >>> a server reinstall or fw updates followed by a reboot, typically when
> >>> hwmon sensors are accessed for the first time (e.g. by monitoring during
> >>> early system initialization). We have not seen it trigger during normal
> >>> operation, such as a simple reboot or on long-running systems.
> >>>
> >>> Given this behavior, it is possible that the issue is related to
> >>> hwmon_device_unregister() being triggered early during system
> >>> initialization, but I do not have concrete evidence at this point.
> >>>
> >>> As a workaround, we will exclude the ACPI power meter from hwmon monitoring
> >>> (Grafana Alloy) until the mentioned refactoring or a fix becomes available.
> >>>
> >>
> >> Would it be possible for you to test the patch series I just sent out ?
> >
> > I suppose these 5 patches from
> > https://lore.kernel.org/lkml/20260123182208.2229670-1-linux@roeck-us.net/
> >   ?
>
> Correct.
>
> > I will try them and let you know.
> >
> Thanks !


The issue is still there:
...
1769654131840 2026-01-29T02:35:31.840Z Jan 29 03:35:31 10.8.8.76  [
595.470961] usb 3-1.2: new high-speed USB device number 15 using
xhci_hcd
1769654131840 2026-01-29T02:35:31.840Z Jan 29 03:35:31 10.8.8.76  [
595.569212] usb 3-1.2: New USB device found, idVendor=413c,
idProduct=0006, bcdDevice= 0.00
1769654131840 2026-01-29T02:35:31.840Z Jan 29 03:35:31 10.8.8.76  [
595.577772] usb 3-1.2: New USB device strings: Mfr=1, Product=2,
SerialNumber=3
1769654131840 2026-01-29T02:35:31.840Z Jan 29 03:35:31 10.8.8.76  [
595.585217] usb 3-1.2: Product: DRAC 5 Virtual Keyboard and Mouse
1769654131840 2026-01-29T02:35:31.840Z Jan 29 03:35:31 10.8.8.76  [
595.591473] usb 3-1.2: Manufacturer: DELL
1769654131840 2026-01-29T02:35:31.840Z Jan 29 03:35:31 10.8.8.76  [
595.609039] usb 3-1.2: SerialNumber: DELL413C-1
1769654132091 2026-01-29T02:35:32.091Z Jan 29 03:35:32 10.8.8.76  [
595.761012] input: DELL DRAC 5 Virtual Keyboard and Mouse as
/devices/pci0000:c0/0000:c0:07.1/0000:c4:00.4/usb3/3-1/3-1.2/3-1.2:1.0/0003:413C:0006.0001/input/input1
1769654132091 2026-01-29T02:35:32.091Z Jan 29 03:35:32 10.8.8.76  [
595.783038] hid-generic 0003:413C:0006.0001: input,hidraw0: USB HID
v1.01 Mouse [DELL DRAC 5 Virtual Keyboard and Mouse] on
usb-0000:c4:00.4-1.2/input0
1769654132091 2026-01-29T02:35:32.091Z Jan 29 03:35:32 10.8.8.76  [
595.815191] input: DELL DRAC 5 Virtual Keyboard and Mouse as
/devices/pci0000:c0/0000:c0:07.1/0000:c4:00.4/usb3/3-1/3-1.2/3-1.2:1.1/0003:413C:0006.0002/input/input2
1769654132341 2026-01-29T02:35:32.341Z Jan 29 03:35:32 10.8.8.76  [
595.933058] hid-generic 0003:413C:0006.0002: input,hidraw1: USB HID
v1.01 Keyboard [DELL DRAC 5 Virtual Keyboard and Mouse] on
usb-0000:c4:00.4-1.2/input1
1769654164161 2026-01-29T02:36:04.161Z Jan 29 03:36:04 10.8.8.76  [
627.816246] usb 3-1.4.1: USB disconnect, device number 10
1769654164913 2026-01-29T02:36:04.913Z Jan 29 03:36:04 10.8.8.76  [
628.584247] usb 3-1.4.2: USB disconnect, device number 12
1769654165666 2026-01-29T02:36:05.666Z Jan 29 03:36:05 10.8.8.76  [
629.352371] usb 3-1.1.2: USB disconnect, device number 11
1769654166417 2026-01-29T02:36:06.417Z Jan 29 03:36:06 10.8.8.76  [
630.120248] usb 3-1.4.3: USB disconnect, device number 14
1769654167168 2026-01-29T02:36:07.168Z Jan 29 03:36:07 10.8.8.76  [
630.888372] usb 3-1.1.3: USB disconnect, device number 13
1769654174684 2026-01-29T02:36:14.684Z Jan 29 03:36:14 10.8.8.76  [
638.222248] usb 3-1.3: new high-speed USB device number 16 using
xhci_hcd
1769654174684 2026-01-29T02:36:14.684Z Jan 29 03:36:14 10.8.8.76  [
638.324502] usb 3-1.3: New USB device found, idVendor=413c,
idProduct=a102, bcdDevice= 3.16
1769654174684 2026-01-29T02:36:14.684Z Jan 29 03:36:14 10.8.8.76  [
638.333081] usb 3-1.3: New USB device strings: Mfr=1, Product=2,
SerialNumber=3
1769654174684 2026-01-29T02:36:14.684Z Jan 29 03:36:14 10.8.8.76  [
638.340535] usb 3-1.3: Product: iDRAC Virtual NIC USB Device
1769654174684 2026-01-29T02:36:14.684Z Jan 29 03:36:14 10.8.8.76  [
638.352328] usb 3-1.3: Manufacturer: Dell(TM)
1769654174684 2026-01-29T02:36:14.684Z Jan 29 03:36:14 10.8.8.76  [
638.356825] usb 3-1.3: SerialNumber: 5678
1769654174684 2026-01-29T02:36:14.684Z Jan 29 03:36:14 10.8.8.76  [
638.416009] cdc_ether 3-1.3:1.0 eth0: register 'cdc_ether' at
usb-0000:c4:00.4-1.3, CDC Ethernet Device, b4:e9:b8:4f:6e:91
1769654174934 2026-01-29T02:36:14.934Z Jan 29 03:36:14 10.8.8.76  [
638.454385] cdc_ether 3-1.3:1.0 idrac: renamed from eth0
1769654179695 2026-01-29T02:36:19.695Z Jan 29 03:36:19 10.8.8.76  [
643.432511] usb 3-1.3: USB disconnect, device number 16
1769654179695 2026-01-29T02:36:19.695Z Jan 29 03:36:19 10.8.8.76  [
643.444350] cdc_ether 3-1.3:1.0 idrac: unregister 'cdc_ether'
usb-0000:c4:00.4-1.3, CDC Ethernet Device
1769654179945 2026-01-29T02:36:19.945Z Jan 29 03:36:19 10.8.8.76  [
643.661285] usb 3-1.3: new high-speed USB device number 17 using
xhci_hcd
1769654180195 2026-01-29T02:36:20.195Z Jan 29 03:36:19 10.8.8.76  [
643.763508] usb 3-1.3: New USB device found, idVendor=413c,
idProduct=a102, bcdDevice= 3.16
1769654180195 2026-01-29T02:36:20.195Z Jan 29 03:36:20 10.8.8.76  [
643.772075] usb 3-1.3: New USB device strings: Mfr=1, Product=2,
SerialNumber=3
1769654180195 2026-01-29T02:36:20.195Z Jan 29 03:36:20 10.8.8.76  [
643.779541] usb 3-1.3: Product: iDRAC Virtual NIC USB Device
1769654180195 2026-01-29T02:36:20.195Z Jan 29 03:36:20 10.8.8.76  [
643.791349] usb 3-1.3: Manufacturer: Dell(TM)
1769654180195 2026-01-29T02:36:20.195Z Jan 29 03:36:20 10.8.8.76  [
643.795849] usb 3-1.3: SerialNumber: 5678
1769654180195 2026-01-29T02:36:20.195Z Jan 29 03:36:20 10.8.8.76  [
643.856016] cdc_ether 3-1.3:1.0 eth0: register 'cdc_ether' at
usb-0000:c4:00.4-1.3, CDC Ethernet Device, b4:e9:b8:4f:6e:91
1769654180195 2026-01-29T02:36:20.195Z Jan 29 03:36:20 10.8.8.76  [
643.902986] cdc_ether 3-1.3:1.0 idrac: renamed from eth0
1769654202995 2026-01-29T02:36:42.995Z Jan 29 03:36:42 10.8.8.76  [
666.728536] usb 3-1.2: USB disconnect, device number 15
1769654276407 2026-01-29T02:37:56.407Z Jan 29 03:37:56 10.8.8.76  [
739.934016] INFO: task alloy:5878 blocked for more than 122 seconds.
1769654276407 2026-01-29T02:37:56.407Z Jan 29 03:37:56 10.8.8.76  [
739.940614]       Tainted: G            E
6.18.7-2.gdc.el9.x86_64 #1
1769654276407 2026-01-29T02:37:56.407Z Jan 29 03:37:56 10.8.8.76  [
739.947641] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
1769654276407 2026-01-29T02:37:56.407Z Jan 29 03:37:56 10.8.8.76  [
739.962034] task:alloy           state:D stack:0     pid:5878
tgid:5871  ppid:1      task_flags:0x400140 flags:0x00080001
1769654276407 2026-01-29T02:37:56.407Z Jan 29 03:37:56 10.8.8.76  [
739.973362] Call Trace:
1769654276407 2026-01-29T02:37:56.407Z Jan 29 03:37:56 10.8.8.76  [
739.979963]  <TASK>
1769654276407 2026-01-29T02:37:56.407Z Jan 29 03:37:56 10.8.8.76  [
739.982237]  __schedule+0x2b5/0x690
1769654276407 2026-01-29T02:37:56.407Z Jan 29 03:37:56 10.8.8.76  [
739.988110]  schedule+0x23/0x80
1769654276407 2026-01-29T02:37:56.407Z Jan 29 03:37:56 10.8.8.76  [
739.993923]  schedule_preempt_disabled+0x11/0x20
1769654276407 2026-01-29T02:37:56.407Z Jan 29 03:37:56 10.8.8.76  [
740.000940]  __mutex_lock.constprop.0+0x3c9/0xa00
1769654276407 2026-01-29T02:37:56.407Z Jan 29 03:37:56 10.8.8.76  [
740.007959]  hwmon_attr_show+0x36/0x130
1769654276407 2026-01-29T02:37:56.407Z Jan 29 03:37:56 10.8.8.76  [
740.014925]  dev_attr_show+0x19/0x60
1769654276407 2026-01-29T02:37:56.407Z Jan 29 03:37:56 10.8.8.76  [
740.020958]  sysfs_kf_seq_show+0xbf/0x140
1769654276407 2026-01-29T02:37:56.407Z Jan 29 03:37:56 10.8.8.76  [
740.027228]  seq_read_iter+0x112/0x510
1769654276407 2026-01-29T02:37:56.407Z Jan 29 03:37:56 10.8.8.76  [
740.033861]  ? security_file_permission+0x8e/0xa0
1769654276407 2026-01-29T02:37:56.407Z Jan 29 03:37:56 10.8.8.76  [
740.041930]  vfs_read+0x215/0x340
1769654276407 2026-01-29T02:37:56.407Z Jan 29 03:37:56 10.8.8.76  [
740.048231]  ksys_read+0x61/0xe0
1769654276407 2026-01-29T02:37:56.407Z Jan 29 03:37:56 10.8.8.76  [
740.054964]  do_syscall_64+0x5d/0xc20
1769654276407 2026-01-29T02:37:56.407Z Jan 29 03:37:56 10.8.8.76  [
740.060965]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
1769654276407 2026-01-29T02:37:56.407Z Jan 29 03:37:56 10.8.8.76  [
740.067924] RIP: 0033:0x408d8e
1769654276407 2026-01-29T02:37:56.407Z Jan 29 03:37:56 10.8.8.76  [
740.073918] RSP: 002b:000000c003785498 EFLAGS: 00000206 ORIG_RAX:
0000000000000000
1769654276407 2026-01-29T02:37:56.407Z Jan 29 03:37:56 10.8.8.76  [
740.083576] RAX: ffffffffffffffda RBX: 000000000000001e RCX:
0000000000408d8e
1769654276407 2026-01-29T02:37:56.407Z Jan 29 03:37:56 10.8.8.76  [
740.092940] RDX: 0000000000000080 RSI: 000000c007188f80 RDI:
000000000000001e
1769654276407 2026-01-29T02:37:56.407Z Jan 29 03:37:56 10.8.8.76  [
740.101775] RBP: 000000c0037854d8 R08: 0000000000000000 R09:
0000000000000000
1769654276407 2026-01-29T02:37:56.407Z Jan 29 03:37:56 10.8.8.76  [
740.111924] R10: 0000000000000000 R11: 0000000000000206 R12:
0000000000000000
1769654276407 2026-01-29T02:37:56.407Z Jan 29 03:37:56 10.8.8.76  [
740.121914] R13: 0000000000000040 R14: 000000c00632c000 R15:
ffffffffffffffff
1769654276407 2026-01-29T02:37:56.407Z Jan 29 03:37:56 10.8.8.76  [
740.131932]  </TASK>
1769654276407 2026-01-29T02:37:56.407Z Jan 29 03:37:56 10.8.8.76  [
740.135945] INFO: task alloy:5883 blocked for more than 123 seconds.
1769654276407 2026-01-29T02:37:56.407Z Jan 29 03:37:56 10.8.8.76  [
740.144953]       Tainted: G            E
6.18.7-2.gdc.el9.x86_64 #1
1769654276407 2026-01-29T02:37:56.407Z Jan 29 03:37:56 10.8.8.76  [
740.153936] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
1769654276407 2026-01-29T02:37:56.407Z Jan 29 03:37:56 10.8.8.76  [
740.163925] task:alloy           state:D stack:0     pid:5883
tgid:5871  ppid:1      task_flags:0x400140 flags:0x00080001
1769654276407 2026-01-29T02:37:56.407Z Jan 29 03:37:56 10.8.8.76  [
740.177931] Call Trace:
1769654276657 2026-01-29T02:37:56.657Z Jan 29 03:37:56 10.8.8.76  [
740.182925]  <TASK>
1769654276657 2026-01-29T02:37:56.657Z Jan 29 03:37:56 10.8.8.76  [
740.186927]  __schedule+0x2b5/0x690
1769654276657 2026-01-29T02:37:56.657Z Jan 29 03:37:56 10.8.8.76  [
740.192928]  schedule+0x23/0x80
1769654276657 2026-01-29T02:37:56.657Z Jan 29 03:37:56 10.8.8.76  [
740.198920]  schedule_preempt_disabled+0x11/0x20
1769654276657 2026-01-29T02:37:56.657Z Jan 29 03:37:56 10.8.8.76  [
740.205947]  __mutex_lock.constprop.0+0x3c9/0xa00
1769654276658 2026-01-29T02:37:56.658Z Jan 29 03:37:56 10.8.8.76  [
740.212932]  hwmon_attr_show+0x36/0x130
1769654276658 2026-01-29T02:37:56.658Z Jan 29 03:37:56 10.8.8.76  [
740.218871]  dev_attr_show+0x19/0x60
1769654276658 2026-01-29T02:37:56.658Z Jan 29 03:37:56 10.8.8.76  [
740.224925]  sysfs_kf_seq_show+0xbf/0x140
1769654276658 2026-01-29T02:37:56.658Z Jan 29 03:37:56 10.8.8.76  [
740.230957]  seq_read_iter+0x112/0x510
1769654276658 2026-01-29T02:37:56.658Z Jan 29 03:37:56 10.8.8.76  [
740.235922]  ? security_file_permission+0x8e/0xa0
1769654276658 2026-01-29T02:37:56.658Z Jan 29 03:37:56 10.8.8.76  [
740.242918]  vfs_read+0x215/0x340
1769654276658 2026-01-29T02:37:56.658Z Jan 29 03:37:56 10.8.8.76  [
740.248342]  ksys_read+0x61/0xe0
1769654276658 2026-01-29T02:37:56.658Z Jan 29 03:37:56 10.8.8.76  [
740.252913]  do_syscall_64+0x5d/0xc20
1769654276658 2026-01-29T02:37:56.658Z Jan 29 03:37:56 10.8.8.76  [
740.258922]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
1769654276658 2026-01-29T02:37:56.658Z Jan 29 03:37:56 10.8.8.76  [
740.265947] RIP: 0033:0x408d8e
1769654276658 2026-01-29T02:37:56.658Z Jan 29 03:37:56 10.8.8.76  [
740.269963] RSP: 002b:000000c003705498 EFLAGS: 00000206 ORIG_RAX:
0000000000000000
1769654276658 2026-01-29T02:37:56.658Z Jan 29 03:37:56 10.8.8.76  [
740.279922] RAX: ffffffffffffffda RBX: 0000000000000020 RCX:
0000000000408d8e
1769654276658 2026-01-29T02:37:56.658Z Jan 29 03:37:56 10.8.8.76  [
740.289908] RDX: 0000000000000080 RSI: 000000c003fdd880 RDI:
0000000000000020
1769654276658 2026-01-29T02:37:56.658Z Jan 29 03:37:56 10.8.8.76  [
740.298953] RBP: 000000c0037054d8 R08: 0000000000000000 R09:
0000000000000000
1769654276658 2026-01-29T02:37:56.658Z Jan 29 03:37:56 10.8.8.76  [
740.307962] R10: 0000000000000000 R11: 0000000000000206 R12:
0000000000000000
1769654276658 2026-01-29T02:37:56.658Z Jan 29 03:37:56 10.8.8.76  [
740.317926] R13: 0000000000000040 R14: 000000c00632d180 R15:
ffffffffffffffff
1769654276658 2026-01-29T02:37:56.658Z Jan 29 03:37:56 10.8.8.76  [
740.327908]  </TASK>
1769654276658 2026-01-29T02:37:56.658Z Jan 29 03:37:56 10.8.8.76  [
740.331948] INFO: task alloy:5894 blocked for more than 123 seconds.
1769654276658 2026-01-29T02:37:56.658Z Jan 29 03:37:56 10.8.8.76  [
740.340925]       Tainted: G            E
6.18.7-2.gdc.el9.x86_64 #1
1769654276658 2026-01-29T02:37:56.658Z Jan 29 03:37:56 10.8.8.76  [
740.349925] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
1769654276658 2026-01-29T02:37:56.658Z Jan 29 03:37:56 10.8.8.76  [
740.358991] task:alloy           state:D stack:0     pid:5894
tgid:5871  ppid:1      task_flags:0x400140 flags:0x00080001
1769654276658 2026-01-29T02:37:56.658Z Jan 29 03:37:56 10.8.8.76  [
740.371996] Call Trace:
1769654276658 2026-01-29T02:37:56.658Z Jan 29 03:37:56 10.8.8.76  [
740.375999]  <TASK>
1769654276658 2026-01-29T02:37:56.658Z Jan 29 03:37:56 10.8.8.76  [
740.380121]  __schedule+0x2b5/0x690
1769654276658 2026-01-29T02:37:56.658Z Jan 29 03:37:56 10.8.8.76  [
740.386395]  schedule+0x23/0x80
1769654276658 2026-01-29T02:37:56.658Z Jan 29 03:37:56 10.8.8.76  [
740.391923]  schedule_preempt_disabled+0x11/0x20
1769654276658 2026-01-29T02:37:56.658Z Jan 29 03:37:56 10.8.8.76  [
740.398449]  __mutex_lock.constprop.0+0x3c9/0xa00
1769654276658 2026-01-29T02:37:56.658Z Jan 29 03:37:56 10.8.8.76  [
740.404927]  hwmon_attr_show+0x36/0x130
1769654276658 2026-01-29T02:37:56.658Z Jan 29 03:37:56 10.8.8.76  [
740.410924]  dev_attr_show+0x19/0x60
1769654276658 2026-01-29T02:37:56.658Z Jan 29 03:37:56 10.8.8.76  [
740.416926]  sysfs_kf_seq_show+0xbf/0x140
1769654276658 2026-01-29T02:37:56.658Z Jan 29 03:37:56 10.8.8.76  [
740.422950]  seq_read_iter+0x112/0x510
1769654276908 2026-01-29T02:37:56.908Z Jan 29 03:37:56 10.8.8.76  [
740.428929]  ? security_file_permission+0x8e/0xa0
1769654276908 2026-01-29T02:37:56.908Z Jan 29 03:37:56 10.8.8.76  [
740.435194]  vfs_read+0x215/0x340
1769654276908 2026-01-29T02:37:56.908Z Jan 29 03:37:56 10.8.8.76  [
740.440919]  ksys_read+0x61/0xe0
1769654276908 2026-01-29T02:37:56.908Z Jan 29 03:37:56 10.8.8.76  [
740.446913]  do_syscall_64+0x5d/0xc20
1769654276908 2026-01-29T02:37:56.908Z Jan 29 03:37:56 10.8.8.76  [
740.451979]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
1769654276908 2026-01-29T02:37:56.908Z Jan 29 03:37:56 10.8.8.76  [
740.459932] RIP: 0033:0x408d8e
1769654276908 2026-01-29T02:37:56.908Z Jan 29 03:37:56 10.8.8.76  [
740.465174] RSP: 002b:000000c0054f1498 EFLAGS: 00000206 ORIG_RAX:
0000000000000000
1769654276908 2026-01-29T02:37:56.908Z Jan 29 03:37:56 10.8.8.76  [
740.474951] RAX: ffffffffffffffda RBX: 000000000000001f RCX:
0000000000408d8e
1769654276909 2026-01-29T02:37:56.909Z Jan 29 03:37:56 10.8.8.76  [
740.483934] RDX: 0000000000000080 RSI: 000000c0061b7c80 RDI:
000000000000001f
1769654276909 2026-01-29T02:37:56.909Z Jan 29 03:37:56 10.8.8.76  [
740.493067] RBP: 000000c0054f14d8 R08: 0000000000000000 R09:
0000000000000000
1769654276909 2026-01-29T02:37:56.909Z Jan 29 03:37:56 10.8.8.76  [
740.502497] R10: 0000000000000000 R11: 0000000000000206 R12:
0000000000000000
1769654276909 2026-01-29T02:37:56.909Z Jan 29 03:37:56 10.8.8.76  [
740.511956] R13: 0000000000000040 R14: 000000c002f57340 R15:
ffffffffffffffff
1769654276909 2026-01-29T02:37:56.909Z Jan 29 03:37:56 10.8.8.76  [
740.521906]  </TASK>
1769654276909 2026-01-29T02:37:56.909Z Jan 29 03:37:56 10.8.8.76  [
740.525960] INFO: task alloy:5897 blocked for more than 123 seconds.
1769654276909 2026-01-29T02:37:56.909Z Jan 29 03:37:56 10.8.8.76  [
740.534895]       Tainted: G            E
6.18.7-2.gdc.el9.x86_64 #1
1769654276909 2026-01-29T02:37:56.909Z Jan 29 03:37:56 10.8.8.76  [
740.543928] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
1769654276909 2026-01-29T02:37:56.909Z Jan 29 03:37:56 10.8.8.76  [
740.552923] task:alloy           state:D stack:0     pid:5897
tgid:5871  ppid:1      task_flags:0x400140 flags:0x00080001
1769654276909 2026-01-29T02:37:56.909Z Jan 29 03:37:56 10.8.8.76  [
740.565440] Call Trace:
1769654276909 2026-01-29T02:37:56.909Z Jan 29 03:37:56 10.8.8.76  [
740.569965]  <TASK>
1769654276909 2026-01-29T02:37:56.909Z Jan 29 03:37:56 10.8.8.76  [
740.574925]  __schedule+0x2b5/0x690
1769654276909 2026-01-29T02:37:56.909Z Jan 29 03:37:56 10.8.8.76  [
740.580925]  schedule+0x23/0x80
1769654276909 2026-01-29T02:37:56.909Z Jan 29 03:37:56 10.8.8.76  [
740.585336]  schedule_preempt_disabled+0x11/0x20
1769654276909 2026-01-29T02:37:56.909Z Jan 29 03:37:56 10.8.8.76  [
740.591994]  __mutex_lock.constprop.0+0x3c9/0xa00
1769654276909 2026-01-29T02:37:56.909Z Jan 29 03:37:56 10.8.8.76  [
740.598331]  hwmon_attr_show+0x36/0x130
1769654276909 2026-01-29T02:37:56.909Z Jan 29 03:37:56 10.8.8.76  [
740.603917]  dev_attr_show+0x19/0x60
1769654276909 2026-01-29T02:37:56.909Z Jan 29 03:37:56 10.8.8.76  [
740.609926]  sysfs_kf_seq_show+0xbf/0x140
1769654276909 2026-01-29T02:37:56.909Z Jan 29 03:37:56 10.8.8.76  [
740.615947]  seq_read_iter+0x112/0x510
1769654276909 2026-01-29T02:37:56.909Z Jan 29 03:37:56 10.8.8.76  [
740.621942]  ? security_file_permission+0x8e/0xa0
1769654276909 2026-01-29T02:37:56.909Z Jan 29 03:37:56 10.8.8.76  [
740.628930]  vfs_read+0x215/0x340
1769654276909 2026-01-29T02:37:56.909Z Jan 29 03:37:56 10.8.8.76  [
740.634928]  ksys_read+0x61/0xe0
1769654276909 2026-01-29T02:37:56.909Z Jan 29 03:37:56 10.8.8.76  [
740.640933]  do_syscall_64+0x5d/0xc20
1769654276909 2026-01-29T02:37:56.909Z Jan 29 03:37:56 10.8.8.76  [
740.646757]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
1769654276909 2026-01-29T02:37:56.909Z Jan 29 03:37:56 10.8.8.76  [
740.652951] RIP: 0033:0x408d8e
1769654276909 2026-01-29T02:37:56.909Z Jan 29 03:37:56 10.8.8.76  [
740.657979] RSP: 002b:000000c00319d498 EFLAGS: 00000206 ORIG_RAX:
0000000000000000
1769654276909 2026-01-29T02:37:56.909Z Jan 29 03:37:56 10.8.8.76  [
740.666295] RAX: ffffffffffffffda RBX: 0000000000000022 RCX:
0000000000408d8e
1769654277159 2026-01-29T02:37:57.159Z Jan 29 03:37:56 10.8.8.76  [
740.675931] RDX: 0000000000000080 RSI: 000000c0068bf480 RDI:
0000000000000022
1769654277159 2026-01-29T02:37:57.159Z Jan 29 03:37:56 10.8.8.76  [
740.684957] RBP: 000000c00319d4d8 R08: 0000000000000000 R09:
0000000000000000
1769654277159 2026-01-29T02:37:57.159Z Jan 29 03:37:56 10.8.8.76  [
740.693924] R10: 0000000000000000 R11: 0000000000000206 R12:
0000000000000000
1769654277159 2026-01-29T02:37:57.159Z Jan 29 03:37:56 10.8.8.76  [
740.702959] R13: 0000000000000040 R14: 000000c0054eafc0 R15:
ffffffffffffffff
1769654277159 2026-01-29T02:37:57.159Z Jan 29 03:37:56 10.8.8.76  [
740.712950]  </TASK>
1769654277159 2026-01-29T02:37:57.159Z Jan 29 03:37:56 10.8.8.76  [
740.717004] INFO: task alloy:5904 blocked for more than 123 seconds.
1769654277159 2026-01-29T02:37:57.159Z Jan 29 03:37:56 10.8.8.76  [
740.725940]       Tainted: G            E
6.18.7-2.gdc.el9.x86_64 #1
1769654277159 2026-01-29T02:37:57.159Z Jan 29 03:37:56 10.8.8.76  [
740.734461] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
1769654277159 2026-01-29T02:37:57.159Z Jan 29 03:37:56 10.8.8.76  [
740.744928] task:alloy           state:D stack:0     pid:5904
tgid:5871  ppid:1      task_flags:0x400140 flags:0x00080001
1769654277159 2026-01-29T02:37:57.159Z Jan 29 03:37:56 10.8.8.76  [
740.757745] Call Trace:
1769654277159 2026-01-29T02:37:57.159Z Jan 29 03:37:56 10.8.8.76  [
740.762008]  <TASK>
1769654277159 2026-01-29T02:37:57.159Z Jan 29 03:37:56 10.8.8.76  [
740.765963]  __schedule+0x2b5/0x690
1769654277159 2026-01-29T02:37:57.159Z Jan 29 03:37:57 10.8.8.76  [
740.771933]  schedule+0x23/0x80
1769654277159 2026-01-29T02:37:57.159Z Jan 29 03:37:57 10.8.8.76  [
740.776962]  schedule_preempt_disabled+0x11/0x20
1769654277159 2026-01-29T02:37:57.159Z Jan 29 03:37:57 10.8.8.76  [
740.783960]  __mutex_lock.constprop.0+0x3c9/0xa00
1769654277159 2026-01-29T02:37:57.159Z Jan 29 03:37:57 10.8.8.76  [
740.789927]  hwmon_attr_show+0x36/0x130
1769654277159 2026-01-29T02:37:57.159Z Jan 29 03:37:57 10.8.8.76  [
740.795947]  dev_attr_show+0x19/0x60
1769654277159 2026-01-29T02:37:57.159Z Jan 29 03:37:57 10.8.8.76  [
740.801578]  sysfs_kf_seq_show+0xbf/0x140
1769654277159 2026-01-29T02:37:57.159Z Jan 29 03:37:57 10.8.8.76  [
740.807094]  seq_read_iter+0x112/0x510
1769654277159 2026-01-29T02:37:57.159Z Jan 29 03:37:57 10.8.8.76  [
740.812920]  ? security_file_permission+0x8e/0xa0
1769654277159 2026-01-29T02:37:57.159Z Jan 29 03:37:57 10.8.8.76  [
740.819940]  vfs_read+0x215/0x340
1769654277159 2026-01-29T02:37:57.159Z Jan 29 03:37:57 10.8.8.76  [
740.824996]  ksys_read+0x61/0xe0
1769654277159 2026-01-29T02:37:57.159Z Jan 29 03:37:57 10.8.8.76  [
740.829967]  do_syscall_64+0x5d/0xc20
1769654277159 2026-01-29T02:37:57.159Z Jan 29 03:37:57 10.8.8.76  [
740.835946]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
1769654277159 2026-01-29T02:37:57.159Z Jan 29 03:37:57 10.8.8.76  [
740.843929] RIP: 0033:0x408d8e
1769654277159 2026-01-29T02:37:57.159Z Jan 29 03:37:57 10.8.8.76  [
740.848963] RSP: 002b:000000c003b6f498 EFLAGS: 00000206 ORIG_RAX:
0000000000000000
1769654277159 2026-01-29T02:37:57.159Z Jan 29 03:37:57 10.8.8.76  [
740.858930] RAX: ffffffffffffffda RBX: 0000000000000023 RCX:
0000000000408d8e
1769654277159 2026-01-29T02:37:57.159Z Jan 29 03:37:57 10.8.8.76  [
740.868102] RDX: 0000000000000080 RSI: 000000c006359b00 RDI:
0000000000000023
1769654277159 2026-01-29T02:37:57.159Z Jan 29 03:37:57 10.8.8.76  [
740.877926] RBP: 000000c003b6f4d8 R08: 0000000000000000 R09:
0000000000000000
1769654277159 2026-01-29T02:37:57.159Z Jan 29 03:37:57 10.8.8.76  [
740.886960] R10: 0000000000000000 R11: 0000000000000206 R12:
0000000000000000
1769654277160 2026-01-29T02:37:57.160Z Jan 29 03:37:57 10.8.8.76  [
740.895980] R13: 0000000000000040 R14: 000000c0061cfa40 R15:
ffffffffffffffff
1769654277160 2026-01-29T02:37:57.160Z Jan 29 03:37:57 10.8.8.76  [
740.905231]  </TASK>
1769654277160 2026-01-29T02:37:57.160Z Jan 29 03:37:57 10.8.8.76  [
740.909934] INFO: task alloy:5976 blocked for more than 123 seconds.
1769654277160 2026-01-29T02:37:57.160Z Jan 29 03:37:57 10.8.8.76  [
740.918925]       Tainted: G            E
6.18.7-2.gdc.el9.x86_64 #1
1769654277409 2026-01-29T02:37:57.409Z Jan 29 03:37:57 10.8.8.76  [
740.927926] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
1769654277409 2026-01-29T02:37:57.409Z Jan 29 03:37:57 10.8.8.76  [
740.937913] task:alloy           state:D stack:0     pid:5976
tgid:5871  ppid:1      task_flags:0x400140 flags:0x00080001
1769654277409 2026-01-29T02:37:57.409Z Jan 29 03:37:57 10.8.8.76  [
740.950975] Call Trace:
1769654277409 2026-01-29T02:37:57.409Z Jan 29 03:37:57 10.8.8.76  [
740.955934]  <TASK>
1769654277409 2026-01-29T02:37:57.409Z Jan 29 03:37:57 10.8.8.76  [
740.959916]  __schedule+0x2b5/0x690
1769654277409 2026-01-29T02:37:57.409Z Jan 29 03:37:57 10.8.8.76  [
740.965939]  schedule+0x23/0x80
1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [
740.971951]  schedule_preempt_disabled+0x11/0x20
1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [
740.978930]  __mutex_lock.constprop.0+0x3c9/0xa00
1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [
740.985951]  hwmon_attr_show+0x36/0x130
1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [
740.990927]  dev_attr_show+0x19/0x60
1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [
740.996933]  sysfs_kf_seq_show+0xbf/0x140
1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [
741.002963]  seq_read_iter+0x112/0x510
1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [
741.008447]  ? security_file_permission+0x8e/0xa0
1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [
741.015928]  vfs_read+0x215/0x340
1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [
741.020959]  ksys_read+0x61/0xe0
1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [
741.026759]  do_syscall_64+0x5d/0xc20
1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [
741.032936]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [
741.039236] RIP: 0033:0x408d8e
1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [
741.044928] RSP: 002b:000000c003b73498 EFLAGS: 00000206 ORIG_RAX:
0000000000000000
1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [
741.054940] RAX: ffffffffffffffda RBX: 0000000000000024 RCX:
0000000000408d8e
1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [
741.064928] RDX: 0000000000000080 RSI: 000000c007ca8380 RDI:
0000000000000024
1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [
741.073033] RBP: 000000c003b734d8 R08: 0000000000000000 R09:
0000000000000000
1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [
741.082927] R10: 0000000000000000 R11: 0000000000000206 R12:
0000000000000000
1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [
741.091964] R13: 0000000000000040 R14: 000000c0054eb880 R15:
ffffffffffffffff
1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [
741.101031]  </TASK>
1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [
741.105931] INFO: task alloy:5985 blocked for more than 124 seconds.
1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [
741.114475]       Tainted: G            E
6.18.7-2.gdc.el9.x86_64 #1
1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [
741.122872] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [
741.132932] task:alloy           state:D stack:0     pid:5985
tgid:5871  ppid:1      task_flags:0x400140 flags:0x00080001
1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [
741.145975] Call Trace:
1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [
741.149985]  <TASK>
1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [
741.154931]  __schedule+0x2b5/0x690
1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [
741.160934]  schedule+0x23/0x80
1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [
741.166498]  schedule_timeout+0xe8/0x100
1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [
741.172936]  __wait_for_common+0x99/0x1c0
1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [
741.179148]  ? __pfx_schedule_timeout+0x10/0x10
1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [
741.185936]  acpi_ipmi_space_handler.part.0+0x1ac/0x370 [acpi_ipmi]
1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [
741.194929]  acpi_ev_address_space_dispatch+0x16d/0x3c0
1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [
741.202218]  ? __pfx_acpi_ipmi_space_handler+0x10/0x10 [acpi_ipmi]
1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [
741.210928]  acpi_ex_access_region+0x5b/0xd0
1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [
741.217934]  acpi_ex_write_serial_bus+0xb6/0x1e0
1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [
741.224238]  acpi_ex_store_object_to_node+0x15e/0x200
1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [
741.231946]  acpi_ex_opcode_1A_1T_1R+0x24c/0x620
1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [
741.238714]  acpi_ds_exec_end_op+0x287/0x510
1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [
741.245939]  acpi_ps_parse_loop+0xf6/0x680
1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [
741.252154]  acpi_ps_parse_aml+0x17a/0x3d0
1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [
741.257945]  acpi_ps_execute_method+0x137/0x270
1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [
741.264913]  acpi_ns_evaluate+0x1f4/0x2e0
1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [
741.270934]  acpi_evaluate_object+0x13c/0x2e0
1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [
741.276946]  acpi_evaluate_integer+0x4f/0xe0
1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [
741.282982]  ? obj_cgroup_charge_account+0x187/0x2d0
1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [
741.289892]  update_meter+0x61/0xd0 [acpi_power_meter]
1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [
741.297754]  ? __memcg_slab_post_alloc_hook+0x182/0x380
1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [
741.305928]  power_meter_read+0x18b/0x280 [acpi_power_meter]
1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [
741.313928]  hwmon_attr_show+0x5e/0x130
1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [
741.319928]  dev_attr_show+0x19/0x60
1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [
741.324928]  sysfs_kf_seq_show+0xbf/0x140
1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [
741.329939]  seq_read_iter+0x112/0x510
1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [
741.334969]  ? security_file_permission+0x8e/0xa0
1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [
741.341935]  vfs_read+0x215/0x340
1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [
741.347926]  ksys_read+0x61/0xe0
1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [
741.352960]  do_syscall_64+0x5d/0xc20
1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [
741.358902]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [
741.365954] RIP: 0033:0x408d8e
1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [
741.370813] RSP: 002b:000000c0054f3498 EFLAGS: 00000206 ORIG_RAX:
0000000000000000
1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [
741.380930] RAX: ffffffffffffffda RBX: 000000000000001d RCX:
0000000000408d8e
1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [
741.390928] RDX: 0000000000000080 RSI: 000000c0041e8b00 RDI:
000000000000001d
1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [
741.399692] RBP: 000000c0054f34d8 R08: 0000000000000000 R09:
0000000000000000
1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [
741.407963] R10: 0000000000000000 R11: 0000000000000206 R12:
0000000000000000
1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [
741.416332] R13: 0000000000000040 R14: 000000c003e4cc40 R15:
ffffffffffffffff
1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [
741.425938]  </TASK>
1769654277910 2026-01-29T02:37:57.910Z Jan 29 03:37:57 10.8.8.76  [
741.429965] INFO: task alloy:6000 blocked for more than 124 seconds.
1769654277910 2026-01-29T02:37:57.910Z Jan 29 03:37:57 10.8.8.76  [
741.437931]       Tainted: G            E
6.18.7-2.gdc.el9.x86_64 #1
1769654277910 2026-01-29T02:37:57.910Z Jan 29 03:37:57 10.8.8.76  [
741.446711] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
1769654277910 2026-01-29T02:37:57.910Z Jan 29 03:37:57 10.8.8.76  [
741.456578] task:alloy           state:D stack:0     pid:6000
tgid:5871  ppid:1      task_flags:0x400140 flags:0x00080001
1769654277910 2026-01-29T02:37:57.910Z Jan 29 03:37:57 10.8.8.76  [
741.469626] Call Trace:
1769654277910 2026-01-29T02:37:57.910Z Jan 29 03:37:57 10.8.8.76  [
741.474928]  <TASK>
1769654277910 2026-01-29T02:37:57.910Z Jan 29 03:37:57 10.8.8.76  [
741.479810]  __schedule+0x2b5/0x690
1769654277910 2026-01-29T02:37:57.910Z Jan 29 03:37:57 10.8.8.76  [
741.485949]  schedule+0x23/0x80
1769654277910 2026-01-29T02:37:57.910Z Jan 29 03:37:57 10.8.8.76  [
741.490930]  schedule_preempt_disabled+0x11/0x20
1769654277910 2026-01-29T02:37:57.910Z Jan 29 03:37:57 10.8.8.76  [
741.497930]  __mutex_lock.constprop.0+0x3c9/0xa00
1769654277910 2026-01-29T02:37:57.910Z Jan 29 03:37:57 10.8.8.76  [
741.503935]  hwmon_attr_show+0x36/0x130
1769654277910 2026-01-29T02:37:57.910Z Jan 29 03:37:57 10.8.8.76  [
741.508962]  dev_attr_show+0x19/0x60
1769654277910 2026-01-29T02:37:57.910Z Jan 29 03:37:57 10.8.8.76  [
741.515927]  sysfs_kf_seq_show+0xbf/0x140
1769654277910 2026-01-29T02:37:57.910Z Jan 29 03:37:57 10.8.8.76  [
741.520941]  seq_read_iter+0x112/0x510
1769654277910 2026-01-29T02:37:57.910Z Jan 29 03:37:57 10.8.8.76  [
741.525978]  ? security_file_permission+0x8e/0xa0
1769654277910 2026-01-29T02:37:57.910Z Jan 29 03:37:57 10.8.8.76  [
741.532923]  vfs_read+0x215/0x340
1769654277910 2026-01-29T02:37:57.910Z Jan 29 03:37:57 10.8.8.76  [
741.537727]  ksys_read+0x61/0xe0
1769654277910 2026-01-29T02:37:57.910Z Jan 29 03:37:57 10.8.8.76  [
741.542967]  do_syscall_64+0x5d/0xc20
1769654277910 2026-01-29T02:37:57.910Z Jan 29 03:37:57 10.8.8.76  [
741.548919]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
1769654277910 2026-01-29T02:37:57.910Z Jan 29 03:37:57 10.8.8.76  [
741.555966] RIP: 0033:0x408d8e
1769654277910 2026-01-29T02:37:57.910Z Jan 29 03:37:57 10.8.8.76  [
741.561927] RSP: 002b:000000c006d8b498 EFLAGS: 00000206 ORIG_RAX:
0000000000000000
1769654277910 2026-01-29T02:37:57.910Z Jan 29 03:37:57 10.8.8.76  [
741.570935] RAX: ffffffffffffffda RBX: 000000000000001c RCX:
0000000000408d8e
1769654277910 2026-01-29T02:37:57.910Z Jan 29 03:37:57 10.8.8.76  [
741.580926] RDX: 0000000000000080 RSI: 000000c0033e1880 RDI:
000000000000001c
1769654277910 2026-01-29T02:37:57.910Z Jan 29 03:37:57 10.8.8.76  [
741.589949] RBP: 000000c006d8b4d8 R08: 0000000000000000 R09:
0000000000000000
1769654277910 2026-01-29T02:37:57.910Z Jan 29 03:37:57 10.8.8.76  [
741.598688] R10: 0000000000000000 R11: 0000000000000206 R12:
0000000000000000
1769654277910 2026-01-29T02:37:57.910Z Jan 29 03:37:57 10.8.8.76  [
741.607931] R13: 0000000000000040 R14: 000000c002e22700 R15:
ffffffffffffffff
1769654277910 2026-01-29T02:37:57.910Z Jan 29 03:37:57 10.8.8.76  [
741.616972]  </TASK>
1769654399212 2026-01-29T02:39:59.212Z Jan 29 03:39:59 10.8.8.76  [
862.814865] INFO: task alloy:5878 blocked for more than 245 seconds.
1769654399212 2026-01-29T02:39:59.212Z Jan 29 03:39:59 10.8.8.76  [
862.821416]       Tainted: G            E
6.18.7-2.gdc.el9.x86_64 #1
1769654399212 2026-01-29T02:39:59.212Z Jan 29 03:39:59 10.8.8.76  [
862.828408] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
1769654399212 2026-01-29T02:39:59.212Z Jan 29 03:39:59 10.8.8.76  [
862.839936] task:alloy           state:D stack:0     pid:5878
tgid:5871  ppid:1      task_flags:0x400140 flags:0x00080001
1769654399212 2026-01-29T02:39:59.212Z Jan 29 03:39:59 10.8.8.76  [
862.853805] Call Trace:
1769654399212 2026-01-29T02:39:59.212Z Jan 29 03:39:59 10.8.8.76  [
862.858790]  <TASK>
1769654399212 2026-01-29T02:39:59.212Z Jan 29 03:39:59 10.8.8.76  [
862.862927]  __schedule+0x2b5/0x690
1769654399212 2026-01-29T02:39:59.212Z Jan 29 03:39:59 10.8.8.76  [
862.869807]  schedule+0x23/0x80
1769654399212 2026-01-29T02:39:59.212Z Jan 29 03:39:59 10.8.8.76  [
862.874791]  schedule_preempt_disabled+0x11/0x20
1769654399212 2026-01-29T02:39:59.212Z Jan 29 03:39:59 10.8.8.76  [
862.880822]  __mutex_lock.constprop.0+0x3c9/0xa00
1769654399212 2026-01-29T02:39:59.212Z Jan 29 03:39:59 10.8.8.76  [
862.887787]  hwmon_attr_show+0x36/0x130
1769654399212 2026-01-29T02:39:59.212Z Jan 29 03:39:59 10.8.8.76  [
862.893810]  dev_attr_show+0x19/0x60
1769654399212 2026-01-29T02:39:59.212Z Jan 29 03:39:59 10.8.8.76  [
862.898789]  sysfs_kf_seq_show+0xbf/0x140
1769654399212 2026-01-29T02:39:59.212Z Jan 29 03:39:59 10.8.8.76  [
862.904790]  seq_read_iter+0x112/0x510
1769654399212 2026-01-29T02:39:59.212Z Jan 29 03:39:59 10.8.8.76  [
862.910791]  ? security_file_permission+0x8e/0xa0
1769654399212 2026-01-29T02:39:59.212Z Jan 29 03:39:59 10.8.8.76  [
862.917785]  vfs_read+0x215/0x340
1769654399212 2026-01-29T02:39:59.212Z Jan 29 03:39:59 10.8.8.76  [
862.923818]  ksys_read+0x61/0xe0
1769654399212 2026-01-29T02:39:59.212Z Jan 29 03:39:59 10.8.8.76  [
862.928487]  do_syscall_64+0x5d/0xc20
1769654399212 2026-01-29T02:39:59.212Z Jan 29 03:39:59 10.8.8.76  [
862.934787]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
1769654399212 2026-01-29T02:39:59.212Z Jan 29 03:39:59 10.8.8.76  [
862.942807] RIP: 0033:0x408d8e
1769654399212 2026-01-29T02:39:59.212Z Jan 29 03:39:59 10.8.8.76  [
862.947847] RSP: 002b:000000c003785498 EFLAGS: 00000206 ORIG_RAX:
0000000000000000
1769654399212 2026-01-29T02:39:59.212Z Jan 29 03:39:59 10.8.8.76  [
862.957811] RAX: ffffffffffffffda RBX: 000000000000001e RCX:
0000000000408d8e
1769654399212 2026-01-29T02:39:59.212Z Jan 29 03:39:59 10.8.8.76  [
862.967791] RDX: 0000000000000080 RSI: 000000c007188f80 RDI:
000000000000001e
1769654399212 2026-01-29T02:39:59.212Z Jan 29 03:39:59 10.8.8.76  [
862.977779] RBP: 000000c0037854d8 R08: 0000000000000000 R09:
0000000000000000
1769654399462 2026-01-29T02:39:59.462Z Jan 29 03:39:59 10.8.8.76  [
862.986816] R10: 0000000000000000 R11: 0000000000000206 R12:
0000000000000000
1769654399462 2026-01-29T02:39:59.462Z Jan 29 03:39:59 10.8.8.76  [
862.996763] R13: 0000000000000040 R14: 000000c00632c000 R15:
ffffffffffffffff
1769654399462 2026-01-29T02:39:59.462Z Jan 29 03:39:59 10.8.8.76  [
863.006797]  </TASK>
1769654399462 2026-01-29T02:39:59.462Z Jan 29 03:39:59 10.8.8.76  [
863.010603] INFO: task alloy:5883 blocked for more than 245 seconds.
1769654399462 2026-01-29T02:39:59.462Z Jan 29 03:39:59 10.8.8.76  [
863.019794]       Tainted: G            E
6.18.7-2.gdc.el9.x86_64 #1
1769654399462 2026-01-29T02:39:59.462Z Jan 29 03:39:59 10.8.8.76  [
863.028778] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
1769654399462 2026-01-29T02:39:59.462Z Jan 29 03:39:59 10.8.8.76  [
863.038822] task:alloy           state:D stack:0     pid:5883
tgid:5871  ppid:1      task_flags:0x400140 flags:0x00080001
1769654399462 2026-01-29T02:39:59.462Z Jan 29 03:39:59 10.8.8.76  [
863.052796] Call Trace:
1769654399462 2026-01-29T02:39:59.462Z Jan 29 03:39:59 10.8.8.76  [
863.057796]  <TASK>
1769654399462 2026-01-29T02:39:59.462Z Jan 29 03:39:59 10.8.8.76  [
863.062792]  __schedule+0x2b5/0x690
1769654399462 2026-01-29T02:39:59.462Z Jan 29 03:39:59 10.8.8.76  [
863.068791]  schedule+0x23/0x80
1769654399462 2026-01-29T02:39:59.462Z Jan 29 03:39:59 10.8.8.76  [
863.073859]  schedule_preempt_disabled+0x11/0x20
1769654399462 2026-01-29T02:39:59.462Z Jan 29 03:39:59 10.8.8.76  [
863.080786]  __mutex_lock.constprop.0+0x3c9/0xa00
1769654399462 2026-01-29T02:39:59.462Z Jan 29 03:39:59 10.8.8.76  [
863.087800]  hwmon_attr_show+0x36/0x130
1769654399462 2026-01-29T02:39:59.462Z Jan 29 03:39:59 10.8.8.76  [
863.093779]  dev_attr_show+0x19/0x60
1769654399462 2026-01-29T02:39:59.462Z Jan 29 03:39:59 10.8.8.76  [
863.099798]  sysfs_kf_seq_show+0xbf/0x140
1769654399462 2026-01-29T02:39:59.462Z Jan 29 03:39:59 10.8.8.76  [
863.105817]  seq_read_iter+0x112/0x510
1769654399462 2026-01-29T02:39:59.462Z Jan 29 03:39:59 10.8.8.76  [
863.110795]  ? security_file_permission+0x8e/0xa0
1769654399462 2026-01-29T02:39:59.462Z Jan 29 03:39:59 10.8.8.76  [
863.116806]  vfs_read+0x215/0x340
1769654399462 2026-01-29T02:39:59.462Z Jan 29 03:39:59 10.8.8.76  [
863.121800]  ksys_read+0x61/0xe0
1769654399462 2026-01-29T02:39:59.462Z Jan 29 03:39:59 10.8.8.76  [
863.126850]  do_syscall_64+0x5d/0xc20
1769654399462 2026-01-29T02:39:59.462Z Jan 29 03:39:59 10.8.8.76  [
863.132861]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
1769654399462 2026-01-29T02:39:59.462Z Jan 29 03:39:59 10.8.8.76  [
863.139830] RIP: 0033:0x408d8e
1769654399462 2026-01-29T02:39:59.462Z Jan 29 03:39:59 10.8.8.76  [
863.144791] RSP: 002b:000000c003705498 EFLAGS: 00000206 ORIG_RAX:
0000000000000000
1769654399462 2026-01-29T02:39:59.462Z Jan 29 03:39:59 10.8.8.76  [
863.154779] RAX: ffffffffffffffda RBX: 0000000000000020 RCX:
0000000000408d8e
1769654399462 2026-01-29T02:39:59.462Z Jan 29 03:39:59 10.8.8.76  [
863.163829] RDX: 0000000000000080 RSI: 000000c003fdd880 RDI:
0000000000000020
1769654399462 2026-01-29T02:39:59.462Z Jan 29 03:39:59 10.8.8.76  [
863.172268] RBP: 000000c0037054d8 R08: 0000000000000000 R09:
0000000000000000
1769654399462 2026-01-29T02:39:59.462Z Jan 29 03:39:59 10.8.8.76  [
863.180839] R10: 0000000000000000 R11: 0000000000000206 R12:
0000000000000000
1769654399462 2026-01-29T02:39:59.462Z Jan 29 03:39:59 10.8.8.76  [
863.190798] R13: 0000000000000040 R14: 000000c00632d180 R15:
ffffffffffffffff
1769654399462 2026-01-29T02:39:59.462Z Jan 29 03:39:59 10.8.8.76  [
863.200778]  </TASK>
1769654399462 2026-01-29T02:39:59.462Z Jan 29 03:39:59 10.8.8.76  [
863.205805] Future hung task reports are suppressed, see sysctl
kernel.hung_task_warnings


>
> Guenter
>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-01-29 15:26                 ` Jaroslav Pulchart
@ 2026-01-29 18:22                   ` Guenter Roeck
  2026-02-02 16:48                     ` Jaroslav Pulchart
  0 siblings, 1 reply; 60+ messages in thread
From: Guenter Roeck @ 2026-01-29 18:22 UTC (permalink / raw)
  To: Jaroslav Pulchart
  Cc: Rafael J. Wysocki, linux-acpi, linux-hwmon, Igor Raits,
	Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li

On Thu, Jan 29, 2026 at 04:26:37PM +0100, Jaroslav Pulchart wrote:
> >
> > On 1/27/26 03:58, Jaroslav Pulchart wrote:
> > ...
> > >>> Hello,
> > >>>
> > >>> Thank you for the analysis and insights.
> > >>>
> > >>> To add some context from our side: we observe this issue shortly after
> > >>> a server reinstall or fw updates followed by a reboot, typically when
> > >>> hwmon sensors are accessed for the first time (e.g. by monitoring during
> > >>> early system initialization). We have not seen it trigger during normal
> > >>> operation, such as a simple reboot or on long-running systems.
> > >>>
> > >>> Given this behavior, it is possible that the issue is related to
> > >>> hwmon_device_unregister() being triggered early during system
> > >>> initialization, but I do not have concrete evidence at this point.
> > >>>
> > >>> As a workaround, we will exclude the ACPI power meter from hwmon monitoring
> > >>> (Grafana Alloy) until the mentioned refactoring or a fix becomes available.
> > >>>
> > >>
> > >> Would it be possible for you to test the patch series I just sent out ?
> > >
> > > I suppose these 5 patches from
> > > https://lore.kernel.org/lkml/20260123182208.2229670-1-linux@roeck-us.net/
> > >   ?
> >
> > Correct.
> >
> > > I will try them and let you know.
> > >
> > Thanks !
> 
> 
> The issue is still there:
> ...

Thnaks a lot for testing.

> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.105931] INFO: task alloy:5985 blocked for more than 124 seconds.
> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.114475]       Tainted: G            E 6.18.7-2.gdc.el9.x86_64 #1
> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.122872] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.132932] task:alloy           state:D stack:0     pid:5985 tgid:5871  ppid:1      task_flags:0x400140 flags:0x00080001
> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.145975] Call Trace:
> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.149985]  <TASK>
> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.154931]  __schedule+0x2b5/0x690
> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.160934]  schedule+0x23/0x80
> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.166498]  schedule_timeout+0xe8/0x100
> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.172936]  __wait_for_common+0x99/0x1c0
> 1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [ 741.179148]  ? __pfx_schedule_timeout+0x10/0x10
> 1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [ 741.185936]  acpi_ipmi_space_handler.part.0+0x1ac/0x370 [acpi_ipmi]
> 1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [ 741.194929]  acpi_ev_address_space_dispatch+0x16d/0x3c0

Looks like acpi holds another lock. Can you check if the patch submitted by
Rafael fixes the problem ?

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-01-29 18:22                   ` Guenter Roeck
@ 2026-02-02 16:48                     ` Jaroslav Pulchart
  2026-02-02 18:00                       ` Guenter Roeck
  0 siblings, 1 reply; 60+ messages in thread
From: Jaroslav Pulchart @ 2026-02-02 16:48 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Rafael J. Wysocki, linux-acpi, linux-hwmon, Igor Raits,
	Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li

>
> On Thu, Jan 29, 2026 at 04:26:37PM +0100, Jaroslav Pulchart wrote:
> > >
> > > On 1/27/26 03:58, Jaroslav Pulchart wrote:
> > > ...
> > > >>> Hello,
> > > >>>
> > > >>> Thank you for the analysis and insights.
> > > >>>
> > > >>> To add some context from our side: we observe this issue shortly after
> > > >>> a server reinstall or fw updates followed by a reboot, typically when
> > > >>> hwmon sensors are accessed for the first time (e.g. by monitoring during
> > > >>> early system initialization). We have not seen it trigger during normal
> > > >>> operation, such as a simple reboot or on long-running systems.
> > > >>>
> > > >>> Given this behavior, it is possible that the issue is related to
> > > >>> hwmon_device_unregister() being triggered early during system
> > > >>> initialization, but I do not have concrete evidence at this point.
> > > >>>
> > > >>> As a workaround, we will exclude the ACPI power meter from hwmon monitoring
> > > >>> (Grafana Alloy) until the mentioned refactoring or a fix becomes available.
> > > >>>
> > > >>
> > > >> Would it be possible for you to test the patch series I just sent out ?
> > > >
> > > > I suppose these 5 patches from
> > > > https://lore.kernel.org/lkml/20260123182208.2229670-1-linux@roeck-us.net/
> > > >   ?
> > >
> > > Correct.
> > >
> > > > I will try them and let you know.
> > > >
> > > Thanks !
> >
> >
> > The issue is still there:
> > ...
>
> Thnaks a lot for testing.
>
> > 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.105931] INFO: task alloy:5985 blocked for more than 124 seconds.
> > 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.114475]       Tainted: G            E 6.18.7-2.gdc.el9.x86_64 #1
> > 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.122872] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.132932] task:alloy           state:D stack:0     pid:5985 tgid:5871  ppid:1      task_flags:0x400140 flags:0x00080001
> > 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.145975] Call Trace:
> > 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.149985]  <TASK>
> > 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.154931]  __schedule+0x2b5/0x690
> > 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.160934]  schedule+0x23/0x80
> > 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.166498]  schedule_timeout+0xe8/0x100
> > 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.172936]  __wait_for_common+0x99/0x1c0
> > 1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [ 741.179148]  ? __pfx_schedule_timeout+0x10/0x10
> > 1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [ 741.185936]  acpi_ipmi_space_handler.part.0+0x1ac/0x370 [acpi_ipmi]
> > 1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [ 741.194929]  acpi_ev_address_space_dispatch+0x16d/0x3c0
>
> Looks like acpi holds another lock. Can you check if the patch submitted by
> Rafael fixes the problem ?

Hello,

I tested the patch as a standalone change, and the issue is still present.

However, I can now reliably reproduce the problem. It is triggered while
Dell iDRAC is being reset or updated and is temporarily unavailable. During
this time, a [kipmi0] task enters D state for an extended period, and
shortly after (or during this), the ACPI power meter hwmon interface ends up
in the locked state described earlier.

This explains why we mostly observe the issue after redeployments in our
environment: server redeployments typically include iDRAC firmware updates
or resets performed together with operating system installation or updates.

Best

>
> Thanks,
> Guenter

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-02-02 16:48                     ` Jaroslav Pulchart
@ 2026-02-02 18:00                       ` Guenter Roeck
  2026-02-02 18:14                         ` Jaroslav Pulchart
  0 siblings, 1 reply; 60+ messages in thread
From: Guenter Roeck @ 2026-02-02 18:00 UTC (permalink / raw)
  To: Jaroslav Pulchart
  Cc: Rafael J. Wysocki, linux-acpi, linux-hwmon, Igor Raits,
	Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li

On 2/2/26 08:48, Jaroslav Pulchart wrote:
>>
>> On Thu, Jan 29, 2026 at 04:26:37PM +0100, Jaroslav Pulchart wrote:
>>>>
>>>> On 1/27/26 03:58, Jaroslav Pulchart wrote:
>>>> ...
>>>>>>> Hello,
>>>>>>>
>>>>>>> Thank you for the analysis and insights.
>>>>>>>
>>>>>>> To add some context from our side: we observe this issue shortly after
>>>>>>> a server reinstall or fw updates followed by a reboot, typically when
>>>>>>> hwmon sensors are accessed for the first time (e.g. by monitoring during
>>>>>>> early system initialization). We have not seen it trigger during normal
>>>>>>> operation, such as a simple reboot or on long-running systems.
>>>>>>>
>>>>>>> Given this behavior, it is possible that the issue is related to
>>>>>>> hwmon_device_unregister() being triggered early during system
>>>>>>> initialization, but I do not have concrete evidence at this point.
>>>>>>>
>>>>>>> As a workaround, we will exclude the ACPI power meter from hwmon monitoring
>>>>>>> (Grafana Alloy) until the mentioned refactoring or a fix becomes available.
>>>>>>>
>>>>>>
>>>>>> Would it be possible for you to test the patch series I just sent out ?
>>>>>
>>>>> I suppose these 5 patches from
>>>>> https://lore.kernel.org/lkml/20260123182208.2229670-1-linux@roeck-us.net/
>>>>>    ?
>>>>
>>>> Correct.
>>>>
>>>>> I will try them and let you know.
>>>>>
>>>> Thanks !
>>>
>>>
>>> The issue is still there:
>>> ...
>>
>> Thnaks a lot for testing.
>>
>>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.105931] INFO: task alloy:5985 blocked for more than 124 seconds.
>>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.114475]       Tainted: G            E 6.18.7-2.gdc.el9.x86_64 #1
>>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.122872] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.132932] task:alloy           state:D stack:0     pid:5985 tgid:5871  ppid:1      task_flags:0x400140 flags:0x00080001
>>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.145975] Call Trace:
>>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.149985]  <TASK>
>>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.154931]  __schedule+0x2b5/0x690
>>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.160934]  schedule+0x23/0x80
>>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.166498]  schedule_timeout+0xe8/0x100
>>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.172936]  __wait_for_common+0x99/0x1c0
>>> 1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [ 741.179148]  ? __pfx_schedule_timeout+0x10/0x10
>>> 1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [ 741.185936]  acpi_ipmi_space_handler.part.0+0x1ac/0x370 [acpi_ipmi]
>>> 1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [ 741.194929]  acpi_ev_address_space_dispatch+0x16d/0x3c0
>>
>> Looks like acpi holds another lock. Can you check if the patch submitted by
>> Rafael fixes the problem ?
> 
> Hello,
> 
> I tested the patch as a standalone change, and the issue is still present.
> 
> However, I can now reliably reproduce the problem. It is triggered while
> Dell iDRAC is being reset or updated and is temporarily unavailable. During
> this time, a [kipmi0] task enters D state for an extended period, and
> shortly after (or during this), the ACPI power meter hwmon interface ends up
> in the locked state described earlier.
> 
> This explains why we mostly observe the issue after redeployments in our
> environment: server redeployments typically include iDRAC firmware updates
> or resets performed together with operating system installation or updates.
> 

Thanks for the update. Do you have a dmesg log showing the problem with Rafael's
patch applied, by any chance ?

Thanks,
Guenter


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-02-02 18:00                       ` Guenter Roeck
@ 2026-02-02 18:14                         ` Jaroslav Pulchart
  2026-02-02 19:26                           ` Rafael J. Wysocki
  0 siblings, 1 reply; 60+ messages in thread
From: Jaroslav Pulchart @ 2026-02-02 18:14 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Rafael J. Wysocki, linux-acpi, linux-hwmon, Igor Raits,
	Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li

>
> On 2/2/26 08:48, Jaroslav Pulchart wrote:
> >>
> >> On Thu, Jan 29, 2026 at 04:26:37PM +0100, Jaroslav Pulchart wrote:
> >>>>
> >>>> On 1/27/26 03:58, Jaroslav Pulchart wrote:
> >>>> ...
> >>>>>>> Hello,
> >>>>>>>
> >>>>>>> Thank you for the analysis and insights.
> >>>>>>>
> >>>>>>> To add some context from our side: we observe this issue shortly after
> >>>>>>> a server reinstall or fw updates followed by a reboot, typically when
> >>>>>>> hwmon sensors are accessed for the first time (e.g. by monitoring during
> >>>>>>> early system initialization). We have not seen it trigger during normal
> >>>>>>> operation, such as a simple reboot or on long-running systems.
> >>>>>>>
> >>>>>>> Given this behavior, it is possible that the issue is related to
> >>>>>>> hwmon_device_unregister() being triggered early during system
> >>>>>>> initialization, but I do not have concrete evidence at this point.
> >>>>>>>
> >>>>>>> As a workaround, we will exclude the ACPI power meter from hwmon monitoring
> >>>>>>> (Grafana Alloy) until the mentioned refactoring or a fix becomes available.
> >>>>>>>
> >>>>>>
> >>>>>> Would it be possible for you to test the patch series I just sent out ?
> >>>>>
> >>>>> I suppose these 5 patches from
> >>>>> https://lore.kernel.org/lkml/20260123182208.2229670-1-linux@roeck-us.net/
> >>>>>    ?
> >>>>
> >>>> Correct.
> >>>>
> >>>>> I will try them and let you know.
> >>>>>
> >>>> Thanks !
> >>>
> >>>
> >>> The issue is still there:
> >>> ...
> >>
> >> Thnaks a lot for testing.
> >>
> >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.105931] INFO: task alloy:5985 blocked for more than 124 seconds.
> >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.114475]       Tainted: G            E 6.18.7-2.gdc.el9.x86_64 #1
> >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.122872] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.132932] task:alloy           state:D stack:0     pid:5985 tgid:5871  ppid:1      task_flags:0x400140 flags:0x00080001
> >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.145975] Call Trace:
> >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.149985]  <TASK>
> >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.154931]  __schedule+0x2b5/0x690
> >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.160934]  schedule+0x23/0x80
> >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.166498]  schedule_timeout+0xe8/0x100
> >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.172936]  __wait_for_common+0x99/0x1c0
> >>> 1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [ 741.179148]  ? __pfx_schedule_timeout+0x10/0x10
> >>> 1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [ 741.185936]  acpi_ipmi_space_handler.part.0+0x1ac/0x370 [acpi_ipmi]
> >>> 1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [ 741.194929]  acpi_ev_address_space_dispatch+0x16d/0x3c0
> >>
> >> Looks like acpi holds another lock. Can you check if the patch submitted by
> >> Rafael fixes the problem ?
> >
> > Hello,
> >
> > I tested the patch as a standalone change, and the issue is still present.
> >
> > However, I can now reliably reproduce the problem. It is triggered while
> > Dell iDRAC is being reset or updated and is temporarily unavailable. During
> > this time, a [kipmi0] task enters D state for an extended period, and
> > shortly after (or during this), the ACPI power meter hwmon interface ends up
> > in the locked state described earlier.
> >
> > This explains why we mostly observe the issue after redeployments in our
> > environment: server redeployments typically include iDRAC firmware updates
> > or resets performed together with operating system installation or updates.
> >
>
> Thanks for the update. Do you have a dmesg log showing the problem with Rafael's
> patch applied, by any chance ?

[  740.004138] INFO: task alloy:8260 blocked for more than 122 seconds.
[  740.010752]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
[  740.019257] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  740.030103] task:alloy           state:D stack:0     pid:8260
tgid:8260  ppid:1      task_flags:0x400100 flags:0x00080001
[  740.045068] Call Trace:
[  740.050334]  <TASK>
[  740.055075]  __schedule+0x2b5/0x690
[  740.062073]  schedule+0x23/0x80
[  740.068081]  schedule_preempt_disabled+0x11/0x20
[  740.074962]  __mutex_lock.constprop.0+0x3c9/0xa00
[  740.083073]  hwmon_attr_show+0x36/0x130
[  740.089081]  dev_attr_show+0x19/0x60
[  740.095876]  sysfs_kf_seq_show+0xbf/0x140
[  740.102810]  seq_read_iter+0x112/0x510
[  740.109070]  ? security_file_permission+0x8e/0xa0
[  740.117100]  vfs_read+0x215/0x340
[  740.123113]  ksys_read+0x61/0xe0
[  740.130088]  do_syscall_64+0x5d/0xb40
[  740.137032]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  740.144601] RIP: 0033:0x408d8e
[  740.151070] RSP: 002b:000000c001cad498 EFLAGS: 00000206 ORIG_RAX:
0000000000000000
[  740.160825] RAX: ffffffffffffffda RBX: 0000000000000022 RCX: 0000000000408d8e
[  740.170097] RDX: 0000000000000080 RSI: 000000c007c28880 RDI: 0000000000000022
[  740.180085] RBP: 000000c001cad4d8 R08: 0000000000000000 R09: 0000000000000000
[  740.190074] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
[  740.201069] R13: 0000000000000040 R14: 000000c0058b1dc0 R15: ffffffffffffffff
[  740.211072]  </TASK>
[  740.215117] INFO: task alloy:8268 blocked for more than 123 seconds.
[  740.224109]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
[  740.234089] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  740.243611] task:alloy           state:D stack:0     pid:8268
tgid:8260  ppid:1      task_flags:0x400140 flags:0x00080001
[  740.258077] Call Trace:
[  740.261260]  <TASK>
[  740.266083]  __schedule+0x2b5/0x690
[  740.273081]  schedule+0x23/0x80
[  740.279055]  schedule_preempt_disabled+0x11/0x20
[  740.286087]  __mutex_lock.constprop.0+0x3c9/0xa00
[  740.294074]  hwmon_attr_show+0x36/0x130
[  740.300089]  dev_attr_show+0x19/0x60
[  740.306074]  sysfs_kf_seq_show+0xbf/0x140
[  740.313074]  seq_read_iter+0x112/0x510
[  740.319075]  ? security_file_permission+0x8e/0xa0
[  740.326064]  vfs_read+0x215/0x340
[  740.331218]  ksys_read+0x61/0xe0
[  740.337078]  do_syscall_64+0x5d/0xb40
[  740.342083]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  740.349376] RIP: 0033:0x408d8e
[  740.355086] RSP: 002b:000000c008403498 EFLAGS: 00000206 ORIG_RAX:
0000000000000000
[  740.365088] RAX: ffffffffffffffda RBX: 000000000000001f RCX: 0000000000408d8e
[  740.375072] RDX: 0000000000000080 RSI: 000000c006a47d80 RDI: 000000000000001f
[  740.383669] RBP: 000000c0084034d8 R08: 0000000000000000 R09: 0000000000000000
[  740.393072] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
[  740.401326] R13: 0000000000000040 R14: 000000c005ea7880 R15: ffffffffffffffff
[  740.411087]  </TASK>
[  740.416089] INFO: task alloy:8270 blocked for more than 123 seconds.
[  740.425081]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
[  740.434091] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  740.444078] task:alloy           state:D stack:0     pid:8270
tgid:8260  ppid:1      task_flags:0x400140 flags:0x00080001
[  740.457108] Call Trace:
[  740.462080]  <TASK>
[  740.467073]  __schedule+0x2b5/0x690
[  740.472252]  schedule+0x23/0x80
[  740.478087]  schedule_preempt_disabled+0x11/0x20
[  740.485099]  __mutex_lock.constprop.0+0x3c9/0xa00
[  740.491096]  hwmon_attr_show+0x36/0x130
[  740.497082]  dev_attr_show+0x19/0x60
[  740.503078]  sysfs_kf_seq_show+0xbf/0x140
[  740.510030]  seq_read_iter+0x112/0x510
[  740.516073]  ? security_file_permission+0x8e/0xa0
[  740.523355]  vfs_read+0x215/0x340
[  740.529077]  ksys_read+0x61/0xe0
[  740.534925]  do_syscall_64+0x5d/0xb40
[  740.540253]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  740.548077] RIP: 0033:0x408d8e
[  740.553078] RSP: 002b:000000c0085cf498 EFLAGS: 00000206 ORIG_RAX:
0000000000000000
[  740.563075] RAX: ffffffffffffffda RBX: 000000000000001b RCX: 0000000000408d8e
[  740.572088] RDX: 0000000000000080 RSI: 000000c006d44f80 RDI: 000000000000001b
[  740.582076] RBP: 000000c0085cf4d8 R08: 0000000000000000 R09: 0000000000000000
[  740.591086] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
[  740.600086] R13: 0000000000000040 R14: 000000c005703c00 R15: 0000000000000054
[  740.609481]  </TASK>
[  740.614084] INFO: task alloy:8287 blocked for more than 123 seconds.
[  740.623089]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
[  740.632094] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  740.642083] task:alloy           state:D stack:0     pid:8287
tgid:8260  ppid:1      task_flags:0x400140 flags:0x00080001
[  740.656080] Call Trace:
[  740.660694]  <TASK>
[  740.665087]  __schedule+0x2b5/0x690
[  740.671086]  schedule+0x23/0x80
[  740.677080]  schedule_timeout+0xe8/0x100
[  740.683134]  __wait_for_common+0x99/0x1c0
[  740.690093]  ? __pfx_schedule_timeout+0x10/0x10
[  740.697081]  acpi_ipmi_space_handler.part.0+0x1ac/0x370 [acpi_ipmi]
[  740.706086]  acpi_ev_address_space_dispatch+0x16d/0x3c0
[  740.713584]  ? __pfx_acpi_ipmi_space_handler+0x10/0x10 [acpi_ipmi]
[  740.722080]  acpi_ex_access_region+0x5b/0xd0
[  740.729078]  acpi_ex_write_serial_bus+0xb6/0x1e0
[  740.736083]  acpi_ex_store_object_to_node+0x15e/0x200
[  740.744083]  acpi_ex_opcode_1A_1T_1R+0x24c/0x620
[  740.750088]  acpi_ds_exec_end_op+0x287/0x510
[  740.757076]  acpi_ps_parse_loop+0xf6/0x680
[  740.763177]  acpi_ps_parse_aml+0x17a/0x3d0
[  740.770074]  acpi_ps_execute_method+0x137/0x270
[  740.776256]  acpi_ns_evaluate+0x1f4/0x2e0
[  740.783076]  acpi_evaluate_object+0x13c/0x2e0
[  740.790118]  acpi_evaluate_integer+0x4f/0xe0
[  740.796571]  ? obj_cgroup_charge_account+0x187/0x2d0
[  740.804074]  update_meter+0x61/0xd0 [acpi_power_meter]
[  740.811082]  power_meter_read+0x1ea/0x2c0 [acpi_power_meter]
[  740.819088]  hwmon_attr_show+0x5e/0x130
[  740.825093]  dev_attr_show+0x19/0x60
[  740.830810]  sysfs_kf_seq_show+0xbf/0x140
[  740.837076]  seq_read_iter+0x112/0x510
[  740.843079]  ? security_file_permission+0x8e/0xa0
[  740.849058]  vfs_read+0x215/0x340
[  740.855086]  ksys_read+0x61/0xe0
[  740.861058]  do_syscall_64+0x5d/0xb40
[  740.866095]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  740.874104] RIP: 0033:0x408d8e
[  740.880067] RSP: 002b:000000c007e81498 EFLAGS: 00000206 ORIG_RAX:
0000000000000000
[  740.890076] RAX: ffffffffffffffda RBX: 000000000000001d RCX: 0000000000408d8e
[  740.900075] RDX: 0000000000000080 RSI: 000000c007383380 RDI: 000000000000001d
[  740.910080] RBP: 000000c007e814d8 R08: 0000000000000000 R09: 0000000000000000
[  740.918446] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
[  740.928087] R13: 0000000000000040 R14: 000000c00560ec40 R15: ffffffffffffffff
[  740.937080]  </TASK>
[  740.942085] INFO: task alloy:8463 blocked for more than 123 seconds.
[  740.951076]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
[  740.959660] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  740.970077] task:alloy           state:D stack:0     pid:8463
tgid:8260  ppid:1      task_flags:0x400140 flags:0x00080001
[  740.983108] Call Trace:
[  740.987305]  <TASK>
[  740.992084]  __schedule+0x2b5/0x690
[  740.998077]  schedule+0x23/0x80
[  741.003472]  schedule_preempt_disabled+0x11/0x20
[  741.009738]  __mutex_lock.constprop.0+0x3c9/0xa00
[  741.017079]  hwmon_attr_show+0x36/0x130
[  741.023094]  dev_attr_show+0x19/0x60
[  741.029076]  sysfs_kf_seq_show+0xbf/0x140
[  741.035084]  seq_read_iter+0x112/0x510
[  741.039031]  ? security_file_permission+0x8e/0xa0
[  741.045212]  vfs_read+0x215/0x340
[  741.051081]  ksys_read+0x61/0xe0
[  741.056076]  do_syscall_64+0x5d/0xb40
[  741.061744]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  741.069081] RIP: 0033:0x408d8e
[  741.074087] RSP: 002b:000000c006127498 EFLAGS: 00000206 ORIG_RAX:
0000000000000000
[  741.084107] RAX: ffffffffffffffda RBX: 0000000000000020 RCX: 0000000000408d8e
[  741.093151] RDX: 0000000000000080 RSI: 000000c005b15b00 RDI: 0000000000000020
[  741.103087] RBP: 000000c0061274d8 R08: 0000000000000000 R09: 0000000000000000
[  741.113084] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
[  741.123120] R13: 0000000000000040 R14: 000000c005ba1dc0 R15: ffffffffffffffff
[  741.133093]  </TASK>
[  741.138081] INFO: task alloy:8465 blocked for more than 124 seconds.
[  741.147085]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
[  741.155079] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  741.165084] task:alloy           state:D stack:0     pid:8465
tgid:8260  ppid:1      task_flags:0x400140 flags:0x00080001
[  741.178095] Call Trace:
[  741.182595]  <TASK>
[  741.187079]  __schedule+0x2b5/0x690
[  741.193014]  schedule+0x23/0x80
[  741.199071]  schedule_preempt_disabled+0x11/0x20
[  741.206102]  __mutex_lock.constprop.0+0x3c9/0xa00
[  741.214091]  hwmon_attr_show+0x36/0x130
[  741.221079]  dev_attr_show+0x19/0x60
[  741.226081]  sysfs_kf_seq_show+0xbf/0x140
[  741.232083]  seq_read_iter+0x112/0x510
[  741.237943]  ? security_file_permission+0x8e/0xa0
[  741.245080]  vfs_read+0x215/0x340
[  741.251080]  ksys_read+0x61/0xe0
[  741.257068]  do_syscall_64+0x5d/0xb40
[  741.262084]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  741.270083] RIP: 0033:0x408d8e
[  741.275229] RSP: 002b:000000c007e85498 EFLAGS: 00000206 ORIG_RAX:
0000000000000000
[  741.285086] RAX: ffffffffffffffda RBX: 000000000000001e RCX: 0000000000408d8e
[  741.294078] RDX: 0000000000000080 RSI: 000000c0066a5180 RDI: 000000000000001e
[  741.304078] RBP: 000000c007e854d8 R08: 0000000000000000 R09: 0000000000000000
[  741.313085] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
[  741.323084] R13: 0000000000000040 R14: 000000c0040f8fc0 R15: ffffffffffffffff
[  741.333078]  </TASK>
[  741.337087] INFO: task alloy:8470 blocked for more than 124 seconds.
[  741.346079]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
[  741.355091] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  741.365102] task:alloy           state:D stack:0     pid:8470
tgid:8260  ppid:1      task_flags:0x400140 flags:0x00080001
[  741.378081] Call Trace:
[  741.383127]  <TASK>
[  741.388089]  __schedule+0x2b5/0x690
[  741.394085]  schedule+0x23/0x80
[  741.400084]  schedule_preempt_disabled+0x11/0x20
[  741.408085]  __mutex_lock.constprop.0+0x3c9/0xa00
[  741.415246]  hwmon_attr_show+0x36/0x130
[  741.422088]  dev_attr_show+0x19/0x60
[  741.428079]  sysfs_kf_seq_show+0xbf/0x140
[  741.434088]  seq_read_iter+0x112/0x510
[  741.440075]  ? security_file_permission+0x8e/0xa0
[  741.446661]  vfs_read+0x215/0x340
[  741.452104]  ksys_read+0x61/0xe0
[  741.458082]  do_syscall_64+0x5d/0xb40
[  741.464083]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  741.471412] RIP: 0033:0x408d8e
[  741.477072] RSP: 002b:000000c007369498 EFLAGS: 00000206 ORIG_RAX:
0000000000000000
[  741.487081] RAX: ffffffffffffffda RBX: 0000000000000021 RCX: 0000000000408d8e
[  741.496078] RDX: 0000000000000080 RSI: 000000c005cff500 RDI: 0000000000000021
[  741.506094] RBP: 000000c0073694d8 R08: 0000000000000000 R09: 0000000000000000
[  741.515087] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
[  741.525084] R13: 0000000000000040 R14: 000000c001db6c40 R15: ffffffffffffffff
[  741.534086]  </TASK>
[  741.539094] INFO: task alloy:8474 blocked for more than 124 seconds.
[  741.548092]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
[  741.557097] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  741.567095] task:alloy           state:D stack:0     pid:8474
tgid:8260  ppid:1      task_flags:0x400140 flags:0x00080001
[  741.581079] Call Trace:
[  741.586087]  <TASK>
[  741.591162]  __schedule+0x2b5/0x690
[  741.597103]  schedule+0x23/0x80
[  741.603109]  schedule_preempt_disabled+0x11/0x20
[  741.610098]  __mutex_lock.constprop.0+0x3c9/0xa00
[  741.617086]  hwmon_attr_show+0x36/0x130
[  741.623137]  dev_attr_show+0x19/0x60
[  741.629087]  sysfs_kf_seq_show+0xbf/0x140
[  741.635090]  seq_read_iter+0x112/0x510
[  741.641083]  ? security_file_permission+0x8e/0xa0
[  741.648095]  vfs_read+0x215/0x340
[  741.654101]  ksys_read+0x61/0xe0
[  741.660087]  do_syscall_64+0x5d/0xb40
[  741.666090]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  741.672343] RIP: 0033:0x408d8e
[  741.678083] RSP: 002b:000000c000b03498 EFLAGS: 00000206 ORIG_RAX:
0000000000000000
[  741.688096] RAX: ffffffffffffffda RBX: 0000000000000025 RCX: 0000000000408d8e
[  741.698084] RDX: 0000000000000080 RSI: 000000c0069df000 RDI: 0000000000000025
[  741.706728] RBP: 000000c000b034d8 R08: 0000000000000000 R09: 0000000000000000
[  741.716084] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
[  741.725093] R13: 0000000000000040 R14: 000000c004075340 R15: ffffffffffffffff
[  741.735124]  </TASK>
[  741.737900] INFO: task alloy:8726 blocked for more than 124 seconds.
[  741.747088]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
[  741.756086] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  741.766111] task:alloy           state:D stack:0     pid:8726
tgid:8260  ppid:1      task_flags:0x400140 flags:0x00080001
[  741.780116] Call Trace:
[  741.785093]  <TASK>
[  741.790100]  __schedule+0x2b5/0x690
[  741.796596]  schedule+0x23/0x80
[  741.802109]  schedule_preempt_disabled+0x11/0x20
[  741.809091]  __mutex_lock.constprop.0+0x3c9/0xa00
[  741.815262]  hwmon_attr_show+0x36/0x130
[  741.822096]  dev_attr_show+0x19/0x60
[  741.828105]  sysfs_kf_seq_show+0xbf/0x140
[  741.834088]  seq_read_iter+0x112/0x510
[  741.839547]  ? security_file_permission+0x8e/0xa0
[  741.846570]  vfs_read+0x215/0x340
[  741.850048] perf: interrupt took too long (2993 > 2500), lowering
kernel.perf_event_max_sample_rate to 66000
[  741.863112]  ksys_read+0x61/0xe0
[  741.868778]  do_syscall_64+0x5d/0xb40
[  741.875104]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  741.883101] RIP: 0033:0x408d8e
[  741.888436] RSP: 002b:000000c0085cb498 EFLAGS: 00000206 ORIG_RAX:
0000000000000000
[  741.898081] RAX: ffffffffffffffda RBX: 0000000000000024 RCX: 0000000000408d8e
[  741.907092] RDX: 0000000000000080 RSI: 000000c006424e00 RDI: 0000000000000024
[  741.917092] RBP: 000000c0085cb4d8 R08: 0000000000000000 R09: 0000000000000000
[  741.926087] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
[  741.936089] R13: 0000000000000040 R14: 000000c00565aa80 R15: ffffffffffffffff
[  741.945098]  </TASK>
[  862.885552] INFO: task alloy:8260 blocked for more than 245 seconds.
[  862.892117]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
[  862.904592] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  862.914632] task:alloy           state:D stack:0     pid:8260
tgid:8260  ppid:1      task_flags:0x400100 flags:0x00080001
[  862.928478] Call Trace:
[  862.932477]  <TASK>
[  862.938482]  __schedule+0x2b5/0x690
[  862.944502]  schedule+0x23/0x80
[  862.949478]  schedule_preempt_disabled+0x11/0x20
[  862.955487]  __mutex_lock.constprop.0+0x3c9/0xa00
[  862.962489]  hwmon_attr_show+0x36/0x130
[  862.968485]  dev_attr_show+0x19/0x60
[  862.974482]  sysfs_kf_seq_show+0xbf/0x140
[  862.981517]  seq_read_iter+0x112/0x510
[  862.987485]  ? security_file_permission+0x8e/0xa0
[  862.994505]  vfs_read+0x215/0x340
[  863.000295]  ksys_read+0x61/0xe0
[  863.006484]  do_syscall_64+0x5d/0xb40
[  863.013226]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  863.021497] RIP: 0033:0x408d8e
[  863.027499] RSP: 002b:000000c001cad498 EFLAGS: 00000206 ORIG_RAX:
0000000000000000
[  863.037483] RAX: ffffffffffffffda RBX: 0000000000000022 RCX: 0000000000408d8e
[  863.047483] RDX: 0000000000000080 RSI: 000000c007c28880 RDI: 0000000000000022
[  863.057487] RBP: 000000c001cad4d8 R08: 0000000000000000 R09: 0000000000000000
[  863.066499] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
[  863.076485] R13: 0000000000000040 R14: 000000c0058b1dc0 R15: ffffffffffffffff
[  863.087007]  </TASK>


>
> Thanks,
> Guenter
>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-02-02 18:14                         ` Jaroslav Pulchart
@ 2026-02-02 19:26                           ` Rafael J. Wysocki
  2026-02-03  0:26                             ` Guenter Roeck
  2026-02-03  8:23                             ` Jaroslav Pulchart
  0 siblings, 2 replies; 60+ messages in thread
From: Rafael J. Wysocki @ 2026-02-02 19:26 UTC (permalink / raw)
  To: Jaroslav Pulchart
  Cc: Guenter Roeck, Rafael J. Wysocki, linux-acpi, linux-hwmon,
	Igor Raits, Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li

On Mon, Feb 2, 2026 at 7:14 PM Jaroslav Pulchart
<jaroslav.pulchart@gooddata.com> wrote:
>
> >
> > On 2/2/26 08:48, Jaroslav Pulchart wrote:
> > >>
> > >> On Thu, Jan 29, 2026 at 04:26:37PM +0100, Jaroslav Pulchart wrote:
> > >>>>
> > >>>> On 1/27/26 03:58, Jaroslav Pulchart wrote:
> > >>>> ...
> > >>>>>>> Hello,
> > >>>>>>>
> > >>>>>>> Thank you for the analysis and insights.
> > >>>>>>>
> > >>>>>>> To add some context from our side: we observe this issue shortly after
> > >>>>>>> a server reinstall or fw updates followed by a reboot, typically when
> > >>>>>>> hwmon sensors are accessed for the first time (e.g. by monitoring during
> > >>>>>>> early system initialization). We have not seen it trigger during normal
> > >>>>>>> operation, such as a simple reboot or on long-running systems.
> > >>>>>>>
> > >>>>>>> Given this behavior, it is possible that the issue is related to
> > >>>>>>> hwmon_device_unregister() being triggered early during system
> > >>>>>>> initialization, but I do not have concrete evidence at this point.
> > >>>>>>>
> > >>>>>>> As a workaround, we will exclude the ACPI power meter from hwmon monitoring
> > >>>>>>> (Grafana Alloy) until the mentioned refactoring or a fix becomes available.
> > >>>>>>>
> > >>>>>>
> > >>>>>> Would it be possible for you to test the patch series I just sent out ?
> > >>>>>
> > >>>>> I suppose these 5 patches from
> > >>>>> https://lore.kernel.org/lkml/20260123182208.2229670-1-linux@roeck-us.net/
> > >>>>>    ?
> > >>>>
> > >>>> Correct.
> > >>>>
> > >>>>> I will try them and let you know.
> > >>>>>
> > >>>> Thanks !
> > >>>
> > >>>
> > >>> The issue is still there:
> > >>> ...
> > >>
> > >> Thnaks a lot for testing.
> > >>
> > >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.105931] INFO: task alloy:5985 blocked for more than 124 seconds.
> > >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.114475]       Tainted: G            E 6.18.7-2.gdc.el9.x86_64 #1
> > >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.122872] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.132932] task:alloy           state:D stack:0     pid:5985 tgid:5871  ppid:1      task_flags:0x400140 flags:0x00080001
> > >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.145975] Call Trace:
> > >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.149985]  <TASK>
> > >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.154931]  __schedule+0x2b5/0x690
> > >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.160934]  schedule+0x23/0x80
> > >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.166498]  schedule_timeout+0xe8/0x100
> > >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.172936]  __wait_for_common+0x99/0x1c0
> > >>> 1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [ 741.179148]  ? __pfx_schedule_timeout+0x10/0x10
> > >>> 1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [ 741.185936]  acpi_ipmi_space_handler.part.0+0x1ac/0x370 [acpi_ipmi]
> > >>> 1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [ 741.194929]  acpi_ev_address_space_dispatch+0x16d/0x3c0
> > >>
> > >> Looks like acpi holds another lock. Can you check if the patch submitted by
> > >> Rafael fixes the problem ?
> > >
> > > Hello,
> > >
> > > I tested the patch as a standalone change, and the issue is still present.
> > >
> > > However, I can now reliably reproduce the problem. It is triggered while
> > > Dell iDRAC is being reset or updated and is temporarily unavailable. During
> > > this time, a [kipmi0] task enters D state for an extended period, and
> > > shortly after (or during this), the ACPI power meter hwmon interface ends up
> > > in the locked state described earlier.
> > >
> > > This explains why we mostly observe the issue after redeployments in our
> > > environment: server redeployments typically include iDRAC firmware updates
> > > or resets performed together with operating system installation or updates.
> > >
> >
> > Thanks for the update. Do you have a dmesg log showing the problem with Rafael's
> > patch applied, by any chance ?
>
> [  740.004138] INFO: task alloy:8260 blocked for more than 122 seconds.
> [  740.010752]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
> [  740.019257] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  740.030103] task:alloy           state:D stack:0     pid:8260
> tgid:8260  ppid:1      task_flags:0x400100 flags:0x00080001
> [  740.045068] Call Trace:
> [  740.050334]  <TASK>
> [  740.055075]  __schedule+0x2b5/0x690
> [  740.062073]  schedule+0x23/0x80
> [  740.068081]  schedule_preempt_disabled+0x11/0x20
> [  740.074962]  __mutex_lock.constprop.0+0x3c9/0xa00
> [  740.083073]  hwmon_attr_show+0x36/0x130
> [  740.089081]  dev_attr_show+0x19/0x60
> [  740.095876]  sysfs_kf_seq_show+0xbf/0x140
> [  740.102810]  seq_read_iter+0x112/0x510
> [  740.109070]  ? security_file_permission+0x8e/0xa0
> [  740.117100]  vfs_read+0x215/0x340
> [  740.123113]  ksys_read+0x61/0xe0
> [  740.130088]  do_syscall_64+0x5d/0xb40
> [  740.137032]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [  740.144601] RIP: 0033:0x408d8e
> [  740.151070] RSP: 002b:000000c001cad498 EFLAGS: 00000206 ORIG_RAX:
> 0000000000000000
> [  740.160825] RAX: ffffffffffffffda RBX: 0000000000000022 RCX: 0000000000408d8e
> [  740.170097] RDX: 0000000000000080 RSI: 000000c007c28880 RDI: 0000000000000022
> [  740.180085] RBP: 000000c001cad4d8 R08: 0000000000000000 R09: 0000000000000000
> [  740.190074] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> [  740.201069] R13: 0000000000000040 R14: 000000c0058b1dc0 R15: ffffffffffffffff
> [  740.211072]  </TASK>
> [  740.215117] INFO: task alloy:8268 blocked for more than 123 seconds.
> [  740.224109]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
> [  740.234089] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  740.243611] task:alloy           state:D stack:0     pid:8268
> tgid:8260  ppid:1      task_flags:0x400140 flags:0x00080001
> [  740.258077] Call Trace:
> [  740.261260]  <TASK>
> [  740.266083]  __schedule+0x2b5/0x690
> [  740.273081]  schedule+0x23/0x80
> [  740.279055]  schedule_preempt_disabled+0x11/0x20
> [  740.286087]  __mutex_lock.constprop.0+0x3c9/0xa00
> [  740.294074]  hwmon_attr_show+0x36/0x130
> [  740.300089]  dev_attr_show+0x19/0x60
> [  740.306074]  sysfs_kf_seq_show+0xbf/0x140
> [  740.313074]  seq_read_iter+0x112/0x510
> [  740.319075]  ? security_file_permission+0x8e/0xa0
> [  740.326064]  vfs_read+0x215/0x340
> [  740.331218]  ksys_read+0x61/0xe0
> [  740.337078]  do_syscall_64+0x5d/0xb40
> [  740.342083]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [  740.349376] RIP: 0033:0x408d8e
> [  740.355086] RSP: 002b:000000c008403498 EFLAGS: 00000206 ORIG_RAX:
> 0000000000000000
> [  740.365088] RAX: ffffffffffffffda RBX: 000000000000001f RCX: 0000000000408d8e
> [  740.375072] RDX: 0000000000000080 RSI: 000000c006a47d80 RDI: 000000000000001f
> [  740.383669] RBP: 000000c0084034d8 R08: 0000000000000000 R09: 0000000000000000
> [  740.393072] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> [  740.401326] R13: 0000000000000040 R14: 000000c005ea7880 R15: ffffffffffffffff
> [  740.411087]  </TASK>
> [  740.416089] INFO: task alloy:8270 blocked for more than 123 seconds.
> [  740.425081]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
> [  740.434091] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  740.444078] task:alloy           state:D stack:0     pid:8270
> tgid:8260  ppid:1      task_flags:0x400140 flags:0x00080001
> [  740.457108] Call Trace:
> [  740.462080]  <TASK>
> [  740.467073]  __schedule+0x2b5/0x690
> [  740.472252]  schedule+0x23/0x80
> [  740.478087]  schedule_preempt_disabled+0x11/0x20
> [  740.485099]  __mutex_lock.constprop.0+0x3c9/0xa00
> [  740.491096]  hwmon_attr_show+0x36/0x130
> [  740.497082]  dev_attr_show+0x19/0x60
> [  740.503078]  sysfs_kf_seq_show+0xbf/0x140
> [  740.510030]  seq_read_iter+0x112/0x510
> [  740.516073]  ? security_file_permission+0x8e/0xa0
> [  740.523355]  vfs_read+0x215/0x340
> [  740.529077]  ksys_read+0x61/0xe0
> [  740.534925]  do_syscall_64+0x5d/0xb40
> [  740.540253]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [  740.548077] RIP: 0033:0x408d8e
> [  740.553078] RSP: 002b:000000c0085cf498 EFLAGS: 00000206 ORIG_RAX:
> 0000000000000000
> [  740.563075] RAX: ffffffffffffffda RBX: 000000000000001b RCX: 0000000000408d8e
> [  740.572088] RDX: 0000000000000080 RSI: 000000c006d44f80 RDI: 000000000000001b
> [  740.582076] RBP: 000000c0085cf4d8 R08: 0000000000000000 R09: 0000000000000000
> [  740.591086] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> [  740.600086] R13: 0000000000000040 R14: 000000c005703c00 R15: 0000000000000054
> [  740.609481]  </TASK>
> [  740.614084] INFO: task alloy:8287 blocked for more than 123 seconds.
> [  740.623089]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
> [  740.632094] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  740.642083] task:alloy           state:D stack:0     pid:8287
> tgid:8260  ppid:1      task_flags:0x400140 flags:0x00080001
> [  740.656080] Call Trace:
> [  740.660694]  <TASK>
> [  740.665087]  __schedule+0x2b5/0x690
> [  740.671086]  schedule+0x23/0x80
> [  740.677080]  schedule_timeout+0xe8/0x100
> [  740.683134]  __wait_for_common+0x99/0x1c0

So this is wait_for_completion(&tx_msg->tx_complete) in
acpi_ipmi_space_handler(), if I'm not mistaken, which never gets
completed, so stuff cannot make progress.

I guess what happens is that one of the ACPI methods evaluated by the
acpi_power_meter driver trips on an IPMI operation region and locks up
because of the above.

Did you, by any chance, upgrade the firmware in the time frame between
6.17 and 6.18?

> [  740.690093]  ? __pfx_schedule_timeout+0x10/0x10
> [  740.697081]  acpi_ipmi_space_handler.part.0+0x1ac/0x370 [acpi_ipmi]
> [  740.706086]  acpi_ev_address_space_dispatch+0x16d/0x3c0
> [  740.713584]  ? __pfx_acpi_ipmi_space_handler+0x10/0x10 [acpi_ipmi]
> [  740.722080]  acpi_ex_access_region+0x5b/0xd0
> [  740.729078]  acpi_ex_write_serial_bus+0xb6/0x1e0
> [  740.736083]  acpi_ex_store_object_to_node+0x15e/0x200
> [  740.744083]  acpi_ex_opcode_1A_1T_1R+0x24c/0x620
> [  740.750088]  acpi_ds_exec_end_op+0x287/0x510
> [  740.757076]  acpi_ps_parse_loop+0xf6/0x680
> [  740.763177]  acpi_ps_parse_aml+0x17a/0x3d0
> [  740.770074]  acpi_ps_execute_method+0x137/0x270
> [  740.776256]  acpi_ns_evaluate+0x1f4/0x2e0
> [  740.783076]  acpi_evaluate_object+0x13c/0x2e0
> [  740.790118]  acpi_evaluate_integer+0x4f/0xe0
> [  740.796571]  ? obj_cgroup_charge_account+0x187/0x2d0
> [  740.804074]  update_meter+0x61/0xd0 [acpi_power_meter]
> [  740.811082]  power_meter_read+0x1ea/0x2c0 [acpi_power_meter]
> [  740.819088]  hwmon_attr_show+0x5e/0x130
> [  740.825093]  dev_attr_show+0x19/0x60
> [  740.830810]  sysfs_kf_seq_show+0xbf/0x140
> [  740.837076]  seq_read_iter+0x112/0x510
> [  740.843079]  ? security_file_permission+0x8e/0xa0
> [  740.849058]  vfs_read+0x215/0x340
> [  740.855086]  ksys_read+0x61/0xe0
> [  740.861058]  do_syscall_64+0x5d/0xb40
> [  740.866095]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [  740.874104] RIP: 0033:0x408d8e
> [  740.880067] RSP: 002b:000000c007e81498 EFLAGS: 00000206 ORIG_RAX:
> 0000000000000000
> [  740.890076] RAX: ffffffffffffffda RBX: 000000000000001d RCX: 0000000000408d8e
> [  740.900075] RDX: 0000000000000080 RSI: 000000c007383380 RDI: 000000000000001d
> [  740.910080] RBP: 000000c007e814d8 R08: 0000000000000000 R09: 0000000000000000
> [  740.918446] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> [  740.928087] R13: 0000000000000040 R14: 000000c00560ec40 R15: ffffffffffffffff
> [  740.937080]  </TASK>
> [  740.942085] INFO: task alloy:8463 blocked for more than 123 seconds.
> [  740.951076]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
> [  740.959660] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  740.970077] task:alloy           state:D stack:0     pid:8463
> tgid:8260  ppid:1      task_flags:0x400140 flags:0x00080001
> [  740.983108] Call Trace:
> [  740.987305]  <TASK>
> [  740.992084]  __schedule+0x2b5/0x690
> [  740.998077]  schedule+0x23/0x80
> [  741.003472]  schedule_preempt_disabled+0x11/0x20
> [  741.009738]  __mutex_lock.constprop.0+0x3c9/0xa00
> [  741.017079]  hwmon_attr_show+0x36/0x130
> [  741.023094]  dev_attr_show+0x19/0x60
> [  741.029076]  sysfs_kf_seq_show+0xbf/0x140
> [  741.035084]  seq_read_iter+0x112/0x510
> [  741.039031]  ? security_file_permission+0x8e/0xa0
> [  741.045212]  vfs_read+0x215/0x340
> [  741.051081]  ksys_read+0x61/0xe0
> [  741.056076]  do_syscall_64+0x5d/0xb40
> [  741.061744]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [  741.069081] RIP: 0033:0x408d8e
> [  741.074087] RSP: 002b:000000c006127498 EFLAGS: 00000206 ORIG_RAX:
> 0000000000000000
> [  741.084107] RAX: ffffffffffffffda RBX: 0000000000000020 RCX: 0000000000408d8e
> [  741.093151] RDX: 0000000000000080 RSI: 000000c005b15b00 RDI: 0000000000000020
> [  741.103087] RBP: 000000c0061274d8 R08: 0000000000000000 R09: 0000000000000000
> [  741.113084] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> [  741.123120] R13: 0000000000000040 R14: 000000c005ba1dc0 R15: ffffffffffffffff
> [  741.133093]  </TASK>
> [  741.138081] INFO: task alloy:8465 blocked for more than 124 seconds.
> [  741.147085]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
> [  741.155079] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  741.165084] task:alloy           state:D stack:0     pid:8465
> tgid:8260  ppid:1      task_flags:0x400140 flags:0x00080001
> [  741.178095] Call Trace:
> [  741.182595]  <TASK>
> [  741.187079]  __schedule+0x2b5/0x690
> [  741.193014]  schedule+0x23/0x80
> [  741.199071]  schedule_preempt_disabled+0x11/0x20
> [  741.206102]  __mutex_lock.constprop.0+0x3c9/0xa00
> [  741.214091]  hwmon_attr_show+0x36/0x130
> [  741.221079]  dev_attr_show+0x19/0x60
> [  741.226081]  sysfs_kf_seq_show+0xbf/0x140
> [  741.232083]  seq_read_iter+0x112/0x510
> [  741.237943]  ? security_file_permission+0x8e/0xa0
> [  741.245080]  vfs_read+0x215/0x340
> [  741.251080]  ksys_read+0x61/0xe0
> [  741.257068]  do_syscall_64+0x5d/0xb40
> [  741.262084]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [  741.270083] RIP: 0033:0x408d8e
> [  741.275229] RSP: 002b:000000c007e85498 EFLAGS: 00000206 ORIG_RAX:
> 0000000000000000
> [  741.285086] RAX: ffffffffffffffda RBX: 000000000000001e RCX: 0000000000408d8e
> [  741.294078] RDX: 0000000000000080 RSI: 000000c0066a5180 RDI: 000000000000001e
> [  741.304078] RBP: 000000c007e854d8 R08: 0000000000000000 R09: 0000000000000000
> [  741.313085] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> [  741.323084] R13: 0000000000000040 R14: 000000c0040f8fc0 R15: ffffffffffffffff
> [  741.333078]  </TASK>
> [  741.337087] INFO: task alloy:8470 blocked for more than 124 seconds.
> [  741.346079]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
> [  741.355091] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  741.365102] task:alloy           state:D stack:0     pid:8470
> tgid:8260  ppid:1      task_flags:0x400140 flags:0x00080001
> [  741.378081] Call Trace:
> [  741.383127]  <TASK>
> [  741.388089]  __schedule+0x2b5/0x690
> [  741.394085]  schedule+0x23/0x80
> [  741.400084]  schedule_preempt_disabled+0x11/0x20
> [  741.408085]  __mutex_lock.constprop.0+0x3c9/0xa00
> [  741.415246]  hwmon_attr_show+0x36/0x130
> [  741.422088]  dev_attr_show+0x19/0x60
> [  741.428079]  sysfs_kf_seq_show+0xbf/0x140
> [  741.434088]  seq_read_iter+0x112/0x510
> [  741.440075]  ? security_file_permission+0x8e/0xa0
> [  741.446661]  vfs_read+0x215/0x340
> [  741.452104]  ksys_read+0x61/0xe0
> [  741.458082]  do_syscall_64+0x5d/0xb40
> [  741.464083]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [  741.471412] RIP: 0033:0x408d8e
> [  741.477072] RSP: 002b:000000c007369498 EFLAGS: 00000206 ORIG_RAX:
> 0000000000000000
> [  741.487081] RAX: ffffffffffffffda RBX: 0000000000000021 RCX: 0000000000408d8e
> [  741.496078] RDX: 0000000000000080 RSI: 000000c005cff500 RDI: 0000000000000021
> [  741.506094] RBP: 000000c0073694d8 R08: 0000000000000000 R09: 0000000000000000
> [  741.515087] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> [  741.525084] R13: 0000000000000040 R14: 000000c001db6c40 R15: ffffffffffffffff
> [  741.534086]  </TASK>
> [  741.539094] INFO: task alloy:8474 blocked for more than 124 seconds.
> [  741.548092]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
> [  741.557097] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  741.567095] task:alloy           state:D stack:0     pid:8474
> tgid:8260  ppid:1      task_flags:0x400140 flags:0x00080001
> [  741.581079] Call Trace:
> [  741.586087]  <TASK>
> [  741.591162]  __schedule+0x2b5/0x690
> [  741.597103]  schedule+0x23/0x80
> [  741.603109]  schedule_preempt_disabled+0x11/0x20
> [  741.610098]  __mutex_lock.constprop.0+0x3c9/0xa00
> [  741.617086]  hwmon_attr_show+0x36/0x130
> [  741.623137]  dev_attr_show+0x19/0x60
> [  741.629087]  sysfs_kf_seq_show+0xbf/0x140
> [  741.635090]  seq_read_iter+0x112/0x510
> [  741.641083]  ? security_file_permission+0x8e/0xa0
> [  741.648095]  vfs_read+0x215/0x340
> [  741.654101]  ksys_read+0x61/0xe0
> [  741.660087]  do_syscall_64+0x5d/0xb40
> [  741.666090]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [  741.672343] RIP: 0033:0x408d8e
> [  741.678083] RSP: 002b:000000c000b03498 EFLAGS: 00000206 ORIG_RAX:
> 0000000000000000
> [  741.688096] RAX: ffffffffffffffda RBX: 0000000000000025 RCX: 0000000000408d8e
> [  741.698084] RDX: 0000000000000080 RSI: 000000c0069df000 RDI: 0000000000000025
> [  741.706728] RBP: 000000c000b034d8 R08: 0000000000000000 R09: 0000000000000000
> [  741.716084] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> [  741.725093] R13: 0000000000000040 R14: 000000c004075340 R15: ffffffffffffffff
> [  741.735124]  </TASK>
> [  741.737900] INFO: task alloy:8726 blocked for more than 124 seconds.
> [  741.747088]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
> [  741.756086] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  741.766111] task:alloy           state:D stack:0     pid:8726
> tgid:8260  ppid:1      task_flags:0x400140 flags:0x00080001
> [  741.780116] Call Trace:
> [  741.785093]  <TASK>
> [  741.790100]  __schedule+0x2b5/0x690
> [  741.796596]  schedule+0x23/0x80
> [  741.802109]  schedule_preempt_disabled+0x11/0x20
> [  741.809091]  __mutex_lock.constprop.0+0x3c9/0xa00
> [  741.815262]  hwmon_attr_show+0x36/0x130
> [  741.822096]  dev_attr_show+0x19/0x60
> [  741.828105]  sysfs_kf_seq_show+0xbf/0x140
> [  741.834088]  seq_read_iter+0x112/0x510
> [  741.839547]  ? security_file_permission+0x8e/0xa0
> [  741.846570]  vfs_read+0x215/0x340
> [  741.850048] perf: interrupt took too long (2993 > 2500), lowering
> kernel.perf_event_max_sample_rate to 66000
> [  741.863112]  ksys_read+0x61/0xe0
> [  741.868778]  do_syscall_64+0x5d/0xb40
> [  741.875104]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [  741.883101] RIP: 0033:0x408d8e
> [  741.888436] RSP: 002b:000000c0085cb498 EFLAGS: 00000206 ORIG_RAX:
> 0000000000000000
> [  741.898081] RAX: ffffffffffffffda RBX: 0000000000000024 RCX: 0000000000408d8e
> [  741.907092] RDX: 0000000000000080 RSI: 000000c006424e00 RDI: 0000000000000024
> [  741.917092] RBP: 000000c0085cb4d8 R08: 0000000000000000 R09: 0000000000000000
> [  741.926087] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> [  741.936089] R13: 0000000000000040 R14: 000000c00565aa80 R15: ffffffffffffffff
> [  741.945098]  </TASK>
> [  862.885552] INFO: task alloy:8260 blocked for more than 245 seconds.
> [  862.892117]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
> [  862.904592] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  862.914632] task:alloy           state:D stack:0     pid:8260
> tgid:8260  ppid:1      task_flags:0x400100 flags:0x00080001
> [  862.928478] Call Trace:
> [  862.932477]  <TASK>
> [  862.938482]  __schedule+0x2b5/0x690
> [  862.944502]  schedule+0x23/0x80
> [  862.949478]  schedule_preempt_disabled+0x11/0x20
> [  862.955487]  __mutex_lock.constprop.0+0x3c9/0xa00
> [  862.962489]  hwmon_attr_show+0x36/0x130
> [  862.968485]  dev_attr_show+0x19/0x60
> [  862.974482]  sysfs_kf_seq_show+0xbf/0x140
> [  862.981517]  seq_read_iter+0x112/0x510
> [  862.987485]  ? security_file_permission+0x8e/0xa0
> [  862.994505]  vfs_read+0x215/0x340
> [  863.000295]  ksys_read+0x61/0xe0
> [  863.006484]  do_syscall_64+0x5d/0xb40
> [  863.013226]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [  863.021497] RIP: 0033:0x408d8e
> [  863.027499] RSP: 002b:000000c001cad498 EFLAGS: 00000206 ORIG_RAX:
> 0000000000000000
> [  863.037483] RAX: ffffffffffffffda RBX: 0000000000000022 RCX: 0000000000408d8e
> [  863.047483] RDX: 0000000000000080 RSI: 000000c007c28880 RDI: 0000000000000022
> [  863.057487] RBP: 000000c001cad4d8 R08: 0000000000000000 R09: 0000000000000000
> [  863.066499] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> [  863.076485] R13: 0000000000000040 R14: 000000c0058b1dc0 R15: ffffffffffffffff
> [  863.087007]  </TASK>
>
>
> >
> > Thanks,
> > Guenter
> >
>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-02-02 19:26                           ` Rafael J. Wysocki
@ 2026-02-03  0:26                             ` Guenter Roeck
  2026-02-03  8:23                             ` Jaroslav Pulchart
  1 sibling, 0 replies; 60+ messages in thread
From: Guenter Roeck @ 2026-02-03  0:26 UTC (permalink / raw)
  To: Rafael J. Wysocki, Jaroslav Pulchart
  Cc: linux-acpi, linux-hwmon, Igor Raits, Daniel Secik, Zdenek Pesek,
	Jiri Jurica, Huisong Li

On 2/2/26 11:26, Rafael J. Wysocki wrote:
> On Mon, Feb 2, 2026 at 7:14 PM Jaroslav Pulchart
> <jaroslav.pulchart@gooddata.com> wrote:
>>
>>>
>>> On 2/2/26 08:48, Jaroslav Pulchart wrote:
>>>>>
>>>>> On Thu, Jan 29, 2026 at 04:26:37PM +0100, Jaroslav Pulchart wrote:
>>>>>>>
>>>>>>> On 1/27/26 03:58, Jaroslav Pulchart wrote:
>>>>>>> ...
>>>>>>>>>> Hello,
>>>>>>>>>>
>>>>>>>>>> Thank you for the analysis and insights.
>>>>>>>>>>
>>>>>>>>>> To add some context from our side: we observe this issue shortly after
>>>>>>>>>> a server reinstall or fw updates followed by a reboot, typically when
>>>>>>>>>> hwmon sensors are accessed for the first time (e.g. by monitoring during
>>>>>>>>>> early system initialization). We have not seen it trigger during normal
>>>>>>>>>> operation, such as a simple reboot or on long-running systems.
>>>>>>>>>>
>>>>>>>>>> Given this behavior, it is possible that the issue is related to
>>>>>>>>>> hwmon_device_unregister() being triggered early during system
>>>>>>>>>> initialization, but I do not have concrete evidence at this point.
>>>>>>>>>>
>>>>>>>>>> As a workaround, we will exclude the ACPI power meter from hwmon monitoring
>>>>>>>>>> (Grafana Alloy) until the mentioned refactoring or a fix becomes available.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Would it be possible for you to test the patch series I just sent out ?
>>>>>>>>
>>>>>>>> I suppose these 5 patches from
>>>>>>>> https://lore.kernel.org/lkml/20260123182208.2229670-1-linux@roeck-us.net/
>>>>>>>>     ?
>>>>>>>
>>>>>>> Correct.
>>>>>>>
>>>>>>>> I will try them and let you know.
>>>>>>>>
>>>>>>> Thanks !
>>>>>>
>>>>>>
>>>>>> The issue is still there:
>>>>>> ...
>>>>>
>>>>> Thnaks a lot for testing.
>>>>>
>>>>>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.105931] INFO: task alloy:5985 blocked for more than 124 seconds.
>>>>>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.114475]       Tainted: G            E 6.18.7-2.gdc.el9.x86_64 #1
>>>>>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.122872] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>>>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.132932] task:alloy           state:D stack:0     pid:5985 tgid:5871  ppid:1      task_flags:0x400140 flags:0x00080001
>>>>>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.145975] Call Trace:
>>>>>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.149985]  <TASK>
>>>>>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.154931]  __schedule+0x2b5/0x690
>>>>>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.160934]  schedule+0x23/0x80
>>>>>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.166498]  schedule_timeout+0xe8/0x100
>>>>>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.172936]  __wait_for_common+0x99/0x1c0
>>>>>> 1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [ 741.179148]  ? __pfx_schedule_timeout+0x10/0x10
>>>>>> 1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [ 741.185936]  acpi_ipmi_space_handler.part.0+0x1ac/0x370 [acpi_ipmi]
>>>>>> 1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [ 741.194929]  acpi_ev_address_space_dispatch+0x16d/0x3c0
>>>>>
>>>>> Looks like acpi holds another lock. Can you check if the patch submitted by
>>>>> Rafael fixes the problem ?
>>>>
>>>> Hello,
>>>>
>>>> I tested the patch as a standalone change, and the issue is still present.
>>>>
>>>> However, I can now reliably reproduce the problem. It is triggered while
>>>> Dell iDRAC is being reset or updated and is temporarily unavailable. During
>>>> this time, a [kipmi0] task enters D state for an extended period, and
>>>> shortly after (or during this), the ACPI power meter hwmon interface ends up
>>>> in the locked state described earlier.
>>>>
>>>> This explains why we mostly observe the issue after redeployments in our
>>>> environment: server redeployments typically include iDRAC firmware updates
>>>> or resets performed together with operating system installation or updates.
>>>>
>>>
>>> Thanks for the update. Do you have a dmesg log showing the problem with Rafael's
>>> patch applied, by any chance ?
>>
>> [  740.004138] INFO: task alloy:8260 blocked for more than 122 seconds.
>> [  740.010752]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
>> [  740.019257] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>> disables this message.
>> [  740.030103] task:alloy           state:D stack:0     pid:8260
>> tgid:8260  ppid:1      task_flags:0x400100 flags:0x00080001
>> [  740.045068] Call Trace:
>> [  740.050334]  <TASK>
>> [  740.055075]  __schedule+0x2b5/0x690
>> [  740.062073]  schedule+0x23/0x80
>> [  740.068081]  schedule_preempt_disabled+0x11/0x20
>> [  740.074962]  __mutex_lock.constprop.0+0x3c9/0xa00
>> [  740.083073]  hwmon_attr_show+0x36/0x130
>> [  740.089081]  dev_attr_show+0x19/0x60
>> [  740.095876]  sysfs_kf_seq_show+0xbf/0x140
>> [  740.102810]  seq_read_iter+0x112/0x510
>> [  740.109070]  ? security_file_permission+0x8e/0xa0
>> [  740.117100]  vfs_read+0x215/0x340
>> [  740.123113]  ksys_read+0x61/0xe0
>> [  740.130088]  do_syscall_64+0x5d/0xb40
>> [  740.137032]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> [  740.144601] RIP: 0033:0x408d8e
>> [  740.151070] RSP: 002b:000000c001cad498 EFLAGS: 00000206 ORIG_RAX:
>> 0000000000000000
>> [  740.160825] RAX: ffffffffffffffda RBX: 0000000000000022 RCX: 0000000000408d8e
>> [  740.170097] RDX: 0000000000000080 RSI: 000000c007c28880 RDI: 0000000000000022
>> [  740.180085] RBP: 000000c001cad4d8 R08: 0000000000000000 R09: 0000000000000000
>> [  740.190074] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
>> [  740.201069] R13: 0000000000000040 R14: 000000c0058b1dc0 R15: ffffffffffffffff
>> [  740.211072]  </TASK>
>> [  740.215117] INFO: task alloy:8268 blocked for more than 123 seconds.
>> [  740.224109]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
>> [  740.234089] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>> disables this message.
>> [  740.243611] task:alloy           state:D stack:0     pid:8268
>> tgid:8260  ppid:1      task_flags:0x400140 flags:0x00080001
>> [  740.258077] Call Trace:
>> [  740.261260]  <TASK>
>> [  740.266083]  __schedule+0x2b5/0x690
>> [  740.273081]  schedule+0x23/0x80
>> [  740.279055]  schedule_preempt_disabled+0x11/0x20
>> [  740.286087]  __mutex_lock.constprop.0+0x3c9/0xa00
>> [  740.294074]  hwmon_attr_show+0x36/0x130
>> [  740.300089]  dev_attr_show+0x19/0x60
>> [  740.306074]  sysfs_kf_seq_show+0xbf/0x140
>> [  740.313074]  seq_read_iter+0x112/0x510
>> [  740.319075]  ? security_file_permission+0x8e/0xa0
>> [  740.326064]  vfs_read+0x215/0x340
>> [  740.331218]  ksys_read+0x61/0xe0
>> [  740.337078]  do_syscall_64+0x5d/0xb40
>> [  740.342083]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> [  740.349376] RIP: 0033:0x408d8e
>> [  740.355086] RSP: 002b:000000c008403498 EFLAGS: 00000206 ORIG_RAX:
>> 0000000000000000
>> [  740.365088] RAX: ffffffffffffffda RBX: 000000000000001f RCX: 0000000000408d8e
>> [  740.375072] RDX: 0000000000000080 RSI: 000000c006a47d80 RDI: 000000000000001f
>> [  740.383669] RBP: 000000c0084034d8 R08: 0000000000000000 R09: 0000000000000000
>> [  740.393072] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
>> [  740.401326] R13: 0000000000000040 R14: 000000c005ea7880 R15: ffffffffffffffff
>> [  740.411087]  </TASK>
>> [  740.416089] INFO: task alloy:8270 blocked for more than 123 seconds.
>> [  740.425081]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
>> [  740.434091] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>> disables this message.
>> [  740.444078] task:alloy           state:D stack:0     pid:8270
>> tgid:8260  ppid:1      task_flags:0x400140 flags:0x00080001
>> [  740.457108] Call Trace:
>> [  740.462080]  <TASK>
>> [  740.467073]  __schedule+0x2b5/0x690
>> [  740.472252]  schedule+0x23/0x80
>> [  740.478087]  schedule_preempt_disabled+0x11/0x20
>> [  740.485099]  __mutex_lock.constprop.0+0x3c9/0xa00
>> [  740.491096]  hwmon_attr_show+0x36/0x130
>> [  740.497082]  dev_attr_show+0x19/0x60
>> [  740.503078]  sysfs_kf_seq_show+0xbf/0x140
>> [  740.510030]  seq_read_iter+0x112/0x510
>> [  740.516073]  ? security_file_permission+0x8e/0xa0
>> [  740.523355]  vfs_read+0x215/0x340
>> [  740.529077]  ksys_read+0x61/0xe0
>> [  740.534925]  do_syscall_64+0x5d/0xb40
>> [  740.540253]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> [  740.548077] RIP: 0033:0x408d8e
>> [  740.553078] RSP: 002b:000000c0085cf498 EFLAGS: 00000206 ORIG_RAX:
>> 0000000000000000
>> [  740.563075] RAX: ffffffffffffffda RBX: 000000000000001b RCX: 0000000000408d8e
>> [  740.572088] RDX: 0000000000000080 RSI: 000000c006d44f80 RDI: 000000000000001b
>> [  740.582076] RBP: 000000c0085cf4d8 R08: 0000000000000000 R09: 0000000000000000
>> [  740.591086] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
>> [  740.600086] R13: 0000000000000040 R14: 000000c005703c00 R15: 0000000000000054
>> [  740.609481]  </TASK>
>> [  740.614084] INFO: task alloy:8287 blocked for more than 123 seconds.
>> [  740.623089]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
>> [  740.632094] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>> disables this message.
>> [  740.642083] task:alloy           state:D stack:0     pid:8287
>> tgid:8260  ppid:1      task_flags:0x400140 flags:0x00080001
>> [  740.656080] Call Trace:
>> [  740.660694]  <TASK>
>> [  740.665087]  __schedule+0x2b5/0x690
>> [  740.671086]  schedule+0x23/0x80
>> [  740.677080]  schedule_timeout+0xe8/0x100
>> [  740.683134]  __wait_for_common+0x99/0x1c0
> 
> So this is wait_for_completion(&tx_msg->tx_complete) in
> acpi_ipmi_space_handler(), if I'm not mistaken, which never gets
> completed, so stuff cannot make progress.
> 
> I guess what happens is that one of the ACPI methods evaluated by the
> acpi_power_meter driver trips on an IPMI operation region and locks up
> because of the above.
> 
> Did you, by any chance, upgrade the firmware in the time frame between
> 6.17 and 6.18?
> 

"It is triggered while Dell iDRAC is being reset or updated and is temporarily unavailable."

so I guess this is a yes. What I don't understand is why this wasn't seen before.
I would have assumed that the problem exists even without the additional
lock in the hardware monitoring core. Obviously I am missing something.

Any idea ?

Thanks,
Guenter


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-02-02 19:26                           ` Rafael J. Wysocki
  2026-02-03  0:26                             ` Guenter Roeck
@ 2026-02-03  8:23                             ` Jaroslav Pulchart
  2026-02-03 23:21                               ` [BISECTED]: " Jaroslav Pulchart
  1 sibling, 1 reply; 60+ messages in thread
From: Jaroslav Pulchart @ 2026-02-03  8:23 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Guenter Roeck, linux-acpi, linux-hwmon, Igor Raits, Daniel Secik,
	Zdenek Pesek, Jiri Jurica, Huisong Li

> On Mon, Feb 2, 2026 at 7:14 PM Jaroslav Pulchart
> <jaroslav.pulchart@gooddata.com> wrote:
> >
> > >
> > > On 2/2/26 08:48, Jaroslav Pulchart wrote:
> > > >>
> > > >> On Thu, Jan 29, 2026 at 04:26:37PM +0100, Jaroslav Pulchart wrote:
> > > >>>>
> > > >>>> On 1/27/26 03:58, Jaroslav Pulchart wrote:
> > > >>>> ...
> > > >>>>>>> Hello,
> > > >>>>>>>
> > > >>>>>>> Thank you for the analysis and insights.
> > > >>>>>>>
> > > >>>>>>> To add some context from our side: we observe this issue shortly after
> > > >>>>>>> a server reinstall or fw updates followed by a reboot, typically when
> > > >>>>>>> hwmon sensors are accessed for the first time (e.g. by monitoring during
> > > >>>>>>> early system initialization). We have not seen it trigger during normal
> > > >>>>>>> operation, such as a simple reboot or on long-running systems.
> > > >>>>>>>
> > > >>>>>>> Given this behavior, it is possible that the issue is related to
> > > >>>>>>> hwmon_device_unregister() being triggered early during system
> > > >>>>>>> initialization, but I do not have concrete evidence at this point.
> > > >>>>>>>
> > > >>>>>>> As a workaround, we will exclude the ACPI power meter from hwmon monitoring
> > > >>>>>>> (Grafana Alloy) until the mentioned refactoring or a fix becomes available.
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>> Would it be possible for you to test the patch series I just sent out ?
> > > >>>>>
> > > >>>>> I suppose these 5 patches from
> > > >>>>> https://lore.kernel.org/lkml/20260123182208.2229670-1-linux@roeck-us.net/
> > > >>>>>    ?
> > > >>>>
> > > >>>> Correct.
> > > >>>>
> > > >>>>> I will try them and let you know.
> > > >>>>>
> > > >>>> Thanks !
> > > >>>
> > > >>>
> > > >>> The issue is still there:
> > > >>> ...
> > > >>
> > > >> Thnaks a lot for testing.
> > > >>
> > > >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.105931] INFO: task alloy:5985 blocked for more than 124 seconds.
> > > >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.114475]       Tainted: G            E 6.18.7-2.gdc.el9.x86_64 #1
> > > >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.122872] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > > >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.132932] task:alloy           state:D stack:0     pid:5985 tgid:5871  ppid:1      task_flags:0x400140 flags:0x00080001
> > > >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.145975] Call Trace:
> > > >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.149985]  <TASK>
> > > >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.154931]  __schedule+0x2b5/0x690
> > > >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.160934]  schedule+0x23/0x80
> > > >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.166498]  schedule_timeout+0xe8/0x100
> > > >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.172936]  __wait_for_common+0x99/0x1c0
> > > >>> 1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [ 741.179148]  ? __pfx_schedule_timeout+0x10/0x10
> > > >>> 1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [ 741.185936]  acpi_ipmi_space_handler.part.0+0x1ac/0x370 [acpi_ipmi]
> > > >>> 1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [ 741.194929]  acpi_ev_address_space_dispatch+0x16d/0x3c0
> > > >>
> > > >> Looks like acpi holds another lock. Can you check if the patch submitted by
> > > >> Rafael fixes the problem ?
> > > >
> > > > Hello,
> > > >
> > > > I tested the patch as a standalone change, and the issue is still present.
> > > >
> > > > However, I can now reliably reproduce the problem. It is triggered while
> > > > Dell iDRAC is being reset or updated and is temporarily unavailable. During
> > > > this time, a [kipmi0] task enters D state for an extended period, and
> > > > shortly after (or during this), the ACPI power meter hwmon interface ends up
> > > > in the locked state described earlier.
> > > >
> > > > This explains why we mostly observe the issue after redeployments in our
> > > > environment: server redeployments typically include iDRAC firmware updates
> > > > or resets performed together with operating system installation or updates.
> > > >
> > >
> > > Thanks for the update. Do you have a dmesg log showing the problem with Rafael's
> > > patch applied, by any chance ?
> >
> > [  740.004138] INFO: task alloy:8260 blocked for more than 122 seconds.
> > [  740.010752]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
> > [  740.019257] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [  740.030103] task:alloy           state:D stack:0     pid:8260
> > tgid:8260  ppid:1      task_flags:0x400100 flags:0x00080001
> > [  740.045068] Call Trace:
> > [  740.050334]  <TASK>
> > [  740.055075]  __schedule+0x2b5/0x690
> > [  740.062073]  schedule+0x23/0x80
> > [  740.068081]  schedule_preempt_disabled+0x11/0x20
> > [  740.074962]  __mutex_lock.constprop.0+0x3c9/0xa00
> > [  740.083073]  hwmon_attr_show+0x36/0x130
> > [  740.089081]  dev_attr_show+0x19/0x60
> > [  740.095876]  sysfs_kf_seq_show+0xbf/0x140
> > [  740.102810]  seq_read_iter+0x112/0x510
> > [  740.109070]  ? security_file_permission+0x8e/0xa0
> > [  740.117100]  vfs_read+0x215/0x340
> > [  740.123113]  ksys_read+0x61/0xe0
> > [  740.130088]  do_syscall_64+0x5d/0xb40
> > [  740.137032]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > [  740.144601] RIP: 0033:0x408d8e
> > [  740.151070] RSP: 002b:000000c001cad498 EFLAGS: 00000206 ORIG_RAX:
> > 0000000000000000
> > [  740.160825] RAX: ffffffffffffffda RBX: 0000000000000022 RCX: 0000000000408d8e
> > [  740.170097] RDX: 0000000000000080 RSI: 000000c007c28880 RDI: 0000000000000022
> > [  740.180085] RBP: 000000c001cad4d8 R08: 0000000000000000 R09: 0000000000000000
> > [  740.190074] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> > [  740.201069] R13: 0000000000000040 R14: 000000c0058b1dc0 R15: ffffffffffffffff
> > [  740.211072]  </TASK>
> > [  740.215117] INFO: task alloy:8268 blocked for more than 123 seconds.
> > [  740.224109]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
> > [  740.234089] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [  740.243611] task:alloy           state:D stack:0     pid:8268
> > tgid:8260  ppid:1      task_flags:0x400140 flags:0x00080001
> > [  740.258077] Call Trace:
> > [  740.261260]  <TASK>
> > [  740.266083]  __schedule+0x2b5/0x690
> > [  740.273081]  schedule+0x23/0x80
> > [  740.279055]  schedule_preempt_disabled+0x11/0x20
> > [  740.286087]  __mutex_lock.constprop.0+0x3c9/0xa00
> > [  740.294074]  hwmon_attr_show+0x36/0x130
> > [  740.300089]  dev_attr_show+0x19/0x60
> > [  740.306074]  sysfs_kf_seq_show+0xbf/0x140
> > [  740.313074]  seq_read_iter+0x112/0x510
> > [  740.319075]  ? security_file_permission+0x8e/0xa0
> > [  740.326064]  vfs_read+0x215/0x340
> > [  740.331218]  ksys_read+0x61/0xe0
> > [  740.337078]  do_syscall_64+0x5d/0xb40
> > [  740.342083]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > [  740.349376] RIP: 0033:0x408d8e
> > [  740.355086] RSP: 002b:000000c008403498 EFLAGS: 00000206 ORIG_RAX:
> > 0000000000000000
> > [  740.365088] RAX: ffffffffffffffda RBX: 000000000000001f RCX: 0000000000408d8e
> > [  740.375072] RDX: 0000000000000080 RSI: 000000c006a47d80 RDI: 000000000000001f
> > [  740.383669] RBP: 000000c0084034d8 R08: 0000000000000000 R09: 0000000000000000
> > [  740.393072] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> > [  740.401326] R13: 0000000000000040 R14: 000000c005ea7880 R15: ffffffffffffffff
> > [  740.411087]  </TASK>
> > [  740.416089] INFO: task alloy:8270 blocked for more than 123 seconds.
> > [  740.425081]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
> > [  740.434091] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [  740.444078] task:alloy           state:D stack:0     pid:8270
> > tgid:8260  ppid:1      task_flags:0x400140 flags:0x00080001
> > [  740.457108] Call Trace:
> > [  740.462080]  <TASK>
> > [  740.467073]  __schedule+0x2b5/0x690
> > [  740.472252]  schedule+0x23/0x80
> > [  740.478087]  schedule_preempt_disabled+0x11/0x20
> > [  740.485099]  __mutex_lock.constprop.0+0x3c9/0xa00
> > [  740.491096]  hwmon_attr_show+0x36/0x130
> > [  740.497082]  dev_attr_show+0x19/0x60
> > [  740.503078]  sysfs_kf_seq_show+0xbf/0x140
> > [  740.510030]  seq_read_iter+0x112/0x510
> > [  740.516073]  ? security_file_permission+0x8e/0xa0
> > [  740.523355]  vfs_read+0x215/0x340
> > [  740.529077]  ksys_read+0x61/0xe0
> > [  740.534925]  do_syscall_64+0x5d/0xb40
> > [  740.540253]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > [  740.548077] RIP: 0033:0x408d8e
> > [  740.553078] RSP: 002b:000000c0085cf498 EFLAGS: 00000206 ORIG_RAX:
> > 0000000000000000
> > [  740.563075] RAX: ffffffffffffffda RBX: 000000000000001b RCX: 0000000000408d8e
> > [  740.572088] RDX: 0000000000000080 RSI: 000000c006d44f80 RDI: 000000000000001b
> > [  740.582076] RBP: 000000c0085cf4d8 R08: 0000000000000000 R09: 0000000000000000
> > [  740.591086] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> > [  740.600086] R13: 0000000000000040 R14: 000000c005703c00 R15: 0000000000000054
> > [  740.609481]  </TASK>
> > [  740.614084] INFO: task alloy:8287 blocked for more than 123 seconds.
> > [  740.623089]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
> > [  740.632094] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [  740.642083] task:alloy           state:D stack:0     pid:8287
> > tgid:8260  ppid:1      task_flags:0x400140 flags:0x00080001
> > [  740.656080] Call Trace:
> > [  740.660694]  <TASK>
> > [  740.665087]  __schedule+0x2b5/0x690
> > [  740.671086]  schedule+0x23/0x80
> > [  740.677080]  schedule_timeout+0xe8/0x100
> > [  740.683134]  __wait_for_common+0x99/0x1c0
>
> So this is wait_for_completion(&tx_msg->tx_complete) in
> acpi_ipmi_space_handler(), if I'm not mistaken, which never gets
> completed, so stuff cannot make progress.
>
> I guess what happens is that one of the ACPI methods evaluated by the
> acpi_power_meter driver trips on an IPMI operation region and locks up
> because of the above.
>
> Did you, by any chance, upgrade the firmware in the time frame between
> 6.17 and 6.18?

I retested this now with the same firmware versions on both kernel
lines: an iDRAC reset reliably triggers the issue on 6.18.y, while the
same reproducer does not trigger any issue on 6.17.12.

>
> > [  740.690093]  ? __pfx_schedule_timeout+0x10/0x10
> > [  740.697081]  acpi_ipmi_space_handler.part.0+0x1ac/0x370 [acpi_ipmi]
> > [  740.706086]  acpi_ev_address_space_dispatch+0x16d/0x3c0
> > [  740.713584]  ? __pfx_acpi_ipmi_space_handler+0x10/0x10 [acpi_ipmi]
> > [  740.722080]  acpi_ex_access_region+0x5b/0xd0
> > [  740.729078]  acpi_ex_write_serial_bus+0xb6/0x1e0
> > [  740.736083]  acpi_ex_store_object_to_node+0x15e/0x200
> > [  740.744083]  acpi_ex_opcode_1A_1T_1R+0x24c/0x620
> > [  740.750088]  acpi_ds_exec_end_op+0x287/0x510
> > [  740.757076]  acpi_ps_parse_loop+0xf6/0x680
> > [  740.763177]  acpi_ps_parse_aml+0x17a/0x3d0
> > [  740.770074]  acpi_ps_execute_method+0x137/0x270
> > [  740.776256]  acpi_ns_evaluate+0x1f4/0x2e0
> > [  740.783076]  acpi_evaluate_object+0x13c/0x2e0
> > [  740.790118]  acpi_evaluate_integer+0x4f/0xe0
> > [  740.796571]  ? obj_cgroup_charge_account+0x187/0x2d0
> > [  740.804074]  update_meter+0x61/0xd0 [acpi_power_meter]
> > [  740.811082]  power_meter_read+0x1ea/0x2c0 [acpi_power_meter]
> > [  740.819088]  hwmon_attr_show+0x5e/0x130
> > [  740.825093]  dev_attr_show+0x19/0x60
> > [  740.830810]  sysfs_kf_seq_show+0xbf/0x140
> > [  740.837076]  seq_read_iter+0x112/0x510
> > [  740.843079]  ? security_file_permission+0x8e/0xa0
> > [  740.849058]  vfs_read+0x215/0x340
> > [  740.855086]  ksys_read+0x61/0xe0
> > [  740.861058]  do_syscall_64+0x5d/0xb40
> > [  740.866095]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > [  740.874104] RIP: 0033:0x408d8e
> > [  740.880067] RSP: 002b:000000c007e81498 EFLAGS: 00000206 ORIG_RAX:
> > 0000000000000000
> > [  740.890076] RAX: ffffffffffffffda RBX: 000000000000001d RCX: 0000000000408d8e
> > [  740.900075] RDX: 0000000000000080 RSI: 000000c007383380 RDI: 000000000000001d
> > [  740.910080] RBP: 000000c007e814d8 R08: 0000000000000000 R09: 0000000000000000
> > [  740.918446] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> > [  740.928087] R13: 0000000000000040 R14: 000000c00560ec40 R15: ffffffffffffffff
> > [  740.937080]  </TASK>
> > [  740.942085] INFO: task alloy:8463 blocked for more than 123 seconds.
> > [  740.951076]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
> > [  740.959660] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [  740.970077] task:alloy           state:D stack:0     pid:8463
> > tgid:8260  ppid:1      task_flags:0x400140 flags:0x00080001
> > [  740.983108] Call Trace:
> > [  740.987305]  <TASK>
> > [  740.992084]  __schedule+0x2b5/0x690
> > [  740.998077]  schedule+0x23/0x80
> > [  741.003472]  schedule_preempt_disabled+0x11/0x20
> > [  741.009738]  __mutex_lock.constprop.0+0x3c9/0xa00
> > [  741.017079]  hwmon_attr_show+0x36/0x130
> > [  741.023094]  dev_attr_show+0x19/0x60
> > [  741.029076]  sysfs_kf_seq_show+0xbf/0x140
> > [  741.035084]  seq_read_iter+0x112/0x510
> > [  741.039031]  ? security_file_permission+0x8e/0xa0
> > [  741.045212]  vfs_read+0x215/0x340
> > [  741.051081]  ksys_read+0x61/0xe0
> > [  741.056076]  do_syscall_64+0x5d/0xb40
> > [  741.061744]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > [  741.069081] RIP: 0033:0x408d8e
> > [  741.074087] RSP: 002b:000000c006127498 EFLAGS: 00000206 ORIG_RAX:
> > 0000000000000000
> > [  741.084107] RAX: ffffffffffffffda RBX: 0000000000000020 RCX: 0000000000408d8e
> > [  741.093151] RDX: 0000000000000080 RSI: 000000c005b15b00 RDI: 0000000000000020
> > [  741.103087] RBP: 000000c0061274d8 R08: 0000000000000000 R09: 0000000000000000
> > [  741.113084] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> > [  741.123120] R13: 0000000000000040 R14: 000000c005ba1dc0 R15: ffffffffffffffff
> > [  741.133093]  </TASK>
> > [  741.138081] INFO: task alloy:8465 blocked for more than 124 seconds.
> > [  741.147085]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
> > [  741.155079] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [  741.165084] task:alloy           state:D stack:0     pid:8465
> > tgid:8260  ppid:1      task_flags:0x400140 flags:0x00080001
> > [  741.178095] Call Trace:
> > [  741.182595]  <TASK>
> > [  741.187079]  __schedule+0x2b5/0x690
> > [  741.193014]  schedule+0x23/0x80
> > [  741.199071]  schedule_preempt_disabled+0x11/0x20
> > [  741.206102]  __mutex_lock.constprop.0+0x3c9/0xa00
> > [  741.214091]  hwmon_attr_show+0x36/0x130
> > [  741.221079]  dev_attr_show+0x19/0x60
> > [  741.226081]  sysfs_kf_seq_show+0xbf/0x140
> > [  741.232083]  seq_read_iter+0x112/0x510
> > [  741.237943]  ? security_file_permission+0x8e/0xa0
> > [  741.245080]  vfs_read+0x215/0x340
> > [  741.251080]  ksys_read+0x61/0xe0
> > [  741.257068]  do_syscall_64+0x5d/0xb40
> > [  741.262084]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > [  741.270083] RIP: 0033:0x408d8e
> > [  741.275229] RSP: 002b:000000c007e85498 EFLAGS: 00000206 ORIG_RAX:
> > 0000000000000000
> > [  741.285086] RAX: ffffffffffffffda RBX: 000000000000001e RCX: 0000000000408d8e
> > [  741.294078] RDX: 0000000000000080 RSI: 000000c0066a5180 RDI: 000000000000001e
> > [  741.304078] RBP: 000000c007e854d8 R08: 0000000000000000 R09: 0000000000000000
> > [  741.313085] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> > [  741.323084] R13: 0000000000000040 R14: 000000c0040f8fc0 R15: ffffffffffffffff
> > [  741.333078]  </TASK>
> > [  741.337087] INFO: task alloy:8470 blocked for more than 124 seconds.
> > [  741.346079]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
> > [  741.355091] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [  741.365102] task:alloy           state:D stack:0     pid:8470
> > tgid:8260  ppid:1      task_flags:0x400140 flags:0x00080001
> > [  741.378081] Call Trace:
> > [  741.383127]  <TASK>
> > [  741.388089]  __schedule+0x2b5/0x690
> > [  741.394085]  schedule+0x23/0x80
> > [  741.400084]  schedule_preempt_disabled+0x11/0x20
> > [  741.408085]  __mutex_lock.constprop.0+0x3c9/0xa00
> > [  741.415246]  hwmon_attr_show+0x36/0x130
> > [  741.422088]  dev_attr_show+0x19/0x60
> > [  741.428079]  sysfs_kf_seq_show+0xbf/0x140
> > [  741.434088]  seq_read_iter+0x112/0x510
> > [  741.440075]  ? security_file_permission+0x8e/0xa0
> > [  741.446661]  vfs_read+0x215/0x340
> > [  741.452104]  ksys_read+0x61/0xe0
> > [  741.458082]  do_syscall_64+0x5d/0xb40
> > [  741.464083]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > [  741.471412] RIP: 0033:0x408d8e
> > [  741.477072] RSP: 002b:000000c007369498 EFLAGS: 00000206 ORIG_RAX:
> > 0000000000000000
> > [  741.487081] RAX: ffffffffffffffda RBX: 0000000000000021 RCX: 0000000000408d8e
> > [  741.496078] RDX: 0000000000000080 RSI: 000000c005cff500 RDI: 0000000000000021
> > [  741.506094] RBP: 000000c0073694d8 R08: 0000000000000000 R09: 0000000000000000
> > [  741.515087] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> > [  741.525084] R13: 0000000000000040 R14: 000000c001db6c40 R15: ffffffffffffffff
> > [  741.534086]  </TASK>
> > [  741.539094] INFO: task alloy:8474 blocked for more than 124 seconds.
> > [  741.548092]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
> > [  741.557097] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [  741.567095] task:alloy           state:D stack:0     pid:8474
> > tgid:8260  ppid:1      task_flags:0x400140 flags:0x00080001
> > [  741.581079] Call Trace:
> > [  741.586087]  <TASK>
> > [  741.591162]  __schedule+0x2b5/0x690
> > [  741.597103]  schedule+0x23/0x80
> > [  741.603109]  schedule_preempt_disabled+0x11/0x20
> > [  741.610098]  __mutex_lock.constprop.0+0x3c9/0xa00
> > [  741.617086]  hwmon_attr_show+0x36/0x130
> > [  741.623137]  dev_attr_show+0x19/0x60
> > [  741.629087]  sysfs_kf_seq_show+0xbf/0x140
> > [  741.635090]  seq_read_iter+0x112/0x510
> > [  741.641083]  ? security_file_permission+0x8e/0xa0
> > [  741.648095]  vfs_read+0x215/0x340
> > [  741.654101]  ksys_read+0x61/0xe0
> > [  741.660087]  do_syscall_64+0x5d/0xb40
> > [  741.666090]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > [  741.672343] RIP: 0033:0x408d8e
> > [  741.678083] RSP: 002b:000000c000b03498 EFLAGS: 00000206 ORIG_RAX:
> > 0000000000000000
> > [  741.688096] RAX: ffffffffffffffda RBX: 0000000000000025 RCX: 0000000000408d8e
> > [  741.698084] RDX: 0000000000000080 RSI: 000000c0069df000 RDI: 0000000000000025
> > [  741.706728] RBP: 000000c000b034d8 R08: 0000000000000000 R09: 0000000000000000
> > [  741.716084] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> > [  741.725093] R13: 0000000000000040 R14: 000000c004075340 R15: ffffffffffffffff
> > [  741.735124]  </TASK>
> > [  741.737900] INFO: task alloy:8726 blocked for more than 124 seconds.
> > [  741.747088]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
> > [  741.756086] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [  741.766111] task:alloy           state:D stack:0     pid:8726
> > tgid:8260  ppid:1      task_flags:0x400140 flags:0x00080001
> > [  741.780116] Call Trace:
> > [  741.785093]  <TASK>
> > [  741.790100]  __schedule+0x2b5/0x690
> > [  741.796596]  schedule+0x23/0x80
> > [  741.802109]  schedule_preempt_disabled+0x11/0x20
> > [  741.809091]  __mutex_lock.constprop.0+0x3c9/0xa00
> > [  741.815262]  hwmon_attr_show+0x36/0x130
> > [  741.822096]  dev_attr_show+0x19/0x60
> > [  741.828105]  sysfs_kf_seq_show+0xbf/0x140
> > [  741.834088]  seq_read_iter+0x112/0x510
> > [  741.839547]  ? security_file_permission+0x8e/0xa0
> > [  741.846570]  vfs_read+0x215/0x340
> > [  741.850048] perf: interrupt took too long (2993 > 2500), lowering
> > kernel.perf_event_max_sample_rate to 66000
> > [  741.863112]  ksys_read+0x61/0xe0
> > [  741.868778]  do_syscall_64+0x5d/0xb40
> > [  741.875104]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > [  741.883101] RIP: 0033:0x408d8e
> > [  741.888436] RSP: 002b:000000c0085cb498 EFLAGS: 00000206 ORIG_RAX:
> > 0000000000000000
> > [  741.898081] RAX: ffffffffffffffda RBX: 0000000000000024 RCX: 0000000000408d8e
> > [  741.907092] RDX: 0000000000000080 RSI: 000000c006424e00 RDI: 0000000000000024
> > [  741.917092] RBP: 000000c0085cb4d8 R08: 0000000000000000 R09: 0000000000000000
> > [  741.926087] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> > [  741.936089] R13: 0000000000000040 R14: 000000c00565aa80 R15: ffffffffffffffff
> > [  741.945098]  </TASK>
> > [  862.885552] INFO: task alloy:8260 blocked for more than 245 seconds.
> > [  862.892117]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
> > [  862.904592] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [  862.914632] task:alloy           state:D stack:0     pid:8260
> > tgid:8260  ppid:1      task_flags:0x400100 flags:0x00080001
> > [  862.928478] Call Trace:
> > [  862.932477]  <TASK>
> > [  862.938482]  __schedule+0x2b5/0x690
> > [  862.944502]  schedule+0x23/0x80
> > [  862.949478]  schedule_preempt_disabled+0x11/0x20
> > [  862.955487]  __mutex_lock.constprop.0+0x3c9/0xa00
> > [  862.962489]  hwmon_attr_show+0x36/0x130
> > [  862.968485]  dev_attr_show+0x19/0x60
> > [  862.974482]  sysfs_kf_seq_show+0xbf/0x140
> > [  862.981517]  seq_read_iter+0x112/0x510
> > [  862.987485]  ? security_file_permission+0x8e/0xa0
> > [  862.994505]  vfs_read+0x215/0x340
> > [  863.000295]  ksys_read+0x61/0xe0
> > [  863.006484]  do_syscall_64+0x5d/0xb40
> > [  863.013226]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > [  863.021497] RIP: 0033:0x408d8e
> > [  863.027499] RSP: 002b:000000c001cad498 EFLAGS: 00000206 ORIG_RAX:
> > 0000000000000000
> > [  863.037483] RAX: ffffffffffffffda RBX: 0000000000000022 RCX: 0000000000408d8e
> > [  863.047483] RDX: 0000000000000080 RSI: 000000c007c28880 RDI: 0000000000000022
> > [  863.057487] RBP: 000000c001cad4d8 R08: 0000000000000000 R09: 0000000000000000
> > [  863.066499] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> > [  863.076485] R13: 0000000000000040 R14: 000000c0058b1dc0 R15: ffffffffffffffff
> > [  863.087007]  </TASK>
> >
> >
> > >
> > > Thanks,
> > > Guenter
> > >
> >

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [BISECTED]: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-02-03  8:23                             ` Jaroslav Pulchart
@ 2026-02-03 23:21                               ` Jaroslav Pulchart
  2026-02-04  1:01                                 ` Guenter Roeck
  0 siblings, 1 reply; 60+ messages in thread
From: Jaroslav Pulchart @ 2026-02-03 23:21 UTC (permalink / raw)
  To: Guenter Roeck, Rafael J. Wysocki, linux-acpi, linux-hwmon
  Cc: Igor Raits, Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li

> > On Mon, Feb 2, 2026 at 7:14 PM Jaroslav Pulchart
> > <jaroslav.pulchart@gooddata.com> wrote:
> > >
> > > >
> > > > On 2/2/26 08:48, Jaroslav Pulchart wrote:
> > > > >>
> > > > >> On Thu, Jan 29, 2026 at 04:26:37PM +0100, Jaroslav Pulchart wrote:
> > > > >>>>
> > > > >>>> On 1/27/26 03:58, Jaroslav Pulchart wrote:
> > > > >>>> ...
> > > > >>>>>>> Hello,
> > > > >>>>>>>
> > > > >>>>>>> Thank you for the analysis and insights.
> > > > >>>>>>>
> > > > >>>>>>> To add some context from our side: we observe this issue shortly after
> > > > >>>>>>> a server reinstall or fw updates followed by a reboot, typically when
> > > > >>>>>>> hwmon sensors are accessed for the first time (e.g. by monitoring during
> > > > >>>>>>> early system initialization). We have not seen it trigger during normal
> > > > >>>>>>> operation, such as a simple reboot or on long-running systems.
> > > > >>>>>>>
> > > > >>>>>>> Given this behavior, it is possible that the issue is related to
> > > > >>>>>>> hwmon_device_unregister() being triggered early during system
> > > > >>>>>>> initialization, but I do not have concrete evidence at this point.
> > > > >>>>>>>
> > > > >>>>>>> As a workaround, we will exclude the ACPI power meter from hwmon monitoring
> > > > >>>>>>> (Grafana Alloy) until the mentioned refactoring or a fix becomes available.
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>> Would it be possible for you to test the patch series I just sent out ?
> > > > >>>>>
> > > > >>>>> I suppose these 5 patches from
> > > > >>>>> https://lore.kernel.org/lkml/20260123182208.2229670-1-linux@roeck-us.net/
> > > > >>>>>    ?
> > > > >>>>
> > > > >>>> Correct.
> > > > >>>>
> > > > >>>>> I will try them and let you know.
> > > > >>>>>
> > > > >>>> Thanks !
> > > > >>>
> > > > >>>
> > > > >>> The issue is still there:
> > > > >>> ...
> > > > >>
> > > > >> Thnaks a lot for testing.
> > > > >>
> > > > >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.105931] INFO: task alloy:5985 blocked for more than 124 seconds.
> > > > >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.114475]       Tainted: G            E 6.18.7-2.gdc.el9.x86_64 #1
> > > > >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.122872] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > > > >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.132932] task:alloy           state:D stack:0     pid:5985 tgid:5871  ppid:1      task_flags:0x400140 flags:0x00080001
> > > > >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.145975] Call Trace:
> > > > >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.149985]  <TASK>
> > > > >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.154931]  __schedule+0x2b5/0x690
> > > > >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.160934]  schedule+0x23/0x80
> > > > >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.166498]  schedule_timeout+0xe8/0x100
> > > > >>> 1769654277410 2026-01-29T02:37:57.410Z Jan 29 03:37:57 10.8.8.76  [ 741.172936]  __wait_for_common+0x99/0x1c0
> > > > >>> 1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [ 741.179148]  ? __pfx_schedule_timeout+0x10/0x10
> > > > >>> 1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [ 741.185936]  acpi_ipmi_space_handler.part.0+0x1ac/0x370 [acpi_ipmi]
> > > > >>> 1769654277660 2026-01-29T02:37:57.660Z Jan 29 03:37:57 10.8.8.76  [ 741.194929]  acpi_ev_address_space_dispatch+0x16d/0x3c0
> > > > >>
> > > > >> Looks like acpi holds another lock. Can you check if the patch submitted by
> > > > >> Rafael fixes the problem ?
> > > > >
> > > > > Hello,
> > > > >
> > > > > I tested the patch as a standalone change, and the issue is still present.
> > > > >
> > > > > However, I can now reliably reproduce the problem. It is triggered while
> > > > > Dell iDRAC is being reset or updated and is temporarily unavailable. During
> > > > > this time, a [kipmi0] task enters D state for an extended period, and
> > > > > shortly after (or during this), the ACPI power meter hwmon interface ends up
> > > > > in the locked state described earlier.
> > > > >
> > > > > This explains why we mostly observe the issue after redeployments in our
> > > > > environment: server redeployments typically include iDRAC firmware updates
> > > > > or resets performed together with operating system installation or updates.
> > > > >
> > > >
> > > > Thanks for the update. Do you have a dmesg log showing the problem with Rafael's
> > > > patch applied, by any chance ?
> > >
> > > [  740.004138] INFO: task alloy:8260 blocked for more than 122 seconds.
> > > [  740.010752]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
> > > [  740.019257] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > > disables this message.
> > > [  740.030103] task:alloy           state:D stack:0     pid:8260
> > > tgid:8260  ppid:1      task_flags:0x400100 flags:0x00080001
> > > [  740.045068] Call Trace:
> > > [  740.050334]  <TASK>
> > > [  740.055075]  __schedule+0x2b5/0x690
> > > [  740.062073]  schedule+0x23/0x80
> > > [  740.068081]  schedule_preempt_disabled+0x11/0x20
> > > [  740.074962]  __mutex_lock.constprop.0+0x3c9/0xa00
> > > [  740.083073]  hwmon_attr_show+0x36/0x130
> > > [  740.089081]  dev_attr_show+0x19/0x60
> > > [  740.095876]  sysfs_kf_seq_show+0xbf/0x140
> > > [  740.102810]  seq_read_iter+0x112/0x510
> > > [  740.109070]  ? security_file_permission+0x8e/0xa0
> > > [  740.117100]  vfs_read+0x215/0x340
> > > [  740.123113]  ksys_read+0x61/0xe0
> > > [  740.130088]  do_syscall_64+0x5d/0xb40
> > > [  740.137032]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > > [  740.144601] RIP: 0033:0x408d8e
> > > [  740.151070] RSP: 002b:000000c001cad498 EFLAGS: 00000206 ORIG_RAX:
> > > 0000000000000000
> > > [  740.160825] RAX: ffffffffffffffda RBX: 0000000000000022 RCX: 0000000000408d8e
> > > [  740.170097] RDX: 0000000000000080 RSI: 000000c007c28880 RDI: 0000000000000022
> > > [  740.180085] RBP: 000000c001cad4d8 R08: 0000000000000000 R09: 0000000000000000
> > > [  740.190074] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> > > [  740.201069] R13: 0000000000000040 R14: 000000c0058b1dc0 R15: ffffffffffffffff
> > > [  740.211072]  </TASK>
> > > [  740.215117] INFO: task alloy:8268 blocked for more than 123 seconds.
> > > [  740.224109]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
> > > [  740.234089] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > > disables this message.
> > > [  740.243611] task:alloy           state:D stack:0     pid:8268
> > > tgid:8260  ppid:1      task_flags:0x400140 flags:0x00080001
> > > [  740.258077] Call Trace:
> > > [  740.261260]  <TASK>
> > > [  740.266083]  __schedule+0x2b5/0x690
> > > [  740.273081]  schedule+0x23/0x80
> > > [  740.279055]  schedule_preempt_disabled+0x11/0x20
> > > [  740.286087]  __mutex_lock.constprop.0+0x3c9/0xa00
> > > [  740.294074]  hwmon_attr_show+0x36/0x130
> > > [  740.300089]  dev_attr_show+0x19/0x60
> > > [  740.306074]  sysfs_kf_seq_show+0xbf/0x140
> > > [  740.313074]  seq_read_iter+0x112/0x510
> > > [  740.319075]  ? security_file_permission+0x8e/0xa0
> > > [  740.326064]  vfs_read+0x215/0x340
> > > [  740.331218]  ksys_read+0x61/0xe0
> > > [  740.337078]  do_syscall_64+0x5d/0xb40
> > > [  740.342083]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > > [  740.349376] RIP: 0033:0x408d8e
> > > [  740.355086] RSP: 002b:000000c008403498 EFLAGS: 00000206 ORIG_RAX:
> > > 0000000000000000
> > > [  740.365088] RAX: ffffffffffffffda RBX: 000000000000001f RCX: 0000000000408d8e
> > > [  740.375072] RDX: 0000000000000080 RSI: 000000c006a47d80 RDI: 000000000000001f
> > > [  740.383669] RBP: 000000c0084034d8 R08: 0000000000000000 R09: 0000000000000000
> > > [  740.393072] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> > > [  740.401326] R13: 0000000000000040 R14: 000000c005ea7880 R15: ffffffffffffffff
> > > [  740.411087]  </TASK>
> > > [  740.416089] INFO: task alloy:8270 blocked for more than 123 seconds.
> > > [  740.425081]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
> > > [  740.434091] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > > disables this message.
> > > [  740.444078] task:alloy           state:D stack:0     pid:8270
> > > tgid:8260  ppid:1      task_flags:0x400140 flags:0x00080001
> > > [  740.457108] Call Trace:
> > > [  740.462080]  <TASK>
> > > [  740.467073]  __schedule+0x2b5/0x690
> > > [  740.472252]  schedule+0x23/0x80
> > > [  740.478087]  schedule_preempt_disabled+0x11/0x20
> > > [  740.485099]  __mutex_lock.constprop.0+0x3c9/0xa00
> > > [  740.491096]  hwmon_attr_show+0x36/0x130
> > > [  740.497082]  dev_attr_show+0x19/0x60
> > > [  740.503078]  sysfs_kf_seq_show+0xbf/0x140
> > > [  740.510030]  seq_read_iter+0x112/0x510
> > > [  740.516073]  ? security_file_permission+0x8e/0xa0
> > > [  740.523355]  vfs_read+0x215/0x340
> > > [  740.529077]  ksys_read+0x61/0xe0
> > > [  740.534925]  do_syscall_64+0x5d/0xb40
> > > [  740.540253]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > > [  740.548077] RIP: 0033:0x408d8e
> > > [  740.553078] RSP: 002b:000000c0085cf498 EFLAGS: 00000206 ORIG_RAX:
> > > 0000000000000000
> > > [  740.563075] RAX: ffffffffffffffda RBX: 000000000000001b RCX: 0000000000408d8e
> > > [  740.572088] RDX: 0000000000000080 RSI: 000000c006d44f80 RDI: 000000000000001b
> > > [  740.582076] RBP: 000000c0085cf4d8 R08: 0000000000000000 R09: 0000000000000000
> > > [  740.591086] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> > > [  740.600086] R13: 0000000000000040 R14: 000000c005703c00 R15: 0000000000000054
> > > [  740.609481]  </TASK>
> > > [  740.614084] INFO: task alloy:8287 blocked for more than 123 seconds.
> > > [  740.623089]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
> > > [  740.632094] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > > disables this message.
> > > [  740.642083] task:alloy           state:D stack:0     pid:8287
> > > tgid:8260  ppid:1      task_flags:0x400140 flags:0x00080001
> > > [  740.656080] Call Trace:
> > > [  740.660694]  <TASK>
> > > [  740.665087]  __schedule+0x2b5/0x690
> > > [  740.671086]  schedule+0x23/0x80
> > > [  740.677080]  schedule_timeout+0xe8/0x100
> > > [  740.683134]  __wait_for_common+0x99/0x1c0
> >
> > So this is wait_for_completion(&tx_msg->tx_complete) in
> > acpi_ipmi_space_handler(), if I'm not mistaken, which never gets
> > completed, so stuff cannot make progress.
> >
> > I guess what happens is that one of the ACPI methods evaluated by the
> > acpi_power_meter driver trips on an IPMI operation region and locks up
> > because of the above.
> >
> > Did you, by any chance, upgrade the firmware in the time frame between
> > 6.17 and 6.18?
>
> I retested this now with the same firmware versions on both kernel
> lines: an iDRAC reset reliably triggers the issue on 6.18.y, while the
> same reproducer does not trigger any issue on 6.17.12.
>
> >
> > > [  740.690093]  ? __pfx_schedule_timeout+0x10/0x10
> > > [  740.697081]  acpi_ipmi_space_handler.part.0+0x1ac/0x370 [acpi_ipmi]
> > > [  740.706086]  acpi_ev_address_space_dispatch+0x16d/0x3c0
> > > [  740.713584]  ? __pfx_acpi_ipmi_space_handler+0x10/0x10 [acpi_ipmi]
> > > [  740.722080]  acpi_ex_access_region+0x5b/0xd0
> > > [  740.729078]  acpi_ex_write_serial_bus+0xb6/0x1e0
> > > [  740.736083]  acpi_ex_store_object_to_node+0x15e/0x200
> > > [  740.744083]  acpi_ex_opcode_1A_1T_1R+0x24c/0x620
> > > [  740.750088]  acpi_ds_exec_end_op+0x287/0x510
> > > [  740.757076]  acpi_ps_parse_loop+0xf6/0x680
> > > [  740.763177]  acpi_ps_parse_aml+0x17a/0x3d0
> > > [  740.770074]  acpi_ps_execute_method+0x137/0x270
> > > [  740.776256]  acpi_ns_evaluate+0x1f4/0x2e0
> > > [  740.783076]  acpi_evaluate_object+0x13c/0x2e0
> > > [  740.790118]  acpi_evaluate_integer+0x4f/0xe0
> > > [  740.796571]  ? obj_cgroup_charge_account+0x187/0x2d0
> > > [  740.804074]  update_meter+0x61/0xd0 [acpi_power_meter]
> > > [  740.811082]  power_meter_read+0x1ea/0x2c0 [acpi_power_meter]
> > > [  740.819088]  hwmon_attr_show+0x5e/0x130
> > > [  740.825093]  dev_attr_show+0x19/0x60
> > > [  740.830810]  sysfs_kf_seq_show+0xbf/0x140
> > > [  740.837076]  seq_read_iter+0x112/0x510
> > > [  740.843079]  ? security_file_permission+0x8e/0xa0
> > > [  740.849058]  vfs_read+0x215/0x340
> > > [  740.855086]  ksys_read+0x61/0xe0
> > > [  740.861058]  do_syscall_64+0x5d/0xb40
> > > [  740.866095]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > > [  740.874104] RIP: 0033:0x408d8e
> > > [  740.880067] RSP: 002b:000000c007e81498 EFLAGS: 00000206 ORIG_RAX:
> > > 0000000000000000
> > > [  740.890076] RAX: ffffffffffffffda RBX: 000000000000001d RCX: 0000000000408d8e
> > > [  740.900075] RDX: 0000000000000080 RSI: 000000c007383380 RDI: 000000000000001d
> > > [  740.910080] RBP: 000000c007e814d8 R08: 0000000000000000 R09: 0000000000000000
> > > [  740.918446] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> > > [  740.928087] R13: 0000000000000040 R14: 000000c00560ec40 R15: ffffffffffffffff
> > > [  740.937080]  </TASK>
> > > [  740.942085] INFO: task alloy:8463 blocked for more than 123 seconds.
> > > [  740.951076]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
> > > [  740.959660] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > > disables this message.
> > > [  740.970077] task:alloy           state:D stack:0     pid:8463
> > > tgid:8260  ppid:1      task_flags:0x400140 flags:0x00080001
> > > [  740.983108] Call Trace:
> > > [  740.987305]  <TASK>
> > > [  740.992084]  __schedule+0x2b5/0x690
> > > [  740.998077]  schedule+0x23/0x80
> > > [  741.003472]  schedule_preempt_disabled+0x11/0x20
> > > [  741.009738]  __mutex_lock.constprop.0+0x3c9/0xa00
> > > [  741.017079]  hwmon_attr_show+0x36/0x130
> > > [  741.023094]  dev_attr_show+0x19/0x60
> > > [  741.029076]  sysfs_kf_seq_show+0xbf/0x140
> > > [  741.035084]  seq_read_iter+0x112/0x510
> > > [  741.039031]  ? security_file_permission+0x8e/0xa0
> > > [  741.045212]  vfs_read+0x215/0x340
> > > [  741.051081]  ksys_read+0x61/0xe0
> > > [  741.056076]  do_syscall_64+0x5d/0xb40
> > > [  741.061744]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > > [  741.069081] RIP: 0033:0x408d8e
> > > [  741.074087] RSP: 002b:000000c006127498 EFLAGS: 00000206 ORIG_RAX:
> > > 0000000000000000
> > > [  741.084107] RAX: ffffffffffffffda RBX: 0000000000000020 RCX: 0000000000408d8e
> > > [  741.093151] RDX: 0000000000000080 RSI: 000000c005b15b00 RDI: 0000000000000020
> > > [  741.103087] RBP: 000000c0061274d8 R08: 0000000000000000 R09: 0000000000000000
> > > [  741.113084] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> > > [  741.123120] R13: 0000000000000040 R14: 000000c005ba1dc0 R15: ffffffffffffffff
> > > [  741.133093]  </TASK>
> > > [  741.138081] INFO: task alloy:8465 blocked for more than 124 seconds.
> > > [  741.147085]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
> > > [  741.155079] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > > disables this message.
> > > [  741.165084] task:alloy           state:D stack:0     pid:8465
> > > tgid:8260  ppid:1      task_flags:0x400140 flags:0x00080001
> > > [  741.178095] Call Trace:
> > > [  741.182595]  <TASK>
> > > [  741.187079]  __schedule+0x2b5/0x690
> > > [  741.193014]  schedule+0x23/0x80
> > > [  741.199071]  schedule_preempt_disabled+0x11/0x20
> > > [  741.206102]  __mutex_lock.constprop.0+0x3c9/0xa00
> > > [  741.214091]  hwmon_attr_show+0x36/0x130
> > > [  741.221079]  dev_attr_show+0x19/0x60
> > > [  741.226081]  sysfs_kf_seq_show+0xbf/0x140
> > > [  741.232083]  seq_read_iter+0x112/0x510
> > > [  741.237943]  ? security_file_permission+0x8e/0xa0
> > > [  741.245080]  vfs_read+0x215/0x340
> > > [  741.251080]  ksys_read+0x61/0xe0
> > > [  741.257068]  do_syscall_64+0x5d/0xb40
> > > [  741.262084]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > > [  741.270083] RIP: 0033:0x408d8e
> > > [  741.275229] RSP: 002b:000000c007e85498 EFLAGS: 00000206 ORIG_RAX:
> > > 0000000000000000
> > > [  741.285086] RAX: ffffffffffffffda RBX: 000000000000001e RCX: 0000000000408d8e
> > > [  741.294078] RDX: 0000000000000080 RSI: 000000c0066a5180 RDI: 000000000000001e
> > > [  741.304078] RBP: 000000c007e854d8 R08: 0000000000000000 R09: 0000000000000000
> > > [  741.313085] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> > > [  741.323084] R13: 0000000000000040 R14: 000000c0040f8fc0 R15: ffffffffffffffff
> > > [  741.333078]  </TASK>
> > > [  741.337087] INFO: task alloy:8470 blocked for more than 124 seconds.
> > > [  741.346079]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
> > > [  741.355091] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > > disables this message.
> > > [  741.365102] task:alloy           state:D stack:0     pid:8470
> > > tgid:8260  ppid:1      task_flags:0x400140 flags:0x00080001
> > > [  741.378081] Call Trace:
> > > [  741.383127]  <TASK>
> > > [  741.388089]  __schedule+0x2b5/0x690
> > > [  741.394085]  schedule+0x23/0x80
> > > [  741.400084]  schedule_preempt_disabled+0x11/0x20
> > > [  741.408085]  __mutex_lock.constprop.0+0x3c9/0xa00
> > > [  741.415246]  hwmon_attr_show+0x36/0x130
> > > [  741.422088]  dev_attr_show+0x19/0x60
> > > [  741.428079]  sysfs_kf_seq_show+0xbf/0x140
> > > [  741.434088]  seq_read_iter+0x112/0x510
> > > [  741.440075]  ? security_file_permission+0x8e/0xa0
> > > [  741.446661]  vfs_read+0x215/0x340
> > > [  741.452104]  ksys_read+0x61/0xe0
> > > [  741.458082]  do_syscall_64+0x5d/0xb40
> > > [  741.464083]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > > [  741.471412] RIP: 0033:0x408d8e
> > > [  741.477072] RSP: 002b:000000c007369498 EFLAGS: 00000206 ORIG_RAX:
> > > 0000000000000000
> > > [  741.487081] RAX: ffffffffffffffda RBX: 0000000000000021 RCX: 0000000000408d8e
> > > [  741.496078] RDX: 0000000000000080 RSI: 000000c005cff500 RDI: 0000000000000021
> > > [  741.506094] RBP: 000000c0073694d8 R08: 0000000000000000 R09: 0000000000000000
> > > [  741.515087] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> > > [  741.525084] R13: 0000000000000040 R14: 000000c001db6c40 R15: ffffffffffffffff
> > > [  741.534086]  </TASK>
> > > [  741.539094] INFO: task alloy:8474 blocked for more than 124 seconds.
> > > [  741.548092]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
> > > [  741.557097] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > > disables this message.
> > > [  741.567095] task:alloy           state:D stack:0     pid:8474
> > > tgid:8260  ppid:1      task_flags:0x400140 flags:0x00080001
> > > [  741.581079] Call Trace:
> > > [  741.586087]  <TASK>
> > > [  741.591162]  __schedule+0x2b5/0x690
> > > [  741.597103]  schedule+0x23/0x80
> > > [  741.603109]  schedule_preempt_disabled+0x11/0x20
> > > [  741.610098]  __mutex_lock.constprop.0+0x3c9/0xa00
> > > [  741.617086]  hwmon_attr_show+0x36/0x130
> > > [  741.623137]  dev_attr_show+0x19/0x60
> > > [  741.629087]  sysfs_kf_seq_show+0xbf/0x140
> > > [  741.635090]  seq_read_iter+0x112/0x510
> > > [  741.641083]  ? security_file_permission+0x8e/0xa0
> > > [  741.648095]  vfs_read+0x215/0x340
> > > [  741.654101]  ksys_read+0x61/0xe0
> > > [  741.660087]  do_syscall_64+0x5d/0xb40
> > > [  741.666090]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > > [  741.672343] RIP: 0033:0x408d8e
> > > [  741.678083] RSP: 002b:000000c000b03498 EFLAGS: 00000206 ORIG_RAX:
> > > 0000000000000000
> > > [  741.688096] RAX: ffffffffffffffda RBX: 0000000000000025 RCX: 0000000000408d8e
> > > [  741.698084] RDX: 0000000000000080 RSI: 000000c0069df000 RDI: 0000000000000025
> > > [  741.706728] RBP: 000000c000b034d8 R08: 0000000000000000 R09: 0000000000000000
> > > [  741.716084] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> > > [  741.725093] R13: 0000000000000040 R14: 000000c004075340 R15: ffffffffffffffff
> > > [  741.735124]  </TASK>
> > > [  741.737900] INFO: task alloy:8726 blocked for more than 124 seconds.
> > > [  741.747088]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
> > > [  741.756086] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > > disables this message.
> > > [  741.766111] task:alloy           state:D stack:0     pid:8726
> > > tgid:8260  ppid:1      task_flags:0x400140 flags:0x00080001
> > > [  741.780116] Call Trace:
> > > [  741.785093]  <TASK>
> > > [  741.790100]  __schedule+0x2b5/0x690
> > > [  741.796596]  schedule+0x23/0x80
> > > [  741.802109]  schedule_preempt_disabled+0x11/0x20
> > > [  741.809091]  __mutex_lock.constprop.0+0x3c9/0xa00
> > > [  741.815262]  hwmon_attr_show+0x36/0x130
> > > [  741.822096]  dev_attr_show+0x19/0x60
> > > [  741.828105]  sysfs_kf_seq_show+0xbf/0x140
> > > [  741.834088]  seq_read_iter+0x112/0x510
> > > [  741.839547]  ? security_file_permission+0x8e/0xa0
> > > [  741.846570]  vfs_read+0x215/0x340
> > > [  741.850048] perf: interrupt took too long (2993 > 2500), lowering
> > > kernel.perf_event_max_sample_rate to 66000
> > > [  741.863112]  ksys_read+0x61/0xe0
> > > [  741.868778]  do_syscall_64+0x5d/0xb40
> > > [  741.875104]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > > [  741.883101] RIP: 0033:0x408d8e
> > > [  741.888436] RSP: 002b:000000c0085cb498 EFLAGS: 00000206 ORIG_RAX:
> > > 0000000000000000
> > > [  741.898081] RAX: ffffffffffffffda RBX: 0000000000000024 RCX: 0000000000408d8e
> > > [  741.907092] RDX: 0000000000000080 RSI: 000000c006424e00 RDI: 0000000000000024
> > > [  741.917092] RBP: 000000c0085cb4d8 R08: 0000000000000000 R09: 0000000000000000
> > > [  741.926087] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> > > [  741.936089] R13: 0000000000000040 R14: 000000c00565aa80 R15: ffffffffffffffff
> > > [  741.945098]  </TASK>
> > > [  862.885552] INFO: task alloy:8260 blocked for more than 245 seconds.
> > > [  862.892117]       Tainted: G            E       6.18.8-1.gdc.el9.x86_64 #1
> > > [  862.904592] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > > disables this message.
> > > [  862.914632] task:alloy           state:D stack:0     pid:8260
> > > tgid:8260  ppid:1      task_flags:0x400100 flags:0x00080001
> > > [  862.928478] Call Trace:
> > > [  862.932477]  <TASK>
> > > [  862.938482]  __schedule+0x2b5/0x690
> > > [  862.944502]  schedule+0x23/0x80
> > > [  862.949478]  schedule_preempt_disabled+0x11/0x20
> > > [  862.955487]  __mutex_lock.constprop.0+0x3c9/0xa00
> > > [  862.962489]  hwmon_attr_show+0x36/0x130
> > > [  862.968485]  dev_attr_show+0x19/0x60
> > > [  862.974482]  sysfs_kf_seq_show+0xbf/0x140
> > > [  862.981517]  seq_read_iter+0x112/0x510
> > > [  862.987485]  ? security_file_permission+0x8e/0xa0
> > > [  862.994505]  vfs_read+0x215/0x340
> > > [  863.000295]  ksys_read+0x61/0xe0
> > > [  863.006484]  do_syscall_64+0x5d/0xb40
> > > [  863.013226]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > > [  863.021497] RIP: 0033:0x408d8e
> > > [  863.027499] RSP: 002b:000000c001cad498 EFLAGS: 00000206 ORIG_RAX:
> > > 0000000000000000
> > > [  863.037483] RAX: ffffffffffffffda RBX: 0000000000000022 RCX: 0000000000408d8e
> > > [  863.047483] RDX: 0000000000000080 RSI: 000000c007c28880 RDI: 0000000000000022
> > > [  863.057487] RBP: 000000c001cad4d8 R08: 0000000000000000 R09: 0000000000000000
> > > [  863.066499] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> > > [  863.076485] R13: 0000000000000040 R14: 000000c0058b1dc0 R15: ffffffffffffffff
> > > [  863.087007]  </TASK>
> > >
> > >
> > > >
> > > > Thanks,
> > > > Guenter
> > > >
> > >

I was able to bisect to first bad commit

$ git bisect log
git bisect start
# status: waiting for both good and bad commits
# good: [e5f0a698b34ed76002dc5cff3804a61c80233a7a] Linux 6.17
git bisect good e5f0a698b34ed76002dc5cff3804a61c80233a7a
# status: waiting for bad commit, 1 good commit known
# bad: [7d0a66e4bb9081d75c82ec4957c50034cb0ea449] Linux 6.18
git bisect bad 7d0a66e4bb9081d75c82ec4957c50034cb0ea449
# bad: [f79e772258df311c2cb21594ca0996318e720d28] Merge tag
'media/v6.18-1' of
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
git bisect bad f79e772258df311c2cb21594ca0996318e720d28
# bad: [0f048c878ee32a4259dbf28e0ad8fd0b71ee0085] Merge tag
'soc-dt-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
git bisect bad 0f048c878ee32a4259dbf28e0ad8fd0b71ee0085
# good: [c050daf69f3edf72e274eaa321f663b1779c4391] Merge tag
'pwm/for-6.18-rc1' of
git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux
git bisect good c050daf69f3edf72e274eaa321f663b1779c4391
# bad: [9792d660a4e91d31a6b1af105ae3f1c29107e94b] Merge tag
'devicetree-for-6.18' of
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
git bisect bad 9792d660a4e91d31a6b1af105ae3f1c29107e94b
# good: [f07c3695bf65220a69a848478bd9099bdeaafa78] Merge tag
'firewire-updates-6.18' of
git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394
git bisect good f07c3695bf65220a69a848478bd9099bdeaafa78
# bad: [080ffb4bec4d49cdedca11810395f8cad812471e] Merge tag
'i3c/for-6.18' of
git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux
git bisect bad 080ffb4bec4d49cdedca11810395f8cad812471e
# good: [da3a88e9656c17a34daf49c9acc6d85f73b4d3d9] pinctrl: use more
common syntax for compound literals
git bisect good da3a88e9656c17a34daf49c9acc6d85f73b4d3d9
# bad: [5fb024931949f3475260c84a0e4b0997af9c5530] Merge tag
'pinctrl-v6.18-1' of
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
git bisect bad 5fb024931949f3475260c84a0e4b0997af9c5530
# good: [cc3e883a06251ba835f15672dbe8724f2687971b] power: supply:
qcom_battmgr: Add charge control support
git bisect good cc3e883a06251ba835f15672dbe8724f2687971b
# bad: [d46651d4e3c0caab554c4c591c0b6c3b026b1e93] ipmi: Add
Loongson-2K BMC support
git bisect bad d46651d4e3c0caab554c4c591c0b6c3b026b1e93
# good: [627118470fccc61d7763aa667fcab0a9476843f6] ipmi: Add a
maintenance mode sysfs file
git bisect good 627118470fccc61d7763aa667fcab0a9476843f6
# good: [abe4918a942e17f997f62b1e12bc8a15d8ff5f7e] ipmi:si: Move flags
get start to its own function
git bisect good abe4918a942e17f997f62b1e12bc8a15d8ff5f7e
# good: [3bc54ab3b9790ca92f197e9822e486665daa321c] ipmi: Rename
"user_data" to "recv_msg" in an SMI message
git bisect good 3bc54ab3b9790ca92f197e9822e486665daa321c
# bad: [bc3a9d217755f65c137f145600f23bf1d6c31ea9] ipmi:si: Gracefully
handle if the BMC is non-functional
git bisect bad bc3a9d217755f65c137f145600f23bf1d6c31ea9
# first bad commit: [bc3a9d217755f65c137f145600f23bf1d6c31ea9]
ipmi:si: Gracefully handle if the BMC is non-functional

...
$ git bisect bad
bc3a9d217755f65c137f145600f23bf1d6c31ea9 is the first bad commit
commit bc3a9d217755f65c137f145600f23bf1d6c31ea9 (HEAD)
Author: Corey Minyard <corey@minyard.net>
Date:   Wed Aug 20 14:56:50 2025 -0500

    ipmi:si: Gracefully handle if the BMC is non-functional

    If the BMC is not functional, the driver goes into an error state and
    starts a 1 second timer.  When the timer times out, it will attempt a
    simple message.  If the BMC interacts correctly, the driver will start
    accepting messages again.  If not, it remains in error state.

    If the driver goes into error state, all messages current and pending
    will return with an error.

    This should more gracefully handle when the BMC becomes non-operational,
    as opposed to trying each messages individually and failing them.

    Signed-off-by: Corey Minyard <corey@minyard.net>

 drivers/char/ipmi/ipmi_si_intf.c | 29 +++++++++++++++++++++++------
 1 file changed, 23 insertions(+), 6 deletions(-)

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [BISECTED]: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-02-03 23:21                               ` [BISECTED]: " Jaroslav Pulchart
@ 2026-02-04  1:01                                 ` Guenter Roeck
  2026-02-04  8:20                                   ` Jaroslav Pulchart
  2026-02-04 19:54                                   ` Igor Raits
  0 siblings, 2 replies; 60+ messages in thread
From: Guenter Roeck @ 2026-02-04  1:01 UTC (permalink / raw)
  To: Jaroslav Pulchart, Rafael J. Wysocki, linux-acpi, linux-hwmon
  Cc: Igor Raits, Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li

On 2/3/26 15:21, Jaroslav Pulchart wrote:

> ...
> $ git bisect bad
> bc3a9d217755f65c137f145600f23bf1d6c31ea9 is the first bad commit
> commit bc3a9d217755f65c137f145600f23bf1d6c31ea9 (HEAD)
> Author: Corey Minyard <corey@minyard.net>
> Date:   Wed Aug 20 14:56:50 2025 -0500
> 
>      ipmi:si: Gracefully handle if the BMC is non-functional
> 
>      If the BMC is not functional, the driver goes into an error state and
>      starts a 1 second timer.  When the timer times out, it will attempt a
>      simple message.  If the BMC interacts correctly, the driver will start
>      accepting messages again.  If not, it remains in error state.
> 
>      If the driver goes into error state, all messages current and pending
>      will return with an error.
> 
>      This should more gracefully handle when the BMC becomes non-operational,
>      as opposed to trying each messages individually and failing them.
> 
>      Signed-off-by: Corey Minyard <corey@minyard.net>
> 
>   drivers/char/ipmi/ipmi_si_intf.c | 29 +++++++++++++++++++++++------
>   1 file changed, 23 insertions(+), 6 deletions(-)

Now that is interesting. Does reverting that patch on top of 6.18.8
(or any other 6.18 release) solve the problem ?

Thanks,
Guenter


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [BISECTED]: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-02-04  1:01                                 ` Guenter Roeck
@ 2026-02-04  8:20                                   ` Jaroslav Pulchart
  2026-02-04 19:54                                   ` Igor Raits
  1 sibling, 0 replies; 60+ messages in thread
From: Jaroslav Pulchart @ 2026-02-04  8:20 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Rafael J. Wysocki, linux-acpi, linux-hwmon, Igor Raits,
	Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li

>
> On 2/3/26 15:21, Jaroslav Pulchart wrote:
>
> > ...
> > $ git bisect bad
> > bc3a9d217755f65c137f145600f23bf1d6c31ea9 is the first bad commit
> > commit bc3a9d217755f65c137f145600f23bf1d6c31ea9 (HEAD)
> > Author: Corey Minyard <corey@minyard.net>
> > Date:   Wed Aug 20 14:56:50 2025 -0500
> >
> >      ipmi:si: Gracefully handle if the BMC is non-functional
> >
> >      If the BMC is not functional, the driver goes into an error state and
> >      starts a 1 second timer.  When the timer times out, it will attempt a
> >      simple message.  If the BMC interacts correctly, the driver will start
> >      accepting messages again.  If not, it remains in error state.
> >
> >      If the driver goes into error state, all messages current and pending
> >      will return with an error.
> >
> >      This should more gracefully handle when the BMC becomes non-operational,
> >      as opposed to trying each messages individually and failing them.
> >
> >      Signed-off-by: Corey Minyard <corey@minyard.net>
> >
> >   drivers/char/ipmi/ipmi_si_intf.c | 29 +++++++++++++++++++++++------
> >   1 file changed, 23 insertions(+), 6 deletions(-)
>
> Now that is interesting. Does reverting that patch on top of 6.18.8
> (or any other 6.18 release) solve the problem ?
>
> Thanks,
> Guenter
>

Yes, reverting commit bc3a9d217755f65c137f145600f23bf1d6c31ea9 on top
of 6.18.8 prevents the issue, and the problem no longer triggers with
the same reproducer.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [BISECTED]: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-02-04  1:01                                 ` Guenter Roeck
  2026-02-04  8:20                                   ` Jaroslav Pulchart
@ 2026-02-04 19:54                                   ` Igor Raits
  2026-02-04 22:48                                     ` Guenter Roeck
  1 sibling, 1 reply; 60+ messages in thread
From: Igor Raits @ 2026-02-04 19:54 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Jaroslav Pulchart, Rafael J. Wysocki, linux-acpi, linux-hwmon,
	Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li

[-- Attachment #1: Type: text/plain, Size: 1715 bytes --]

I have written a patch with the help of AI and it fixes the problem. Attached.

I don't know how correct it is, but it works :)

On Wed, Feb 4, 2026 at 2:01 AM Guenter Roeck <linux@roeck-us.net> wrote:
>
> On 2/3/26 15:21, Jaroslav Pulchart wrote:
>
> > ...
> > $ git bisect bad
> > bc3a9d217755f65c137f145600f23bf1d6c31ea9 is the first bad commit
> > commit bc3a9d217755f65c137f145600f23bf1d6c31ea9 (HEAD)
> > Author: Corey Minyard <corey@minyard.net>
> > Date:   Wed Aug 20 14:56:50 2025 -0500
> >
> >      ipmi:si: Gracefully handle if the BMC is non-functional
> >
> >      If the BMC is not functional, the driver goes into an error state and
> >      starts a 1 second timer.  When the timer times out, it will attempt a
> >      simple message.  If the BMC interacts correctly, the driver will start
> >      accepting messages again.  If not, it remains in error state.
> >
> >      If the driver goes into error state, all messages current and pending
> >      will return with an error.
> >
> >      This should more gracefully handle when the BMC becomes non-operational,
> >      as opposed to trying each messages individually and failing them.
> >
> >      Signed-off-by: Corey Minyard <corey@minyard.net>
> >
> >   drivers/char/ipmi/ipmi_si_intf.c | 29 +++++++++++++++++++++++------
> >   1 file changed, 23 insertions(+), 6 deletions(-)
>
> Now that is interesting. Does reverting that patch on top of 6.18.8
> (or any other 6.18 release) solve the problem ?
>
> Thanks,
> Guenter
>


-- 

Igor Raits | Sr. Technical Manager, Platform Delivery

igor@gooddata.com

+420 775 117 817


Moravske namesti 1007/14

602 00 Brno-Veveri, Czech Republic

[-- Attachment #2: 0001-ipmi-Fix-multiple-races-and-add-timeout-to-prevent-d.patch --]
[-- Type: text/x-patch, Size: 5912 bytes --]

From 218c72134d5a5d55845d04ffcc2e5698d6e31abc Mon Sep 17 00:00:00 2001
From: Igor Raits <igor@gooddata.com>
Date: Wed, 4 Feb 2026 10:41:45 +0100
Subject: [PATCH] ipmi: Fix multiple races and add timeout to prevent deadlock
 when BMC is slow/hosed

When the BMC is non-functional or slow (e.g., during iDRAC reset),
multiple code paths could leave IPMI messages without responses, causing
callers to block indefinitely.

In the SI driver (ipmi_si_intf.c), fix three races:

1. The sender() callback returns an error (IPMI_BUS_ERR) immediately,
   but smi_send() in ipmi_msghandler.c was ignoring this return value,
   meaning no error response was delivered to waiting callers.

2. When entering SI_HOSED state, only curr_msg was returned with an
   error response. If there was a waiting_msg queued, it would be
   orphaned and never receive a response.

3. The SI_HOSED check in sender() was done outside the si_lock,
   creating a race window where:
   - sender() checks SI_HOSED (not hosed) outside lock
   - smi_event_handler() under lock sets SI_HOSED and handles waiting_msg
   - sender() takes lock and sets waiting_msg
   - Message is orphaned since hosed handler already ran

Additionally, in acpi_ipmi.c, the wait_for_completion() had no timeout,
so if the BMC was slow (but not completely dead, e.g., during iDRAC
reset), the ACPI IPMI handler would block indefinitely. This caused
hung tasks in hwmon attribute reads that use ACPI methods accessing
IPMI. Add a 15-second timeout (3x IPMI_TIMEOUT) to prevent this.

Fixes: bc3a9d217755 ("ipmi:si: Gracefully handle if the BMC is non-functional")
Reported-and-bisected-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com>
Signed-off-by: Igor Raits <igor@gooddata.com>
---
 drivers/acpi/acpi_ipmi.c            | 13 ++++++++++++-
 drivers/char/ipmi/ipmi_msghandler.c | 19 +++++++++++++++++--
 drivers/char/ipmi/ipmi_si_intf.c    | 26 ++++++++++++++++++--------
 3 files changed, 47 insertions(+), 11 deletions(-)

diff --git a/drivers/acpi/acpi_ipmi.c b/drivers/acpi/acpi_ipmi.c
index 5fba4dab5d08..ab1f4b27741d 100644
--- a/drivers/acpi/acpi_ipmi.c
+++ b/drivers/acpi/acpi_ipmi.c
@@ -572,7 +572,18 @@ acpi_ipmi_space_handler(u32 function, acpi_physical_address address,
 		status = AE_ERROR;
 		goto out_msg;
 	}
-	wait_for_completion(&tx_msg->tx_complete);
+
+	/*
+	 * Use a timeout to avoid blocking forever if the BMC is slow
+	 * or non-responsive (e.g., during iDRAC reset). The timeout
+	 * should be longer than IPMI_TIMEOUT to allow for retries.
+	 */
+	if (!wait_for_completion_timeout(&tx_msg->tx_complete,
+					 msecs_to_jiffies(IPMI_TIMEOUT * 3))) {
+		dev_warn_once(ipmi_device->dev,
+			      "ACPI IPMI request timed out\n");
+		tx_msg->msg_done = ACPI_IPMI_TIMEOUT;
+	}
 
 	acpi_format_ipmi_response(tx_msg, value);
 	status = AE_OK;
diff --git a/drivers/char/ipmi/ipmi_msghandler.c b/drivers/char/ipmi/ipmi_msghandler.c
index 3700ab4eba3e..a4ee28f13615 100644
--- a/drivers/char/ipmi/ipmi_msghandler.c
+++ b/drivers/char/ipmi/ipmi_msghandler.c
@@ -1876,6 +1876,7 @@ static void smi_send(struct ipmi_smi *intf,
 {
 	int run_to_completion = READ_ONCE(intf->run_to_completion);
 	unsigned long flags = 0;
+	int rv;
 
 	if (!run_to_completion)
 		spin_lock_irqsave(&intf->xmit_msgs_lock, flags);
@@ -1883,8 +1884,22 @@ static void smi_send(struct ipmi_smi *intf,
 	if (!run_to_completion)
 		spin_unlock_irqrestore(&intf->xmit_msgs_lock, flags);
 
-	if (smi_msg)
-		handlers->sender(intf->send_info, smi_msg);
+	if (smi_msg) {
+		rv = handlers->sender(intf->send_info, smi_msg);
+		if (rv) {
+			/*
+			 * The sender() failed (e.g., BMC in hosed state).
+			 * Deliver an error response so callers don't block
+			 * indefinitely waiting for a response that will
+			 * never come.
+			 */
+			if (smi_msg->recv_msg)
+				deliver_err_response(intf, smi_msg->recv_msg,
+						     rv);
+			else
+				ipmi_free_smi_msg(smi_msg);
+		}
+	}
 }
 
 static bool is_maintenance_mode_cmd(struct kernel_ipmi_msg *msg)
diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
index 70e55f5ff85e..698dd4072af5 100644
--- a/drivers/char/ipmi/ipmi_si_intf.c
+++ b/drivers/char/ipmi/ipmi_si_intf.c
@@ -810,6 +810,15 @@ static enum si_sm_result smi_event_handler(struct smi_info *smi_info,
 			 */
 			return_hosed_msg(smi_info, IPMI_BUS_ERR);
 		}
+		if (smi_info->waiting_msg != NULL) {
+			/*
+			 * If there's a waiting message, return it with
+			 * error too so callers don't block indefinitely.
+			 */
+			smi_info->curr_msg = smi_info->waiting_msg;
+			smi_info->waiting_msg = NULL;
+			return_hosed_msg(smi_info, IPMI_BUS_ERR);
+		}
 		smi_mod_timer(smi_info, jiffies + SI_TIMEOUT_HOSED);
 		goto out;
 	}
@@ -922,26 +931,27 @@ static int sender(void *send_info, struct ipmi_smi_msg *msg)
 
 	debug_timestamp(smi_info, "Enqueue");
 
-	if (smi_info->si_state == SI_HOSED)
-		return IPMI_BUS_ERR;
-
 	if (smi_info->run_to_completion) {
 		/*
 		 * If we are running to completion, start it.  Upper
 		 * layer will call flush_messages to clear it out.
 		 */
+		if (smi_info->si_state == SI_HOSED)
+			return IPMI_BUS_ERR;
 		smi_info->waiting_msg = msg;
 		return IPMI_CC_NO_ERROR;
 	}
 
 	spin_lock_irqsave(&smi_info->si_lock, flags);
 	/*
-	 * The following two lines don't need to be under the lock for
-	 * the lock's sake, but they do need SMP memory barriers to
-	 * avoid getting things out of order.  We are already claiming
-	 * the lock, anyway, so just do it under the lock to avoid the
-	 * ordering problem.
+	 * Check SI_HOSED under the lock to avoid race with
+	 * smi_event_handler() which also handles waiting_msg
+	 * under this lock when entering hosed state.
 	 */
+	if (smi_info->si_state == SI_HOSED) {
+		spin_unlock_irqrestore(&smi_info->si_lock, flags);
+		return IPMI_BUS_ERR;
+	}
 	BUG_ON(smi_info->waiting_msg);
 	smi_info->waiting_msg = msg;
 	check_start_timer_thread(smi_info);
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* Re: [BISECTED]: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-02-04 19:54                                   ` Igor Raits
@ 2026-02-04 22:48                                     ` Guenter Roeck
  2026-02-05  7:25                                       ` Igor Raits
  0 siblings, 1 reply; 60+ messages in thread
From: Guenter Roeck @ 2026-02-04 22:48 UTC (permalink / raw)
  To: Igor Raits
  Cc: Jaroslav Pulchart, Rafael J. Wysocki, linux-acpi, linux-hwmon,
	Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li

On 2/4/26 11:54, Igor Raits wrote:
> I have written a patch with the help of AI and it fixes the problem. Attached.
> 

"No MIME, no links, no compression, no attachments.  Just plain text"

... which means I can not provide inline feedback, which is the whole
point of the above.

Your patch crosses subsystems, so it will need to be split in two
(assuming the ACPI side is even needed). Also, references to iDRAC
in common code seem inappropriate.

Thanks,
Guenter

> I don't know how correct it is, but it works :)
> 
> On Wed, Feb 4, 2026 at 2:01 AM Guenter Roeck <linux@roeck-us.net> wrote:
>>
>> On 2/3/26 15:21, Jaroslav Pulchart wrote:
>>
>>> ...
>>> $ git bisect bad
>>> bc3a9d217755f65c137f145600f23bf1d6c31ea9 is the first bad commit
>>> commit bc3a9d217755f65c137f145600f23bf1d6c31ea9 (HEAD)
>>> Author: Corey Minyard <corey@minyard.net>
>>> Date:   Wed Aug 20 14:56:50 2025 -0500
>>>
>>>       ipmi:si: Gracefully handle if the BMC is non-functional
>>>
>>>       If the BMC is not functional, the driver goes into an error state and
>>>       starts a 1 second timer.  When the timer times out, it will attempt a
>>>       simple message.  If the BMC interacts correctly, the driver will start
>>>       accepting messages again.  If not, it remains in error state.
>>>
>>>       If the driver goes into error state, all messages current and pending
>>>       will return with an error.
>>>
>>>       This should more gracefully handle when the BMC becomes non-operational,
>>>       as opposed to trying each messages individually and failing them.
>>>
>>>       Signed-off-by: Corey Minyard <corey@minyard.net>
>>>
>>>    drivers/char/ipmi/ipmi_si_intf.c | 29 +++++++++++++++++++++++------
>>>    1 file changed, 23 insertions(+), 6 deletions(-)
>>
>> Now that is interesting. Does reverting that patch on top of 6.18.8
>> (or any other 6.18 release) solve the problem ?
>>
>> Thanks,
>> Guenter
>>
> 
> 


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [BISECTED]: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-02-04 22:48                                     ` Guenter Roeck
@ 2026-02-05  7:25                                       ` Igor Raits
  2026-02-05 17:51                                         ` Guenter Roeck
  0 siblings, 1 reply; 60+ messages in thread
From: Igor Raits @ 2026-02-05  7:25 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Jaroslav Pulchart, Rafael J. Wysocki, linux-acpi, linux-hwmon,
	Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li

On Wed, Feb 4, 2026 at 11:49 PM Guenter Roeck <linux@roeck-us.net> wrote:
>
> On 2/4/26 11:54, Igor Raits wrote:
> > I have written a patch with the help of AI and it fixes the problem. Attached.
> >
>
> "No MIME, no links, no compression, no attachments.  Just plain text"

Sorry for that, I had assumed that attaching the file would make it in-line.

> ... which means I can not provide inline feedback, which is the whole
> point of the above.
>
> Your patch crosses subsystems, so it will need to be split in two
> (assuming the ACPI side is even needed). Also, references to iDRAC
> in common code seem inappropriate.

Yes, this I believe was the essential part (it was the last piece in
my testing which fixed the hanging):

diff --git a/drivers/acpi/acpi_ipmi.c b/drivers/acpi/acpi_ipmi.c
index 5fba4dab5d08..ab1f4b27741d 100644
--- a/drivers/acpi/acpi_ipmi.c
+++ b/drivers/acpi/acpi_ipmi.c
@@ -572,7 +572,18 @@ acpi_ipmi_space_handler(u32 function,
acpi_physical_address address,
                status = AE_ERROR;
                goto out_msg;
        }
-       wait_for_completion(&tx_msg->tx_complete);
+
+       /*
+        * Use a timeout to avoid blocking forever if the BMC is slow
+        * or non-responsive (e.g., during iDRAC reset). The timeout
+        * should be longer than IPMI_TIMEOUT to allow for retries.
+        */
+       if (!wait_for_completion_timeout(&tx_msg->tx_complete,
+                                        msecs_to_jiffies(IPMI_TIMEOUT * 3))) {
+               dev_warn_once(ipmi_device->dev,
+                             "ACPI IPMI request timed out\n");
+               tx_msg->msg_done = ACPI_IPMI_TIMEOUT;
+       }

        acpi_format_ipmi_response(tx_msg, value);
        status = AE_OK;

> Thanks,
> Guenter
>
> > I don't know how correct it is, but it works :)
> >
> > On Wed, Feb 4, 2026 at 2:01 AM Guenter Roeck <linux@roeck-us.net> wrote:
> >>
> >> On 2/3/26 15:21, Jaroslav Pulchart wrote:
> >>
> >>> ...
> >>> $ git bisect bad
> >>> bc3a9d217755f65c137f145600f23bf1d6c31ea9 is the first bad commit
> >>> commit bc3a9d217755f65c137f145600f23bf1d6c31ea9 (HEAD)
> >>> Author: Corey Minyard <corey@minyard.net>
> >>> Date:   Wed Aug 20 14:56:50 2025 -0500
> >>>
> >>>       ipmi:si: Gracefully handle if the BMC is non-functional
> >>>
> >>>       If the BMC is not functional, the driver goes into an error state and
> >>>       starts a 1 second timer.  When the timer times out, it will attempt a
> >>>       simple message.  If the BMC interacts correctly, the driver will start
> >>>       accepting messages again.  If not, it remains in error state.
> >>>
> >>>       If the driver goes into error state, all messages current and pending
> >>>       will return with an error.
> >>>
> >>>       This should more gracefully handle when the BMC becomes non-operational,
> >>>       as opposed to trying each messages individually and failing them.
> >>>
> >>>       Signed-off-by: Corey Minyard <corey@minyard.net>
> >>>
> >>>    drivers/char/ipmi/ipmi_si_intf.c | 29 +++++++++++++++++++++++------
> >>>    1 file changed, 23 insertions(+), 6 deletions(-)
> >>
> >> Now that is interesting. Does reverting that patch on top of 6.18.8
> >> (or any other 6.18 release) solve the problem ?
> >>
> >> Thanks,
> >> Guenter
> >>
> >
> >
>


-- 

Igor Raits | Sr. Technical Manager, Platform Delivery

igor@gooddata.com

+420 775 117 817


Moravske namesti 1007/14

602 00 Brno-Veveri, Czech Republic

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* Re: [BISECTED]: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-02-05  7:25                                       ` Igor Raits
@ 2026-02-05 17:51                                         ` Guenter Roeck
  2026-02-05 19:04                                           ` Rafael J. Wysocki
  0 siblings, 1 reply; 60+ messages in thread
From: Guenter Roeck @ 2026-02-05 17:51 UTC (permalink / raw)
  To: Igor Raits
  Cc: Jaroslav Pulchart, Rafael J. Wysocki, linux-acpi, linux-hwmon,
	Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li

On Thu, Feb 05, 2026 at 08:25:57AM +0100, Igor Raits wrote:
> On Wed, Feb 4, 2026 at 11:49 PM Guenter Roeck <linux@roeck-us.net> wrote:
> >
> > On 2/4/26 11:54, Igor Raits wrote:
> > > I have written a patch with the help of AI and it fixes the problem. Attached.
> > >
> >
> > "No MIME, no links, no compression, no attachments.  Just plain text"
> 
> Sorry for that, I had assumed that attaching the file would make it in-line.
> 
> > ... which means I can not provide inline feedback, which is the whole
> > point of the above.
> >
> > Your patch crosses subsystems, so it will need to be split in two
> > (assuming the ACPI side is even needed). Also, references to iDRAC
> > in common code seem inappropriate.
> 
> Yes, this I believe was the essential part (it was the last piece in
> my testing which fixed the hanging):
> 

Then I'll need to ask differently: What happens if you drop the IPMI code,
and just keep the wait_for_completion -> wait_for_completion_timeout
change ? Would that be sufficient to solve the problem ?

Either case, the need for this change suggests that the ipmi change
may not be complete, since it should send a completion with an error.

Thanks,
Guenter

> diff --git a/drivers/acpi/acpi_ipmi.c b/drivers/acpi/acpi_ipmi.c
> index 5fba4dab5d08..ab1f4b27741d 100644
> --- a/drivers/acpi/acpi_ipmi.c
> +++ b/drivers/acpi/acpi_ipmi.c
> @@ -572,7 +572,18 @@ acpi_ipmi_space_handler(u32 function,
> acpi_physical_address address,
>                 status = AE_ERROR;
>                 goto out_msg;
>         }
> -       wait_for_completion(&tx_msg->tx_complete);
> +
> +       /*
> +        * Use a timeout to avoid blocking forever if the BMC is slow
> +        * or non-responsive (e.g., during iDRAC reset). The timeout
> +        * should be longer than IPMI_TIMEOUT to allow for retries.
> +        */
> +       if (!wait_for_completion_timeout(&tx_msg->tx_complete,
> +                                        msecs_to_jiffies(IPMI_TIMEOUT * 3))) {
> +               dev_warn_once(ipmi_device->dev,
> +                             "ACPI IPMI request timed out\n");
> +               tx_msg->msg_done = ACPI_IPMI_TIMEOUT;
> +       }
> 
>         acpi_format_ipmi_response(tx_msg, value);
>         status = AE_OK;
> 
> > Thanks,
> > Guenter
> >
> > > I don't know how correct it is, but it works :)
> > >
> > > On Wed, Feb 4, 2026 at 2:01 AM Guenter Roeck <linux@roeck-us.net> wrote:
> > >>
> > >> On 2/3/26 15:21, Jaroslav Pulchart wrote:
> > >>
> > >>> ...
> > >>> $ git bisect bad
> > >>> bc3a9d217755f65c137f145600f23bf1d6c31ea9 is the first bad commit
> > >>> commit bc3a9d217755f65c137f145600f23bf1d6c31ea9 (HEAD)
> > >>> Author: Corey Minyard <corey@minyard.net>
> > >>> Date:   Wed Aug 20 14:56:50 2025 -0500
> > >>>
> > >>>       ipmi:si: Gracefully handle if the BMC is non-functional
> > >>>
> > >>>       If the BMC is not functional, the driver goes into an error state and
> > >>>       starts a 1 second timer.  When the timer times out, it will attempt a
> > >>>       simple message.  If the BMC interacts correctly, the driver will start
> > >>>       accepting messages again.  If not, it remains in error state.
> > >>>
> > >>>       If the driver goes into error state, all messages current and pending
> > >>>       will return with an error.
> > >>>
> > >>>       This should more gracefully handle when the BMC becomes non-operational,
> > >>>       as opposed to trying each messages individually and failing them.
> > >>>
> > >>>       Signed-off-by: Corey Minyard <corey@minyard.net>
> > >>>
> > >>>    drivers/char/ipmi/ipmi_si_intf.c | 29 +++++++++++++++++++++++------
> > >>>    1 file changed, 23 insertions(+), 6 deletions(-)
> > >>
> > >> Now that is interesting. Does reverting that patch on top of 6.18.8
> > >> (or any other 6.18 release) solve the problem ?
> > >>
> > >> Thanks,
> > >> Guenter
> > >>
> > >
> > >
> >
> 
> 
> -- 
> 
> Igor Raits | Sr. Technical Manager, Platform Delivery
> 
> igor@gooddata.com
> 
> +420 775 117 817
> 
> 
> Moravske namesti 1007/14
> 
> 602 00 Brno-Veveri, Czech Republic

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [BISECTED]: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-02-05 17:51                                         ` Guenter Roeck
@ 2026-02-05 19:04                                           ` Rafael J. Wysocki
  2026-02-05 20:57                                             ` [BISECTED - impi related]: " Guenter Roeck
  0 siblings, 1 reply; 60+ messages in thread
From: Rafael J. Wysocki @ 2026-02-05 19:04 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Igor Raits, Jaroslav Pulchart, linux-acpi, linux-hwmon,
	Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li,
	Corey Minyard

Cc: Corey

On Thu, Feb 5, 2026 at 6:51 PM Guenter Roeck <linux@roeck-us.net> wrote:
>
> On Thu, Feb 05, 2026 at 08:25:57AM +0100, Igor Raits wrote:
> > On Wed, Feb 4, 2026 at 11:49 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > >
> > > On 2/4/26 11:54, Igor Raits wrote:
> > > > I have written a patch with the help of AI and it fixes the problem. Attached.
> > > >
> > >
> > > "No MIME, no links, no compression, no attachments.  Just plain text"
> >
> > Sorry for that, I had assumed that attaching the file would make it in-line.
> >
> > > ... which means I can not provide inline feedback, which is the whole
> > > point of the above.
> > >
> > > Your patch crosses subsystems, so it will need to be split in two
> > > (assuming the ACPI side is even needed). Also, references to iDRAC
> > > in common code seem inappropriate.
> >
> > Yes, this I believe was the essential part (it was the last piece in
> > my testing which fixed the hanging):
> >
>
> Then I'll need to ask differently: What happens if you drop the IPMI code,
> and just keep the wait_for_completion -> wait_for_completion_timeout
> change ? Would that be sufficient to solve the problem ?

I'd rather say "Would that be sufficient to make the symptoms go
away?" as it most likely papers over the real problem.

> Either case, the need for this change suggests that the ipmi change
> may not be complete, since it should send a completion with an error.

I think that reverting commit bc3a9d217755 ("ipmi:si: Gracefully
handle if the BMC is non-functional") should also be considered as a
possible way forward because it clearly did not improve things as
expected, at least in this particular case.

It evidently did something that confuses things quite a bit.  Either
it is returning IPMI_BUS_ERR instead of IPMI_ERR_UNSPECIFIED, or it is
the "hosed" state and refusing to accept messages.

> > diff --git a/drivers/acpi/acpi_ipmi.c b/drivers/acpi/acpi_ipmi.c
> > index 5fba4dab5d08..ab1f4b27741d 100644
> > --- a/drivers/acpi/acpi_ipmi.c
> > +++ b/drivers/acpi/acpi_ipmi.c
> > @@ -572,7 +572,18 @@ acpi_ipmi_space_handler(u32 function,
> > acpi_physical_address address,
> >                 status = AE_ERROR;
> >                 goto out_msg;
> >         }
> > -       wait_for_completion(&tx_msg->tx_complete);
> > +
> > +       /*
> > +        * Use a timeout to avoid blocking forever if the BMC is slow
> > +        * or non-responsive (e.g., during iDRAC reset). The timeout
> > +        * should be longer than IPMI_TIMEOUT to allow for retries.
> > +        */
> > +       if (!wait_for_completion_timeout(&tx_msg->tx_complete,
> > +                                        msecs_to_jiffies(IPMI_TIMEOUT * 3))) {
> > +               dev_warn_once(ipmi_device->dev,
> > +                             "ACPI IPMI request timed out\n");
> > +               tx_msg->msg_done = ACPI_IPMI_TIMEOUT;
> > +       }
> >
> >         acpi_format_ipmi_response(tx_msg, value);
> >         status = AE_OK;
> >
> > > Thanks,
> > > Guenter
> > >
> > > > I don't know how correct it is, but it works :)
> > > >
> > > > On Wed, Feb 4, 2026 at 2:01 AM Guenter Roeck <linux@roeck-us.net> wrote:
> > > >>
> > > >> On 2/3/26 15:21, Jaroslav Pulchart wrote:
> > > >>
> > > >>> ...
> > > >>> $ git bisect bad
> > > >>> bc3a9d217755f65c137f145600f23bf1d6c31ea9 is the first bad commit
> > > >>> commit bc3a9d217755f65c137f145600f23bf1d6c31ea9 (HEAD)
> > > >>> Author: Corey Minyard <corey@minyard.net>
> > > >>> Date:   Wed Aug 20 14:56:50 2025 -0500
> > > >>>
> > > >>>       ipmi:si: Gracefully handle if the BMC is non-functional
> > > >>>
> > > >>>       If the BMC is not functional, the driver goes into an error state and
> > > >>>       starts a 1 second timer.  When the timer times out, it will attempt a
> > > >>>       simple message.  If the BMC interacts correctly, the driver will start
> > > >>>       accepting messages again.  If not, it remains in error state.
> > > >>>
> > > >>>       If the driver goes into error state, all messages current and pending
> > > >>>       will return with an error.
> > > >>>
> > > >>>       This should more gracefully handle when the BMC becomes non-operational,
> > > >>>       as opposed to trying each messages individually and failing them.
> > > >>>
> > > >>>       Signed-off-by: Corey Minyard <corey@minyard.net>
> > > >>>
> > > >>>    drivers/char/ipmi/ipmi_si_intf.c | 29 +++++++++++++++++++++++------
> > > >>>    1 file changed, 23 insertions(+), 6 deletions(-)
> > > >>
> > > >> Now that is interesting. Does reverting that patch on top of 6.18.8
> > > >> (or any other 6.18 release) solve the problem ?

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [BISECTED - impi related]: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-02-05 19:04                                           ` Rafael J. Wysocki
@ 2026-02-05 20:57                                             ` Guenter Roeck
  2026-02-06 12:08                                               ` Rafael J. Wysocki
  0 siblings, 1 reply; 60+ messages in thread
From: Guenter Roeck @ 2026-02-05 20:57 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Igor Raits, Jaroslav Pulchart, linux-acpi, linux-hwmon,
	Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li,
	Corey Minyard

On Thu, Feb 05, 2026 at 08:04:12PM +0100, Rafael J. Wysocki wrote:
> Cc: Corey
> 
> On Thu, Feb 5, 2026 at 6:51 PM Guenter Roeck <linux@roeck-us.net> wrote:
> >
> > On Thu, Feb 05, 2026 at 08:25:57AM +0100, Igor Raits wrote:
> > > On Wed, Feb 4, 2026 at 11:49 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > >
> > > > On 2/4/26 11:54, Igor Raits wrote:
> > > > > I have written a patch with the help of AI and it fixes the problem. Attached.
> > > > >
> > > >
> > > > "No MIME, no links, no compression, no attachments.  Just plain text"
> > >
> > > Sorry for that, I had assumed that attaching the file would make it in-line.
> > >
> > > > ... which means I can not provide inline feedback, which is the whole
> > > > point of the above.
> > > >
> > > > Your patch crosses subsystems, so it will need to be split in two
> > > > (assuming the ACPI side is even needed). Also, references to iDRAC
> > > > in common code seem inappropriate.
> > >
> > > Yes, this I believe was the essential part (it was the last piece in
> > > my testing which fixed the hanging):
> > >
> >
> > Then I'll need to ask differently: What happens if you drop the IPMI code,
> > and just keep the wait_for_completion -> wait_for_completion_timeout
> > change ? Would that be sufficient to solve the problem ?
> 
> I'd rather say "Would that be sufficient to make the symptoms go
> away?" as it most likely papers over the real problem.
> 

Good point. Worse, it may result in UAF or memory leaks.

> > Either case, the need for this change suggests that the ipmi change
> > may not be complete, since it should send a completion with an error.
> 
> I think that reverting commit bc3a9d217755 ("ipmi:si: Gracefully
> handle if the BMC is non-functional") should also be considered as a
> possible way forward because it clearly did not improve things as
> expected, at least in this particular case.
> 

I tend to agree. I ran a number of AI code reviews over the patch, and
each time it finds new (and different) problems. The fact that the acpi
patch is still needed even after applying the ipmi changes suggests that
something is still missing in the ipmi code.

> It evidently did something that confuses things quite a bit.  Either
> it is returning IPMI_BUS_ERR instead of IPMI_ERR_UNSPECIFIED, or it is
> the "hosed" state and refusing to accept messages.
> 

More than that. My latest AI results are below, just for reference
(using Gemini 3 with Chris Mason's debug prompts). The prompt I used
for this run is:

"
The top commit in the linux/ directory results in hung tasks if the BMC
stops responding. Using @review-prompts/kernel/debugging.md analyze the
patch, identify the reason for the hung task problem, suggest and implement
a fix. Note that there may be more than one problem in the patch, so analyze
the complete patch and do not stop after fiding the first regression.
"

I think that catches most of the problem, but not all of it.

Guenter

---

Summary of crash or warning:
Hung task detected in ipmi_si driver when BMC becomes non-functional.
Processes waiting for IPMI responses (e.g. ipmitool, monitoring agents) enter D state and never recover.

Kernel version if available:
Top of tree (commit bc3a9d217755f65c137f145600f23bf1d6c31ea9)

Machine type if available:
Generic Server with BMC

Cleaned up copy of oops or stack trace:
[  120.123456] INFO: task ipmitool:1234 blocked for more than 120 seconds.
[  120.123457]       Not tainted 6.14.0-rc1 #1
[  120.123458] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  120.123459] task:ipmitool        state:D stack:    0 pid: 1234 ppid:  100 flags:0x00000000
[  120.123460] Call Trace:
[  120.123461]  <TASK>
[  120.123462]  __schedule+0x123/0x456
[  120.123463]  schedule+0x45/0x78
[  120.123464]  schedule_timeout+0x9a/0xbc
[  120.123465]  wait_for_completion+0xde/0xf0
[  120.123466]  ipmi_request_settime+0x123/0x145
[  120.123467]  ...
[  120.123468]  </TASK>

Any other kernel messages you found relevant:
N/A

Explanation of the problem:
1. Hung Task:
The patch "ipmi:si: Gracefully handle if the BMC is non-functional" introduces a new state `SI_HOSED` to handle BMC failures. When the driver detects that the BMC is not responding, it transitions to `SI_HOSED` and fails the currently processing message (`curr_msg`). However, if a new message is queued via `sender()` (populating `waiting_msg`) during a recovery probe (state `SI_GETTING_FLAGS`), and that probe subsequently fails, the state machine transitions back to `SI_HOSED`. In this transition, the driver checks and fails `curr_msg`, but it neglects to check or fail `waiting_msg`. As a result, the `waiting_msg` remains in the queue indefinitely, causing the waiting process to hang.

2. Excessive Polling (Timer Overwrite):
In `smi_timeout()`, the timer is unconditionally reset to a short timeout (e.g., 10ms) at the end of the function, unless the state machine is IDLE. When the state machine returns `SI_SM_HOSED`, `smi_event_handler()` correctly sets the timer to a long backoff (1 second). However, `smi_timeout()` subsequently overwrites this with the short timeout. This causes the driver to poll the hosed BMC every 10ms instead of backing off for 1 second, resulting in unnecessary CPU overhead.

Functions, snippets and call traces of code related to the problem:

drivers/char/ipmi/ipmi_si_intf.c: smi_event_handler()

	} else if (si_sm_result == SI_SM_HOSED) {
		/* ... */
		smi_info->si_state = SI_HOSED;
		if (smi_info->curr_msg != NULL) {
			return_hosed_msg(smi_info, IPMI_BUS_ERR);
		}
		/* BUG 1: waiting_msg is NOT checked here! */

		smi_mod_timer(smi_info, jiffies + SI_TIMEOUT_HOSED); /* Sets 1s timeout */
		goto out;
	}

drivers/char/ipmi/ipmi_si_intf.c: smi_timeout()

	smi_result = smi_event_handler(smi_info, time_diff);
	/* ... */
	if (smi_result != SI_SM_IDLE)
		smi_mod_timer(smi_info, timeout); /* BUG 2: Overwrites 1s timeout with ~10ms */

List of potential commits related to the problem:
bc3a9d217755f65c137f145600f23bf1d6c31ea9 ipmi:si: Gracefully handle if the BMC is non-functional

Suggested fix:
1. When entering `SI_HOSED` state in `smi_event_handler`, check for and fail any pending `waiting_msg` in addition to `curr_msg`.
2. In `smi_timeout`, do not reset the timer if the state is `SI_HOSED`, preserving the backoff timeout set by `smi_event_handler`.

---
Suggested diffs from that run:

diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
index 1c65275906b4..023b0e819f91 100644
--- a/drivers/char/ipmi/ipmi_si_intf.c
+++ b/drivers/char/ipmi/ipmi_si_intf.c
@@ -810,6 +810,11 @@ static enum si_sm_result smi_event_handler(struct smi_info *smi_info,
 			 */
 			return_hosed_msg(smi_info, IPMI_BUS_ERR);
 		}
+		if (smi_info->waiting_msg != NULL) {
+			smi_info->curr_msg = smi_info->waiting_msg;
+			smi_info->waiting_msg = NULL;
+			return_hosed_msg(smi_info, IPMI_BUS_ERR);
+		}
 		smi_mod_timer(smi_info, jiffies + SI_TIMEOUT_HOSED);
 		goto out;
 	}
@@ -1130,10 +1135,12 @@ static void smi_timeout(struct timer_list *t)
 		timeout = jiffies + SI_TIMEOUT_JIFFIES;
 	}

-	if (smi_result != SI_SM_IDLE)
-		smi_mod_timer(smi_info, timeout);
-	else
+	if (smi_result != SI_SM_IDLE) {
+		if (smi_info->si_state != SI_HOSED)
+			smi_mod_timer(smi_info, timeout);
+	} else {
 		smi_info->timer_running = false;
+	}
 	spin_unlock_irqrestore(&(smi_info->si_lock), flags);
 }

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* Re: [BISECTED - impi related]: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-02-05 20:57                                             ` [BISECTED - impi related]: " Guenter Roeck
@ 2026-02-06 12:08                                               ` Rafael J. Wysocki
  2026-02-06 15:58                                                 ` Corey Minyard
  2026-02-06 16:08                                                 ` Corey Minyard
  0 siblings, 2 replies; 60+ messages in thread
From: Rafael J. Wysocki @ 2026-02-06 12:08 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Rafael J. Wysocki, Igor Raits, Jaroslav Pulchart, linux-acpi,
	linux-hwmon, Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li,
	Corey Minyard

On Thu, Feb 5, 2026 at 11:34 PM Guenter Roeck <linux@roeck-us.net> wrote:
>
> On Thu, Feb 05, 2026 at 08:04:12PM +0100, Rafael J. Wysocki wrote:
> > Cc: Corey
> >
> > On Thu, Feb 5, 2026 at 6:51 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > >
> > > On Thu, Feb 05, 2026 at 08:25:57AM +0100, Igor Raits wrote:
> > > > On Wed, Feb 4, 2026 at 11:49 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > >
> > > > > On 2/4/26 11:54, Igor Raits wrote:
> > > > > > I have written a patch with the help of AI and it fixes the problem. Attached.
> > > > > >
> > > > >
> > > > > "No MIME, no links, no compression, no attachments.  Just plain text"
> > > >
> > > > Sorry for that, I had assumed that attaching the file would make it in-line.
> > > >
> > > > > ... which means I can not provide inline feedback, which is the whole
> > > > > point of the above.
> > > > >
> > > > > Your patch crosses subsystems, so it will need to be split in two
> > > > > (assuming the ACPI side is even needed). Also, references to iDRAC
> > > > > in common code seem inappropriate.
> > > >
> > > > Yes, this I believe was the essential part (it was the last piece in
> > > > my testing which fixed the hanging):
> > > >
> > >
> > > Then I'll need to ask differently: What happens if you drop the IPMI code,
> > > and just keep the wait_for_completion -> wait_for_completion_timeout
> > > change ? Would that be sufficient to solve the problem ?
> >
> > I'd rather say "Would that be sufficient to make the symptoms go
> > away?" as it most likely papers over the real problem.
> >
>
> Good point. Worse, it may result in UAF or memory leaks.
>
> > > Either case, the need for this change suggests that the ipmi change
> > > may not be complete, since it should send a completion with an error.
> >
> > I think that reverting commit bc3a9d217755 ("ipmi:si: Gracefully
> > handle if the BMC is non-functional") should also be considered as a
> > possible way forward because it clearly did not improve things as
> > expected, at least in this particular case.
> >
>
> I tend to agree. I ran a number of AI code reviews over the patch, and
> each time it finds new (and different) problems. The fact that the acpi
> patch is still needed even after applying the ipmi changes suggests that
> something is still missing in the ipmi code.
>
> > It evidently did something that confuses things quite a bit.  Either
> > it is returning IPMI_BUS_ERR instead of IPMI_ERR_UNSPECIFIED, or it is
> > the "hosed" state and refusing to accept messages.
> >
>
> More than that. My latest AI results are below, just for reference
> (using Gemini 3 with Chris Mason's debug prompts). The prompt I used
> for this run is:

Well, I guess it's time to send a revert patch then.

> "
> The top commit in the linux/ directory results in hung tasks if the BMC
> stops responding. Using @review-prompts/kernel/debugging.md analyze the
> patch, identify the reason for the hung task problem, suggest and implement
> a fix. Note that there may be more than one problem in the patch, so analyze
> the complete patch and do not stop after fiding the first regression.
> "
>
> I think that catches most of the problem, but not all of it.
>
> Guenter
>
> ---
>
> Summary of crash or warning:
> Hung task detected in ipmi_si driver when BMC becomes non-functional.
> Processes waiting for IPMI responses (e.g. ipmitool, monitoring agents) enter D state and never recover.
>
> Kernel version if available:
> Top of tree (commit bc3a9d217755f65c137f145600f23bf1d6c31ea9)
>
> Machine type if available:
> Generic Server with BMC
>
> Cleaned up copy of oops or stack trace:
> [  120.123456] INFO: task ipmitool:1234 blocked for more than 120 seconds.
> [  120.123457]       Not tainted 6.14.0-rc1 #1
> [  120.123458] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [  120.123459] task:ipmitool        state:D stack:    0 pid: 1234 ppid:  100 flags:0x00000000
> [  120.123460] Call Trace:
> [  120.123461]  <TASK>
> [  120.123462]  __schedule+0x123/0x456
> [  120.123463]  schedule+0x45/0x78
> [  120.123464]  schedule_timeout+0x9a/0xbc
> [  120.123465]  wait_for_completion+0xde/0xf0
> [  120.123466]  ipmi_request_settime+0x123/0x145
> [  120.123467]  ...
> [  120.123468]  </TASK>
>
> Any other kernel messages you found relevant:
> N/A
>
> Explanation of the problem:
> 1. Hung Task:
> The patch "ipmi:si: Gracefully handle if the BMC is non-functional" introduces a new state `SI_HOSED` to handle BMC failures. When the driver detects that the BMC is not responding, it transitions to `SI_HOSED` and fails the currently processing message (`curr_msg`). However, if a new message is queued via `sender()` (populating `waiting_msg`) during a recovery probe (state `SI_GETTING_FLAGS`), and that probe subsequently fails, the state machine transitions back to `SI_HOSED`. In this transition, the driver checks and fails `curr_msg`, but it neglects to check or fail `waiting_msg`. As a result, the `waiting_msg` remains in the queue indefinitely, causing the waiting process to hang.

That's quite convincing and it would explain the observed symptoms.

> 2. Excessive Polling (Timer Overwrite):
> In `smi_timeout()`, the timer is unconditionally reset to a short timeout (e.g., 10ms) at the end of the function, unless the state machine is IDLE. When the state machine returns `SI_SM_HOSED`, `smi_event_handler()` correctly sets the timer to a long backoff (1 second). However, `smi_timeout()` subsequently overwrites this with the short timeout. This causes the driver to poll the hosed BMC every 10ms instead of backing off for 1 second, resulting in unnecessary CPU overhead.

Well, that's not nice, but it doesn't seem to be related to the
observed symptoms.

> Functions, snippets and call traces of code related to the problem:
>
> drivers/char/ipmi/ipmi_si_intf.c: smi_event_handler()
>
>         } else if (si_sm_result == SI_SM_HOSED) {
>                 /* ... */
>                 smi_info->si_state = SI_HOSED;
>                 if (smi_info->curr_msg != NULL) {
>                         return_hosed_msg(smi_info, IPMI_BUS_ERR);
>                 }
>                 /* BUG 1: waiting_msg is NOT checked here! */
>
>                 smi_mod_timer(smi_info, jiffies + SI_TIMEOUT_HOSED); /* Sets 1s timeout */
>                 goto out;
>         }
>
> drivers/char/ipmi/ipmi_si_intf.c: smi_timeout()
>
>         smi_result = smi_event_handler(smi_info, time_diff);
>         /* ... */
>         if (smi_result != SI_SM_IDLE)
>                 smi_mod_timer(smi_info, timeout); /* BUG 2: Overwrites 1s timeout with ~10ms */
>
> List of potential commits related to the problem:
> bc3a9d217755f65c137f145600f23bf1d6c31ea9 ipmi:si: Gracefully handle if the BMC is non-functional
>
> Suggested fix:
> 1. When entering `SI_HOSED` state in `smi_event_handler`, check for and fail any pending `waiting_msg` in addition to `curr_msg`.
> 2. In `smi_timeout`, do not reset the timer if the state is `SI_HOSED`, preserving the backoff timeout set by `smi_event_handler`.
>
> ---
> Suggested diffs from that run:
>
> diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
> index 1c65275906b4..023b0e819f91 100644
> --- a/drivers/char/ipmi/ipmi_si_intf.c
> +++ b/drivers/char/ipmi/ipmi_si_intf.c
> @@ -810,6 +810,11 @@ static enum si_sm_result smi_event_handler(struct smi_info *smi_info,
>                          */
>                         return_hosed_msg(smi_info, IPMI_BUS_ERR);
>                 }
> +               if (smi_info->waiting_msg != NULL) {
> +                       smi_info->curr_msg = smi_info->waiting_msg;
> +                       smi_info->waiting_msg = NULL;
> +                       return_hosed_msg(smi_info, IPMI_BUS_ERR);
> +               }
>                 smi_mod_timer(smi_info, jiffies + SI_TIMEOUT_HOSED);
>                 goto out;
>         }
> @@ -1130,10 +1135,12 @@ static void smi_timeout(struct timer_list *t)
>                 timeout = jiffies + SI_TIMEOUT_JIFFIES;
>         }
>
> -       if (smi_result != SI_SM_IDLE)
> -               smi_mod_timer(smi_info, timeout);
> -       else
> +       if (smi_result != SI_SM_IDLE) {
> +               if (smi_info->si_state != SI_HOSED)
> +                       smi_mod_timer(smi_info, timeout);
> +       } else {
>                 smi_info->timer_running = false;
> +       }
>         spin_unlock_irqrestore(&(smi_info->si_lock), flags);
>  }

Well, I'd leave fixing the IPMI handling to somebody familiar with that code.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [BISECTED - impi related]: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-02-06 12:08                                               ` Rafael J. Wysocki
@ 2026-02-06 15:58                                                 ` Corey Minyard
  2026-02-06 19:33                                                   ` Rafael J. Wysocki
  2026-02-10 16:31                                                   ` Rafael J. Wysocki
  2026-02-06 16:08                                                 ` Corey Minyard
  1 sibling, 2 replies; 60+ messages in thread
From: Corey Minyard @ 2026-02-06 15:58 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Guenter Roeck, Igor Raits, Jaroslav Pulchart, linux-acpi,
	linux-hwmon, Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li

On Fri, Feb 06, 2026 at 01:08:56PM +0100, Rafael J. Wysocki wrote:
> On Thu, Feb 5, 2026 at 11:34 PM Guenter Roeck <linux@roeck-us.net> wrote:
> >
> > On Thu, Feb 05, 2026 at 08:04:12PM +0100, Rafael J. Wysocki wrote:
> > > Cc: Corey
> > >
> > > On Thu, Feb 5, 2026 at 6:51 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > >
> > > > On Thu, Feb 05, 2026 at 08:25:57AM +0100, Igor Raits wrote:
> > > > > On Wed, Feb 4, 2026 at 11:49 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > > >
> > > > > > On 2/4/26 11:54, Igor Raits wrote:
> > > > > > > I have written a patch with the help of AI and it fixes the problem. Attached.
> > > > > > >
> > > > > >
> > > > > > "No MIME, no links, no compression, no attachments.  Just plain text"
> > > > >
> > > > > Sorry for that, I had assumed that attaching the file would make it in-line.
> > > > >
> > > > > > ... which means I can not provide inline feedback, which is the whole
> > > > > > point of the above.
> > > > > >
> > > > > > Your patch crosses subsystems, so it will need to be split in two
> > > > > > (assuming the ACPI side is even needed). Also, references to iDRAC
> > > > > > in common code seem inappropriate.
> > > > >
> > > > > Yes, this I believe was the essential part (it was the last piece in
> > > > > my testing which fixed the hanging):
> > > > >
> > > >
> > > > Then I'll need to ask differently: What happens if you drop the IPMI code,
> > > > and just keep the wait_for_completion -> wait_for_completion_timeout
> > > > change ? Would that be sufficient to solve the problem ?
> > >
> > > I'd rather say "Would that be sufficient to make the symptoms go
> > > away?" as it most likely papers over the real problem.
> > >
> >
> > Good point. Worse, it may result in UAF or memory leaks.
> >
> > > > Either case, the need for this change suggests that the ipmi change
> > > > may not be complete, since it should send a completion with an error.
> > >
> > > I think that reverting commit bc3a9d217755 ("ipmi:si: Gracefully
> > > handle if the BMC is non-functional") should also be considered as a
> > > possible way forward because it clearly did not improve things as
> > > expected, at least in this particular case.
> > >
> >
> > I tend to agree. I ran a number of AI code reviews over the patch, and
> > each time it finds new (and different) problems. The fact that the acpi
> > patch is still needed even after applying the ipmi changes suggests that
> > something is still missing in the ipmi code.
> >
> > > It evidently did something that confuses things quite a bit.  Either
> > > it is returning IPMI_BUS_ERR instead of IPMI_ERR_UNSPECIFIED, or it is
> > > the "hosed" state and refusing to accept messages.
> > >
> >
> > More than that. My latest AI results are below, just for reference
> > (using Gemini 3 with Chris Mason's debug prompts). The prompt I used
> > for this run is:
> 
> Well, I guess it's time to send a revert patch then.

Thanks for the CC.

Let's fix it right in the IPMI driver.

> 
> > "
> > The top commit in the linux/ directory results in hung tasks if the BMC
> > stops responding. Using @review-prompts/kernel/debugging.md analyze the
> > patch, identify the reason for the hung task problem, suggest and implement
> > a fix. Note that there may be more than one problem in the patch, so analyze
> > the complete patch and do not stop after fiding the first regression.
> > "
> >
> > I think that catches most of the problem, but not all of it.
> >
> > Guenter
> >
> > ---
> >
> > Summary of crash or warning:
> > Hung task detected in ipmi_si driver when BMC becomes non-functional.
> > Processes waiting for IPMI responses (e.g. ipmitool, monitoring agents) enter D state and never recover.
> >
> > Kernel version if available:
> > Top of tree (commit bc3a9d217755f65c137f145600f23bf1d6c31ea9)
> >
> > Machine type if available:
> > Generic Server with BMC
> >
> > Cleaned up copy of oops or stack trace:
> > [  120.123456] INFO: task ipmitool:1234 blocked for more than 120 seconds.
> > [  120.123457]       Not tainted 6.14.0-rc1 #1
> > [  120.123458] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > [  120.123459] task:ipmitool        state:D stack:    0 pid: 1234 ppid:  100 flags:0x00000000
> > [  120.123460] Call Trace:
> > [  120.123461]  <TASK>
> > [  120.123462]  __schedule+0x123/0x456
> > [  120.123463]  schedule+0x45/0x78
> > [  120.123464]  schedule_timeout+0x9a/0xbc
> > [  120.123465]  wait_for_completion+0xde/0xf0
> > [  120.123466]  ipmi_request_settime+0x123/0x145
> > [  120.123467]  ...
> > [  120.123468]  </TASK>
> >
> > Any other kernel messages you found relevant:
> > N/A
> >
> > Explanation of the problem:
> > 1. Hung Task:
> > The patch "ipmi:si: Gracefully handle if the BMC is non-functional" introduces a new state `SI_HOSED` to handle BMC failures. When the driver detects that the BMC is not responding, it transitions to `SI_HOSED` and fails the currently processing message (`curr_msg`). However, if a new message is queued via `sender()` (populating `waiting_msg`) during a recovery probe (state `SI_GETTING_FLAGS`), and that probe subsequently fails, the state machine transitions back to `SI_HOSED`. In this transition, the driver checks and fails `curr_msg`, but it neglects to check or fail `waiting_msg`. As a result, the `waiting_msg` remains in the queue indefinitely, causing the waiting process to hang.
> 
> That's quite convincing and it would explain the observed symptoms.

Yes, and it's a fairly easy fix, I think.  The waiting message just
needs to be returned in that case.  The following patch should do it:

diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
index 5459ffdde8dc..ff159b1162b9 100644
--- a/drivers/char/ipmi/ipmi_si_intf.c
+++ b/drivers/char/ipmi/ipmi_si_intf.c
@@ -809,6 +809,12 @@ static enum si_sm_result smi_event_handler(struct smi_info *smi_info,
                         */
                        return_hosed_msg(smi_info, IPMI_BUS_ERR);
                }
+               if (smi_info->waiting_msg != NULL) {
+                       /* Also handle if there was a message waiting. */
+                       smi_info->curr_msg = smi_info->waiting_msg;
+                       smi_info->waiting_msg = NULL;
+                       return_hosed_msg(smi_info, IPMI_BUS_ERR);
+               }
                smi_mod_timer(smi_info, jiffies + SI_TIMEOUT_HOSED);
                goto out;
        }


Thanks,

-corey

> 
> > 2. Excessive Polling (Timer Overwrite):
> > In `smi_timeout()`, the timer is unconditionally reset to a short timeout (e.g., 10ms) at the end of the function, unless the state machine is IDLE. When the state machine returns `SI_SM_HOSED`, `smi_event_handler()` correctly sets the timer to a long backoff (1 second). However, `smi_timeout()` subsequently overwrites this with the short timeout. This causes the driver to poll the hosed BMC every 10ms instead of backing off for 1 second, resulting in unnecessary CPU overhead.
> 
> Well, that's not nice, but it doesn't seem to be related to the
> observed symptoms.
> 
> > Functions, snippets and call traces of code related to the problem:
> >
> > drivers/char/ipmi/ipmi_si_intf.c: smi_event_handler()
> >
> >         } else if (si_sm_result == SI_SM_HOSED) {
> >                 /* ... */
> >                 smi_info->si_state = SI_HOSED;
> >                 if (smi_info->curr_msg != NULL) {
> >                         return_hosed_msg(smi_info, IPMI_BUS_ERR);
> >                 }
> >                 /* BUG 1: waiting_msg is NOT checked here! */
> >
> >                 smi_mod_timer(smi_info, jiffies + SI_TIMEOUT_HOSED); /* Sets 1s timeout */
> >                 goto out;
> >         }
> >
> > drivers/char/ipmi/ipmi_si_intf.c: smi_timeout()
> >
> >         smi_result = smi_event_handler(smi_info, time_diff);
> >         /* ... */
> >         if (smi_result != SI_SM_IDLE)
> >                 smi_mod_timer(smi_info, timeout); /* BUG 2: Overwrites 1s timeout with ~10ms */
> >
> > List of potential commits related to the problem:
> > bc3a9d217755f65c137f145600f23bf1d6c31ea9 ipmi:si: Gracefully handle if the BMC is non-functional
> >
> > Suggested fix:
> > 1. When entering `SI_HOSED` state in `smi_event_handler`, check for and fail any pending `waiting_msg` in addition to `curr_msg`.
> > 2. In `smi_timeout`, do not reset the timer if the state is `SI_HOSED`, preserving the backoff timeout set by `smi_event_handler`.
> >
> > ---
> > Suggested diffs from that run:
> >
> > diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
> > index 1c65275906b4..023b0e819f91 100644
> > --- a/drivers/char/ipmi/ipmi_si_intf.c
> > +++ b/drivers/char/ipmi/ipmi_si_intf.c
> > @@ -810,6 +810,11 @@ static enum si_sm_result smi_event_handler(struct smi_info *smi_info,
> >                          */
> >                         return_hosed_msg(smi_info, IPMI_BUS_ERR);
> >                 }
> > +               if (smi_info->waiting_msg != NULL) {
> > +                       smi_info->curr_msg = smi_info->waiting_msg;
> > +                       smi_info->waiting_msg = NULL;
> > +                       return_hosed_msg(smi_info, IPMI_BUS_ERR);
> > +               }
> >                 smi_mod_timer(smi_info, jiffies + SI_TIMEOUT_HOSED);
> >                 goto out;
> >         }
> > @@ -1130,10 +1135,12 @@ static void smi_timeout(struct timer_list *t)
> >                 timeout = jiffies + SI_TIMEOUT_JIFFIES;
> >         }
> >
> > -       if (smi_result != SI_SM_IDLE)
> > -               smi_mod_timer(smi_info, timeout);
> > -       else
> > +       if (smi_result != SI_SM_IDLE) {
> > +               if (smi_info->si_state != SI_HOSED)
> > +                       smi_mod_timer(smi_info, timeout);
> > +       } else {
> >                 smi_info->timer_running = false;
> > +       }
> >         spin_unlock_irqrestore(&(smi_info->si_lock), flags);
> >  }
> 
> Well, I'd leave fixing the IPMI handling to somebody familiar with that code.

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* Re: [BISECTED - impi related]: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-02-06 15:58                                                 ` Corey Minyard
@ 2026-02-06 19:33                                                   ` Rafael J. Wysocki
  2026-02-10 16:31                                                   ` Rafael J. Wysocki
  1 sibling, 0 replies; 60+ messages in thread
From: Rafael J. Wysocki @ 2026-02-06 19:33 UTC (permalink / raw)
  To: corey
  Cc: Rafael J. Wysocki, Guenter Roeck, Igor Raits, Jaroslav Pulchart,
	linux-acpi, linux-hwmon, Daniel Secik, Zdenek Pesek, Jiri Jurica,
	Huisong Li

On Fri, Feb 6, 2026 at 4:58 PM Corey Minyard <corey@minyard.net> wrote:
>
> On Fri, Feb 06, 2026 at 01:08:56PM +0100, Rafael J. Wysocki wrote:
> > On Thu, Feb 5, 2026 at 11:34 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > >
> > > On Thu, Feb 05, 2026 at 08:04:12PM +0100, Rafael J. Wysocki wrote:
> > > > Cc: Corey
> > > >
> > > > On Thu, Feb 5, 2026 at 6:51 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > >
> > > > > On Thu, Feb 05, 2026 at 08:25:57AM +0100, Igor Raits wrote:
> > > > > > On Wed, Feb 4, 2026 at 11:49 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > > > >
> > > > > > > On 2/4/26 11:54, Igor Raits wrote:
> > > > > > > > I have written a patch with the help of AI and it fixes the problem. Attached.
> > > > > > > >
> > > > > > >
> > > > > > > "No MIME, no links, no compression, no attachments.  Just plain text"
> > > > > >
> > > > > > Sorry for that, I had assumed that attaching the file would make it in-line.
> > > > > >
> > > > > > > ... which means I can not provide inline feedback, which is the whole
> > > > > > > point of the above.
> > > > > > >
> > > > > > > Your patch crosses subsystems, so it will need to be split in two
> > > > > > > (assuming the ACPI side is even needed). Also, references to iDRAC
> > > > > > > in common code seem inappropriate.
> > > > > >
> > > > > > Yes, this I believe was the essential part (it was the last piece in
> > > > > > my testing which fixed the hanging):
> > > > > >
> > > > >
> > > > > Then I'll need to ask differently: What happens if you drop the IPMI code,
> > > > > and just keep the wait_for_completion -> wait_for_completion_timeout
> > > > > change ? Would that be sufficient to solve the problem ?
> > > >
> > > > I'd rather say "Would that be sufficient to make the symptoms go
> > > > away?" as it most likely papers over the real problem.
> > > >
> > >
> > > Good point. Worse, it may result in UAF or memory leaks.
> > >
> > > > > Either case, the need for this change suggests that the ipmi change
> > > > > may not be complete, since it should send a completion with an error.
> > > >
> > > > I think that reverting commit bc3a9d217755 ("ipmi:si: Gracefully
> > > > handle if the BMC is non-functional") should also be considered as a
> > > > possible way forward because it clearly did not improve things as
> > > > expected, at least in this particular case.
> > > >
> > >
> > > I tend to agree. I ran a number of AI code reviews over the patch, and
> > > each time it finds new (and different) problems. The fact that the acpi
> > > patch is still needed even after applying the ipmi changes suggests that
> > > something is still missing in the ipmi code.
> > >
> > > > It evidently did something that confuses things quite a bit.  Either
> > > > it is returning IPMI_BUS_ERR instead of IPMI_ERR_UNSPECIFIED, or it is
> > > > the "hosed" state and refusing to accept messages.
> > > >
> > >
> > > More than that. My latest AI results are below, just for reference
> > > (using Gemini 3 with Chris Mason's debug prompts). The prompt I used
> > > for this run is:
> >
> > Well, I guess it's time to send a revert patch then.
>
> Thanks for the CC.

Thanks for taking care of it!

> Let's fix it right in the IPMI driver.
>
> >
> > > "
> > > The top commit in the linux/ directory results in hung tasks if the BMC
> > > stops responding. Using @review-prompts/kernel/debugging.md analyze the
> > > patch, identify the reason for the hung task problem, suggest and implement
> > > a fix. Note that there may be more than one problem in the patch, so analyze
> > > the complete patch and do not stop after fiding the first regression.
> > > "
> > >
> > > I think that catches most of the problem, but not all of it.
> > >
> > > Guenter
> > >
> > > ---
> > >
> > > Summary of crash or warning:
> > > Hung task detected in ipmi_si driver when BMC becomes non-functional.
> > > Processes waiting for IPMI responses (e.g. ipmitool, monitoring agents) enter D state and never recover.
> > >
> > > Kernel version if available:
> > > Top of tree (commit bc3a9d217755f65c137f145600f23bf1d6c31ea9)
> > >
> > > Machine type if available:
> > > Generic Server with BMC
> > >
> > > Cleaned up copy of oops or stack trace:
> > > [  120.123456] INFO: task ipmitool:1234 blocked for more than 120 seconds.
> > > [  120.123457]       Not tainted 6.14.0-rc1 #1
> > > [  120.123458] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > > [  120.123459] task:ipmitool        state:D stack:    0 pid: 1234 ppid:  100 flags:0x00000000
> > > [  120.123460] Call Trace:
> > > [  120.123461]  <TASK>
> > > [  120.123462]  __schedule+0x123/0x456
> > > [  120.123463]  schedule+0x45/0x78
> > > [  120.123464]  schedule_timeout+0x9a/0xbc
> > > [  120.123465]  wait_for_completion+0xde/0xf0
> > > [  120.123466]  ipmi_request_settime+0x123/0x145
> > > [  120.123467]  ...
> > > [  120.123468]  </TASK>
> > >
> > > Any other kernel messages you found relevant:
> > > N/A
> > >
> > > Explanation of the problem:
> > > 1. Hung Task:
> > > The patch "ipmi:si: Gracefully handle if the BMC is non-functional" introduces a new state `SI_HOSED` to handle BMC failures. When the driver detects that the BMC is not responding, it transitions to `SI_HOSED` and fails the currently processing message (`curr_msg`). However, if a new message is queued via `sender()` (populating `waiting_msg`) during a recovery probe (state `SI_GETTING_FLAGS`), and that probe subsequently fails, the state machine transitions back to `SI_HOSED`. In this transition, the driver checks and fails `curr_msg`, but it neglects to check or fail `waiting_msg`. As a result, the `waiting_msg` remains in the queue indefinitely, causing the waiting process to hang.
> >
> > That's quite convincing and it would explain the observed symptoms.
>
> Yes, and it's a fairly easy fix, I think.  The waiting message just
> needs to be returned in that case.  The following patch should do it:
>
> diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
> index 5459ffdde8dc..ff159b1162b9 100644
> --- a/drivers/char/ipmi/ipmi_si_intf.c
> +++ b/drivers/char/ipmi/ipmi_si_intf.c
> @@ -809,6 +809,12 @@ static enum si_sm_result smi_event_handler(struct smi_info *smi_info,
>                          */
>                         return_hosed_msg(smi_info, IPMI_BUS_ERR);
>                 }
> +               if (smi_info->waiting_msg != NULL) {
> +                       /* Also handle if there was a message waiting. */
> +                       smi_info->curr_msg = smi_info->waiting_msg;
> +                       smi_info->waiting_msg = NULL;
> +                       return_hosed_msg(smi_info, IPMI_BUS_ERR);
> +               }
>                 smi_mod_timer(smi_info, jiffies + SI_TIMEOUT_HOSED);
>                 goto out;
>         }
>

So the AI was right on the money with this one.  Impressive.

[cut]

> > > ---
> > > Suggested diffs from that run:
> > >
> > > diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
> > > index 1c65275906b4..023b0e819f91 100644
> > > --- a/drivers/char/ipmi/ipmi_si_intf.c
> > > +++ b/drivers/char/ipmi/ipmi_si_intf.c
> > > @@ -810,6 +810,11 @@ static enum si_sm_result smi_event_handler(struct smi_info *smi_info,
> > >                          */
> > >                         return_hosed_msg(smi_info, IPMI_BUS_ERR);
> > >                 }
> > > +               if (smi_info->waiting_msg != NULL) {
> > > +                       smi_info->curr_msg = smi_info->waiting_msg;
> > > +                       smi_info->waiting_msg = NULL;
> > > +                       return_hosed_msg(smi_info, IPMI_BUS_ERR);
> > > +               }
> > >                 smi_mod_timer(smi_info, jiffies + SI_TIMEOUT_HOSED);
> > >                 goto out;
> > >         }
> > > @@ -1130,10 +1135,12 @@ static void smi_timeout(struct timer_list *t)
> > >                 timeout = jiffies + SI_TIMEOUT_JIFFIES;
> > >         }
> > >
> > > -       if (smi_result != SI_SM_IDLE)
> > > -               smi_mod_timer(smi_info, timeout);
> > > -       else
> > > +       if (smi_result != SI_SM_IDLE) {
> > > +               if (smi_info->si_state != SI_HOSED)
> > > +                       smi_mod_timer(smi_info, timeout);
> > > +       } else {
> > >                 smi_info->timer_running = false;
> > > +       }
> > >         spin_unlock_irqrestore(&(smi_info->si_lock), flags);
> > >  }

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [BISECTED - impi related]: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-02-06 15:58                                                 ` Corey Minyard
  2026-02-06 19:33                                                   ` Rafael J. Wysocki
@ 2026-02-10 16:31                                                   ` Rafael J. Wysocki
  2026-02-12  9:10                                                     ` Jaroslav Pulchart
  1 sibling, 1 reply; 60+ messages in thread
From: Rafael J. Wysocki @ 2026-02-10 16:31 UTC (permalink / raw)
  To: corey, Jaroslav Pulchart
  Cc: Guenter Roeck, Igor Raits, linux-acpi, linux-hwmon, Daniel Secik,
	Zdenek Pesek, Jiri Jurica, Huisong Li

On Fri, Feb 6, 2026 at 4:58 PM Corey Minyard <corey@minyard.net> wrote:
>
> On Fri, Feb 06, 2026 at 01:08:56PM +0100, Rafael J. Wysocki wrote:
> > On Thu, Feb 5, 2026 at 11:34 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > >
> > > On Thu, Feb 05, 2026 at 08:04:12PM +0100, Rafael J. Wysocki wrote:
> > > > Cc: Corey
> > > >
> > > > On Thu, Feb 5, 2026 at 6:51 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > >
> > > > > On Thu, Feb 05, 2026 at 08:25:57AM +0100, Igor Raits wrote:
> > > > > > On Wed, Feb 4, 2026 at 11:49 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > > > >
> > > > > > > On 2/4/26 11:54, Igor Raits wrote:
> > > > > > > > I have written a patch with the help of AI and it fixes the problem. Attached.
> > > > > > > >
> > > > > > >
> > > > > > > "No MIME, no links, no compression, no attachments.  Just plain text"
> > > > > >
> > > > > > Sorry for that, I had assumed that attaching the file would make it in-line.
> > > > > >
> > > > > > > ... which means I can not provide inline feedback, which is the whole
> > > > > > > point of the above.
> > > > > > >
> > > > > > > Your patch crosses subsystems, so it will need to be split in two
> > > > > > > (assuming the ACPI side is even needed). Also, references to iDRAC
> > > > > > > in common code seem inappropriate.
> > > > > >
> > > > > > Yes, this I believe was the essential part (it was the last piece in
> > > > > > my testing which fixed the hanging):
> > > > > >
> > > > >
> > > > > Then I'll need to ask differently: What happens if you drop the IPMI code,
> > > > > and just keep the wait_for_completion -> wait_for_completion_timeout
> > > > > change ? Would that be sufficient to solve the problem ?
> > > >
> > > > I'd rather say "Would that be sufficient to make the symptoms go
> > > > away?" as it most likely papers over the real problem.
> > > >
> > >
> > > Good point. Worse, it may result in UAF or memory leaks.
> > >
> > > > > Either case, the need for this change suggests that the ipmi change
> > > > > may not be complete, since it should send a completion with an error.
> > > >
> > > > I think that reverting commit bc3a9d217755 ("ipmi:si: Gracefully
> > > > handle if the BMC is non-functional") should also be considered as a
> > > > possible way forward because it clearly did not improve things as
> > > > expected, at least in this particular case.
> > > >
> > >
> > > I tend to agree. I ran a number of AI code reviews over the patch, and
> > > each time it finds new (and different) problems. The fact that the acpi
> > > patch is still needed even after applying the ipmi changes suggests that
> > > something is still missing in the ipmi code.
> > >
> > > > It evidently did something that confuses things quite a bit.  Either
> > > > it is returning IPMI_BUS_ERR instead of IPMI_ERR_UNSPECIFIED, or it is
> > > > the "hosed" state and refusing to accept messages.
> > > >
> > >
> > > More than that. My latest AI results are below, just for reference
> > > (using Gemini 3 with Chris Mason's debug prompts). The prompt I used
> > > for this run is:
> >
> > Well, I guess it's time to send a revert patch then.
>
> Thanks for the CC.
>
> Let's fix it right in the IPMI driver.
>
> >
> > > "
> > > The top commit in the linux/ directory results in hung tasks if the BMC
> > > stops responding. Using @review-prompts/kernel/debugging.md analyze the
> > > patch, identify the reason for the hung task problem, suggest and implement
> > > a fix. Note that there may be more than one problem in the patch, so analyze
> > > the complete patch and do not stop after fiding the first regression.
> > > "
> > >
> > > I think that catches most of the problem, but not all of it.
> > >
> > > Guenter
> > >
> > > ---
> > >
> > > Summary of crash or warning:
> > > Hung task detected in ipmi_si driver when BMC becomes non-functional.
> > > Processes waiting for IPMI responses (e.g. ipmitool, monitoring agents) enter D state and never recover.
> > >
> > > Kernel version if available:
> > > Top of tree (commit bc3a9d217755f65c137f145600f23bf1d6c31ea9)
> > >
> > > Machine type if available:
> > > Generic Server with BMC
> > >
> > > Cleaned up copy of oops or stack trace:
> > > [  120.123456] INFO: task ipmitool:1234 blocked for more than 120 seconds.
> > > [  120.123457]       Not tainted 6.14.0-rc1 #1
> > > [  120.123458] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > > [  120.123459] task:ipmitool        state:D stack:    0 pid: 1234 ppid:  100 flags:0x00000000
> > > [  120.123460] Call Trace:
> > > [  120.123461]  <TASK>
> > > [  120.123462]  __schedule+0x123/0x456
> > > [  120.123463]  schedule+0x45/0x78
> > > [  120.123464]  schedule_timeout+0x9a/0xbc
> > > [  120.123465]  wait_for_completion+0xde/0xf0
> > > [  120.123466]  ipmi_request_settime+0x123/0x145
> > > [  120.123467]  ...
> > > [  120.123468]  </TASK>
> > >
> > > Any other kernel messages you found relevant:
> > > N/A
> > >
> > > Explanation of the problem:
> > > 1. Hung Task:
> > > The patch "ipmi:si: Gracefully handle if the BMC is non-functional" introduces a new state `SI_HOSED` to handle BMC failures. When the driver detects that the BMC is not responding, it transitions to `SI_HOSED` and fails the currently processing message (`curr_msg`). However, if a new message is queued via `sender()` (populating `waiting_msg`) during a recovery probe (state `SI_GETTING_FLAGS`), and that probe subsequently fails, the state machine transitions back to `SI_HOSED`. In this transition, the driver checks and fails `curr_msg`, but it neglects to check or fail `waiting_msg`. As a result, the `waiting_msg` remains in the queue indefinitely, causing the waiting process to hang.
> >
> > That's quite convincing and it would explain the observed symptoms.
>
> Yes, and it's a fairly easy fix, I think.  The waiting message just
> needs to be returned in that case.  The following patch should do it:

Jaroslav, it would be good to test the patch below on top of 6.19.  I
can put it on a test git branch if need be, so please let me know.

> diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
> index 5459ffdde8dc..ff159b1162b9 100644
> --- a/drivers/char/ipmi/ipmi_si_intf.c
> +++ b/drivers/char/ipmi/ipmi_si_intf.c
> @@ -809,6 +809,12 @@ static enum si_sm_result smi_event_handler(struct smi_info *smi_info,
>                          */
>                         return_hosed_msg(smi_info, IPMI_BUS_ERR);
>                 }
> +               if (smi_info->waiting_msg != NULL) {
> +                       /* Also handle if there was a message waiting. */
> +                       smi_info->curr_msg = smi_info->waiting_msg;
> +                       smi_info->waiting_msg = NULL;
> +                       return_hosed_msg(smi_info, IPMI_BUS_ERR);
> +               }
>                 smi_mod_timer(smi_info, jiffies + SI_TIMEOUT_HOSED);
>                 goto out;
>         }

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [BISECTED - impi related]: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-02-10 16:31                                                   ` Rafael J. Wysocki
@ 2026-02-12  9:10                                                     ` Jaroslav Pulchart
  2026-02-12 12:27                                                       ` Rafael J. Wysocki
  0 siblings, 1 reply; 60+ messages in thread
From: Jaroslav Pulchart @ 2026-02-12  9:10 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: corey, Guenter Roeck, Igor Raits, linux-acpi, linux-hwmon,
	Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li

>
> On Fri, Feb 6, 2026 at 4:58 PM Corey Minyard <corey@minyard.net> wrote:
> >
> > On Fri, Feb 06, 2026 at 01:08:56PM +0100, Rafael J. Wysocki wrote:
> > > On Thu, Feb 5, 2026 at 11:34 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > >
> > > > On Thu, Feb 05, 2026 at 08:04:12PM +0100, Rafael J. Wysocki wrote:
> > > > > Cc: Corey
> > > > >
> > > > > On Thu, Feb 5, 2026 at 6:51 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > > >
> > > > > > On Thu, Feb 05, 2026 at 08:25:57AM +0100, Igor Raits wrote:
> > > > > > > On Wed, Feb 4, 2026 at 11:49 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > > > > >
> > > > > > > > On 2/4/26 11:54, Igor Raits wrote:
> > > > > > > > > I have written a patch with the help of AI and it fixes the problem. Attached.
> > > > > > > > >
> > > > > > > >
> > > > > > > > "No MIME, no links, no compression, no attachments.  Just plain text"
> > > > > > >
> > > > > > > Sorry for that, I had assumed that attaching the file would make it in-line.
> > > > > > >
> > > > > > > > ... which means I can not provide inline feedback, which is the whole
> > > > > > > > point of the above.
> > > > > > > >
> > > > > > > > Your patch crosses subsystems, so it will need to be split in two
> > > > > > > > (assuming the ACPI side is even needed). Also, references to iDRAC
> > > > > > > > in common code seem inappropriate.
> > > > > > >
> > > > > > > Yes, this I believe was the essential part (it was the last piece in
> > > > > > > my testing which fixed the hanging):
> > > > > > >
> > > > > >
> > > > > > Then I'll need to ask differently: What happens if you drop the IPMI code,
> > > > > > and just keep the wait_for_completion -> wait_for_completion_timeout
> > > > > > change ? Would that be sufficient to solve the problem ?
> > > > >
> > > > > I'd rather say "Would that be sufficient to make the symptoms go
> > > > > away?" as it most likely papers over the real problem.
> > > > >
> > > >
> > > > Good point. Worse, it may result in UAF or memory leaks.
> > > >
> > > > > > Either case, the need for this change suggests that the ipmi change
> > > > > > may not be complete, since it should send a completion with an error.
> > > > >
> > > > > I think that reverting commit bc3a9d217755 ("ipmi:si: Gracefully
> > > > > handle if the BMC is non-functional") should also be considered as a
> > > > > possible way forward because it clearly did not improve things as
> > > > > expected, at least in this particular case.
> > > > >
> > > >
> > > > I tend to agree. I ran a number of AI code reviews over the patch, and
> > > > each time it finds new (and different) problems. The fact that the acpi
> > > > patch is still needed even after applying the ipmi changes suggests that
> > > > something is still missing in the ipmi code.
> > > >
> > > > > It evidently did something that confuses things quite a bit.  Either
> > > > > it is returning IPMI_BUS_ERR instead of IPMI_ERR_UNSPECIFIED, or it is
> > > > > the "hosed" state and refusing to accept messages.
> > > > >
> > > >
> > > > More than that. My latest AI results are below, just for reference
> > > > (using Gemini 3 with Chris Mason's debug prompts). The prompt I used
> > > > for this run is:
> > >
> > > Well, I guess it's time to send a revert patch then.
> >
> > Thanks for the CC.
> >
> > Let's fix it right in the IPMI driver.
> >
> > >
> > > > "
> > > > The top commit in the linux/ directory results in hung tasks if the BMC
> > > > stops responding. Using @review-prompts/kernel/debugging.md analyze the
> > > > patch, identify the reason for the hung task problem, suggest and implement
> > > > a fix. Note that there may be more than one problem in the patch, so analyze
> > > > the complete patch and do not stop after fiding the first regression.
> > > > "
> > > >
> > > > I think that catches most of the problem, but not all of it.
> > > >
> > > > Guenter
> > > >
> > > > ---
> > > >
> > > > Summary of crash or warning:
> > > > Hung task detected in ipmi_si driver when BMC becomes non-functional.
> > > > Processes waiting for IPMI responses (e.g. ipmitool, monitoring agents) enter D state and never recover.
> > > >
> > > > Kernel version if available:
> > > > Top of tree (commit bc3a9d217755f65c137f145600f23bf1d6c31ea9)
> > > >
> > > > Machine type if available:
> > > > Generic Server with BMC
> > > >
> > > > Cleaned up copy of oops or stack trace:
> > > > [  120.123456] INFO: task ipmitool:1234 blocked for more than 120 seconds.
> > > > [  120.123457]       Not tainted 6.14.0-rc1 #1
> > > > [  120.123458] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > > > [  120.123459] task:ipmitool        state:D stack:    0 pid: 1234 ppid:  100 flags:0x00000000
> > > > [  120.123460] Call Trace:
> > > > [  120.123461]  <TASK>
> > > > [  120.123462]  __schedule+0x123/0x456
> > > > [  120.123463]  schedule+0x45/0x78
> > > > [  120.123464]  schedule_timeout+0x9a/0xbc
> > > > [  120.123465]  wait_for_completion+0xde/0xf0
> > > > [  120.123466]  ipmi_request_settime+0x123/0x145
> > > > [  120.123467]  ...
> > > > [  120.123468]  </TASK>
> > > >
> > > > Any other kernel messages you found relevant:
> > > > N/A
> > > >
> > > > Explanation of the problem:
> > > > 1. Hung Task:
> > > > The patch "ipmi:si: Gracefully handle if the BMC is non-functional" introduces a new state `SI_HOSED` to handle BMC failures. When the driver detects that the BMC is not responding, it transitions to `SI_HOSED` and fails the currently processing message (`curr_msg`). However, if a new message is queued via `sender()` (populating `waiting_msg`) during a recovery probe (state `SI_GETTING_FLAGS`), and that probe subsequently fails, the state machine transitions back to `SI_HOSED`. In this transition, the driver checks and fails `curr_msg`, but it neglects to check or fail `waiting_msg`. As a result, the `waiting_msg` remains in the queue indefinitely, causing the waiting process to hang.
> > >
> > > That's quite convincing and it would explain the observed symptoms.
> >
> > Yes, and it's a fairly easy fix, I think.  The waiting message just
> > needs to be returned in that case.  The following patch should do it:
>
> Jaroslav, it would be good to test the patch below on top of 6.19.  I
> can put it on a test git branch if need be, so please let me know.
>
> > diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
> > index 5459ffdde8dc..ff159b1162b9 100644
> > --- a/drivers/char/ipmi/ipmi_si_intf.c
> > +++ b/drivers/char/ipmi/ipmi_si_intf.c
> > @@ -809,6 +809,12 @@ static enum si_sm_result smi_event_handler(struct smi_info *smi_info,
> >                          */
> >                         return_hosed_msg(smi_info, IPMI_BUS_ERR);
> >                 }
> > +               if (smi_info->waiting_msg != NULL) {
> > +                       /* Also handle if there was a message waiting. */
> > +                       smi_info->curr_msg = smi_info->waiting_msg;
> > +                       smi_info->waiting_msg = NULL;
> > +                       return_hosed_msg(smi_info, IPMI_BUS_ERR);
> > +               }
> >                 smi_mod_timer(smi_info, jiffies + SI_TIMEOUT_HOSED);
> >                 goto out;
> >         }

I apply ^ patch to both 6.18.10 and 6.19 and reproduced the issue on
both, so it does not fix the problem.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [BISECTED - impi related]: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-02-12  9:10                                                     ` Jaroslav Pulchart
@ 2026-02-12 12:27                                                       ` Rafael J. Wysocki
  2026-02-12 16:48                                                         ` Corey Minyard
  0 siblings, 1 reply; 60+ messages in thread
From: Rafael J. Wysocki @ 2026-02-12 12:27 UTC (permalink / raw)
  To: Jaroslav Pulchart
  Cc: Rafael J. Wysocki, corey, Guenter Roeck, Igor Raits, linux-acpi,
	linux-hwmon, Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li

On Thu, Feb 12, 2026 at 10:11 AM Jaroslav Pulchart
<jaroslav.pulchart@gooddata.com> wrote:
>
> >
> > On Fri, Feb 6, 2026 at 4:58 PM Corey Minyard <corey@minyard.net> wrote:
> > >
> > > On Fri, Feb 06, 2026 at 01:08:56PM +0100, Rafael J. Wysocki wrote:
> > > > On Thu, Feb 5, 2026 at 11:34 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > >
> > > > > On Thu, Feb 05, 2026 at 08:04:12PM +0100, Rafael J. Wysocki wrote:
> > > > > > Cc: Corey
> > > > > >
> > > > > > On Thu, Feb 5, 2026 at 6:51 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > > > >
> > > > > > > On Thu, Feb 05, 2026 at 08:25:57AM +0100, Igor Raits wrote:
> > > > > > > > On Wed, Feb 4, 2026 at 11:49 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > > > > > >
> > > > > > > > > On 2/4/26 11:54, Igor Raits wrote:
> > > > > > > > > > I have written a patch with the help of AI and it fixes the problem. Attached.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > "No MIME, no links, no compression, no attachments.  Just plain text"
> > > > > > > >
> > > > > > > > Sorry for that, I had assumed that attaching the file would make it in-line.
> > > > > > > >
> > > > > > > > > ... which means I can not provide inline feedback, which is the whole
> > > > > > > > > point of the above.
> > > > > > > > >
> > > > > > > > > Your patch crosses subsystems, so it will need to be split in two
> > > > > > > > > (assuming the ACPI side is even needed). Also, references to iDRAC
> > > > > > > > > in common code seem inappropriate.
> > > > > > > >
> > > > > > > > Yes, this I believe was the essential part (it was the last piece in
> > > > > > > > my testing which fixed the hanging):
> > > > > > > >
> > > > > > >
> > > > > > > Then I'll need to ask differently: What happens if you drop the IPMI code,
> > > > > > > and just keep the wait_for_completion -> wait_for_completion_timeout
> > > > > > > change ? Would that be sufficient to solve the problem ?
> > > > > >
> > > > > > I'd rather say "Would that be sufficient to make the symptoms go
> > > > > > away?" as it most likely papers over the real problem.
> > > > > >
> > > > >
> > > > > Good point. Worse, it may result in UAF or memory leaks.
> > > > >
> > > > > > > Either case, the need for this change suggests that the ipmi change
> > > > > > > may not be complete, since it should send a completion with an error.
> > > > > >
> > > > > > I think that reverting commit bc3a9d217755 ("ipmi:si: Gracefully
> > > > > > handle if the BMC is non-functional") should also be considered as a
> > > > > > possible way forward because it clearly did not improve things as
> > > > > > expected, at least in this particular case.
> > > > > >
> > > > >
> > > > > I tend to agree. I ran a number of AI code reviews over the patch, and
> > > > > each time it finds new (and different) problems. The fact that the acpi
> > > > > patch is still needed even after applying the ipmi changes suggests that
> > > > > something is still missing in the ipmi code.
> > > > >
> > > > > > It evidently did something that confuses things quite a bit.  Either
> > > > > > it is returning IPMI_BUS_ERR instead of IPMI_ERR_UNSPECIFIED, or it is
> > > > > > the "hosed" state and refusing to accept messages.
> > > > > >
> > > > >
> > > > > More than that. My latest AI results are below, just for reference
> > > > > (using Gemini 3 with Chris Mason's debug prompts). The prompt I used
> > > > > for this run is:
> > > >
> > > > Well, I guess it's time to send a revert patch then.
> > >
> > > Thanks for the CC.
> > >
> > > Let's fix it right in the IPMI driver.
> > >
> > > >
> > > > > "
> > > > > The top commit in the linux/ directory results in hung tasks if the BMC
> > > > > stops responding. Using @review-prompts/kernel/debugging.md analyze the
> > > > > patch, identify the reason for the hung task problem, suggest and implement
> > > > > a fix. Note that there may be more than one problem in the patch, so analyze
> > > > > the complete patch and do not stop after fiding the first regression.
> > > > > "
> > > > >
> > > > > I think that catches most of the problem, but not all of it.
> > > > >
> > > > > Guenter
> > > > >
> > > > > ---
> > > > >
> > > > > Summary of crash or warning:
> > > > > Hung task detected in ipmi_si driver when BMC becomes non-functional.
> > > > > Processes waiting for IPMI responses (e.g. ipmitool, monitoring agents) enter D state and never recover.
> > > > >
> > > > > Kernel version if available:
> > > > > Top of tree (commit bc3a9d217755f65c137f145600f23bf1d6c31ea9)
> > > > >
> > > > > Machine type if available:
> > > > > Generic Server with BMC
> > > > >
> > > > > Cleaned up copy of oops or stack trace:
> > > > > [  120.123456] INFO: task ipmitool:1234 blocked for more than 120 seconds.
> > > > > [  120.123457]       Not tainted 6.14.0-rc1 #1
> > > > > [  120.123458] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > > > > [  120.123459] task:ipmitool        state:D stack:    0 pid: 1234 ppid:  100 flags:0x00000000
> > > > > [  120.123460] Call Trace:
> > > > > [  120.123461]  <TASK>
> > > > > [  120.123462]  __schedule+0x123/0x456
> > > > > [  120.123463]  schedule+0x45/0x78
> > > > > [  120.123464]  schedule_timeout+0x9a/0xbc
> > > > > [  120.123465]  wait_for_completion+0xde/0xf0
> > > > > [  120.123466]  ipmi_request_settime+0x123/0x145
> > > > > [  120.123467]  ...
> > > > > [  120.123468]  </TASK>
> > > > >
> > > > > Any other kernel messages you found relevant:
> > > > > N/A
> > > > >
> > > > > Explanation of the problem:
> > > > > 1. Hung Task:
> > > > > The patch "ipmi:si: Gracefully handle if the BMC is non-functional" introduces a new state `SI_HOSED` to handle BMC failures. When the driver detects that the BMC is not responding, it transitions to `SI_HOSED` and fails the currently processing message (`curr_msg`). However, if a new message is queued via `sender()` (populating `waiting_msg`) during a recovery probe (state `SI_GETTING_FLAGS`), and that probe subsequently fails, the state machine transitions back to `SI_HOSED`. In this transition, the driver checks and fails `curr_msg`, but it neglects to check or fail `waiting_msg`. As a result, the `waiting_msg` remains in the queue indefinitely, causing the waiting process to hang.
> > > >
> > > > That's quite convincing and it would explain the observed symptoms.
> > >
> > > Yes, and it's a fairly easy fix, I think.  The waiting message just
> > > needs to be returned in that case.  The following patch should do it:
> >
> > Jaroslav, it would be good to test the patch below on top of 6.19.  I
> > can put it on a test git branch if need be, so please let me know.
> >
> > > diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
> > > index 5459ffdde8dc..ff159b1162b9 100644
> > > --- a/drivers/char/ipmi/ipmi_si_intf.c
> > > +++ b/drivers/char/ipmi/ipmi_si_intf.c
> > > @@ -809,6 +809,12 @@ static enum si_sm_result smi_event_handler(struct smi_info *smi_info,
> > >                          */
> > >                         return_hosed_msg(smi_info, IPMI_BUS_ERR);
> > >                 }
> > > +               if (smi_info->waiting_msg != NULL) {
> > > +                       /* Also handle if there was a message waiting. */
> > > +                       smi_info->curr_msg = smi_info->waiting_msg;
> > > +                       smi_info->waiting_msg = NULL;
> > > +                       return_hosed_msg(smi_info, IPMI_BUS_ERR);
> > > +               }
> > >                 smi_mod_timer(smi_info, jiffies + SI_TIMEOUT_HOSED);
> > >                 goto out;
> > >         }
>
> I apply ^ patch to both 6.18.10 and 6.19 and reproduced the issue on
> both, so it does not fix the problem.

Thanks!

With all due respect to everyone involved (including the AI), this
means that we are not anywhere close to fixing the problem and it
would be a shame to ship 7.0 with it.

I'm sending a revert patch shortly.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [BISECTED - impi related]: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-02-12 12:27                                                       ` Rafael J. Wysocki
@ 2026-02-12 16:48                                                         ` Corey Minyard
  2026-02-12 17:22                                                           ` Rafael J. Wysocki
  0 siblings, 1 reply; 60+ messages in thread
From: Corey Minyard @ 2026-02-12 16:48 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Jaroslav Pulchart, Guenter Roeck, Igor Raits, linux-acpi,
	linux-hwmon, Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li

On Thu, Feb 12, 2026 at 01:27:41PM +0100, Rafael J. Wysocki wrote:
> On Thu, Feb 12, 2026 at 10:11???AM Jaroslav Pulchart
> <jaroslav.pulchart@gooddata.com> wrote:
> >
> > >
> > > On Fri, Feb 6, 2026 at 4:58???PM Corey Minyard <corey@minyard.net> wrote:
> > > >
> > > > On Fri, Feb 06, 2026 at 01:08:56PM +0100, Rafael J. Wysocki wrote:
> > > > > On Thu, Feb 5, 2026 at 11:34???PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > > >
> > > > > > On Thu, Feb 05, 2026 at 08:04:12PM +0100, Rafael J. Wysocki wrote:
> > > > > > > Cc: Corey
> > > > > > >
> > > > > > > On Thu, Feb 5, 2026 at 6:51???PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > > > > >
> > > > > > > > On Thu, Feb 05, 2026 at 08:25:57AM +0100, Igor Raits wrote:
> > > > > > > > > On Wed, Feb 4, 2026 at 11:49???PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > > > > > > >
> > > > > > > > > > On 2/4/26 11:54, Igor Raits wrote:
> > > > > > > > > > > I have written a patch with the help of AI and it fixes the problem. Attached.
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > "No MIME, no links, no compression, no attachments.  Just plain text"
> > > > > > > > >
> > > > > > > > > Sorry for that, I had assumed that attaching the file would make it in-line.
> > > > > > > > >
> > > > > > > > > > ... which means I can not provide inline feedback, which is the whole
> > > > > > > > > > point of the above.
> > > > > > > > > >
> > > > > > > > > > Your patch crosses subsystems, so it will need to be split in two
> > > > > > > > > > (assuming the ACPI side is even needed). Also, references to iDRAC
> > > > > > > > > > in common code seem inappropriate.
> > > > > > > > >
> > > > > > > > > Yes, this I believe was the essential part (it was the last piece in
> > > > > > > > > my testing which fixed the hanging):
> > > > > > > > >
> > > > > > > >
> > > > > > > > Then I'll need to ask differently: What happens if you drop the IPMI code,
> > > > > > > > and just keep the wait_for_completion -> wait_for_completion_timeout
> > > > > > > > change ? Would that be sufficient to solve the problem ?
> > > > > > >
> > > > > > > I'd rather say "Would that be sufficient to make the symptoms go
> > > > > > > away?" as it most likely papers over the real problem.
> > > > > > >
> > > > > >
> > > > > > Good point. Worse, it may result in UAF or memory leaks.
> > > > > >
> > > > > > > > Either case, the need for this change suggests that the ipmi change
> > > > > > > > may not be complete, since it should send a completion with an error.
> > > > > > >
> > > > > > > I think that reverting commit bc3a9d217755 ("ipmi:si: Gracefully
> > > > > > > handle if the BMC is non-functional") should also be considered as a
> > > > > > > possible way forward because it clearly did not improve things as
> > > > > > > expected, at least in this particular case.
> > > > > > >
> > > > > >
> > > > > > I tend to agree. I ran a number of AI code reviews over the patch, and
> > > > > > each time it finds new (and different) problems. The fact that the acpi
> > > > > > patch is still needed even after applying the ipmi changes suggests that
> > > > > > something is still missing in the ipmi code.
> > > > > >
> > > > > > > It evidently did something that confuses things quite a bit.  Either
> > > > > > > it is returning IPMI_BUS_ERR instead of IPMI_ERR_UNSPECIFIED, or it is
> > > > > > > the "hosed" state and refusing to accept messages.
> > > > > > >
> > > > > >
> > > > > > More than that. My latest AI results are below, just for reference
> > > > > > (using Gemini 3 with Chris Mason's debug prompts). The prompt I used
> > > > > > for this run is:
> > > > >
> > > > > Well, I guess it's time to send a revert patch then.
> > > >
> > > > Thanks for the CC.
> > > >
> > > > Let's fix it right in the IPMI driver.
> > > >
> > > > >
> > > > > > "
> > > > > > The top commit in the linux/ directory results in hung tasks if the BMC
> > > > > > stops responding. Using @review-prompts/kernel/debugging.md analyze the
> > > > > > patch, identify the reason for the hung task problem, suggest and implement
> > > > > > a fix. Note that there may be more than one problem in the patch, so analyze
> > > > > > the complete patch and do not stop after fiding the first regression.
> > > > > > "
> > > > > >
> > > > > > I think that catches most of the problem, but not all of it.
> > > > > >
> > > > > > Guenter
> > > > > >
> > > > > > ---
> > > > > >
> > > > > > Summary of crash or warning:
> > > > > > Hung task detected in ipmi_si driver when BMC becomes non-functional.
> > > > > > Processes waiting for IPMI responses (e.g. ipmitool, monitoring agents) enter D state and never recover.
> > > > > >
> > > > > > Kernel version if available:
> > > > > > Top of tree (commit bc3a9d217755f65c137f145600f23bf1d6c31ea9)
> > > > > >
> > > > > > Machine type if available:
> > > > > > Generic Server with BMC
> > > > > >
> > > > > > Cleaned up copy of oops or stack trace:
> > > > > > [  120.123456] INFO: task ipmitool:1234 blocked for more than 120 seconds.
> > > > > > [  120.123457]       Not tainted 6.14.0-rc1 #1
> > > > > > [  120.123458] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > > > > > [  120.123459] task:ipmitool        state:D stack:    0 pid: 1234 ppid:  100 flags:0x00000000
> > > > > > [  120.123460] Call Trace:
> > > > > > [  120.123461]  <TASK>
> > > > > > [  120.123462]  __schedule+0x123/0x456
> > > > > > [  120.123463]  schedule+0x45/0x78
> > > > > > [  120.123464]  schedule_timeout+0x9a/0xbc
> > > > > > [  120.123465]  wait_for_completion+0xde/0xf0
> > > > > > [  120.123466]  ipmi_request_settime+0x123/0x145
> > > > > > [  120.123467]  ...
> > > > > > [  120.123468]  </TASK>
> > > > > >
> > > > > > Any other kernel messages you found relevant:
> > > > > > N/A
> > > > > >
> > > > > > Explanation of the problem:
> > > > > > 1. Hung Task:
> > > > > > The patch "ipmi:si: Gracefully handle if the BMC is non-functional" introduces a new state `SI_HOSED` to handle BMC failures. When the driver detects that the BMC is not responding, it transitions to `SI_HOSED` and fails the currently processing message (`curr_msg`). However, if a new message is queued via `sender()` (populating `waiting_msg`) during a recovery probe (state `SI_GETTING_FLAGS`), and that probe subsequently fails, the state machine transitions back to `SI_HOSED`. In this transition, the driver checks and fails `curr_msg`, but it neglects to check or fail `waiting_msg`. As a result, the `waiting_msg` remains in the queue indefinitely, causing the waiting process to hang.
> > > > >
> > > > > That's quite convincing and it would explain the observed symptoms.
> > > >
> > > > Yes, and it's a fairly easy fix, I think.  The waiting message just
> > > > needs to be returned in that case.  The following patch should do it:
> > >
> > > Jaroslav, it would be good to test the patch below on top of 6.19.  I
> > > can put it on a test git branch if need be, so please let me know.
> > >
> > > > diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
> > > > index 5459ffdde8dc..ff159b1162b9 100644
> > > > --- a/drivers/char/ipmi/ipmi_si_intf.c
> > > > +++ b/drivers/char/ipmi/ipmi_si_intf.c
> > > > @@ -809,6 +809,12 @@ static enum si_sm_result smi_event_handler(struct smi_info *smi_info,
> > > >                          */
> > > >                         return_hosed_msg(smi_info, IPMI_BUS_ERR);
> > > >                 }
> > > > +               if (smi_info->waiting_msg != NULL) {
> > > > +                       /* Also handle if there was a message waiting. */
> > > > +                       smi_info->curr_msg = smi_info->waiting_msg;
> > > > +                       smi_info->waiting_msg = NULL;
> > > > +                       return_hosed_msg(smi_info, IPMI_BUS_ERR);
> > > > +               }
> > > >                 smi_mod_timer(smi_info, jiffies + SI_TIMEOUT_HOSED);
> > > >                 goto out;
> > > >         }
> >
> > I apply ^ patch to both 6.18.10 and 6.19 and reproduced the issue on
> > both, so it does not fix the problem.
> 
> Thanks!
> 
> With all due respect to everyone involved (including the AI), this
> means that we are not anywhere close to fixing the problem and it
> would be a shame to ship 7.0 with it.
> 
> I'm sending a revert patch shortly.

Unfortunately, that patch fixed an issue others were having.

I'm going to try to hack on qemu and see if I can reproduce this.  I
don't have any hardware where this is feasible.

-corey

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [BISECTED - impi related]: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-02-12 16:48                                                         ` Corey Minyard
@ 2026-02-12 17:22                                                           ` Rafael J. Wysocki
  2026-02-12 18:34                                                             ` Corey Minyard
  0 siblings, 1 reply; 60+ messages in thread
From: Rafael J. Wysocki @ 2026-02-12 17:22 UTC (permalink / raw)
  To: corey
  Cc: Rafael J. Wysocki, Jaroslav Pulchart, Guenter Roeck, Igor Raits,
	linux-acpi, linux-hwmon, Daniel Secik, Zdenek Pesek, Jiri Jurica,
	Huisong Li

On Thu, Feb 12, 2026 at 5:48 PM Corey Minyard <corey@minyard.net> wrote:
>
> On Thu, Feb 12, 2026 at 01:27:41PM +0100, Rafael J. Wysocki wrote:
> > On Thu, Feb 12, 2026 at 10:11???AM Jaroslav Pulchart
> > <jaroslav.pulchart@gooddata.com> wrote:
> > >
> > > >
> > > > On Fri, Feb 6, 2026 at 4:58???PM Corey Minyard <corey@minyard.net> wrote:
> > > > >
> > > > > On Fri, Feb 06, 2026 at 01:08:56PM +0100, Rafael J. Wysocki wrote:
> > > > > > On Thu, Feb 5, 2026 at 11:34???PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > > > >
> > > > > > > On Thu, Feb 05, 2026 at 08:04:12PM +0100, Rafael J. Wysocki wrote:
> > > > > > > > Cc: Corey
> > > > > > > >
> > > > > > > > On Thu, Feb 5, 2026 at 6:51???PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > > > > > >
> > > > > > > > > On Thu, Feb 05, 2026 at 08:25:57AM +0100, Igor Raits wrote:
> > > > > > > > > > On Wed, Feb 4, 2026 at 11:49???PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > > > > > > > >
> > > > > > > > > > > On 2/4/26 11:54, Igor Raits wrote:
> > > > > > > > > > > > I have written a patch with the help of AI and it fixes the problem. Attached.
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > "No MIME, no links, no compression, no attachments.  Just plain text"
> > > > > > > > > >
> > > > > > > > > > Sorry for that, I had assumed that attaching the file would make it in-line.
> > > > > > > > > >
> > > > > > > > > > > ... which means I can not provide inline feedback, which is the whole
> > > > > > > > > > > point of the above.
> > > > > > > > > > >
> > > > > > > > > > > Your patch crosses subsystems, so it will need to be split in two
> > > > > > > > > > > (assuming the ACPI side is even needed). Also, references to iDRAC
> > > > > > > > > > > in common code seem inappropriate.
> > > > > > > > > >
> > > > > > > > > > Yes, this I believe was the essential part (it was the last piece in
> > > > > > > > > > my testing which fixed the hanging):
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Then I'll need to ask differently: What happens if you drop the IPMI code,
> > > > > > > > > and just keep the wait_for_completion -> wait_for_completion_timeout
> > > > > > > > > change ? Would that be sufficient to solve the problem ?
> > > > > > > >
> > > > > > > > I'd rather say "Would that be sufficient to make the symptoms go
> > > > > > > > away?" as it most likely papers over the real problem.
> > > > > > > >
> > > > > > >
> > > > > > > Good point. Worse, it may result in UAF or memory leaks.
> > > > > > >
> > > > > > > > > Either case, the need for this change suggests that the ipmi change
> > > > > > > > > may not be complete, since it should send a completion with an error.
> > > > > > > >
> > > > > > > > I think that reverting commit bc3a9d217755 ("ipmi:si: Gracefully
> > > > > > > > handle if the BMC is non-functional") should also be considered as a
> > > > > > > > possible way forward because it clearly did not improve things as
> > > > > > > > expected, at least in this particular case.
> > > > > > > >
> > > > > > >
> > > > > > > I tend to agree. I ran a number of AI code reviews over the patch, and
> > > > > > > each time it finds new (and different) problems. The fact that the acpi
> > > > > > > patch is still needed even after applying the ipmi changes suggests that
> > > > > > > something is still missing in the ipmi code.
> > > > > > >
> > > > > > > > It evidently did something that confuses things quite a bit.  Either
> > > > > > > > it is returning IPMI_BUS_ERR instead of IPMI_ERR_UNSPECIFIED, or it is
> > > > > > > > the "hosed" state and refusing to accept messages.
> > > > > > > >
> > > > > > >
> > > > > > > More than that. My latest AI results are below, just for reference
> > > > > > > (using Gemini 3 with Chris Mason's debug prompts). The prompt I used
> > > > > > > for this run is:
> > > > > >
> > > > > > Well, I guess it's time to send a revert patch then.
> > > > >
> > > > > Thanks for the CC.
> > > > >
> > > > > Let's fix it right in the IPMI driver.
> > > > >
> > > > > >
> > > > > > > "
> > > > > > > The top commit in the linux/ directory results in hung tasks if the BMC
> > > > > > > stops responding. Using @review-prompts/kernel/debugging.md analyze the
> > > > > > > patch, identify the reason for the hung task problem, suggest and implement
> > > > > > > a fix. Note that there may be more than one problem in the patch, so analyze
> > > > > > > the complete patch and do not stop after fiding the first regression.
> > > > > > > "
> > > > > > >
> > > > > > > I think that catches most of the problem, but not all of it.
> > > > > > >
> > > > > > > Guenter
> > > > > > >
> > > > > > > ---
> > > > > > >
> > > > > > > Summary of crash or warning:
> > > > > > > Hung task detected in ipmi_si driver when BMC becomes non-functional.
> > > > > > > Processes waiting for IPMI responses (e.g. ipmitool, monitoring agents) enter D state and never recover.
> > > > > > >
> > > > > > > Kernel version if available:
> > > > > > > Top of tree (commit bc3a9d217755f65c137f145600f23bf1d6c31ea9)
> > > > > > >
> > > > > > > Machine type if available:
> > > > > > > Generic Server with BMC
> > > > > > >
> > > > > > > Cleaned up copy of oops or stack trace:
> > > > > > > [  120.123456] INFO: task ipmitool:1234 blocked for more than 120 seconds.
> > > > > > > [  120.123457]       Not tainted 6.14.0-rc1 #1
> > > > > > > [  120.123458] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > > > > > > [  120.123459] task:ipmitool        state:D stack:    0 pid: 1234 ppid:  100 flags:0x00000000
> > > > > > > [  120.123460] Call Trace:
> > > > > > > [  120.123461]  <TASK>
> > > > > > > [  120.123462]  __schedule+0x123/0x456
> > > > > > > [  120.123463]  schedule+0x45/0x78
> > > > > > > [  120.123464]  schedule_timeout+0x9a/0xbc
> > > > > > > [  120.123465]  wait_for_completion+0xde/0xf0
> > > > > > > [  120.123466]  ipmi_request_settime+0x123/0x145
> > > > > > > [  120.123467]  ...
> > > > > > > [  120.123468]  </TASK>
> > > > > > >
> > > > > > > Any other kernel messages you found relevant:
> > > > > > > N/A
> > > > > > >
> > > > > > > Explanation of the problem:
> > > > > > > 1. Hung Task:
> > > > > > > The patch "ipmi:si: Gracefully handle if the BMC is non-functional" introduces a new state `SI_HOSED` to handle BMC failures. When the driver detects that the BMC is not responding, it transitions to `SI_HOSED` and fails the currently processing message (`curr_msg`). However, if a new message is queued via `sender()` (populating `waiting_msg`) during a recovery probe (state `SI_GETTING_FLAGS`), and that probe subsequently fails, the state machine transitions back to `SI_HOSED`. In this transition, the driver checks and fails `curr_msg`, but it neglects to check or fail `waiting_msg`. As a result, the `waiting_msg` remains in the queue indefinitely, causing the waiting process to hang.
> > > > > >
> > > > > > That's quite convincing and it would explain the observed symptoms.
> > > > >
> > > > > Yes, and it's a fairly easy fix, I think.  The waiting message just
> > > > > needs to be returned in that case.  The following patch should do it:
> > > >
> > > > Jaroslav, it would be good to test the patch below on top of 6.19.  I
> > > > can put it on a test git branch if need be, so please let me know.
> > > >
> > > > > diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
> > > > > index 5459ffdde8dc..ff159b1162b9 100644
> > > > > --- a/drivers/char/ipmi/ipmi_si_intf.c
> > > > > +++ b/drivers/char/ipmi/ipmi_si_intf.c
> > > > > @@ -809,6 +809,12 @@ static enum si_sm_result smi_event_handler(struct smi_info *smi_info,
> > > > >                          */
> > > > >                         return_hosed_msg(smi_info, IPMI_BUS_ERR);
> > > > >                 }
> > > > > +               if (smi_info->waiting_msg != NULL) {
> > > > > +                       /* Also handle if there was a message waiting. */
> > > > > +                       smi_info->curr_msg = smi_info->waiting_msg;
> > > > > +                       smi_info->waiting_msg = NULL;
> > > > > +                       return_hosed_msg(smi_info, IPMI_BUS_ERR);
> > > > > +               }
> > > > >                 smi_mod_timer(smi_info, jiffies + SI_TIMEOUT_HOSED);
> > > > >                 goto out;
> > > > >         }
> > >
> > > I apply ^ patch to both 6.18.10 and 6.19 and reproduced the issue on
> > > both, so it does not fix the problem.
> >
> > Thanks!
> >
> > With all due respect to everyone involved (including the AI), this
> > means that we are not anywhere close to fixing the problem and it
> > would be a shame to ship 7.0 with it.
> >
> > I'm sending a revert patch shortly.
>
> Unfortunately, that patch fixed an issue others were having.

Granted, it broke something else, so it needs to be fixed or reverted.

Maybe there is a way to address the original problem fixed by it differently?

Do you have any pointers to any problem reports regarding that one?

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [BISECTED - impi related]: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-02-12 17:22                                                           ` Rafael J. Wysocki
@ 2026-02-12 18:34                                                             ` Corey Minyard
  2026-02-12 21:33                                                               ` Rafael J. Wysocki
  0 siblings, 1 reply; 60+ messages in thread
From: Corey Minyard @ 2026-02-12 18:34 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Jaroslav Pulchart, Guenter Roeck, Igor Raits, linux-acpi,
	linux-hwmon, Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li

On Thu, Feb 12, 2026 at 06:22:08PM +0100, Rafael J. Wysocki wrote:
> On Thu, Feb 12, 2026 at 5:48???PM Corey Minyard <corey@minyard.net> wrote:
> >
> > On Thu, Feb 12, 2026 at 01:27:41PM +0100, Rafael J. Wysocki wrote:
> > > On Thu, Feb 12, 2026 at 10:11???AM Jaroslav Pulchart
> > > <jaroslav.pulchart@gooddata.com> wrote:
> > > >
> > > > >
> > > > > On Fri, Feb 6, 2026 at 4:58???PM Corey Minyard <corey@minyard.net> wrote:
> > > > > >
> > > > > > On Fri, Feb 06, 2026 at 01:08:56PM +0100, Rafael J. Wysocki wrote:
> > > > > > > On Thu, Feb 5, 2026 at 11:34???PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > > > > >
> > > > > > > > On Thu, Feb 05, 2026 at 08:04:12PM +0100, Rafael J. Wysocki wrote:
> > > > > > > > > Cc: Corey
> > > > > > > > >
> > > > > > > > > On Thu, Feb 5, 2026 at 6:51???PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > > > > > > >
> > > > > > > > > > On Thu, Feb 05, 2026 at 08:25:57AM +0100, Igor Raits wrote:
> > > > > > > > > > > On Wed, Feb 4, 2026 at 11:49???PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > On 2/4/26 11:54, Igor Raits wrote:
> > > > > > > > > > > > > I have written a patch with the help of AI and it fixes the problem. Attached.
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > "No MIME, no links, no compression, no attachments.  Just plain text"
> > > > > > > > > > >
> > > > > > > > > > > Sorry for that, I had assumed that attaching the file would make it in-line.
> > > > > > > > > > >
> > > > > > > > > > > > ... which means I can not provide inline feedback, which is the whole
> > > > > > > > > > > > point of the above.
> > > > > > > > > > > >
> > > > > > > > > > > > Your patch crosses subsystems, so it will need to be split in two
> > > > > > > > > > > > (assuming the ACPI side is even needed). Also, references to iDRAC
> > > > > > > > > > > > in common code seem inappropriate.
> > > > > > > > > > >
> > > > > > > > > > > Yes, this I believe was the essential part (it was the last piece in
> > > > > > > > > > > my testing which fixed the hanging):
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Then I'll need to ask differently: What happens if you drop the IPMI code,
> > > > > > > > > > and just keep the wait_for_completion -> wait_for_completion_timeout
> > > > > > > > > > change ? Would that be sufficient to solve the problem ?
> > > > > > > > >
> > > > > > > > > I'd rather say "Would that be sufficient to make the symptoms go
> > > > > > > > > away?" as it most likely papers over the real problem.
> > > > > > > > >
> > > > > > > >
> > > > > > > > Good point. Worse, it may result in UAF or memory leaks.
> > > > > > > >
> > > > > > > > > > Either case, the need for this change suggests that the ipmi change
> > > > > > > > > > may not be complete, since it should send a completion with an error.
> > > > > > > > >
> > > > > > > > > I think that reverting commit bc3a9d217755 ("ipmi:si: Gracefully
> > > > > > > > > handle if the BMC is non-functional") should also be considered as a
> > > > > > > > > possible way forward because it clearly did not improve things as
> > > > > > > > > expected, at least in this particular case.
> > > > > > > > >
> > > > > > > >
> > > > > > > > I tend to agree. I ran a number of AI code reviews over the patch, and
> > > > > > > > each time it finds new (and different) problems. The fact that the acpi
> > > > > > > > patch is still needed even after applying the ipmi changes suggests that
> > > > > > > > something is still missing in the ipmi code.
> > > > > > > >
> > > > > > > > > It evidently did something that confuses things quite a bit.  Either
> > > > > > > > > it is returning IPMI_BUS_ERR instead of IPMI_ERR_UNSPECIFIED, or it is
> > > > > > > > > the "hosed" state and refusing to accept messages.
> > > > > > > > >
> > > > > > > >
> > > > > > > > More than that. My latest AI results are below, just for reference
> > > > > > > > (using Gemini 3 with Chris Mason's debug prompts). The prompt I used
> > > > > > > > for this run is:
> > > > > > >
> > > > > > > Well, I guess it's time to send a revert patch then.
> > > > > >
> > > > > > Thanks for the CC.
> > > > > >
> > > > > > Let's fix it right in the IPMI driver.
> > > > > >
> > > > > > >
> > > > > > > > "
> > > > > > > > The top commit in the linux/ directory results in hung tasks if the BMC
> > > > > > > > stops responding. Using @review-prompts/kernel/debugging.md analyze the
> > > > > > > > patch, identify the reason for the hung task problem, suggest and implement
> > > > > > > > a fix. Note that there may be more than one problem in the patch, so analyze
> > > > > > > > the complete patch and do not stop after fiding the first regression.
> > > > > > > > "
> > > > > > > >
> > > > > > > > I think that catches most of the problem, but not all of it.
> > > > > > > >
> > > > > > > > Guenter
> > > > > > > >
> > > > > > > > ---
> > > > > > > >
> > > > > > > > Summary of crash or warning:
> > > > > > > > Hung task detected in ipmi_si driver when BMC becomes non-functional.
> > > > > > > > Processes waiting for IPMI responses (e.g. ipmitool, monitoring agents) enter D state and never recover.
> > > > > > > >
> > > > > > > > Kernel version if available:
> > > > > > > > Top of tree (commit bc3a9d217755f65c137f145600f23bf1d6c31ea9)
> > > > > > > >
> > > > > > > > Machine type if available:
> > > > > > > > Generic Server with BMC
> > > > > > > >
> > > > > > > > Cleaned up copy of oops or stack trace:
> > > > > > > > [  120.123456] INFO: task ipmitool:1234 blocked for more than 120 seconds.
> > > > > > > > [  120.123457]       Not tainted 6.14.0-rc1 #1
> > > > > > > > [  120.123458] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > > > > > > > [  120.123459] task:ipmitool        state:D stack:    0 pid: 1234 ppid:  100 flags:0x00000000
> > > > > > > > [  120.123460] Call Trace:
> > > > > > > > [  120.123461]  <TASK>
> > > > > > > > [  120.123462]  __schedule+0x123/0x456
> > > > > > > > [  120.123463]  schedule+0x45/0x78
> > > > > > > > [  120.123464]  schedule_timeout+0x9a/0xbc
> > > > > > > > [  120.123465]  wait_for_completion+0xde/0xf0
> > > > > > > > [  120.123466]  ipmi_request_settime+0x123/0x145
> > > > > > > > [  120.123467]  ...
> > > > > > > > [  120.123468]  </TASK>
> > > > > > > >
> > > > > > > > Any other kernel messages you found relevant:
> > > > > > > > N/A
> > > > > > > >
> > > > > > > > Explanation of the problem:
> > > > > > > > 1. Hung Task:
> > > > > > > > The patch "ipmi:si: Gracefully handle if the BMC is non-functional" introduces a new state `SI_HOSED` to handle BMC failures. When the driver detects that the BMC is not responding, it transitions to `SI_HOSED` and fails the currently processing message (`curr_msg`). However, if a new message is queued via `sender()` (populating `waiting_msg`) during a recovery probe (state `SI_GETTING_FLAGS`), and that probe subsequently fails, the state machine transitions back to `SI_HOSED`. In this transition, the driver checks and fails `curr_msg`, but it neglects to check or fail `waiting_msg`. As a result, the `waiting_msg` remains in the queue indefinitely, causing the waiting process to hang.
> > > > > > >
> > > > > > > That's quite convincing and it would explain the observed symptoms.
> > > > > >
> > > > > > Yes, and it's a fairly easy fix, I think.  The waiting message just
> > > > > > needs to be returned in that case.  The following patch should do it:
> > > > >
> > > > > Jaroslav, it would be good to test the patch below on top of 6.19.  I
> > > > > can put it on a test git branch if need be, so please let me know.
> > > > >
> > > > > > diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
> > > > > > index 5459ffdde8dc..ff159b1162b9 100644
> > > > > > --- a/drivers/char/ipmi/ipmi_si_intf.c
> > > > > > +++ b/drivers/char/ipmi/ipmi_si_intf.c
> > > > > > @@ -809,6 +809,12 @@ static enum si_sm_result smi_event_handler(struct smi_info *smi_info,
> > > > > >                          */
> > > > > >                         return_hosed_msg(smi_info, IPMI_BUS_ERR);
> > > > > >                 }
> > > > > > +               if (smi_info->waiting_msg != NULL) {
> > > > > > +                       /* Also handle if there was a message waiting. */
> > > > > > +                       smi_info->curr_msg = smi_info->waiting_msg;
> > > > > > +                       smi_info->waiting_msg = NULL;
> > > > > > +                       return_hosed_msg(smi_info, IPMI_BUS_ERR);
> > > > > > +               }
> > > > > >                 smi_mod_timer(smi_info, jiffies + SI_TIMEOUT_HOSED);
> > > > > >                 goto out;
> > > > > >         }
> > > >
> > > > I apply ^ patch to both 6.18.10 and 6.19 and reproduced the issue on
> > > > both, so it does not fix the problem.
> > >
> > > Thanks!
> > >
> > > With all due respect to everyone involved (including the AI), this
> > > means that we are not anywhere close to fixing the problem and it
> > > would be a shame to ship 7.0 with it.
> > >
> > > I'm sending a revert patch shortly.
> >
> > Unfortunately, that patch fixed an issue others were having.
> 
> Granted, it broke something else, so it needs to be fixed or reverted.

Yes, certainly.

> 
> Maybe there is a way to address the original problem fixed by it differently?

I'm not sure.  This is not the first attempt...

> 
> Do you have any pointers to any problem reports regarding that one?

The original problem came as a patch set:

https://lore.kernel.org/lkml/20221007092617.87597-1-zhangyuchen.lcr@bytedance.com/

That had a lockup problem, and it had some other issues.  So I reworked
the code to the current form.

I'm working on qemu now.  This needs to be added as part of the test
suite, anyway.

-corey

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [BISECTED - impi related]: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-02-12 18:34                                                             ` Corey Minyard
@ 2026-02-12 21:33                                                               ` Rafael J. Wysocki
  2026-02-12 22:06                                                                 ` Corey Minyard
  2026-02-13  6:55                                                                 ` Corey Minyard
  0 siblings, 2 replies; 60+ messages in thread
From: Rafael J. Wysocki @ 2026-02-12 21:33 UTC (permalink / raw)
  To: corey
  Cc: Rafael J. Wysocki, Jaroslav Pulchart, Guenter Roeck, Igor Raits,
	linux-acpi, linux-hwmon, Daniel Secik, Zdenek Pesek, Jiri Jurica,
	Huisong Li

On Thu, Feb 12, 2026 at 7:35 PM Corey Minyard <corey@minyard.net> wrote:
>
> On Thu, Feb 12, 2026 at 06:22:08PM +0100, Rafael J. Wysocki wrote:
> > On Thu, Feb 12, 2026 at 5:48???PM Corey Minyard <corey@minyard.net> wrote:
> > >
> > > On Thu, Feb 12, 2026 at 01:27:41PM +0100, Rafael J. Wysocki wrote:
> > > > On Thu, Feb 12, 2026 at 10:11???AM Jaroslav Pulchart
> > > > <jaroslav.pulchart@gooddata.com> wrote:
> > > > >
> > > > > >
> > > > > > On Fri, Feb 6, 2026 at 4:58???PM Corey Minyard <corey@minyard.net> wrote:
> > > > > > >
> > > > > > > On Fri, Feb 06, 2026 at 01:08:56PM +0100, Rafael J. Wysocki wrote:
> > > > > > > > On Thu, Feb 5, 2026 at 11:34???PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > > > > > >
> > > > > > > > > On Thu, Feb 05, 2026 at 08:04:12PM +0100, Rafael J. Wysocki wrote:
> > > > > > > > > > Cc: Corey
> > > > > > > > > >
> > > > > > > > > > On Thu, Feb 5, 2026 at 6:51???PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > > > > > > > >
> > > > > > > > > > > On Thu, Feb 05, 2026 at 08:25:57AM +0100, Igor Raits wrote:
> > > > > > > > > > > > On Wed, Feb 4, 2026 at 11:49???PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > On 2/4/26 11:54, Igor Raits wrote:
> > > > > > > > > > > > > > I have written a patch with the help of AI and it fixes the problem. Attached.
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > "No MIME, no links, no compression, no attachments.  Just plain text"
> > > > > > > > > > > >
> > > > > > > > > > > > Sorry for that, I had assumed that attaching the file would make it in-line.
> > > > > > > > > > > >
> > > > > > > > > > > > > ... which means I can not provide inline feedback, which is the whole
> > > > > > > > > > > > > point of the above.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Your patch crosses subsystems, so it will need to be split in two
> > > > > > > > > > > > > (assuming the ACPI side is even needed). Also, references to iDRAC
> > > > > > > > > > > > > in common code seem inappropriate.
> > > > > > > > > > > >
> > > > > > > > > > > > Yes, this I believe was the essential part (it was the last piece in
> > > > > > > > > > > > my testing which fixed the hanging):
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Then I'll need to ask differently: What happens if you drop the IPMI code,
> > > > > > > > > > > and just keep the wait_for_completion -> wait_for_completion_timeout
> > > > > > > > > > > change ? Would that be sufficient to solve the problem ?
> > > > > > > > > >
> > > > > > > > > > I'd rather say "Would that be sufficient to make the symptoms go
> > > > > > > > > > away?" as it most likely papers over the real problem.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Good point. Worse, it may result in UAF or memory leaks.
> > > > > > > > >
> > > > > > > > > > > Either case, the need for this change suggests that the ipmi change
> > > > > > > > > > > may not be complete, since it should send a completion with an error.
> > > > > > > > > >
> > > > > > > > > > I think that reverting commit bc3a9d217755 ("ipmi:si: Gracefully
> > > > > > > > > > handle if the BMC is non-functional") should also be considered as a
> > > > > > > > > > possible way forward because it clearly did not improve things as
> > > > > > > > > > expected, at least in this particular case.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > I tend to agree. I ran a number of AI code reviews over the patch, and
> > > > > > > > > each time it finds new (and different) problems. The fact that the acpi
> > > > > > > > > patch is still needed even after applying the ipmi changes suggests that
> > > > > > > > > something is still missing in the ipmi code.
> > > > > > > > >
> > > > > > > > > > It evidently did something that confuses things quite a bit.  Either
> > > > > > > > > > it is returning IPMI_BUS_ERR instead of IPMI_ERR_UNSPECIFIED, or it is
> > > > > > > > > > the "hosed" state and refusing to accept messages.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > More than that. My latest AI results are below, just for reference
> > > > > > > > > (using Gemini 3 with Chris Mason's debug prompts). The prompt I used
> > > > > > > > > for this run is:
> > > > > > > >
> > > > > > > > Well, I guess it's time to send a revert patch then.
> > > > > > >
> > > > > > > Thanks for the CC.
> > > > > > >
> > > > > > > Let's fix it right in the IPMI driver.
> > > > > > >
> > > > > > > >
> > > > > > > > > "
> > > > > > > > > The top commit in the linux/ directory results in hung tasks if the BMC
> > > > > > > > > stops responding. Using @review-prompts/kernel/debugging.md analyze the
> > > > > > > > > patch, identify the reason for the hung task problem, suggest and implement
> > > > > > > > > a fix. Note that there may be more than one problem in the patch, so analyze
> > > > > > > > > the complete patch and do not stop after fiding the first regression.
> > > > > > > > > "
> > > > > > > > >
> > > > > > > > > I think that catches most of the problem, but not all of it.
> > > > > > > > >
> > > > > > > > > Guenter
> > > > > > > > >
> > > > > > > > > ---
> > > > > > > > >
> > > > > > > > > Summary of crash or warning:
> > > > > > > > > Hung task detected in ipmi_si driver when BMC becomes non-functional.
> > > > > > > > > Processes waiting for IPMI responses (e.g. ipmitool, monitoring agents) enter D state and never recover.
> > > > > > > > >
> > > > > > > > > Kernel version if available:
> > > > > > > > > Top of tree (commit bc3a9d217755f65c137f145600f23bf1d6c31ea9)
> > > > > > > > >
> > > > > > > > > Machine type if available:
> > > > > > > > > Generic Server with BMC
> > > > > > > > >
> > > > > > > > > Cleaned up copy of oops or stack trace:
> > > > > > > > > [  120.123456] INFO: task ipmitool:1234 blocked for more than 120 seconds.
> > > > > > > > > [  120.123457]       Not tainted 6.14.0-rc1 #1
> > > > > > > > > [  120.123458] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > > > > > > > > [  120.123459] task:ipmitool        state:D stack:    0 pid: 1234 ppid:  100 flags:0x00000000
> > > > > > > > > [  120.123460] Call Trace:
> > > > > > > > > [  120.123461]  <TASK>
> > > > > > > > > [  120.123462]  __schedule+0x123/0x456
> > > > > > > > > [  120.123463]  schedule+0x45/0x78
> > > > > > > > > [  120.123464]  schedule_timeout+0x9a/0xbc
> > > > > > > > > [  120.123465]  wait_for_completion+0xde/0xf0
> > > > > > > > > [  120.123466]  ipmi_request_settime+0x123/0x145
> > > > > > > > > [  120.123467]  ...
> > > > > > > > > [  120.123468]  </TASK>
> > > > > > > > >
> > > > > > > > > Any other kernel messages you found relevant:
> > > > > > > > > N/A
> > > > > > > > >
> > > > > > > > > Explanation of the problem:
> > > > > > > > > 1. Hung Task:
> > > > > > > > > The patch "ipmi:si: Gracefully handle if the BMC is non-functional" introduces a new state `SI_HOSED` to handle BMC failures. When the driver detects that the BMC is not responding, it transitions to `SI_HOSED` and fails the currently processing message (`curr_msg`). However, if a new message is queued via `sender()` (populating `waiting_msg`) during a recovery probe (state `SI_GETTING_FLAGS`), and that probe subsequently fails, the state machine transitions back to `SI_HOSED`. In this transition, the driver checks and fails `curr_msg`, but it neglects to check or fail `waiting_msg`. As a result, the `waiting_msg` remains in the queue indefinitely, causing the waiting process to hang.
> > > > > > > >
> > > > > > > > That's quite convincing and it would explain the observed symptoms.
> > > > > > >
> > > > > > > Yes, and it's a fairly easy fix, I think.  The waiting message just
> > > > > > > needs to be returned in that case.  The following patch should do it:
> > > > > >
> > > > > > Jaroslav, it would be good to test the patch below on top of 6.19.  I
> > > > > > can put it on a test git branch if need be, so please let me know.
> > > > > >
> > > > > > > diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
> > > > > > > index 5459ffdde8dc..ff159b1162b9 100644
> > > > > > > --- a/drivers/char/ipmi/ipmi_si_intf.c
> > > > > > > +++ b/drivers/char/ipmi/ipmi_si_intf.c
> > > > > > > @@ -809,6 +809,12 @@ static enum si_sm_result smi_event_handler(struct smi_info *smi_info,
> > > > > > >                          */
> > > > > > >                         return_hosed_msg(smi_info, IPMI_BUS_ERR);
> > > > > > >                 }
> > > > > > > +               if (smi_info->waiting_msg != NULL) {
> > > > > > > +                       /* Also handle if there was a message waiting. */
> > > > > > > +                       smi_info->curr_msg = smi_info->waiting_msg;
> > > > > > > +                       smi_info->waiting_msg = NULL;
> > > > > > > +                       return_hosed_msg(smi_info, IPMI_BUS_ERR);
> > > > > > > +               }
> > > > > > >                 smi_mod_timer(smi_info, jiffies + SI_TIMEOUT_HOSED);
> > > > > > >                 goto out;
> > > > > > >         }
> > > > >
> > > > > I apply ^ patch to both 6.18.10 and 6.19 and reproduced the issue on
> > > > > both, so it does not fix the problem.
> > > >
> > > > Thanks!
> > > >
> > > > With all due respect to everyone involved (including the AI), this
> > > > means that we are not anywhere close to fixing the problem and it
> > > > would be a shame to ship 7.0 with it.
> > > >
> > > > I'm sending a revert patch shortly.
> > >
> > > Unfortunately, that patch fixed an issue others were having.
> >
> > Granted, it broke something else, so it needs to be fixed or reverted.
>
> Yes, certainly.
>
> >
> > Maybe there is a way to address the original problem fixed by it differently?
>
> I'm not sure.  This is not the first attempt...

I see.

> >
> > Do you have any pointers to any problem reports regarding that one?
>
> The original problem came as a patch set:
>
> https://lore.kernel.org/lkml/20221007092617.87597-1-zhangyuchen.lcr@bytedance.com/
>
> That had a lockup problem, and it had some other issues.  So I reworked
> the code to the current form.

OK, thanks!

> I'm working on qemu now.  This needs to be added as part of the test suite, anyway.

There is something in the current code that seems to be problematic.

When acpi_ipmi_space_handler() runs, it calls ipmi_request_settime()
to queue up a message.  AFAICS, if all goes well, this ends up calling
smi_send() via i_ipmi_request().

If intf->curr_msg is NULL, the new message will not be added to a list
in there, but intf->curr_msg will be set to point to it instead and
handlers->sender() will be called on it.  But handlers->sender points
to sender() defined in ipmi_si_intf.c which returns IPMI_BUS_ERR
without doing anything if smi_info->si_state == SI_HOSED and its
return value is ignored.

The message is only pointed to by intf->curr_msg at that point and
AFAICS it will never get actually processed because intf->curr_msg is
never really dereferenced (it is only compared with other pointers and
checked against NULL if I'm not mistaken).

It looks like smi_send() needs to check the handlers->sender() return
value and maybe return it to the caller so i_ipmi_request() can return
an error if it fails.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [BISECTED - impi related]: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-02-12 21:33                                                               ` Rafael J. Wysocki
@ 2026-02-12 22:06                                                                 ` Corey Minyard
  2026-02-13  6:55                                                                 ` Corey Minyard
  1 sibling, 0 replies; 60+ messages in thread
From: Corey Minyard @ 2026-02-12 22:06 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Jaroslav Pulchart, Guenter Roeck, Igor Raits, linux-acpi,
	linux-hwmon, Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li

On Thu, Feb 12, 2026 at 10:33:15PM +0100, Rafael J. Wysocki wrote:
> On Thu, Feb 12, 2026 at 7:35???PM Corey Minyard <corey@minyard.net> wrote:
> >
> > On Thu, Feb 12, 2026 at 06:22:08PM +0100, Rafael J. Wysocki wrote:
> > > On Thu, Feb 12, 2026 at 5:48???PM Corey Minyard <corey@minyard.net> wrote:
> > > >
> > > > On Thu, Feb 12, 2026 at 01:27:41PM +0100, Rafael J. Wysocki wrote:
> > > > > On Thu, Feb 12, 2026 at 10:11???AM Jaroslav Pulchart
> > > > > <jaroslav.pulchart@gooddata.com> wrote:
> > > > > >
> > > > > > >
> > > > > > > On Fri, Feb 6, 2026 at 4:58???PM Corey Minyard <corey@minyard.net> wrote:
> > > > > > > >
> > > > > > > > On Fri, Feb 06, 2026 at 01:08:56PM +0100, Rafael J. Wysocki wrote:
> > > > > > > > > On Thu, Feb 5, 2026 at 11:34???PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > > > > > > >
> > > > > > > > > > On Thu, Feb 05, 2026 at 08:04:12PM +0100, Rafael J. Wysocki wrote:
> > > > > > > > > > > Cc: Corey
> > > > > > > > > > >
> > > > > > > > > > > On Thu, Feb 5, 2026 at 6:51???PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > On Thu, Feb 05, 2026 at 08:25:57AM +0100, Igor Raits wrote:
> > > > > > > > > > > > > On Wed, Feb 4, 2026 at 11:49???PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On 2/4/26 11:54, Igor Raits wrote:
> > > > > > > > > > > > > > > I have written a patch with the help of AI and it fixes the problem. Attached.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > "No MIME, no links, no compression, no attachments.  Just plain text"
> > > > > > > > > > > > >
> > > > > > > > > > > > > Sorry for that, I had assumed that attaching the file would make it in-line.
> > > > > > > > > > > > >
> > > > > > > > > > > > > > ... which means I can not provide inline feedback, which is the whole
> > > > > > > > > > > > > > point of the above.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Your patch crosses subsystems, so it will need to be split in two
> > > > > > > > > > > > > > (assuming the ACPI side is even needed). Also, references to iDRAC
> > > > > > > > > > > > > > in common code seem inappropriate.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Yes, this I believe was the essential part (it was the last piece in
> > > > > > > > > > > > > my testing which fixed the hanging):
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Then I'll need to ask differently: What happens if you drop the IPMI code,
> > > > > > > > > > > > and just keep the wait_for_completion -> wait_for_completion_timeout
> > > > > > > > > > > > change ? Would that be sufficient to solve the problem ?
> > > > > > > > > > >
> > > > > > > > > > > I'd rather say "Would that be sufficient to make the symptoms go
> > > > > > > > > > > away?" as it most likely papers over the real problem.
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Good point. Worse, it may result in UAF or memory leaks.
> > > > > > > > > >
> > > > > > > > > > > > Either case, the need for this change suggests that the ipmi change
> > > > > > > > > > > > may not be complete, since it should send a completion with an error.
> > > > > > > > > > >
> > > > > > > > > > > I think that reverting commit bc3a9d217755 ("ipmi:si: Gracefully
> > > > > > > > > > > handle if the BMC is non-functional") should also be considered as a
> > > > > > > > > > > possible way forward because it clearly did not improve things as
> > > > > > > > > > > expected, at least in this particular case.
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > I tend to agree. I ran a number of AI code reviews over the patch, and
> > > > > > > > > > each time it finds new (and different) problems. The fact that the acpi
> > > > > > > > > > patch is still needed even after applying the ipmi changes suggests that
> > > > > > > > > > something is still missing in the ipmi code.
> > > > > > > > > >
> > > > > > > > > > > It evidently did something that confuses things quite a bit.  Either
> > > > > > > > > > > it is returning IPMI_BUS_ERR instead of IPMI_ERR_UNSPECIFIED, or it is
> > > > > > > > > > > the "hosed" state and refusing to accept messages.
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > More than that. My latest AI results are below, just for reference
> > > > > > > > > > (using Gemini 3 with Chris Mason's debug prompts). The prompt I used
> > > > > > > > > > for this run is:
> > > > > > > > >
> > > > > > > > > Well, I guess it's time to send a revert patch then.
> > > > > > > >
> > > > > > > > Thanks for the CC.
> > > > > > > >
> > > > > > > > Let's fix it right in the IPMI driver.
> > > > > > > >
> > > > > > > > >
> > > > > > > > > > "
> > > > > > > > > > The top commit in the linux/ directory results in hung tasks if the BMC
> > > > > > > > > > stops responding. Using @review-prompts/kernel/debugging.md analyze the
> > > > > > > > > > patch, identify the reason for the hung task problem, suggest and implement
> > > > > > > > > > a fix. Note that there may be more than one problem in the patch, so analyze
> > > > > > > > > > the complete patch and do not stop after fiding the first regression.
> > > > > > > > > > "
> > > > > > > > > >
> > > > > > > > > > I think that catches most of the problem, but not all of it.
> > > > > > > > > >
> > > > > > > > > > Guenter
> > > > > > > > > >
> > > > > > > > > > ---
> > > > > > > > > >
> > > > > > > > > > Summary of crash or warning:
> > > > > > > > > > Hung task detected in ipmi_si driver when BMC becomes non-functional.
> > > > > > > > > > Processes waiting for IPMI responses (e.g. ipmitool, monitoring agents) enter D state and never recover.
> > > > > > > > > >
> > > > > > > > > > Kernel version if available:
> > > > > > > > > > Top of tree (commit bc3a9d217755f65c137f145600f23bf1d6c31ea9)
> > > > > > > > > >
> > > > > > > > > > Machine type if available:
> > > > > > > > > > Generic Server with BMC
> > > > > > > > > >
> > > > > > > > > > Cleaned up copy of oops or stack trace:
> > > > > > > > > > [  120.123456] INFO: task ipmitool:1234 blocked for more than 120 seconds.
> > > > > > > > > > [  120.123457]       Not tainted 6.14.0-rc1 #1
> > > > > > > > > > [  120.123458] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > > > > > > > > > [  120.123459] task:ipmitool        state:D stack:    0 pid: 1234 ppid:  100 flags:0x00000000
> > > > > > > > > > [  120.123460] Call Trace:
> > > > > > > > > > [  120.123461]  <TASK>
> > > > > > > > > > [  120.123462]  __schedule+0x123/0x456
> > > > > > > > > > [  120.123463]  schedule+0x45/0x78
> > > > > > > > > > [  120.123464]  schedule_timeout+0x9a/0xbc
> > > > > > > > > > [  120.123465]  wait_for_completion+0xde/0xf0
> > > > > > > > > > [  120.123466]  ipmi_request_settime+0x123/0x145
> > > > > > > > > > [  120.123467]  ...
> > > > > > > > > > [  120.123468]  </TASK>
> > > > > > > > > >
> > > > > > > > > > Any other kernel messages you found relevant:
> > > > > > > > > > N/A
> > > > > > > > > >
> > > > > > > > > > Explanation of the problem:
> > > > > > > > > > 1. Hung Task:
> > > > > > > > > > The patch "ipmi:si: Gracefully handle if the BMC is non-functional" introduces a new state `SI_HOSED` to handle BMC failures. When the driver detects that the BMC is not responding, it transitions to `SI_HOSED` and fails the currently processing message (`curr_msg`). However, if a new message is queued via `sender()` (populating `waiting_msg`) during a recovery probe (state `SI_GETTING_FLAGS`), and that probe subsequently fails, the state machine transitions back to `SI_HOSED`. In this transition, the driver checks and fails `curr_msg`, but it neglects to check or fail `waiting_msg`. As a result, the `waiting_msg` remains in the queue indefinitely, causing the waiting process to hang.
> > > > > > > > >
> > > > > > > > > That's quite convincing and it would explain the observed symptoms.
> > > > > > > >
> > > > > > > > Yes, and it's a fairly easy fix, I think.  The waiting message just
> > > > > > > > needs to be returned in that case.  The following patch should do it:
> > > > > > >
> > > > > > > Jaroslav, it would be good to test the patch below on top of 6.19.  I
> > > > > > > can put it on a test git branch if need be, so please let me know.
> > > > > > >
> > > > > > > > diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
> > > > > > > > index 5459ffdde8dc..ff159b1162b9 100644
> > > > > > > > --- a/drivers/char/ipmi/ipmi_si_intf.c
> > > > > > > > +++ b/drivers/char/ipmi/ipmi_si_intf.c
> > > > > > > > @@ -809,6 +809,12 @@ static enum si_sm_result smi_event_handler(struct smi_info *smi_info,
> > > > > > > >                          */
> > > > > > > >                         return_hosed_msg(smi_info, IPMI_BUS_ERR);
> > > > > > > >                 }
> > > > > > > > +               if (smi_info->waiting_msg != NULL) {
> > > > > > > > +                       /* Also handle if there was a message waiting. */
> > > > > > > > +                       smi_info->curr_msg = smi_info->waiting_msg;
> > > > > > > > +                       smi_info->waiting_msg = NULL;
> > > > > > > > +                       return_hosed_msg(smi_info, IPMI_BUS_ERR);
> > > > > > > > +               }
> > > > > > > >                 smi_mod_timer(smi_info, jiffies + SI_TIMEOUT_HOSED);
> > > > > > > >                 goto out;
> > > > > > > >         }
> > > > > >
> > > > > > I apply ^ patch to both 6.18.10 and 6.19 and reproduced the issue on
> > > > > > both, so it does not fix the problem.
> > > > >
> > > > > Thanks!
> > > > >
> > > > > With all due respect to everyone involved (including the AI), this
> > > > > means that we are not anywhere close to fixing the problem and it
> > > > > would be a shame to ship 7.0 with it.
> > > > >
> > > > > I'm sending a revert patch shortly.
> > > >
> > > > Unfortunately, that patch fixed an issue others were having.
> > >
> > > Granted, it broke something else, so it needs to be fixed or reverted.
> >
> > Yes, certainly.
> >
> > >
> > > Maybe there is a way to address the original problem fixed by it differently?
> >
> > I'm not sure.  This is not the first attempt...
> 
> I see.
> 
> > >
> > > Do you have any pointers to any problem reports regarding that one?
> >
> > The original problem came as a patch set:
> >
> > https://lore.kernel.org/lkml/20221007092617.87597-1-zhangyuchen.lcr@bytedance.com/
> >
> > That had a lockup problem, and it had some other issues.  So I reworked
> > the code to the current form.
> 
> OK, thanks!
> 
> > I'm working on qemu now.  This needs to be added as part of the test suite, anyway.
> 
> There is something in the current code that seems to be problematic.
> 
> When acpi_ipmi_space_handler() runs, it calls ipmi_request_settime()
> to queue up a message.  AFAICS, if all goes well, this ends up calling
> smi_send() via i_ipmi_request().
> 
> If intf->curr_msg is NULL, the new message will not be added to a list
> in there, but intf->curr_msg will be set to point to it instead and
> handlers->sender() will be called on it.  But handlers->sender points
> to sender() defined in ipmi_si_intf.c which returns IPMI_BUS_ERR
> without doing anything if smi_info->si_state == SI_HOSED and its
> return value is ignored.
> 
> The message is only pointed to by intf->curr_msg at that point and
> AFAICS it will never get actually processed because intf->curr_msg is
> never really dereferenced (it is only compared with other pointers and
> checked against NULL if I'm not mistaken).
> 
> It looks like smi_send() needs to check the handlers->sender() return
> value and maybe return it to the caller so i_ipmi_request() can return
> an error if it fails.

Yes, I think you might be right.  I've just gotten qemu to a point where
I can test this.

Until that code was added handlers->sender() never returned an error.
Hopefully I can figure this out soon.

-corey

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [BISECTED - impi related]: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-02-12 21:33                                                               ` Rafael J. Wysocki
  2026-02-12 22:06                                                                 ` Corey Minyard
@ 2026-02-13  6:55                                                                 ` Corey Minyard
  2026-02-13 12:47                                                                   ` Rafael J. Wysocki
  1 sibling, 1 reply; 60+ messages in thread
From: Corey Minyard @ 2026-02-13  6:55 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Jaroslav Pulchart, Guenter Roeck, Igor Raits, linux-acpi,
	linux-hwmon, Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li

On Thu, Feb 12, 2026 at 10:33:15PM +0100, Rafael J. Wysocki wrote:
> On Thu, Feb 12, 2026 at 7:35???PM Corey Minyard <corey@minyard.net> wrote:
> >
> > On Thu, Feb 12, 2026 at 06:22:08PM +0100, Rafael J. Wysocki wrote:
> > > On Thu, Feb 12, 2026 at 5:48???PM Corey Minyard <corey@minyard.net> wrote:
> > > >
> > > > On Thu, Feb 12, 2026 at 01:27:41PM +0100, Rafael J. Wysocki wrote:
> > > > > On Thu, Feb 12, 2026 at 10:11???AM Jaroslav Pulchart
> > > > > <jaroslav.pulchart@gooddata.com> wrote:
> > > > > >
> > > > > > >
> > > > > > > On Fri, Feb 6, 2026 at 4:58???PM Corey Minyard <corey@minyard.net> wrote:
> > > > > > > >
> > > > > > > > On Fri, Feb 06, 2026 at 01:08:56PM +0100, Rafael J. Wysocki wrote:
> > > > > > > > > On Thu, Feb 5, 2026 at 11:34???PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > > > > > > >
> > > > > > > > > > On Thu, Feb 05, 2026 at 08:04:12PM +0100, Rafael J. Wysocki wrote:
> > > > > > > > > > > Cc: Corey
> > > > > > > > > > >
> > > > > > > > > > > On Thu, Feb 5, 2026 at 6:51???PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > On Thu, Feb 05, 2026 at 08:25:57AM +0100, Igor Raits wrote:
> > > > > > > > > > > > > On Wed, Feb 4, 2026 at 11:49???PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On 2/4/26 11:54, Igor Raits wrote:
> > > > > > > > > > > > > > > I have written a patch with the help of AI and it fixes the problem. Attached.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > "No MIME, no links, no compression, no attachments.  Just plain text"
> > > > > > > > > > > > >
> > > > > > > > > > > > > Sorry for that, I had assumed that attaching the file would make it in-line.
> > > > > > > > > > > > >
> > > > > > > > > > > > > > ... which means I can not provide inline feedback, which is the whole
> > > > > > > > > > > > > > point of the above.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Your patch crosses subsystems, so it will need to be split in two
> > > > > > > > > > > > > > (assuming the ACPI side is even needed). Also, references to iDRAC
> > > > > > > > > > > > > > in common code seem inappropriate.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Yes, this I believe was the essential part (it was the last piece in
> > > > > > > > > > > > > my testing which fixed the hanging):
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Then I'll need to ask differently: What happens if you drop the IPMI code,
> > > > > > > > > > > > and just keep the wait_for_completion -> wait_for_completion_timeout
> > > > > > > > > > > > change ? Would that be sufficient to solve the problem ?
> > > > > > > > > > >
> > > > > > > > > > > I'd rather say "Would that be sufficient to make the symptoms go
> > > > > > > > > > > away?" as it most likely papers over the real problem.
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Good point. Worse, it may result in UAF or memory leaks.
> > > > > > > > > >
> > > > > > > > > > > > Either case, the need for this change suggests that the ipmi change
> > > > > > > > > > > > may not be complete, since it should send a completion with an error.
> > > > > > > > > > >
> > > > > > > > > > > I think that reverting commit bc3a9d217755 ("ipmi:si: Gracefully
> > > > > > > > > > > handle if the BMC is non-functional") should also be considered as a
> > > > > > > > > > > possible way forward because it clearly did not improve things as
> > > > > > > > > > > expected, at least in this particular case.
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > I tend to agree. I ran a number of AI code reviews over the patch, and
> > > > > > > > > > each time it finds new (and different) problems. The fact that the acpi
> > > > > > > > > > patch is still needed even after applying the ipmi changes suggests that
> > > > > > > > > > something is still missing in the ipmi code.
> > > > > > > > > >
> > > > > > > > > > > It evidently did something that confuses things quite a bit.  Either
> > > > > > > > > > > it is returning IPMI_BUS_ERR instead of IPMI_ERR_UNSPECIFIED, or it is
> > > > > > > > > > > the "hosed" state and refusing to accept messages.
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > More than that. My latest AI results are below, just for reference
> > > > > > > > > > (using Gemini 3 with Chris Mason's debug prompts). The prompt I used
> > > > > > > > > > for this run is:
> > > > > > > > >
> > > > > > > > > Well, I guess it's time to send a revert patch then.
> > > > > > > >
> > > > > > > > Thanks for the CC.
> > > > > > > >
> > > > > > > > Let's fix it right in the IPMI driver.
> > > > > > > >
> > > > > > > > >
> > > > > > > > > > "
> > > > > > > > > > The top commit in the linux/ directory results in hung tasks if the BMC
> > > > > > > > > > stops responding. Using @review-prompts/kernel/debugging.md analyze the
> > > > > > > > > > patch, identify the reason for the hung task problem, suggest and implement
> > > > > > > > > > a fix. Note that there may be more than one problem in the patch, so analyze
> > > > > > > > > > the complete patch and do not stop after fiding the first regression.
> > > > > > > > > > "
> > > > > > > > > >
> > > > > > > > > > I think that catches most of the problem, but not all of it.
> > > > > > > > > >
> > > > > > > > > > Guenter
> > > > > > > > > >
> > > > > > > > > > ---
> > > > > > > > > >
> > > > > > > > > > Summary of crash or warning:
> > > > > > > > > > Hung task detected in ipmi_si driver when BMC becomes non-functional.
> > > > > > > > > > Processes waiting for IPMI responses (e.g. ipmitool, monitoring agents) enter D state and never recover.
> > > > > > > > > >
> > > > > > > > > > Kernel version if available:
> > > > > > > > > > Top of tree (commit bc3a9d217755f65c137f145600f23bf1d6c31ea9)
> > > > > > > > > >
> > > > > > > > > > Machine type if available:
> > > > > > > > > > Generic Server with BMC
> > > > > > > > > >
> > > > > > > > > > Cleaned up copy of oops or stack trace:
> > > > > > > > > > [  120.123456] INFO: task ipmitool:1234 blocked for more than 120 seconds.
> > > > > > > > > > [  120.123457]       Not tainted 6.14.0-rc1 #1
> > > > > > > > > > [  120.123458] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > > > > > > > > > [  120.123459] task:ipmitool        state:D stack:    0 pid: 1234 ppid:  100 flags:0x00000000
> > > > > > > > > > [  120.123460] Call Trace:
> > > > > > > > > > [  120.123461]  <TASK>
> > > > > > > > > > [  120.123462]  __schedule+0x123/0x456
> > > > > > > > > > [  120.123463]  schedule+0x45/0x78
> > > > > > > > > > [  120.123464]  schedule_timeout+0x9a/0xbc
> > > > > > > > > > [  120.123465]  wait_for_completion+0xde/0xf0
> > > > > > > > > > [  120.123466]  ipmi_request_settime+0x123/0x145
> > > > > > > > > > [  120.123467]  ...
> > > > > > > > > > [  120.123468]  </TASK>
> > > > > > > > > >
> > > > > > > > > > Any other kernel messages you found relevant:
> > > > > > > > > > N/A
> > > > > > > > > >
> > > > > > > > > > Explanation of the problem:
> > > > > > > > > > 1. Hung Task:
> > > > > > > > > > The patch "ipmi:si: Gracefully handle if the BMC is non-functional" introduces a new state `SI_HOSED` to handle BMC failures. When the driver detects that the BMC is not responding, it transitions to `SI_HOSED` and fails the currently processing message (`curr_msg`). However, if a new message is queued via `sender()` (populating `waiting_msg`) during a recovery probe (state `SI_GETTING_FLAGS`), and that probe subsequently fails, the state machine transitions back to `SI_HOSED`. In this transition, the driver checks and fails `curr_msg`, but it neglects to check or fail `waiting_msg`. As a result, the `waiting_msg` remains in the queue indefinitely, causing the waiting process to hang.
> > > > > > > > >
> > > > > > > > > That's quite convincing and it would explain the observed symptoms.
> > > > > > > >
> > > > > > > > Yes, and it's a fairly easy fix, I think.  The waiting message just
> > > > > > > > needs to be returned in that case.  The following patch should do it:
> > > > > > >
> > > > > > > Jaroslav, it would be good to test the patch below on top of 6.19.  I
> > > > > > > can put it on a test git branch if need be, so please let me know.
> > > > > > >
> > > > > > > > diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
> > > > > > > > index 5459ffdde8dc..ff159b1162b9 100644
> > > > > > > > --- a/drivers/char/ipmi/ipmi_si_intf.c
> > > > > > > > +++ b/drivers/char/ipmi/ipmi_si_intf.c
> > > > > > > > @@ -809,6 +809,12 @@ static enum si_sm_result smi_event_handler(struct smi_info *smi_info,
> > > > > > > >                          */
> > > > > > > >                         return_hosed_msg(smi_info, IPMI_BUS_ERR);
> > > > > > > >                 }
> > > > > > > > +               if (smi_info->waiting_msg != NULL) {
> > > > > > > > +                       /* Also handle if there was a message waiting. */
> > > > > > > > +                       smi_info->curr_msg = smi_info->waiting_msg;
> > > > > > > > +                       smi_info->waiting_msg = NULL;
> > > > > > > > +                       return_hosed_msg(smi_info, IPMI_BUS_ERR);
> > > > > > > > +               }
> > > > > > > >                 smi_mod_timer(smi_info, jiffies + SI_TIMEOUT_HOSED);
> > > > > > > >                 goto out;
> > > > > > > >         }
> > > > > >
> > > > > > I apply ^ patch to both 6.18.10 and 6.19 and reproduced the issue on
> > > > > > both, so it does not fix the problem.
> > > > >
> > > > > Thanks!
> > > > >
> > > > > With all due respect to everyone involved (including the AI), this
> > > > > means that we are not anywhere close to fixing the problem and it
> > > > > would be a shame to ship 7.0 with it.
> > > > >
> > > > > I'm sending a revert patch shortly.
> > > >
> > > > Unfortunately, that patch fixed an issue others were having.
> > >
> > > Granted, it broke something else, so it needs to be fixed or reverted.
> >
> > Yes, certainly.
> >
> > >
> > > Maybe there is a way to address the original problem fixed by it differently?
> >
> > I'm not sure.  This is not the first attempt...
> 
> I see.
> 
> > >
> > > Do you have any pointers to any problem reports regarding that one?
> >
> > The original problem came as a patch set:
> >
> > https://lore.kernel.org/lkml/20221007092617.87597-1-zhangyuchen.lcr@bytedance.com/
> >
> > That had a lockup problem, and it had some other issues.  So I reworked
> > the code to the current form.
> 
> OK, thanks!
> 
> > I'm working on qemu now.  This needs to be added as part of the test suite, anyway.
> 
> There is something in the current code that seems to be problematic.
> 
> When acpi_ipmi_space_handler() runs, it calls ipmi_request_settime()
> to queue up a message.  AFAICS, if all goes well, this ends up calling
> smi_send() via i_ipmi_request().
> 
> If intf->curr_msg is NULL, the new message will not be added to a list
> in there, but intf->curr_msg will be set to point to it instead and
> handlers->sender() will be called on it.  But handlers->sender points
> to sender() defined in ipmi_si_intf.c which returns IPMI_BUS_ERR
> without doing anything if smi_info->si_state == SI_HOSED and its
> return value is ignored.
> 
> The message is only pointed to by intf->curr_msg at that point and
> AFAICS it will never get actually processed because intf->curr_msg is
> never really dereferenced (it is only compared with other pointers and
> checked against NULL if I'm not mistaken).
> 
> It looks like smi_send() needs to check the handlers->sender() return
> value and maybe return it to the caller so i_ipmi_request() can return
> an error if it fails.

This was indeed an issue, and had to be handled correctly in a lot of
places.  And there were other issues.  I tested a lot of different
scenarios.

Patches are currently in my next tree
https://github.com/cminyard/linux-ipmi.git for-next
and I'll send emails with them, too.

I was able to reproduce the issue with qemu.  Assuming this works, next
step is to automate the tests.

-corey

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [BISECTED - impi related]: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-02-13  6:55                                                                 ` Corey Minyard
@ 2026-02-13 12:47                                                                   ` Rafael J. Wysocki
  0 siblings, 0 replies; 60+ messages in thread
From: Rafael J. Wysocki @ 2026-02-13 12:47 UTC (permalink / raw)
  To: corey
  Cc: Rafael J. Wysocki, Jaroslav Pulchart, Guenter Roeck, Igor Raits,
	linux-acpi, linux-hwmon, Daniel Secik, Zdenek Pesek, Jiri Jurica,
	Huisong Li

On Fri, Feb 13, 2026 at 7:55 AM Corey Minyard <corey@minyard.net> wrote:
>
> On Thu, Feb 12, 2026 at 10:33:15PM +0100, Rafael J. Wysocki wrote:
> > On Thu, Feb 12, 2026 at 7:35???PM Corey Minyard <corey@minyard.net> wrote:
> > >
> > > On Thu, Feb 12, 2026 at 06:22:08PM +0100, Rafael J. Wysocki wrote:
> > > > On Thu, Feb 12, 2026 at 5:48???PM Corey Minyard <corey@minyard.net> wrote:
> > > > >
> > > > > On Thu, Feb 12, 2026 at 01:27:41PM +0100, Rafael J. Wysocki wrote:
> > > > > > On Thu, Feb 12, 2026 at 10:11???AM Jaroslav Pulchart
> > > > > > <jaroslav.pulchart@gooddata.com> wrote:
> > > > > > >
> > > > > > > >
> > > > > > > > On Fri, Feb 6, 2026 at 4:58???PM Corey Minyard <corey@minyard.net> wrote:
> > > > > > > > >
> > > > > > > > > On Fri, Feb 06, 2026 at 01:08:56PM +0100, Rafael J. Wysocki wrote:
> > > > > > > > > > On Thu, Feb 5, 2026 at 11:34???PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > > > > > > > >
> > > > > > > > > > > On Thu, Feb 05, 2026 at 08:04:12PM +0100, Rafael J. Wysocki wrote:
> > > > > > > > > > > > Cc: Corey
> > > > > > > > > > > >
> > > > > > > > > > > > On Thu, Feb 5, 2026 at 6:51???PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Thu, Feb 05, 2026 at 08:25:57AM +0100, Igor Raits wrote:
> > > > > > > > > > > > > > On Wed, Feb 4, 2026 at 11:49???PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On 2/4/26 11:54, Igor Raits wrote:
> > > > > > > > > > > > > > > > I have written a patch with the help of AI and it fixes the problem. Attached.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > "No MIME, no links, no compression, no attachments.  Just plain text"
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Sorry for that, I had assumed that attaching the file would make it in-line.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > ... which means I can not provide inline feedback, which is the whole
> > > > > > > > > > > > > > > point of the above.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Your patch crosses subsystems, so it will need to be split in two
> > > > > > > > > > > > > > > (assuming the ACPI side is even needed). Also, references to iDRAC
> > > > > > > > > > > > > > > in common code seem inappropriate.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Yes, this I believe was the essential part (it was the last piece in
> > > > > > > > > > > > > > my testing which fixed the hanging):
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Then I'll need to ask differently: What happens if you drop the IPMI code,
> > > > > > > > > > > > > and just keep the wait_for_completion -> wait_for_completion_timeout
> > > > > > > > > > > > > change ? Would that be sufficient to solve the problem ?
> > > > > > > > > > > >
> > > > > > > > > > > > I'd rather say "Would that be sufficient to make the symptoms go
> > > > > > > > > > > > away?" as it most likely papers over the real problem.
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Good point. Worse, it may result in UAF or memory leaks.
> > > > > > > > > > >
> > > > > > > > > > > > > Either case, the need for this change suggests that the ipmi change
> > > > > > > > > > > > > may not be complete, since it should send a completion with an error.
> > > > > > > > > > > >
> > > > > > > > > > > > I think that reverting commit bc3a9d217755 ("ipmi:si: Gracefully
> > > > > > > > > > > > handle if the BMC is non-functional") should also be considered as a
> > > > > > > > > > > > possible way forward because it clearly did not improve things as
> > > > > > > > > > > > expected, at least in this particular case.
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > I tend to agree. I ran a number of AI code reviews over the patch, and
> > > > > > > > > > > each time it finds new (and different) problems. The fact that the acpi
> > > > > > > > > > > patch is still needed even after applying the ipmi changes suggests that
> > > > > > > > > > > something is still missing in the ipmi code.
> > > > > > > > > > >
> > > > > > > > > > > > It evidently did something that confuses things quite a bit.  Either
> > > > > > > > > > > > it is returning IPMI_BUS_ERR instead of IPMI_ERR_UNSPECIFIED, or it is
> > > > > > > > > > > > the "hosed" state and refusing to accept messages.
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > More than that. My latest AI results are below, just for reference
> > > > > > > > > > > (using Gemini 3 with Chris Mason's debug prompts). The prompt I used
> > > > > > > > > > > for this run is:
> > > > > > > > > >
> > > > > > > > > > Well, I guess it's time to send a revert patch then.
> > > > > > > > >
> > > > > > > > > Thanks for the CC.
> > > > > > > > >
> > > > > > > > > Let's fix it right in the IPMI driver.
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > "
> > > > > > > > > > > The top commit in the linux/ directory results in hung tasks if the BMC
> > > > > > > > > > > stops responding. Using @review-prompts/kernel/debugging.md analyze the
> > > > > > > > > > > patch, identify the reason for the hung task problem, suggest and implement
> > > > > > > > > > > a fix. Note that there may be more than one problem in the patch, so analyze
> > > > > > > > > > > the complete patch and do not stop after fiding the first regression.
> > > > > > > > > > > "
> > > > > > > > > > >
> > > > > > > > > > > I think that catches most of the problem, but not all of it.
> > > > > > > > > > >
> > > > > > > > > > > Guenter
> > > > > > > > > > >
> > > > > > > > > > > ---
> > > > > > > > > > >
> > > > > > > > > > > Summary of crash or warning:
> > > > > > > > > > > Hung task detected in ipmi_si driver when BMC becomes non-functional.
> > > > > > > > > > > Processes waiting for IPMI responses (e.g. ipmitool, monitoring agents) enter D state and never recover.
> > > > > > > > > > >
> > > > > > > > > > > Kernel version if available:
> > > > > > > > > > > Top of tree (commit bc3a9d217755f65c137f145600f23bf1d6c31ea9)
> > > > > > > > > > >
> > > > > > > > > > > Machine type if available:
> > > > > > > > > > > Generic Server with BMC
> > > > > > > > > > >
> > > > > > > > > > > Cleaned up copy of oops or stack trace:
> > > > > > > > > > > [  120.123456] INFO: task ipmitool:1234 blocked for more than 120 seconds.
> > > > > > > > > > > [  120.123457]       Not tainted 6.14.0-rc1 #1
> > > > > > > > > > > [  120.123458] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > > > > > > > > > > [  120.123459] task:ipmitool        state:D stack:    0 pid: 1234 ppid:  100 flags:0x00000000
> > > > > > > > > > > [  120.123460] Call Trace:
> > > > > > > > > > > [  120.123461]  <TASK>
> > > > > > > > > > > [  120.123462]  __schedule+0x123/0x456
> > > > > > > > > > > [  120.123463]  schedule+0x45/0x78
> > > > > > > > > > > [  120.123464]  schedule_timeout+0x9a/0xbc
> > > > > > > > > > > [  120.123465]  wait_for_completion+0xde/0xf0
> > > > > > > > > > > [  120.123466]  ipmi_request_settime+0x123/0x145
> > > > > > > > > > > [  120.123467]  ...
> > > > > > > > > > > [  120.123468]  </TASK>
> > > > > > > > > > >
> > > > > > > > > > > Any other kernel messages you found relevant:
> > > > > > > > > > > N/A
> > > > > > > > > > >
> > > > > > > > > > > Explanation of the problem:
> > > > > > > > > > > 1. Hung Task:
> > > > > > > > > > > The patch "ipmi:si: Gracefully handle if the BMC is non-functional" introduces a new state `SI_HOSED` to handle BMC failures. When the driver detects that the BMC is not responding, it transitions to `SI_HOSED` and fails the currently processing message (`curr_msg`). However, if a new message is queued via `sender()` (populating `waiting_msg`) during a recovery probe (state `SI_GETTING_FLAGS`), and that probe subsequently fails, the state machine transitions back to `SI_HOSED`. In this transition, the driver checks and fails `curr_msg`, but it neglects to check or fail `waiting_msg`. As a result, the `waiting_msg` remains in the queue indefinitely, causing the waiting process to hang.
> > > > > > > > > >
> > > > > > > > > > That's quite convincing and it would explain the observed symptoms.
> > > > > > > > >
> > > > > > > > > Yes, and it's a fairly easy fix, I think.  The waiting message just
> > > > > > > > > needs to be returned in that case.  The following patch should do it:
> > > > > > > >
> > > > > > > > Jaroslav, it would be good to test the patch below on top of 6.19.  I
> > > > > > > > can put it on a test git branch if need be, so please let me know.
> > > > > > > >
> > > > > > > > > diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
> > > > > > > > > index 5459ffdde8dc..ff159b1162b9 100644
> > > > > > > > > --- a/drivers/char/ipmi/ipmi_si_intf.c
> > > > > > > > > +++ b/drivers/char/ipmi/ipmi_si_intf.c
> > > > > > > > > @@ -809,6 +809,12 @@ static enum si_sm_result smi_event_handler(struct smi_info *smi_info,
> > > > > > > > >                          */
> > > > > > > > >                         return_hosed_msg(smi_info, IPMI_BUS_ERR);
> > > > > > > > >                 }
> > > > > > > > > +               if (smi_info->waiting_msg != NULL) {
> > > > > > > > > +                       /* Also handle if there was a message waiting. */
> > > > > > > > > +                       smi_info->curr_msg = smi_info->waiting_msg;
> > > > > > > > > +                       smi_info->waiting_msg = NULL;
> > > > > > > > > +                       return_hosed_msg(smi_info, IPMI_BUS_ERR);
> > > > > > > > > +               }
> > > > > > > > >                 smi_mod_timer(smi_info, jiffies + SI_TIMEOUT_HOSED);
> > > > > > > > >                 goto out;
> > > > > > > > >         }
> > > > > > >
> > > > > > > I apply ^ patch to both 6.18.10 and 6.19 and reproduced the issue on
> > > > > > > both, so it does not fix the problem.
> > > > > >
> > > > > > Thanks!
> > > > > >
> > > > > > With all due respect to everyone involved (including the AI), this
> > > > > > means that we are not anywhere close to fixing the problem and it
> > > > > > would be a shame to ship 7.0 with it.
> > > > > >
> > > > > > I'm sending a revert patch shortly.
> > > > >
> > > > > Unfortunately, that patch fixed an issue others were having.
> > > >
> > > > Granted, it broke something else, so it needs to be fixed or reverted.
> > >
> > > Yes, certainly.
> > >
> > > >
> > > > Maybe there is a way to address the original problem fixed by it differently?
> > >
> > > I'm not sure.  This is not the first attempt...
> >
> > I see.
> >
> > > >
> > > > Do you have any pointers to any problem reports regarding that one?
> > >
> > > The original problem came as a patch set:
> > >
> > > https://lore.kernel.org/lkml/20221007092617.87597-1-zhangyuchen.lcr@bytedance.com/
> > >
> > > That had a lockup problem, and it had some other issues.  So I reworked
> > > the code to the current form.
> >
> > OK, thanks!
> >
> > > I'm working on qemu now.  This needs to be added as part of the test suite, anyway.
> >
> > There is something in the current code that seems to be problematic.
> >
> > When acpi_ipmi_space_handler() runs, it calls ipmi_request_settime()
> > to queue up a message.  AFAICS, if all goes well, this ends up calling
> > smi_send() via i_ipmi_request().
> >
> > If intf->curr_msg is NULL, the new message will not be added to a list
> > in there, but intf->curr_msg will be set to point to it instead and
> > handlers->sender() will be called on it.  But handlers->sender points
> > to sender() defined in ipmi_si_intf.c which returns IPMI_BUS_ERR
> > without doing anything if smi_info->si_state == SI_HOSED and its
> > return value is ignored.
> >
> > The message is only pointed to by intf->curr_msg at that point and
> > AFAICS it will never get actually processed because intf->curr_msg is
> > never really dereferenced (it is only compared with other pointers and
> > checked against NULL if I'm not mistaken).
> >
> > It looks like smi_send() needs to check the handlers->sender() return
> > value and maybe return it to the caller so i_ipmi_request() can return
> > an error if it fails.
>
> This was indeed an issue, and had to be handled correctly in a lot of
> places.  And there were other issues.  I tested a lot of different
> scenarios.
>
> Patches are currently in my next tree
> https://github.com/cminyard/linux-ipmi.git for-next
> and I'll send emails with them, too.
>
> I was able to reproduce the issue with qemu.  Assuming this works, next
> step is to automate the tests.

Many thanks for the fixes!

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [BISECTED - impi related]: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-02-06 12:08                                               ` Rafael J. Wysocki
  2026-02-06 15:58                                                 ` Corey Minyard
@ 2026-02-06 16:08                                                 ` Corey Minyard
  2026-02-06 16:31                                                   ` Guenter Roeck
  1 sibling, 1 reply; 60+ messages in thread
From: Corey Minyard @ 2026-02-06 16:08 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Guenter Roeck, Igor Raits, Jaroslav Pulchart, linux-acpi,
	linux-hwmon, Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li

On Fri, Feb 06, 2026 at 01:08:56PM +0100, Rafael J. Wysocki wrote:
> On Thu, Feb 5, 2026 at 11:34 PM Guenter Roeck <linux@roeck-us.net> wrote:
> >

snip..

> 
> > 2. Excessive Polling (Timer Overwrite):
> > In `smi_timeout()`, the timer is unconditionally reset to a short timeout (e.g., 10ms) at the end of the function, unless the state machine is IDLE. When the state machine returns `SI_SM_HOSED`, `smi_event_handler()` correctly sets the timer to a long backoff (1 second). However, `smi_timeout()` subsequently overwrites this with the short timeout. This causes the driver to poll the hosed BMC every 10ms instead of backing off for 1 second, resulting in unnecessary CPU overhead.
> 
> Well, that's not nice, but it doesn't seem to be related to the
> observed symptoms.

I have a fix for this, too.  Thanks for the report.

snip..

-corey

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [BISECTED - impi related]: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-02-06 16:08                                                 ` Corey Minyard
@ 2026-02-06 16:31                                                   ` Guenter Roeck
  2026-02-06 19:35                                                     ` Rafael J. Wysocki
  0 siblings, 1 reply; 60+ messages in thread
From: Guenter Roeck @ 2026-02-06 16:31 UTC (permalink / raw)
  To: corey, Rafael J. Wysocki
  Cc: Igor Raits, Jaroslav Pulchart, linux-acpi, linux-hwmon,
	Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li

On 2/6/26 08:08, Corey Minyard wrote:
> On Fri, Feb 06, 2026 at 01:08:56PM +0100, Rafael J. Wysocki wrote:
>> On Thu, Feb 5, 2026 at 11:34 PM Guenter Roeck <linux@roeck-us.net> wrote:
>>>
> 
> snip..
> 
>>
>>> 2. Excessive Polling (Timer Overwrite):
>>> In `smi_timeout()`, the timer is unconditionally reset to a short timeout (e.g., 10ms) at the end of the function, unless the state machine is IDLE. When the state machine returns `SI_SM_HOSED`, `smi_event_handler()` correctly sets the timer to a long backoff (1 second). However, `smi_timeout()` subsequently overwrites this with the short timeout. This causes the driver to poll the hosed BMC every 10ms instead of backing off for 1 second, resulting in unnecessary CPU overhead.
>>
>> Well, that's not nice, but it doesn't seem to be related to the
>> observed symptoms.
> 
> I have a fix for this, too.  Thanks for the report.
> 

Thanks is due to Gemini 3, really (I would not have seen that problem),
and to the AI model used by Igor.

Guenter


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [BISECTED - impi related]: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-02-06 16:31                                                   ` Guenter Roeck
@ 2026-02-06 19:35                                                     ` Rafael J. Wysocki
  0 siblings, 0 replies; 60+ messages in thread
From: Rafael J. Wysocki @ 2026-02-06 19:35 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: corey, Rafael J. Wysocki, Igor Raits, Jaroslav Pulchart,
	linux-acpi, linux-hwmon, Daniel Secik, Zdenek Pesek, Jiri Jurica,
	Huisong Li

On Fri, Feb 6, 2026 at 5:31 PM Guenter Roeck <linux@roeck-us.net> wrote:
>
> On 2/6/26 08:08, Corey Minyard wrote:
> > On Fri, Feb 06, 2026 at 01:08:56PM +0100, Rafael J. Wysocki wrote:
> >> On Thu, Feb 5, 2026 at 11:34 PM Guenter Roeck <linux@roeck-us.net> wrote:
> >>>
> >
> > snip..
> >
> >>
> >>> 2. Excessive Polling (Timer Overwrite):
> >>> In `smi_timeout()`, the timer is unconditionally reset to a short timeout (e.g., 10ms) at the end of the function, unless the state machine is IDLE. When the state machine returns `SI_SM_HOSED`, `smi_event_handler()` correctly sets the timer to a long backoff (1 second). However, `smi_timeout()` subsequently overwrites this with the short timeout. This causes the driver to poll the hosed BMC every 10ms instead of backing off for 1 second, resulting in unnecessary CPU overhead.
> >>
> >> Well, that's not nice, but it doesn't seem to be related to the
> >> observed symptoms.
> >
> > I have a fix for this, too.  Thanks for the report.
> >
>
> Thanks is due to Gemini 3, really (I would not have seen that problem),
> and to the AI model used by Igor.

Well, Jaroslav reported the misbehavior, tested patches and then
bisected it down to the problematic commit.

Without that we wouldn't have known what to feed to the AI even.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-01-22 20:42     ` Rafael J. Wysocki
  2026-01-22 22:28       ` Guenter Roeck
@ 2026-01-23  2:53       ` lihuisong (C)
  2026-01-27 16:13         ` Guenter Roeck
  1 sibling, 1 reply; 60+ messages in thread
From: lihuisong (C) @ 2026-01-23  2:53 UTC (permalink / raw)
  To: Rafael J. Wysocki, Guenter Roeck, Jaroslav Pulchart
  Cc: linux-acpi, linux-hwmon, Igor Raits, Daniel Secik, Zdenek Pesek,
	Jiri Jurica, lihuisong


On 1/23/2026 4:42 AM, Rafael J. Wysocki wrote:
> On Thu, Jan 22, 2026 at 8:51 PM Guenter Roeck <linux@roeck-us.net> wrote:
>> On 1/22/26 10:55, Rafael J. Wysocki wrote:
>>> On Thu, Jan 22, 2026 at 7:21 PM Jaroslav Pulchart
>>> <jaroslav.pulchart@gooddata.com> wrote:
>>>> Hello,
>>>>
>>>> after upgrading from kernel 6.17.y to 6.18.y we started to observe a regression
>>>> in the ACPI power meter hwmon interface. Reading power*_average sysfs
>>>> attributes blocks indefinitely and causes tasks to enter uninterruptible
>>>> sleep (D state).
>>> The most recent change in the acpi_power_meter driver was made in
>>> 6.15, so this is not a regression in that driver.
>>>
>>> Also, nothing suspicious is done in power1_average_min_show() and
>>> power1_average_min_store() AFAICS.
>>>
>> I suspect a circular locking problem between the hwmon lock and the
>> resource lock. Unfortunately I don't immediately see it.
>>
>> Would it be possible to test this with a kernel which has lock debugging
>> enabled ? I'd test it myself but I don't have a system available
>> that supports the acpi power meter.
> One problem I found in acpi_power_meter through code inspection is
> calling hwmon_device_unregister() under resource->lock in
> acpi_power_meter_notify(), which may be responsible for the observed
> symptoms, but then I'm not sure why it started to be visible after
> 6.18.
The hwmon_attr_show first hold hwdev->lock and then hold resource->lock 
in acpi_power_meter driver.
Although driver call hwmon_device_unregister under resource->lock in 
acpi_power_meter_notify(),
hwmon_device_unregister does't hold hwdev->lock.
So this may not be the root cause of this problem.

On the 6.19 kernel, ACPI power meter works well on my platform.
I don't konw what the difference about power meter is between your 
platform and my platform.

@Jaroslav Pulchart, does your platform support notify feature?
Is this issue probabilistic on your platform?
I suggest you can try to remove resource->lock before calling 
hwmon_device_unregister in notify.
Then check whether this is the issue here.

Best,
/Huisong

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-01-23  2:53       ` lihuisong (C)
@ 2026-01-27 16:13         ` Guenter Roeck
  0 siblings, 0 replies; 60+ messages in thread
From: Guenter Roeck @ 2026-01-27 16:13 UTC (permalink / raw)
  To: lihuisong (C)
  Cc: Rafael J. Wysocki, Jaroslav Pulchart, linux-acpi, linux-hwmon,
	Igor Raits, Daniel Secik, Zdenek Pesek, Jiri Jurica

On Fri, Jan 23, 2026 at 10:53:02AM +0800, lihuisong (C) wrote:
> 
> On 1/23/2026 4:42 AM, Rafael J. Wysocki wrote:
> > On Thu, Jan 22, 2026 at 8:51 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > On 1/22/26 10:55, Rafael J. Wysocki wrote:
> > > > On Thu, Jan 22, 2026 at 7:21 PM Jaroslav Pulchart
> > > > <jaroslav.pulchart@gooddata.com> wrote:
> > > > > Hello,
> > > > > 
> > > > > after upgrading from kernel 6.17.y to 6.18.y we started to observe a regression
> > > > > in the ACPI power meter hwmon interface. Reading power*_average sysfs
> > > > > attributes blocks indefinitely and causes tasks to enter uninterruptible
> > > > > sleep (D state).
> > > > The most recent change in the acpi_power_meter driver was made in
> > > > 6.15, so this is not a regression in that driver.
> > > > 
> > > > Also, nothing suspicious is done in power1_average_min_show() and
> > > > power1_average_min_store() AFAICS.
> > > > 
> > > I suspect a circular locking problem between the hwmon lock and the
> > > resource lock. Unfortunately I don't immediately see it.
> > > 
> > > Would it be possible to test this with a kernel which has lock debugging
> > > enabled ? I'd test it myself but I don't have a system available
> > > that supports the acpi power meter.
> > One problem I found in acpi_power_meter through code inspection is
> > calling hwmon_device_unregister() under resource->lock in
> > acpi_power_meter_notify(), which may be responsible for the observed
> > symptoms, but then I'm not sure why it started to be visible after
> > 6.18.
> The hwmon_attr_show first hold hwdev->lock and then hold resource->lock in
> acpi_power_meter driver.
> Although driver call hwmon_device_unregister under resource->lock in
> acpi_power_meter_notify(),
> hwmon_device_unregister does't hold hwdev->lock.
> So this may not be the root cause of this problem.

It might be if userspace has one or more open attribute file(s). That might
be the case since other code in the driver calls sysfs_notify(), suggesting
that files are expected to remain open. Of course, that would mean that
userspace is all messed up because having an open file descriptor on a
since unregistered hwmon device doesn't really help much - worse, if a read
is executed after unregistering the hwmon driver, using a _different_
instance of the hwmon lock but the same resource lock would really mess
things up.

We'd need some active debugging to figure out if that is the problem.
Either case, re-registering the hwmon driver from acpi_power_meter_notify()
is conceptually wrong, so I am a bit surprised that this ever worked in the
first place.

Guenter

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-01-22 18:55 ` Rafael J. Wysocki
  2026-01-22 19:51   ` Guenter Roeck
@ 2026-01-28 18:18   ` Guenter Roeck
  2026-01-28 18:45     ` Rafael J. Wysocki
  2026-01-29  1:55     ` acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y lihuisong (C)
  1 sibling, 2 replies; 60+ messages in thread
From: Guenter Roeck @ 2026-01-28 18:18 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Jaroslav Pulchart, linux-acpi, linux-hwmon, Igor Raits,
	Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li

Hi all,

On Thu, Jan 22, 2026 at 07:55:35PM +0100, Rafael J. Wysocki wrote:
> On Thu, Jan 22, 2026 at 7:21 PM Jaroslav Pulchart
> <jaroslav.pulchart@gooddata.com> wrote:
> >
> > Hello,
> >
> > after upgrading from kernel 6.17.y to 6.18.y we started to observe a regression
> > in the ACPI power meter hwmon interface. Reading power*_average sysfs
> > attributes blocks indefinitely and causes tasks to enter uninterruptible
> > sleep (D state).
> 
> The most recent change in the acpi_power_meter driver was made in
> 6.15, so this is not a regression in that driver.
> 
> Also, nothing suspicious is done in power1_average_min_show() and
> power1_average_min_store() AFAICS.
> 

I decided to ask AI (Gemini 3, more specifically) for an analysis of the
problem, using Chris Mason's prompts for guidance. Result is below.

The suggested fix may be a feasible workaround. Let me know what you think.

Thanks,
Guenter

---
Summary of crash or warning
Deadlock in acpi_power_meter driver between sysfs read and ACPI notify.

Kernel version if available
6.18.6-1.gdc.el9.x86_64

Cleaned up copy of oops or stack trace
[  740.721209] Call Trace:
[  741.467334]  power_meter_read+0x1ea/0x2c0 [acpi_power_meter]
[  741.475320]  hwmon_attr_show+0x5e/0x130
[  741.481343]  dev_attr_show+0x19/0x60
[  741.487325]  sysfs_kf_seq_show+0xbf/0x140
[  741.494315]  seq_read_iter+0x112/0x510
[  741.503544]  vfs_read+0x215/0x340

[  740.286380] INFO: task alloy:5901 blocked for more than 122 seconds.
[  740.326290] Call Trace:
[  740.336311]  __schedule+0x2b5/0x690
[  740.341312]  schedule+0x23/0x80
[  740.354335]  __mutex_lock.constprop.0+0x3c9/0xa00
[  740.360318]  hwmon_attr_show+0x36/0x130

Explanation of the problem
The acpi_power_meter driver exhibits a deadlock when acpi_power_meter_notify
is called with METER_NOTIFY_CONFIG while a userspace process is concurrently
reading a sysfs attribute (e.g., power1_average).

The deadlock occurs because acpi_power_meter_notify acquires resource->lock
and then calls hwmon_device_unregister. The hwmon_device_unregister function
(via device_unregister and kernfs_remove) waits for all active sysfs callbacks
to complete. However, the active sysfs callback (power_meter_read) tries to
acquire resource->lock which is already held by acpi_power_meter_notify.

Sequence of events:
1. Userspace reads power1_average.
2. hwmon_attr_show is called. It calls power_meter_read.
3. power_meter_read attempts to acquire resource->lock.
4. Concurrently, ACPI triggers acpi_power_meter_notify with METER_NOTIFY_CONFIG.
5. acpi_power_meter_notify acquires resource->lock.
6. acpi_power_meter_notify calls hwmon_device_unregister.
7. hwmon_device_unregister waits for the active hwmon_attr_show (and thus
   power_meter_read) to return.
8. power_meter_read is blocked waiting for resource->lock.
9. acpi_power_meter_notify is blocked waiting for power_meter_read to finish.

This results in a deadlock and hung tasks. Other tasks trying to read sysfs
attributes also hang waiting for the hwmon core lock (hwdev->lock) which is
held by the blocked reader.

Functions, snippets and call traces of code related to the problem

drivers/hwmon/acpi_power_meter.c:

static void acpi_power_meter_notify(struct acpi_device *device, u32 event)
{
        struct acpi_power_meter_resource *resource;
        int res;

        if (!device || !acpi_driver_data(device))
                return;

        resource = acpi_driver_data(device);

        switch (event) {
        case METER_NOTIFY_CONFIG:
                mutex_lock(&resource->lock);
                free_capabilities(resource);
                remove_domain_devices(resource);
                hwmon_device_unregister(resource->hwmon_dev); /* DEADLOCK HERE */
                res = read_capabilities(resource);
                /* ... */
                mutex_unlock(&resource->lock);
                break;
        /* ... */
}

static int power_meter_read(struct device *dev, enum hwmon_sensor_types type,
                            u32 attr, int channel, long *val)
{
        struct acpi_power_meter_resource *res = dev_get_drvdata(dev);
        int ret = 0;

        if (type != hwmon_power)
                return -EINVAL;

        guard(mutex)(&res->lock); /* BLOCKED HERE */

        switch (attr) {
        case hwmon_power_average:
                ret = update_meter(res);
                /* ... */
        }
        return 0;
}

Suggested Fix
Move hwmon_device_unregister out of the critical section protected by
resource->lock. It should be called before acquiring the lock to modify
the resource structure.

drivers/hwmon/acpi_power_meter.c:

        case METER_NOTIFY_CONFIG:
-               mutex_lock(&resource->lock);
-               free_capabilities(resource);
-               remove_domain_devices(resource);
-               hwmon_device_unregister(resource->hwmon_dev);
+               if (resource->hwmon_dev) {
+                       hwmon_device_unregister(resource->hwmon_dev);
+                       resource->hwmon_dev = NULL;
+               }
+
+               mutex_lock(&resource->lock);
+               free_capabilities(resource);
+               remove_domain_devices(resource);
                res = read_capabilities(resource);

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-01-28 18:18   ` Guenter Roeck
@ 2026-01-28 18:45     ` Rafael J. Wysocki
  2026-01-28 19:52       ` Rafael J. Wysocki
  2026-01-29  1:55     ` acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y lihuisong (C)
  1 sibling, 1 reply; 60+ messages in thread
From: Rafael J. Wysocki @ 2026-01-28 18:45 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Rafael J. Wysocki, Jaroslav Pulchart, linux-acpi, linux-hwmon,
	Igor Raits, Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li

On Wed, Jan 28, 2026 at 7:18 PM Guenter Roeck <linux@roeck-us.net> wrote:
>
> Hi all,
>
> On Thu, Jan 22, 2026 at 07:55:35PM +0100, Rafael J. Wysocki wrote:
> > On Thu, Jan 22, 2026 at 7:21 PM Jaroslav Pulchart
> > <jaroslav.pulchart@gooddata.com> wrote:
> > >
> > > Hello,
> > >
> > > after upgrading from kernel 6.17.y to 6.18.y we started to observe a regression
> > > in the ACPI power meter hwmon interface. Reading power*_average sysfs
> > > attributes blocks indefinitely and causes tasks to enter uninterruptible
> > > sleep (D state).
> >
> > The most recent change in the acpi_power_meter driver was made in
> > 6.15, so this is not a regression in that driver.
> >
> > Also, nothing suspicious is done in power1_average_min_show() and
> > power1_average_min_store() AFAICS.
> >
>
> I decided to ask AI (Gemini 3, more specifically) for an analysis of the
> problem, using Chris Mason's prompts for guidance. Result is below.

So it agrees with me in the analysis part.

> The suggested fix may be a feasible workaround. Let me know what you think.

Well, I'm afraid it won't work if two METER_NOTIFY_CONFIG
notifications compete with each other because they may try to
unregister the hwmon device at the same time.

I would just add a separate lock for the notifier (a static one should
suffice) and make changes to "resource" only under resource->lock.

Let me cut a prototype patch for this.

> ---
> Summary of crash or warning
> Deadlock in acpi_power_meter driver between sysfs read and ACPI notify.
>
> Kernel version if available
> 6.18.6-1.gdc.el9.x86_64
>
> Cleaned up copy of oops or stack trace
> [  740.721209] Call Trace:
> [  741.467334]  power_meter_read+0x1ea/0x2c0 [acpi_power_meter]
> [  741.475320]  hwmon_attr_show+0x5e/0x130
> [  741.481343]  dev_attr_show+0x19/0x60
> [  741.487325]  sysfs_kf_seq_show+0xbf/0x140
> [  741.494315]  seq_read_iter+0x112/0x510
> [  741.503544]  vfs_read+0x215/0x340
>
> [  740.286380] INFO: task alloy:5901 blocked for more than 122 seconds.
> [  740.326290] Call Trace:
> [  740.336311]  __schedule+0x2b5/0x690
> [  740.341312]  schedule+0x23/0x80
> [  740.354335]  __mutex_lock.constprop.0+0x3c9/0xa00
> [  740.360318]  hwmon_attr_show+0x36/0x130
>
> Explanation of the problem
> The acpi_power_meter driver exhibits a deadlock when acpi_power_meter_notify
> is called with METER_NOTIFY_CONFIG while a userspace process is concurrently
> reading a sysfs attribute (e.g., power1_average).
>
> The deadlock occurs because acpi_power_meter_notify acquires resource->lock
> and then calls hwmon_device_unregister. The hwmon_device_unregister function
> (via device_unregister and kernfs_remove) waits for all active sysfs callbacks
> to complete. However, the active sysfs callback (power_meter_read) tries to
> acquire resource->lock which is already held by acpi_power_meter_notify.
>
> Sequence of events:
> 1. Userspace reads power1_average.
> 2. hwmon_attr_show is called. It calls power_meter_read.
> 3. power_meter_read attempts to acquire resource->lock.
> 4. Concurrently, ACPI triggers acpi_power_meter_notify with METER_NOTIFY_CONFIG.
> 5. acpi_power_meter_notify acquires resource->lock.
> 6. acpi_power_meter_notify calls hwmon_device_unregister.
> 7. hwmon_device_unregister waits for the active hwmon_attr_show (and thus
>    power_meter_read) to return.
> 8. power_meter_read is blocked waiting for resource->lock.
> 9. acpi_power_meter_notify is blocked waiting for power_meter_read to finish.
>
> This results in a deadlock and hung tasks. Other tasks trying to read sysfs
> attributes also hang waiting for the hwmon core lock (hwdev->lock) which is
> held by the blocked reader.
>
> Functions, snippets and call traces of code related to the problem
>
> drivers/hwmon/acpi_power_meter.c:
>
> static void acpi_power_meter_notify(struct acpi_device *device, u32 event)
> {
>         struct acpi_power_meter_resource *resource;
>         int res;
>
>         if (!device || !acpi_driver_data(device))
>                 return;
>
>         resource = acpi_driver_data(device);
>
>         switch (event) {
>         case METER_NOTIFY_CONFIG:
>                 mutex_lock(&resource->lock);
>                 free_capabilities(resource);
>                 remove_domain_devices(resource);
>                 hwmon_device_unregister(resource->hwmon_dev); /* DEADLOCK HERE */
>                 res = read_capabilities(resource);
>                 /* ... */
>                 mutex_unlock(&resource->lock);
>                 break;
>         /* ... */
> }
>
> static int power_meter_read(struct device *dev, enum hwmon_sensor_types type,
>                             u32 attr, int channel, long *val)
> {
>         struct acpi_power_meter_resource *res = dev_get_drvdata(dev);
>         int ret = 0;
>
>         if (type != hwmon_power)
>                 return -EINVAL;
>
>         guard(mutex)(&res->lock); /* BLOCKED HERE */
>
>         switch (attr) {
>         case hwmon_power_average:
>                 ret = update_meter(res);
>                 /* ... */
>         }
>         return 0;
> }
>
> Suggested Fix
> Move hwmon_device_unregister out of the critical section protected by
> resource->lock. It should be called before acquiring the lock to modify
> the resource structure.
>
> drivers/hwmon/acpi_power_meter.c:
>
>         case METER_NOTIFY_CONFIG:
> -               mutex_lock(&resource->lock);
> -               free_capabilities(resource);
> -               remove_domain_devices(resource);
> -               hwmon_device_unregister(resource->hwmon_dev);
> +               if (resource->hwmon_dev) {
> +                       hwmon_device_unregister(resource->hwmon_dev);
> +                       resource->hwmon_dev = NULL;
> +               }
> +
> +               mutex_lock(&resource->lock);
> +               free_capabilities(resource);
> +               remove_domain_devices(resource);
>                 res = read_capabilities(resource);

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-01-28 18:45     ` Rafael J. Wysocki
@ 2026-01-28 19:52       ` Rafael J. Wysocki
  2026-01-28 21:52         ` Guenter Roeck
  0 siblings, 1 reply; 60+ messages in thread
From: Rafael J. Wysocki @ 2026-01-28 19:52 UTC (permalink / raw)
  To: Guenter Roeck, Jaroslav Pulchart
  Cc: linux-acpi, linux-hwmon, Igor Raits, Daniel Secik, Zdenek Pesek,
	Jiri Jurica, Huisong Li

On Wednesday, January 28, 2026 7:45:32 PM CET Rafael J. Wysocki wrote:
> On Wed, Jan 28, 2026 at 7:18 PM Guenter Roeck <linux@roeck-us.net> wrote:
> >
> > Hi all,
> >
> > On Thu, Jan 22, 2026 at 07:55:35PM +0100, Rafael J. Wysocki wrote:
> > > On Thu, Jan 22, 2026 at 7:21 PM Jaroslav Pulchart
> > > <jaroslav.pulchart@gooddata.com> wrote:
> > > >
> > > > Hello,
> > > >
> > > > after upgrading from kernel 6.17.y to 6.18.y we started to observe a regression
> > > > in the ACPI power meter hwmon interface. Reading power*_average sysfs
> > > > attributes blocks indefinitely and causes tasks to enter uninterruptible
> > > > sleep (D state).
> > >
> > > The most recent change in the acpi_power_meter driver was made in
> > > 6.15, so this is not a regression in that driver.
> > >
> > > Also, nothing suspicious is done in power1_average_min_show() and
> > > power1_average_min_store() AFAICS.
> > >
> >
> > I decided to ask AI (Gemini 3, more specifically) for an analysis of the
> > problem, using Chris Mason's prompts for guidance. Result is below.
> 
> So it agrees with me in the analysis part.
> 
> > The suggested fix may be a feasible workaround. Let me know what you think.
> 
> Well, I'm afraid it won't work if two METER_NOTIFY_CONFIG
> notifications compete with each other because they may try to
> unregister the hwmon device at the same time.
> 
> I would just add a separate lock for the notifier (a static one should
> suffice) and make changes to "resource" only under resource->lock.
> 
> Let me cut a prototype patch for this.
> 

Something like the below (untested).

Note that it also fixes the driver removal which is tangentially related to the
problem at hand.

---
 drivers/hwmon/acpi_power_meter.c |   17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

--- a/drivers/hwmon/acpi_power_meter.c
+++ b/drivers/hwmon/acpi_power_meter.c
@@ -47,6 +47,8 @@
 static int cap_in_hardware;
 static bool force_cap_on;
 
+static DEFINE_MUTEX(acpi_notify_lock);
+
 static int can_cap_in_hardware(void)
 {
 	return force_cap_on || cap_in_hardware;
@@ -823,18 +825,26 @@ static void acpi_power_meter_notify(stru
 
 	resource = acpi_driver_data(device);
 
+	guard(mutex)(&acpi_notify_lock);
+
 	switch (event) {
 	case METER_NOTIFY_CONFIG:
+		if (!IS_ERR(resource->hwmon_dev))
+			hwmon_device_unregister(resource->hwmon_dev);
+
 		mutex_lock(&resource->lock);
+
 		free_capabilities(resource);
 		remove_domain_devices(resource);
-		hwmon_device_unregister(resource->hwmon_dev);
 		res = read_capabilities(resource);
 		if (res)
 			dev_err_once(&device->dev, "read capabilities failed.\n");
 		res = read_domain_devices(resource);
 		if (res && res != -ENODEV)
 			dev_err_once(&device->dev, "read domain devices failed.\n");
+
+		mutex_unlock(&resource->lock);
+
 		resource->hwmon_dev =
 			hwmon_device_register_with_info(&device->dev,
 							ACPI_POWER_METER_NAME,
@@ -843,7 +853,7 @@ static void acpi_power_meter_notify(stru
 							power_extra_groups);
 		if (IS_ERR(resource->hwmon_dev))
 			dev_err_once(&device->dev, "register hwmon device failed.\n");
-		mutex_unlock(&resource->lock);
+
 		break;
 	case METER_NOTIFY_TRIP:
 		sysfs_notify(&device->dev.kobj, NULL, POWER_AVERAGE_NAME);
@@ -953,7 +963,8 @@ static void acpi_power_meter_remove(stru
 		return;
 
 	resource = acpi_driver_data(device);
-	hwmon_device_unregister(resource->hwmon_dev);
+	if (!IS_ERR(resource->hwmon_dev))
+		hwmon_device_unregister(resource->hwmon_dev);
 
 	remove_domain_devices(resource);
 	free_capabilities(resource);




^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-01-28 19:52       ` Rafael J. Wysocki
@ 2026-01-28 21:52         ` Guenter Roeck
  2026-01-29 14:18           ` [PATCH v1] hwmon: (acpi_power_meter) Fix deadlocks related to acpi_power_meter_notify() Rafael J. Wysocki
  0 siblings, 1 reply; 60+ messages in thread
From: Guenter Roeck @ 2026-01-28 21:52 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Jaroslav Pulchart, linux-acpi, linux-hwmon, Igor Raits,
	Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li

On Wed, Jan 28, 2026 at 08:52:49PM +0100, Rafael J. Wysocki wrote:
> On Wednesday, January 28, 2026 7:45:32 PM CET Rafael J. Wysocki wrote:
> > On Wed, Jan 28, 2026 at 7:18 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > >
> > > Hi all,
> > >
> > > On Thu, Jan 22, 2026 at 07:55:35PM +0100, Rafael J. Wysocki wrote:
> > > > On Thu, Jan 22, 2026 at 7:21 PM Jaroslav Pulchart
> > > > <jaroslav.pulchart@gooddata.com> wrote:
> > > > >
> > > > > Hello,
> > > > >
> > > > > after upgrading from kernel 6.17.y to 6.18.y we started to observe a regression
> > > > > in the ACPI power meter hwmon interface. Reading power*_average sysfs
> > > > > attributes blocks indefinitely and causes tasks to enter uninterruptible
> > > > > sleep (D state).
> > > >
> > > > The most recent change in the acpi_power_meter driver was made in
> > > > 6.15, so this is not a regression in that driver.
> > > >
> > > > Also, nothing suspicious is done in power1_average_min_show() and
> > > > power1_average_min_store() AFAICS.
> > > >
> > >
> > > I decided to ask AI (Gemini 3, more specifically) for an analysis of the
> > > problem, using Chris Mason's prompts for guidance. Result is below.
> > 
> > So it agrees with me in the analysis part.
> > 
Yes.

> > > The suggested fix may be a feasible workaround. Let me know what you think.
> > 
> > Well, I'm afraid it won't work if two METER_NOTIFY_CONFIG
> > notifications compete with each other because they may try to
> > unregister the hwmon device at the same time.
> > 
Good point.

> > I would just add a separate lock for the notifier (a static one should
> > suffice) and make changes to "resource" only under resource->lock.
> > 
> > Let me cut a prototype patch for this.
> > 
> 
> Something like the below (untested).
> 
> Note that it also fixes the driver removal which is tangentially related to the
> problem at hand.

I can't test it either, but I ran it through Gemini and it tells me:

  Fixes: tag missing (y) [Fixes: 16746ce8adfe ("hwmon: (acpi_power_meter) Replace the deprecated hwmon_device_register")]

  The commit addresses a deadlock regression introduced by the conversion to
  hwmon_device_register_with_info.

  CHANGE-1: New static mutex `acpi_notify_lock` introduces global serialization.
  This prevents concurrent notification handling for multiple power meter devices.
  While likely a minor impact given the nature of the device, it technically reduces parallelism compared to the per-device locking used previously.

Just for fun I tried again, this time using a model which is more prone to
false positives. It pretty much provided the same result in more detail.
It is a bit more verbose, so I attached it below. I also asked it to verify,
using the backtrace, if the patch is complete, and to suggest a patch
description. The result is also attached below.

I think this is good enough for a formal patch. WDYT ?

Thanks,
Guenter

---
commit bd510da679b28e0f7a3fed718a5bd5e87c2a46dc
Author: Rafael J. Wysocki <rafael@kernel.org>

acpi_power_meter: Fix deadlock in hwmon_attr_show

This commit fixes a deadlock in the ACPI power meter driver by reordering
locking and introducing a separate mutex for notification processing. It
addresses a regression where sysfs attribute reads would block
indefinitely.

> diff --git a/drivers/hwmon/acpi_power_meter.c b/drivers/hwmon/acpi_power_meter.c
> index 29ccdc2fb7ff..de408df0c4d7 100644
> --- a/drivers/hwmon/acpi_power_meter.c
> +++ b/drivers/hwmon/acpi_power_meter.c
> @@ -47,6 +47,8 @@
>  static int cap_in_hardware;
>  static bool force_cap_on;
>  
> +static DEFINE_MUTEX(acpi_notify_lock);
> +
>  static int can_cap_in_hardware(void)
>  {
>  	return force_cap_on || cap_in_hardware;

This isn't a bug, but would it be better to use a per-device mutex in 
struct acpi_power_meter_resource instead of a static global one? A global
lock serializes notifications across all power meter devices in the
system, which might cause unnecessary contention if multiple devices are
present and one of them is slow to unregister.

[ ... ]

> @@ -823,18 +825,26 @@ static void acpi_power_meter_notify(struct acpi_device *device, u32 event)
>  
>  	resource = acpi_driver_data(device);
>  
> +	guard(mutex)(&acpi_notify_lock);
> +
>  	switch (event) {
>  	case METER_NOTIFY_CONFIG:
> +		if (!IS_ERR(resource->hwmon_dev))
> +			hwmon_device_unregister(resource->hwmon_dev);
> +
>  		mutex_lock(&resource->lock);

[ ... ]

> @@ -953,7 +963,8 @@ static void acpi_power_meter_remove(struct acpi_device *device)
>  		return;
>  
>  	resource = acpi_driver_data(device);
> -	hwmon_device_unregister(resource->hwmon_dev);
> +	if (!IS_ERR(resource->hwmon_dev))
> +		hwmon_device_unregister(resource->hwmon_dev);
>  
>  	remove_domain_devices(resource);
>  	free_capabilities(resource);

This commit appears to fix a major regression introduced in a previous
version, but it is missing a Fixes: tag. Based on the description, the
following tag seems appropriate:

Fixes: 16746ce8adfe ("hwmon: (acpi_power_meter) Replace the deprecated hwmon_device_register")

The above commit switched the driver to the new hwmon API which introduced
the core locking that led to the deadlock fixed here.

---
Completeness and patch description:

The patch bd510da679b28e0f7a3fed718a5bd5e87c2a46dc completely resolves the reported regression.

  Analysis of the Regression and Fix
  The log at logs/acpi/acpi.log confirms a classic AB-BA deadlock between the hwmon core and the acpi_power_meter driver. This deadlock was introduced when the driver was converted to the new
  hwmon_device_register_with_info API, which uses internal core locking.

  The Deadlock Cycle:
   1. Task A (Sysfs Read): hwmon_attr_show() takes the hwmon core lock (hwdev->lock) and then calls power_meter_read(), which attempts to take the driver lock (resource->lock).
   2. Task B (ACPI Notify): acpi_power_meter_notify() (on a METER_NOTIFY_CONFIG event) takes the driver lock (resource->lock) and then calls hwmon_device_unregister(), which attempts to take
      the hwmon core lock (hwdev->lock).

  The patch breaks this cycle by moving hwmon_device_unregister() and hwmon_device_register_with_info() outside of the driver's resource->lock. To maintain safety against concurrent
  notifications, it introduces a new acpi_notify_lock.

  Verification against acpi.log
   * The log shows multiple tasks (alloy) blocked in hwmon_attr_show+0x36. This offset corresponds to the point where the hwmon core attempts to acquire its internal mutex.
   * One task (alloy:5921) is specifically shown inside power_meter_read, confirming it holds the core lock and is waiting for the driver lock.
   * By ensuring that the notification handler no longer holds resource->lock when calling into the hwmon core, the "B" side of the deadlock is eliminated.

  Suggested Improved Patch Description
  The current description is a bit sparse on the technical details of the lock inversion. Here is a more detailed version:

    1 hwmon: (acpi_power_meter) Fix deadlock between hwmon core and driver lock
    2
    3 The conversion of the acpi_power_meter driver to the new hwmon registration
    4 API (hwmon_device_register_with_info) in commit 16746ce8adfe introduced a
    5 deadlock between the hwmon core's internal mutex (hwdev->lock) and the
    6 driver's internal mutex (resource->lock).
    7
    8 The deadlock occurs during ACPI notification handling:
    9 1. hwmon_attr_show() (sysfs read) takes hwdev->lock and then calls
   10    power_meter_read(), which attempts to take resource->lock.
   11 2. acpi_power_meter_notify() (ACPI event) takes resource->lock and then
   12    calls hwmon_device_unregister(), which attempts to take hwdev->lock.
   13
   14 Break this deadlock by moving the hwmon registration and unregistration
   15 calls in the notification handler outside of resource->lock. Introduce a
   16 new global acpi_notify_lock to serialize notification events, ensuring
   17 that the sequence of unregistering the old hwmon device and registering a
   18 new one remains atomic with respect to other notifications.
   19
   20 Also add a check for valid hwmon_dev in acpi_power_meter_remove() and
   21 acpi_power_meter_notify() using !IS_ERR() to handle cases where
   22 registration might have failed.
   23
   24 Fixes: 16746ce8adfe ("hwmon: (acpi_power_meter) Replace the deprecated hwmon_device_register")

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v1] hwmon: (acpi_power_meter) Fix deadlocks related to acpi_power_meter_notify()
  2026-01-28 21:52         ` Guenter Roeck
@ 2026-01-29 14:18           ` Rafael J. Wysocki
  2026-01-30  1:07             ` Guenter Roeck
                               ` (2 more replies)
  0 siblings, 3 replies; 60+ messages in thread
From: Rafael J. Wysocki @ 2026-01-29 14:18 UTC (permalink / raw)
  To: Guenter Roeck, Jaroslav Pulchart
  Cc: linux-acpi, linux-hwmon, Igor Raits, Daniel Secik, Zdenek Pesek,
	Jiri Jurica, Huisong Li

On Wednesday, January 28, 2026 10:52:15 PM CET Guenter Roeck wrote:
> On Wed, Jan 28, 2026 at 08:52:49PM +0100, Rafael J. Wysocki wrote:
> > On Wednesday, January 28, 2026 7:45:32 PM CET Rafael J. Wysocki wrote:
> > > On Wed, Jan 28, 2026 at 7:18 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > >
> > > > Hi all,
> > > >
> > > > On Thu, Jan 22, 2026 at 07:55:35PM +0100, Rafael J. Wysocki wrote:
> > > > > On Thu, Jan 22, 2026 at 7:21 PM Jaroslav Pulchart
> > > > > <jaroslav.pulchart@gooddata.com> wrote:
> > > > > >
> > > > > > Hello,
> > > > > >
> > > > > > after upgrading from kernel 6.17.y to 6.18.y we started to observe a regression
> > > > > > in the ACPI power meter hwmon interface. Reading power*_average sysfs
> > > > > > attributes blocks indefinitely and causes tasks to enter uninterruptible
> > > > > > sleep (D state).
> > > > >
> > > > > The most recent change in the acpi_power_meter driver was made in
> > > > > 6.15, so this is not a regression in that driver.
> > > > >
> > > > > Also, nothing suspicious is done in power1_average_min_show() and
> > > > > power1_average_min_store() AFAICS.
> > > > >
> > > >
> > > > I decided to ask AI (Gemini 3, more specifically) for an analysis of the
> > > > problem, using Chris Mason's prompts for guidance. Result is below.
> > > 
> > > So it agrees with me in the analysis part.
> > > 
> Yes.
> 
> > > > The suggested fix may be a feasible workaround. Let me know what you think.
> > > 
> > > Well, I'm afraid it won't work if two METER_NOTIFY_CONFIG
> > > notifications compete with each other because they may try to
> > > unregister the hwmon device at the same time.
> > > 
> Good point.
> 
> > > I would just add a separate lock for the notifier (a static one should
> > > suffice) and make changes to "resource" only under resource->lock.
> > > 
> > > Let me cut a prototype patch for this.
> > > 
> > 
> > Something like the below (untested).
> > 
> > Note that it also fixes the driver removal which is tangentially related to the
> > problem at hand.
> 
> I can't test it either, but I ran it through Gemini and it tells me:
> 
>   Fixes: tag missing (y) [Fixes: 16746ce8adfe ("hwmon: (acpi_power_meter) Replace the deprecated hwmon_device_register")]
> 
>   The commit addresses a deadlock regression introduced by the conversion to
>   hwmon_device_register_with_info.
> 
>   CHANGE-1: New static mutex `acpi_notify_lock` introduces global serialization.
>   This prevents concurrent notification handling for multiple power meter devices.
>   While likely a minor impact given the nature of the device, it technically reduces parallelism compared to the per-device locking used previously.
> 
> Just for fun I tried again, this time using a model which is more prone to
> false positives. It pretty much provided the same result in more detail.
> It is a bit more verbose, so I attached it below. I also asked it to verify,
> using the backtrace, if the patch is complete, and to suggest a patch
> description. The result is also attached below.
> 
> I think this is good enough for a formal patch. WDYT ?

With all due respect to the AI, I think that it has missed a couple of things,
so below it the patch with my version of the changelog.

Also, it really wants to be two patches IMV, one adding the
IS_ERR(resource->hwmon_dev) checks before hwmon device unregistration and
another one fixing the deadlock in question on top of it.  Please let me know
if you want me to split this one.

Jaroslav, it would be nice to get some feedback on it from you as you seem to
be the only person here who can test it.

---
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subject: [PATCH v1] hwmon: (acpi_power_meter) Fix deadlocks related to acpi_power_meter_notify()

The acpi_power_meter driver's .notify() callback function,
acpi_power_meter_notify(), calls hwmon_device_unregister() under a lock
that is also acquired by callbacks in sysfs attributes of the device
being unregistered which is prone to deadlocks between sysfs access and
device removal.

Address this by moving the hwmon device removal in
acpi_power_meter_notify() outside the lock in question, but notice
that doing it alone is not sufficient because two concurrent
METER_NOTIFY_CONFIG notifications may be attempting to remove the
same device at the same time.  To prevent that from happening, add a
new lock serializing the execution of the switch () statement in
acpi_power_meter_notify().  For simplicity, it is a static mutex
which should not be a problem from the performance perspective.

The new lock also allows the hwmon_device_register_with_info()
in acpi_power_meter_notify() to be called outside the inner lock
because it prevents the other notifications handled by that function
from manipulating the "resource" object while the hwmon device based
on it is being registered.  The sending of ACPI netlink messages from
acpi_power_meter_notify() is serialized by the new lock too which
generally helps to ensure that the order of handling firmware
notifications is the same as the order of sending netlink messages
related to them.

In addition, notice that hwmon_device_register_with_info() may fail
in which case resource->hwmon_dev will become an error pointer,
so add checks to avoid attempting to unregister the hwmon device
pointer to by it in that case to acpi_power_meter_notify() and
acpi_power_meter_remove().

Fixes: 16746ce8adfe ("hwmon: (acpi_power_meter) Replace the deprecated hwmon_device_register")
Reported-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/hwmon/acpi_power_meter.c |   17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

--- a/drivers/hwmon/acpi_power_meter.c
+++ b/drivers/hwmon/acpi_power_meter.c
@@ -47,6 +47,8 @@
 static int cap_in_hardware;
 static bool force_cap_on;
 
+static DEFINE_MUTEX(acpi_notify_lock);
+
 static int can_cap_in_hardware(void)
 {
 	return force_cap_on || cap_in_hardware;
@@ -823,18 +825,26 @@ static void acpi_power_meter_notify(stru
 
 	resource = acpi_driver_data(device);
 
+	guard(mutex)(&acpi_notify_lock);
+
 	switch (event) {
 	case METER_NOTIFY_CONFIG:
+		if (!IS_ERR(resource->hwmon_dev))
+			hwmon_device_unregister(resource->hwmon_dev);
+
 		mutex_lock(&resource->lock);
+
 		free_capabilities(resource);
 		remove_domain_devices(resource);
-		hwmon_device_unregister(resource->hwmon_dev);
 		res = read_capabilities(resource);
 		if (res)
 			dev_err_once(&device->dev, "read capabilities failed.\n");
 		res = read_domain_devices(resource);
 		if (res && res != -ENODEV)
 			dev_err_once(&device->dev, "read domain devices failed.\n");
+
+		mutex_unlock(&resource->lock);
+
 		resource->hwmon_dev =
 			hwmon_device_register_with_info(&device->dev,
 							ACPI_POWER_METER_NAME,
@@ -843,7 +853,7 @@ static void acpi_power_meter_notify(stru
 							power_extra_groups);
 		if (IS_ERR(resource->hwmon_dev))
 			dev_err_once(&device->dev, "register hwmon device failed.\n");
-		mutex_unlock(&resource->lock);
+
 		break;
 	case METER_NOTIFY_TRIP:
 		sysfs_notify(&device->dev.kobj, NULL, POWER_AVERAGE_NAME);
@@ -953,7 +963,8 @@ static void acpi_power_meter_remove(stru
 		return;
 
 	resource = acpi_driver_data(device);
-	hwmon_device_unregister(resource->hwmon_dev);
+	if (!IS_ERR(resource->hwmon_dev))
+		hwmon_device_unregister(resource->hwmon_dev);
 
 	remove_domain_devices(resource);
 	free_capabilities(resource);




^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v1] hwmon: (acpi_power_meter) Fix deadlocks related to acpi_power_meter_notify()
  2026-01-29 14:18           ` [PATCH v1] hwmon: (acpi_power_meter) Fix deadlocks related to acpi_power_meter_notify() Rafael J. Wysocki
@ 2026-01-30  1:07             ` Guenter Roeck
  2026-01-30 17:51               ` Rafael J. Wysocki
  2026-01-30  1:47             ` lihuisong (C)
  2026-01-31 15:29             ` Guenter Roeck
  2 siblings, 1 reply; 60+ messages in thread
From: Guenter Roeck @ 2026-01-30  1:07 UTC (permalink / raw)
  To: Rafael J. Wysocki, Jaroslav Pulchart
  Cc: linux-acpi, linux-hwmon, Igor Raits, Daniel Secik, Zdenek Pesek,
	Jiri Jurica, Huisong Li

Hi Rafael,

On 1/29/26 06:18, Rafael J. Wysocki wrote:
> ---
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Subject: [PATCH v1] hwmon: (acpi_power_meter) Fix deadlocks related to acpi_power_meter_notify()
> 
> The acpi_power_meter driver's .notify() callback function,
> acpi_power_meter_notify(), calls hwmon_device_unregister() under a lock
> that is also acquired by callbacks in sysfs attributes of the device
> being unregistered which is prone to deadlocks between sysfs access and
> device removal.
> 
> Address this by moving the hwmon device removal in
> acpi_power_meter_notify() outside the lock in question, but notice
> that doing it alone is not sufficient because two concurrent
> METER_NOTIFY_CONFIG notifications may be attempting to remove the
> same device at the same time.  To prevent that from happening, add a
> new lock serializing the execution of the switch () statement in
> acpi_power_meter_notify().  For simplicity, it is a static mutex
> which should not be a problem from the performance perspective.
> 
> The new lock also allows the hwmon_device_register_with_info()
> in acpi_power_meter_notify() to be called outside the inner lock
> because it prevents the other notifications handled by that function
> from manipulating the "resource" object while the hwmon device based
> on it is being registered.  The sending of ACPI netlink messages from
> acpi_power_meter_notify() is serialized by the new lock too which
> generally helps to ensure that the order of handling firmware
> notifications is the same as the order of sending netlink messages
> related to them.
> 
> In addition, notice that hwmon_device_register_with_info() may fail
> in which case resource->hwmon_dev will become an error pointer,
> so add checks to avoid attempting to unregister the hwmon device
> pointer to by it in that case to acpi_power_meter_notify() and
> acpi_power_meter_remove().
> 
> Fixes: 16746ce8adfe ("hwmon: (acpi_power_meter) Replace the deprecated hwmon_device_register")
> Reported-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>   drivers/hwmon/acpi_power_meter.c |   17 ++++++++++++++---
>   1 file changed, 14 insertions(+), 3 deletions(-)
> 
> --- a/drivers/hwmon/acpi_power_meter.c
> +++ b/drivers/hwmon/acpi_power_meter.c
> @@ -47,6 +47,8 @@
>   static int cap_in_hardware;
>   static bool force_cap_on;
>   
> +static DEFINE_MUTEX(acpi_notify_lock);
> +
>   static int can_cap_in_hardware(void)
>   {
>   	return force_cap_on || cap_in_hardware;
> @@ -823,18 +825,26 @@ static void acpi_power_meter_notify(stru
>   
>   	resource = acpi_driver_data(device);
>   
> +	guard(mutex)(&acpi_notify_lock);
> +
>   	switch (event) {
>   	case METER_NOTIFY_CONFIG:
> +		if (!IS_ERR(resource->hwmon_dev))
> +			hwmon_device_unregister(resource->hwmon_dev);
> +
>   		mutex_lock(&resource->lock);
> +
>   		free_capabilities(resource);
>   		remove_domain_devices(resource);
> -		hwmon_device_unregister(resource->hwmon_dev);
>   		res = read_capabilities(resource);
>   		if (res)
>   			dev_err_once(&device->dev, "read capabilities failed.\n");
>   		res = read_domain_devices(resource);
>   		if (res && res != -ENODEV)
>   			dev_err_once(&device->dev, "read domain devices failed.\n");
> +
> +		mutex_unlock(&resource->lock);
> +
>   		resource->hwmon_dev =
>   			hwmon_device_register_with_info(&device->dev,
>   							ACPI_POWER_METER_NAME,
> @@ -843,7 +853,7 @@ static void acpi_power_meter_notify(stru
>   							power_extra_groups);
>   		if (IS_ERR(resource->hwmon_dev))
>   			dev_err_once(&device->dev, "register hwmon device failed.\n");
> -		mutex_unlock(&resource->lock);
> +
>   		break;
>   	case METER_NOTIFY_TRIP:
>   		sysfs_notify(&device->dev.kobj, NULL, POWER_AVERAGE_NAME);
> @@ -953,7 +963,8 @@ static void acpi_power_meter_remove(stru
>   		return;
>   
>   	resource = acpi_driver_data(device);
> -	hwmon_device_unregister(resource->hwmon_dev);
> +	if (!IS_ERR(resource->hwmon_dev))
> +		hwmon_device_unregister(resource->hwmon_dev);

Gemini says that this is still racy:

Description: Race condition between remove and notify. acpi_power_meter_remove()
unregisters the hwmon device and frees the resource structure without acquiring
acpi_notify_lock. If acpi_power_meter_notify() is running concurrently (e.g.
waiting on the lock), remove() can free the resource while notify() still holds
a pointer to it. When notify() acquires the lock, it will access the freed
resource (Use-After-Free).

Additionally, both functions may attempt to unregister the same hwmon device
concurrently (Double Unregister) because remove() reads resource->hwmon_dev
without protection. acpi_power_meter_remove() must acquire acpi_notify_lock
and clear device->driver_data, and acpi_power_meter_notify() must re-validate
the resource pointer after acquiring the lock.

Maybe I am missing something, but I think it has a point. it is not a new
problem, but still ...

Thanks,
Guenter


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v1] hwmon: (acpi_power_meter) Fix deadlocks related to acpi_power_meter_notify()
  2026-01-30  1:07             ` Guenter Roeck
@ 2026-01-30 17:51               ` Rafael J. Wysocki
  2026-01-30 18:51                 ` Guenter Roeck
  0 siblings, 1 reply; 60+ messages in thread
From: Rafael J. Wysocki @ 2026-01-30 17:51 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Rafael J. Wysocki, Jaroslav Pulchart, linux-acpi, linux-hwmon,
	Igor Raits, Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li

On Fri, Jan 30, 2026 at 2:07 AM Guenter Roeck <linux@roeck-us.net> wrote:
>
> Hi Rafael,
>
> On 1/29/26 06:18, Rafael J. Wysocki wrote:
> > ---
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > Subject: [PATCH v1] hwmon: (acpi_power_meter) Fix deadlocks related to acpi_power_meter_notify()
> >
> > The acpi_power_meter driver's .notify() callback function,
> > acpi_power_meter_notify(), calls hwmon_device_unregister() under a lock
> > that is also acquired by callbacks in sysfs attributes of the device
> > being unregistered which is prone to deadlocks between sysfs access and
> > device removal.
> >
> > Address this by moving the hwmon device removal in
> > acpi_power_meter_notify() outside the lock in question, but notice
> > that doing it alone is not sufficient because two concurrent
> > METER_NOTIFY_CONFIG notifications may be attempting to remove the
> > same device at the same time.  To prevent that from happening, add a
> > new lock serializing the execution of the switch () statement in
> > acpi_power_meter_notify().  For simplicity, it is a static mutex
> > which should not be a problem from the performance perspective.
> >
> > The new lock also allows the hwmon_device_register_with_info()
> > in acpi_power_meter_notify() to be called outside the inner lock
> > because it prevents the other notifications handled by that function
> > from manipulating the "resource" object while the hwmon device based
> > on it is being registered.  The sending of ACPI netlink messages from
> > acpi_power_meter_notify() is serialized by the new lock too which
> > generally helps to ensure that the order of handling firmware
> > notifications is the same as the order of sending netlink messages
> > related to them.
> >
> > In addition, notice that hwmon_device_register_with_info() may fail
> > in which case resource->hwmon_dev will become an error pointer,
> > so add checks to avoid attempting to unregister the hwmon device
> > pointer to by it in that case to acpi_power_meter_notify() and
> > acpi_power_meter_remove().
> >
> > Fixes: 16746ce8adfe ("hwmon: (acpi_power_meter) Replace the deprecated hwmon_device_register")
> > Reported-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com>
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >   drivers/hwmon/acpi_power_meter.c |   17 ++++++++++++++---
> >   1 file changed, 14 insertions(+), 3 deletions(-)
> >
> > --- a/drivers/hwmon/acpi_power_meter.c
> > +++ b/drivers/hwmon/acpi_power_meter.c
> > @@ -47,6 +47,8 @@
> >   static int cap_in_hardware;
> >   static bool force_cap_on;
> >
> > +static DEFINE_MUTEX(acpi_notify_lock);
> > +
> >   static int can_cap_in_hardware(void)
> >   {
> >       return force_cap_on || cap_in_hardware;
> > @@ -823,18 +825,26 @@ static void acpi_power_meter_notify(stru
> >
> >       resource = acpi_driver_data(device);
> >
> > +     guard(mutex)(&acpi_notify_lock);
> > +
> >       switch (event) {
> >       case METER_NOTIFY_CONFIG:
> > +             if (!IS_ERR(resource->hwmon_dev))
> > +                     hwmon_device_unregister(resource->hwmon_dev);
> > +
> >               mutex_lock(&resource->lock);
> > +
> >               free_capabilities(resource);
> >               remove_domain_devices(resource);
> > -             hwmon_device_unregister(resource->hwmon_dev);
> >               res = read_capabilities(resource);
> >               if (res)
> >                       dev_err_once(&device->dev, "read capabilities failed.\n");
> >               res = read_domain_devices(resource);
> >               if (res && res != -ENODEV)
> >                       dev_err_once(&device->dev, "read domain devices failed.\n");
> > +
> > +             mutex_unlock(&resource->lock);
> > +
> >               resource->hwmon_dev =
> >                       hwmon_device_register_with_info(&device->dev,
> >                                                       ACPI_POWER_METER_NAME,
> > @@ -843,7 +853,7 @@ static void acpi_power_meter_notify(stru
> >                                                       power_extra_groups);
> >               if (IS_ERR(resource->hwmon_dev))
> >                       dev_err_once(&device->dev, "register hwmon device failed.\n");
> > -             mutex_unlock(&resource->lock);
> > +
> >               break;
> >       case METER_NOTIFY_TRIP:
> >               sysfs_notify(&device->dev.kobj, NULL, POWER_AVERAGE_NAME);
> > @@ -953,7 +963,8 @@ static void acpi_power_meter_remove(stru
> >               return;
> >
> >       resource = acpi_driver_data(device);
> > -     hwmon_device_unregister(resource->hwmon_dev);
> > +     if (!IS_ERR(resource->hwmon_dev))
> > +             hwmon_device_unregister(resource->hwmon_dev);
>
> Gemini says that this is still racy:
>
> Description: Race condition between remove and notify. acpi_power_meter_remove()
> unregisters the hwmon device and frees the resource structure without acquiring
> acpi_notify_lock. If acpi_power_meter_notify() is running concurrently (e.g.
> waiting on the lock), remove() can free the resource while notify() still holds
> a pointer to it. When notify() acquires the lock, it will access the freed
> resource (Use-After-Free).

This cannot happen because remove runs after unregistering the notify
handler by the ACPI core and that causes all events to drain.  See
acpi_device_remove() and acpi_device_remove_notify_handler().

Moreover, acpi_os_wait_events_complete() is effectively a full memory
barrier because of the two consecutive flush_workqueue() calls in it
that each acquire and release a mutex.

> Additionally, both functions may attempt to unregister the same hwmon device
> concurrently (Double Unregister) because remove() reads resource->hwmon_dev
> without protection. acpi_power_meter_remove() must acquire acpi_notify_lock
> and clear device->driver_data, and acpi_power_meter_notify() must re-validate
> the resource pointer after acquiring the lock.

Well, this also cannot happen AFAICS as per the above.

> Maybe I am missing something, but I think it has a point. it is not a new
> problem, but still ...

I don't think that there is a problem.

Thanks, Rafael

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v1] hwmon: (acpi_power_meter) Fix deadlocks related to acpi_power_meter_notify()
  2026-01-30 17:51               ` Rafael J. Wysocki
@ 2026-01-30 18:51                 ` Guenter Roeck
  0 siblings, 0 replies; 60+ messages in thread
From: Guenter Roeck @ 2026-01-30 18:51 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Jaroslav Pulchart, linux-acpi, linux-hwmon, Igor Raits,
	Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li

On 1/30/26 09:51, Rafael J. Wysocki wrote:
[ ... ]

>> Gemini says that this is still racy:
>>
>> Description: Race condition between remove and notify. acpi_power_meter_remove()
>> unregisters the hwmon device and frees the resource structure without acquiring
>> acpi_notify_lock. If acpi_power_meter_notify() is running concurrently (e.g.
>> waiting on the lock), remove() can free the resource while notify() still holds
>> a pointer to it. When notify() acquires the lock, it will access the freed
>> resource (Use-After-Free).
> 
> This cannot happen because remove runs after unregistering the notify
> handler by the ACPI core and that causes all events to drain.  See
> acpi_device_remove() and acpi_device_remove_notify_handler().
> 
> Moreover, acpi_os_wait_events_complete() is effectively a full memory
> barrier because of the two consecutive flush_workqueue() calls in it
> that each acquire and release a mutex.
> 
>> Additionally, both functions may attempt to unregister the same hwmon device
>> concurrently (Double Unregister) because remove() reads resource->hwmon_dev
>> without protection. acpi_power_meter_remove() must acquire acpi_notify_lock
>> and clear device->driver_data, and acpi_power_meter_notify() must re-validate
>> the resource pointer after acquiring the lock.
> 
> Well, this also cannot happen AFAICS as per the above.
> 
>> Maybe I am missing something, but I think it has a point. it is not a new
>> problem, but still ...
> 
> I don't think that there is a problem.
> 

Excellent. I wasn't sure if a notify call could still be executing when
the remove function is called, but that does seem to be a bit far-fetched.

I am inclined to apply the patch as-is to (hopefully) resolve the immediate
problem. Getting rid of the hwmon device re-registration will need more work.

Thanks,
Guenter


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v1] hwmon: (acpi_power_meter) Fix deadlocks related to acpi_power_meter_notify()
  2026-01-29 14:18           ` [PATCH v1] hwmon: (acpi_power_meter) Fix deadlocks related to acpi_power_meter_notify() Rafael J. Wysocki
  2026-01-30  1:07             ` Guenter Roeck
@ 2026-01-30  1:47             ` lihuisong (C)
  2026-01-30  5:07               ` Guenter Roeck
  2026-01-31 15:29             ` Guenter Roeck
  2 siblings, 1 reply; 60+ messages in thread
From: lihuisong (C) @ 2026-01-30  1:47 UTC (permalink / raw)
  To: Rafael J. Wysocki, Guenter Roeck, Jaroslav Pulchart
  Cc: linux-acpi, linux-hwmon, Igor Raits, Daniel Secik, Zdenek Pesek,
	Jiri Jurica


On 1/29/2026 10:18 PM, Rafael J. Wysocki wrote:
> On Wednesday, January 28, 2026 10:52:15 PM CET Guenter Roeck wrote:
>> On Wed, Jan 28, 2026 at 08:52:49PM +0100, Rafael J. Wysocki wrote:
>>> On Wednesday, January 28, 2026 7:45:32 PM CET Rafael J. Wysocki wrote:
>>>> On Wed, Jan 28, 2026 at 7:18 PM Guenter Roeck <linux@roeck-us.net> wrote:
>>>>> Hi all,
>>>>>
>>>>> On Thu, Jan 22, 2026 at 07:55:35PM +0100, Rafael J. Wysocki wrote:
>>>>>> On Thu, Jan 22, 2026 at 7:21 PM Jaroslav Pulchart
>>>>>> <jaroslav.pulchart@gooddata.com> wrote:
>>>>>>> Hello,
>>>>>>>
>>>>>>> after upgrading from kernel 6.17.y to 6.18.y we started to observe a regression
>>>>>>> in the ACPI power meter hwmon interface. Reading power*_average sysfs
>>>>>>> attributes blocks indefinitely and causes tasks to enter uninterruptible
>>>>>>> sleep (D state).
>>>>>> The most recent change in the acpi_power_meter driver was made in
>>>>>> 6.15, so this is not a regression in that driver.
>>>>>>
>>>>>> Also, nothing suspicious is done in power1_average_min_show() and
>>>>>> power1_average_min_store() AFAICS.
>>>>>>
>>>>> I decided to ask AI (Gemini 3, more specifically) for an analysis of the
>>>>> problem, using Chris Mason's prompts for guidance. Result is below.
>>>> So it agrees with me in the analysis part.
>>>>
>> Yes.
>>
>>>>> The suggested fix may be a feasible workaround. Let me know what you think.
>>>> Well, I'm afraid it won't work if two METER_NOTIFY_CONFIG
>>>> notifications compete with each other because they may try to
>>>> unregister the hwmon device at the same time.
>>>>
>> Good point.
>>
>>>> I would just add a separate lock for the notifier (a static one should
>>>> suffice) and make changes to "resource" only under resource->lock.
>>>>
>>>> Let me cut a prototype patch for this.
>>>>
>>> Something like the below (untested).
>>>
>>> Note that it also fixes the driver removal which is tangentially related to the
>>> problem at hand.
>> I can't test it either, but I ran it through Gemini and it tells me:
>>
>>    Fixes: tag missing (y) [Fixes: 16746ce8adfe ("hwmon: (acpi_power_meter) Replace the deprecated hwmon_device_register")]
>>
>>    The commit addresses a deadlock regression introduced by the conversion to
>>    hwmon_device_register_with_info.
>>
>>    CHANGE-1: New static mutex `acpi_notify_lock` introduces global serialization.
>>    This prevents concurrent notification handling for multiple power meter devices.
>>    While likely a minor impact given the nature of the device, it technically reduces parallelism compared to the per-device locking used previously.
>>
>> Just for fun I tried again, this time using a model which is more prone to
>> false positives. It pretty much provided the same result in more detail.
>> It is a bit more verbose, so I attached it below. I also asked it to verify,
>> using the backtrace, if the patch is complete, and to suggest a patch
>> description. The result is also attached below.
>>
>> I think this is good enough for a formal patch. WDYT ?
> With all due respect to the AI, I think that it has missed a couple of things,
> so below it the patch with my version of the changelog.
>
> Also, it really wants to be two patches IMV, one adding the
> IS_ERR(resource->hwmon_dev) checks before hwmon device unregistration and
> another one fixing the deadlock in question on top of it.  Please let me know
> if you want me to split this one.
>
> Jaroslav, it would be nice to get some feedback on it from you as you seem to
> be the only person here who can test it.
>
> ---
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Subject: [PATCH v1] hwmon: (acpi_power_meter) Fix deadlocks related to acpi_power_meter_notify()
>
> The acpi_power_meter driver's .notify() callback function,
> acpi_power_meter_notify(), calls hwmon_device_unregister() under a lock
> that is also acquired by callbacks in sysfs attributes of the device
> being unregistered which is prone to deadlocks between sysfs access and
> device removal.
>
> Address this by moving the hwmon device removal in
> acpi_power_meter_notify() outside the lock in question, but notice
> that doing it alone is not sufficient because two concurrent
> METER_NOTIFY_CONFIG notifications may be attempting to remove the
> same device at the same time.  To prevent that from happening, add a
> new lock serializing the execution of the switch () statement in
> acpi_power_meter_notify().  For simplicity, it is a static mutex
> which should not be a problem from the performance perspective.
>
> The new lock also allows the hwmon_device_register_with_info()
> in acpi_power_meter_notify() to be called outside the inner lock
> because it prevents the other notifications handled by that function
> from manipulating the "resource" object while the hwmon device based
> on it is being registered.  The sending of ACPI netlink messages from
> acpi_power_meter_notify() is serialized by the new lock too which
> generally helps to ensure that the order of handling firmware
> notifications is the same as the order of sending netlink messages
> related to them.
>
> In addition, notice that hwmon_device_register_with_info() may fail
> in which case resource->hwmon_dev will become an error pointer,
> so add checks to avoid attempting to unregister the hwmon device
> pointer to by it in that case to acpi_power_meter_notify() and
> acpi_power_meter_remove().
>
> Fixes: 16746ce8adfe ("hwmon: (acpi_power_meter) Replace the deprecated hwmon_device_register")
> Reported-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>   drivers/hwmon/acpi_power_meter.c |   17 ++++++++++++++---
>   1 file changed, 14 insertions(+), 3 deletions(-)
>
> --- a/drivers/hwmon/acpi_power_meter.c
> +++ b/drivers/hwmon/acpi_power_meter.c
> @@ -47,6 +47,8 @@
>   static int cap_in_hardware;
>   static bool force_cap_on;
>   
> +static DEFINE_MUTEX(acpi_notify_lock);
> +
>   static int can_cap_in_hardware(void)
>   {
>   	return force_cap_on || cap_in_hardware;
> @@ -823,18 +825,26 @@ static void acpi_power_meter_notify(stru
>   
>   	resource = acpi_driver_data(device);
>   
> +	guard(mutex)(&acpi_notify_lock);
> +
>   	switch (event) {
>   	case METER_NOTIFY_CONFIG:
> +		if (!IS_ERR(resource->hwmon_dev))
> +			hwmon_device_unregister(resource->hwmon_dev);
> +
>   		mutex_lock(&resource->lock);
> +
>   		free_capabilities(resource);
>   		remove_domain_devices(resource);
> -		hwmon_device_unregister(resource->hwmon_dev);
>   		res = read_capabilities(resource);
>   		if (res)
>   			dev_err_once(&device->dev, "read capabilities failed.\n");
>   		res = read_domain_devices(resource);
>   		if (res && res != -ENODEV)
>   			dev_err_once(&device->dev, "read domain devices failed.\n");
> +
> +		mutex_unlock(&resource->lock);
> +
>   		resource->hwmon_dev =
>   			hwmon_device_register_with_info(&device->dev,
>   							ACPI_POWER_METER_NAME,
> @@ -843,7 +853,7 @@ static void acpi_power_meter_notify(stru
>   							power_extra_groups);
>   		if (IS_ERR(resource->hwmon_dev))
>   			dev_err_once(&device->dev, "register hwmon device failed.\n");
> -		mutex_unlock(&resource->lock);
> +
>   		break;
>   	case METER_NOTIFY_TRIP:
>   		sysfs_notify(&device->dev.kobj, NULL, POWER_AVERAGE_NAME);
> @@ -953,7 +963,8 @@ static void acpi_power_meter_remove(stru
>   		return;
>   
>   	resource = acpi_driver_data(device);
> -	hwmon_device_unregister(resource->hwmon_dev);
> +	if (!IS_ERR(resource->hwmon_dev))
> +		hwmon_device_unregister(resource->hwmon_dev);
>   
!IS_ERR(resource->hwmon_dev) may be not enough. There might be UAF in 
concurrent case.
How about do it like:
if (!IS_ERR_OR_NULL(resource->hwmon_dev)) {
        hwmon_device_unregister(resource->hwmon_dev);
        resource->hwmon_dev = NULL;
}
>   	remove_domain_devices(resource);
>   	free_capabilities(resource);
>
>
>
>
>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v1] hwmon: (acpi_power_meter) Fix deadlocks related to acpi_power_meter_notify()
  2026-01-30  1:47             ` lihuisong (C)
@ 2026-01-30  5:07               ` Guenter Roeck
  2026-01-30  8:40                 ` lihuisong (C)
  0 siblings, 1 reply; 60+ messages in thread
From: Guenter Roeck @ 2026-01-30  5:07 UTC (permalink / raw)
  To: lihuisong (C), Rafael J. Wysocki, Jaroslav Pulchart
  Cc: linux-acpi, linux-hwmon, Igor Raits, Daniel Secik, Zdenek Pesek,
	Jiri Jurica

On 1/29/26 17:47, lihuisong (C) wrote:
> 
> On 1/29/2026 10:18 PM, Rafael J. Wysocki wrote:
>> On Wednesday, January 28, 2026 10:52:15 PM CET Guenter Roeck wrote:
>>> On Wed, Jan 28, 2026 at 08:52:49PM +0100, Rafael J. Wysocki wrote:
>>>> On Wednesday, January 28, 2026 7:45:32 PM CET Rafael J. Wysocki wrote:
>>>>> On Wed, Jan 28, 2026 at 7:18 PM Guenter Roeck <linux@roeck-us.net> wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> On Thu, Jan 22, 2026 at 07:55:35PM +0100, Rafael J. Wysocki wrote:
>>>>>>> On Thu, Jan 22, 2026 at 7:21 PM Jaroslav Pulchart
>>>>>>> <jaroslav.pulchart@gooddata.com> wrote:
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> after upgrading from kernel 6.17.y to 6.18.y we started to observe a regression
>>>>>>>> in the ACPI power meter hwmon interface. Reading power*_average sysfs
>>>>>>>> attributes blocks indefinitely and causes tasks to enter uninterruptible
>>>>>>>> sleep (D state).
>>>>>>> The most recent change in the acpi_power_meter driver was made in
>>>>>>> 6.15, so this is not a regression in that driver.
>>>>>>>
>>>>>>> Also, nothing suspicious is done in power1_average_min_show() and
>>>>>>> power1_average_min_store() AFAICS.
>>>>>>>
>>>>>> I decided to ask AI (Gemini 3, more specifically) for an analysis of the
>>>>>> problem, using Chris Mason's prompts for guidance. Result is below.
>>>>> So it agrees with me in the analysis part.
>>>>>
>>> Yes.
>>>
>>>>>> The suggested fix may be a feasible workaround. Let me know what you think.
>>>>> Well, I'm afraid it won't work if two METER_NOTIFY_CONFIG
>>>>> notifications compete with each other because they may try to
>>>>> unregister the hwmon device at the same time.
>>>>>
>>> Good point.
>>>
>>>>> I would just add a separate lock for the notifier (a static one should
>>>>> suffice) and make changes to "resource" only under resource->lock.
>>>>>
>>>>> Let me cut a prototype patch for this.
>>>>>
>>>> Something like the below (untested).
>>>>
>>>> Note that it also fixes the driver removal which is tangentially related to the
>>>> problem at hand.
>>> I can't test it either, but I ran it through Gemini and it tells me:
>>>
>>>    Fixes: tag missing (y) [Fixes: 16746ce8adfe ("hwmon: (acpi_power_meter) Replace the deprecated hwmon_device_register")]
>>>
>>>    The commit addresses a deadlock regression introduced by the conversion to
>>>    hwmon_device_register_with_info.
>>>
>>>    CHANGE-1: New static mutex `acpi_notify_lock` introduces global serialization.
>>>    This prevents concurrent notification handling for multiple power meter devices.
>>>    While likely a minor impact given the nature of the device, it technically reduces parallelism compared to the per-device locking used previously.
>>>
>>> Just for fun I tried again, this time using a model which is more prone to
>>> false positives. It pretty much provided the same result in more detail.
>>> It is a bit more verbose, so I attached it below. I also asked it to verify,
>>> using the backtrace, if the patch is complete, and to suggest a patch
>>> description. The result is also attached below.
>>>
>>> I think this is good enough for a formal patch. WDYT ?
>> With all due respect to the AI, I think that it has missed a couple of things,
>> so below it the patch with my version of the changelog.
>>
>> Also, it really wants to be two patches IMV, one adding the
>> IS_ERR(resource->hwmon_dev) checks before hwmon device unregistration and
>> another one fixing the deadlock in question on top of it.  Please let me know
>> if you want me to split this one.
>>
>> Jaroslav, it would be nice to get some feedback on it from you as you seem to
>> be the only person here who can test it.
>>
>> ---
>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> Subject: [PATCH v1] hwmon: (acpi_power_meter) Fix deadlocks related to acpi_power_meter_notify()
>>
>> The acpi_power_meter driver's .notify() callback function,
>> acpi_power_meter_notify(), calls hwmon_device_unregister() under a lock
>> that is also acquired by callbacks in sysfs attributes of the device
>> being unregistered which is prone to deadlocks between sysfs access and
>> device removal.
>>
>> Address this by moving the hwmon device removal in
>> acpi_power_meter_notify() outside the lock in question, but notice
>> that doing it alone is not sufficient because two concurrent
>> METER_NOTIFY_CONFIG notifications may be attempting to remove the
>> same device at the same time.  To prevent that from happening, add a
>> new lock serializing the execution of the switch () statement in
>> acpi_power_meter_notify().  For simplicity, it is a static mutex
>> which should not be a problem from the performance perspective.
>>
>> The new lock also allows the hwmon_device_register_with_info()
>> in acpi_power_meter_notify() to be called outside the inner lock
>> because it prevents the other notifications handled by that function
>> from manipulating the "resource" object while the hwmon device based
>> on it is being registered.  The sending of ACPI netlink messages from
>> acpi_power_meter_notify() is serialized by the new lock too which
>> generally helps to ensure that the order of handling firmware
>> notifications is the same as the order of sending netlink messages
>> related to them.
>>
>> In addition, notice that hwmon_device_register_with_info() may fail
>> in which case resource->hwmon_dev will become an error pointer,
>> so add checks to avoid attempting to unregister the hwmon device
>> pointer to by it in that case to acpi_power_meter_notify() and
>> acpi_power_meter_remove().
>>
>> Fixes: 16746ce8adfe ("hwmon: (acpi_power_meter) Replace the deprecated hwmon_device_register")
>> Reported-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com>
>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> ---
>>   drivers/hwmon/acpi_power_meter.c |   17 ++++++++++++++---
>>   1 file changed, 14 insertions(+), 3 deletions(-)
>>
>> --- a/drivers/hwmon/acpi_power_meter.c
>> +++ b/drivers/hwmon/acpi_power_meter.c
>> @@ -47,6 +47,8 @@
>>   static int cap_in_hardware;
>>   static bool force_cap_on;
>> +static DEFINE_MUTEX(acpi_notify_lock);
>> +
>>   static int can_cap_in_hardware(void)
>>   {
>>       return force_cap_on || cap_in_hardware;
>> @@ -823,18 +825,26 @@ static void acpi_power_meter_notify(stru
>>       resource = acpi_driver_data(device);
>> +    guard(mutex)(&acpi_notify_lock);
>> +
>>       switch (event) {
>>       case METER_NOTIFY_CONFIG:
>> +        if (!IS_ERR(resource->hwmon_dev))
>> +            hwmon_device_unregister(resource->hwmon_dev);
>> +
>>           mutex_lock(&resource->lock);
>> +
>>           free_capabilities(resource);
>>           remove_domain_devices(resource);
>> -        hwmon_device_unregister(resource->hwmon_dev);
>>           res = read_capabilities(resource);
>>           if (res)
>>               dev_err_once(&device->dev, "read capabilities failed.\n");
>>           res = read_domain_devices(resource);
>>           if (res && res != -ENODEV)
>>               dev_err_once(&device->dev, "read domain devices failed.\n");
>> +
>> +        mutex_unlock(&resource->lock);
>> +
>>           resource->hwmon_dev =
>>               hwmon_device_register_with_info(&device->dev,
>>                               ACPI_POWER_METER_NAME,
>> @@ -843,7 +853,7 @@ static void acpi_power_meter_notify(stru
>>                               power_extra_groups);
>>           if (IS_ERR(resource->hwmon_dev))
>>               dev_err_once(&device->dev, "register hwmon device failed.\n");
>> -        mutex_unlock(&resource->lock);
>> +
>>           break;
>>       case METER_NOTIFY_TRIP:
>>           sysfs_notify(&device->dev.kobj, NULL, POWER_AVERAGE_NAME);
>> @@ -953,7 +963,8 @@ static void acpi_power_meter_remove(stru
>>           return;
>>       resource = acpi_driver_data(device);
>> -    hwmon_device_unregister(resource->hwmon_dev);
>> +    if (!IS_ERR(resource->hwmon_dev))
>> +        hwmon_device_unregister(resource->hwmon_dev);
> !IS_ERR(resource->hwmon_dev) may be not enough. There might be UAF in concurrent case.
> How about do it like:
> if (!IS_ERR_OR_NULL(resource->hwmon_dev)) {

Not sure what you mean with 'concurrent' case. There is a potential race with the
notification code, as mentioned in my other mail. Question is if the acpi subsystem
disables notification calls before the remove function is called, and guarantees
that no notifications are pending. If so, I don't think there is a problem.
Also, resource->hwmon_dev should never be NULL.

Thanks,
Guenter

>         hwmon_device_unregister(resource->hwmon_dev);
>         resource->hwmon_dev = NULL;
> }
>>       remove_domain_devices(resource);
>>       free_capabilities(resource);
>>
>>
>>
>>
>>


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v1] hwmon: (acpi_power_meter) Fix deadlocks related to acpi_power_meter_notify()
  2026-01-30  5:07               ` Guenter Roeck
@ 2026-01-30  8:40                 ` lihuisong (C)
  2026-01-30 17:53                   ` Rafael J. Wysocki
  0 siblings, 1 reply; 60+ messages in thread
From: lihuisong (C) @ 2026-01-30  8:40 UTC (permalink / raw)
  To: Guenter Roeck, Rafael J. Wysocki, Jaroslav Pulchart
  Cc: linux-acpi, linux-hwmon, Igor Raits, Daniel Secik, Zdenek Pesek,
	Jiri Jurica


On 1/30/2026 1:07 PM, Guenter Roeck wrote:
> On 1/29/26 17:47, lihuisong (C) wrote:
>>
>> On 1/29/2026 10:18 PM, Rafael J. Wysocki wrote:
>>> On Wednesday, January 28, 2026 10:52:15 PM CET Guenter Roeck wrote:
>>>> On Wed, Jan 28, 2026 at 08:52:49PM +0100, Rafael J. Wysocki wrote:
>>>>> On Wednesday, January 28, 2026 7:45:32 PM CET Rafael J. Wysocki 
>>>>> wrote:
>>>>>> On Wed, Jan 28, 2026 at 7:18 PM Guenter Roeck 
>>>>>> <linux@roeck-us.net> wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> On Thu, Jan 22, 2026 at 07:55:35PM +0100, Rafael J. Wysocki wrote:
>>>>>>>> On Thu, Jan 22, 2026 at 7:21 PM Jaroslav Pulchart
>>>>>>>> <jaroslav.pulchart@gooddata.com> wrote:
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> after upgrading from kernel 6.17.y to 6.18.y we started to 
>>>>>>>>> observe a regression
>>>>>>>>> in the ACPI power meter hwmon interface. Reading 
>>>>>>>>> power*_average sysfs
>>>>>>>>> attributes blocks indefinitely and causes tasks to enter 
>>>>>>>>> uninterruptible
>>>>>>>>> sleep (D state).
>>>>>>>> The most recent change in the acpi_power_meter driver was made in
>>>>>>>> 6.15, so this is not a regression in that driver.
>>>>>>>>
>>>>>>>> Also, nothing suspicious is done in power1_average_min_show() and
>>>>>>>> power1_average_min_store() AFAICS.
>>>>>>>>
>>>>>>> I decided to ask AI (Gemini 3, more specifically) for an 
>>>>>>> analysis of the
>>>>>>> problem, using Chris Mason's prompts for guidance. Result is below.
>>>>>> So it agrees with me in the analysis part.
>>>>>>
>>>> Yes.
>>>>
>>>>>>> The suggested fix may be a feasible workaround. Let me know what 
>>>>>>> you think.
>>>>>> Well, I'm afraid it won't work if two METER_NOTIFY_CONFIG
>>>>>> notifications compete with each other because they may try to
>>>>>> unregister the hwmon device at the same time.
>>>>>>
>>>> Good point.
>>>>
>>>>>> I would just add a separate lock for the notifier (a static one 
>>>>>> should
>>>>>> suffice) and make changes to "resource" only under resource->lock.
>>>>>>
>>>>>> Let me cut a prototype patch for this.
>>>>>>
>>>>> Something like the below (untested).
>>>>>
>>>>> Note that it also fixes the driver removal which is tangentially 
>>>>> related to the
>>>>> problem at hand.
>>>> I can't test it either, but I ran it through Gemini and it tells me:
>>>>
>>>>    Fixes: tag missing (y) [Fixes: 16746ce8adfe ("hwmon: 
>>>> (acpi_power_meter) Replace the deprecated hwmon_device_register")]
>>>>
>>>>    The commit addresses a deadlock regression introduced by the 
>>>> conversion to
>>>>    hwmon_device_register_with_info.
>>>>
>>>>    CHANGE-1: New static mutex `acpi_notify_lock` introduces global 
>>>> serialization.
>>>>    This prevents concurrent notification handling for multiple 
>>>> power meter devices.
>>>>    While likely a minor impact given the nature of the device, it 
>>>> technically reduces parallelism compared to the per-device locking 
>>>> used previously.
>>>>
>>>> Just for fun I tried again, this time using a model which is more 
>>>> prone to
>>>> false positives. It pretty much provided the same result in more 
>>>> detail.
>>>> It is a bit more verbose, so I attached it below. I also asked it 
>>>> to verify,
>>>> using the backtrace, if the patch is complete, and to suggest a patch
>>>> description. The result is also attached below.
>>>>
>>>> I think this is good enough for a formal patch. WDYT ?
>>> With all due respect to the AI, I think that it has missed a couple 
>>> of things,
>>> so below it the patch with my version of the changelog.
>>>
>>> Also, it really wants to be two patches IMV, one adding the
>>> IS_ERR(resource->hwmon_dev) checks before hwmon device 
>>> unregistration and
>>> another one fixing the deadlock in question on top of it. Please let 
>>> me know
>>> if you want me to split this one.
>>>
>>> Jaroslav, it would be nice to get some feedback on it from you as 
>>> you seem to
>>> be the only person here who can test it.
>>>
>>> ---
>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>> Subject: [PATCH v1] hwmon: (acpi_power_meter) Fix deadlocks related 
>>> to acpi_power_meter_notify()
>>>
>>> The acpi_power_meter driver's .notify() callback function,
>>> acpi_power_meter_notify(), calls hwmon_device_unregister() under a lock
>>> that is also acquired by callbacks in sysfs attributes of the device
>>> being unregistered which is prone to deadlocks between sysfs access and
>>> device removal.
>>>
>>> Address this by moving the hwmon device removal in
>>> acpi_power_meter_notify() outside the lock in question, but notice
>>> that doing it alone is not sufficient because two concurrent
>>> METER_NOTIFY_CONFIG notifications may be attempting to remove the
>>> same device at the same time.  To prevent that from happening, add a
>>> new lock serializing the execution of the switch () statement in
>>> acpi_power_meter_notify().  For simplicity, it is a static mutex
>>> which should not be a problem from the performance perspective.
>>>
>>> The new lock also allows the hwmon_device_register_with_info()
>>> in acpi_power_meter_notify() to be called outside the inner lock
>>> because it prevents the other notifications handled by that function
>>> from manipulating the "resource" object while the hwmon device based
>>> on it is being registered.  The sending of ACPI netlink messages from
>>> acpi_power_meter_notify() is serialized by the new lock too which
>>> generally helps to ensure that the order of handling firmware
>>> notifications is the same as the order of sending netlink messages
>>> related to them.
>>>
>>> In addition, notice that hwmon_device_register_with_info() may fail
>>> in which case resource->hwmon_dev will become an error pointer,
>>> so add checks to avoid attempting to unregister the hwmon device
>>> pointer to by it in that case to acpi_power_meter_notify() and
>>> acpi_power_meter_remove().
>>>
>>> Fixes: 16746ce8adfe ("hwmon: (acpi_power_meter) Replace the 
>>> deprecated hwmon_device_register")
>>> Reported-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com>
>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>> ---
>>>   drivers/hwmon/acpi_power_meter.c |   17 ++++++++++++++---
>>>   1 file changed, 14 insertions(+), 3 deletions(-)
>>>
>>> --- a/drivers/hwmon/acpi_power_meter.c
>>> +++ b/drivers/hwmon/acpi_power_meter.c
>>> @@ -47,6 +47,8 @@
>>>   static int cap_in_hardware;
>>>   static bool force_cap_on;
>>> +static DEFINE_MUTEX(acpi_notify_lock);
>>> +
>>>   static int can_cap_in_hardware(void)
>>>   {
>>>       return force_cap_on || cap_in_hardware;
>>> @@ -823,18 +825,26 @@ static void acpi_power_meter_notify(stru
>>>       resource = acpi_driver_data(device);
>>> +    guard(mutex)(&acpi_notify_lock);
>>> +
>>>       switch (event) {
>>>       case METER_NOTIFY_CONFIG:
>>> +        if (!IS_ERR(resource->hwmon_dev))
>>> +            hwmon_device_unregister(resource->hwmon_dev);
>>> +
>>>           mutex_lock(&resource->lock);
>>> +
>>>           free_capabilities(resource);
>>>           remove_domain_devices(resource);
>>> -        hwmon_device_unregister(resource->hwmon_dev);
>>>           res = read_capabilities(resource);
>>>           if (res)
>>>               dev_err_once(&device->dev, "read capabilities 
>>> failed.\n");
>>>           res = read_domain_devices(resource);
>>>           if (res && res != -ENODEV)
>>>               dev_err_once(&device->dev, "read domain devices 
>>> failed.\n");
>>> +
>>> +        mutex_unlock(&resource->lock);
>>> +
>>>           resource->hwmon_dev =
>>> hwmon_device_register_with_info(&device->dev,
>>>                               ACPI_POWER_METER_NAME,
>>> @@ -843,7 +853,7 @@ static void acpi_power_meter_notify(stru
>>>                               power_extra_groups);
>>>           if (IS_ERR(resource->hwmon_dev))
>>>               dev_err_once(&device->dev, "register hwmon device 
>>> failed.\n");
>>> -        mutex_unlock(&resource->lock);
>>> +
>>>           break;
>>>       case METER_NOTIFY_TRIP:
>>>           sysfs_notify(&device->dev.kobj, NULL, POWER_AVERAGE_NAME);
>>> @@ -953,7 +963,8 @@ static void acpi_power_meter_remove(stru
>>>           return;
>>>       resource = acpi_driver_data(device);
>>> -    hwmon_device_unregister(resource->hwmon_dev);
>>> +    if (!IS_ERR(resource->hwmon_dev))
>>> +        hwmon_device_unregister(resource->hwmon_dev);
>> !IS_ERR(resource->hwmon_dev) may be not enough. There might be UAF in 
>> concurrent case.
>> How about do it like:
>> if (!IS_ERR_OR_NULL(resource->hwmon_dev)) {
>
> Not sure what you mean with 'concurrent' case. There is a potential 
> race with the
Apologize for your confusion. What I mean is as mentioned in your other 
mail.
> notification code, as mentioned in my other mail. Question is if the 
> acpi subsystem
> disables notification calls before the remove function is called, and 
> guarantees
> that no notifications are pending. If so, I don't think there is a 
> problem.
> Also, resource->hwmon_dev should never be NULL.
>
All right,
the resource is already released and this notify should not be entered 
again if remove() is called first.

But the .remove() of acpi_power_meter driver must clear the 
'driver_data' in acpi_device.
In this way, even if ACPI subsystem doesn't disable notification, the 
notify also doesn't perform any action after calling .remove().
>>  hwmon_device_unregister(resource->hwmon_dev);
>>         resource->hwmon_dev = NULL;
>> }
>>>       remove_domain_devices(resource);
>>>       free_capabilities(resource);
>>>
>>>
>>>
>>>
>>>
>
>
>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v1] hwmon: (acpi_power_meter) Fix deadlocks related to acpi_power_meter_notify()
  2026-01-30  8:40                 ` lihuisong (C)
@ 2026-01-30 17:53                   ` Rafael J. Wysocki
  2026-01-31 10:06                     ` lihuisong (C)
  0 siblings, 1 reply; 60+ messages in thread
From: Rafael J. Wysocki @ 2026-01-30 17:53 UTC (permalink / raw)
  To: lihuisong (C)
  Cc: Guenter Roeck, Rafael J. Wysocki, Jaroslav Pulchart, linux-acpi,
	linux-hwmon, Igor Raits, Daniel Secik, Zdenek Pesek, Jiri Jurica

On Fri, Jan 30, 2026 at 9:40 AM lihuisong (C) <lihuisong@huawei.com> wrote:
>
>
> On 1/30/2026 1:07 PM, Guenter Roeck wrote:
> > On 1/29/26 17:47, lihuisong (C) wrote:
> >>
> >> On 1/29/2026 10:18 PM, Rafael J. Wysocki wrote:
> >>> On Wednesday, January 28, 2026 10:52:15 PM CET Guenter Roeck wrote:
> >>>> On Wed, Jan 28, 2026 at 08:52:49PM +0100, Rafael J. Wysocki wrote:
> >>>>> On Wednesday, January 28, 2026 7:45:32 PM CET Rafael J. Wysocki
> >>>>> wrote:
> >>>>>> On Wed, Jan 28, 2026 at 7:18 PM Guenter Roeck
> >>>>>> <linux@roeck-us.net> wrote:
> >>>>>>> Hi all,
> >>>>>>>
> >>>>>>> On Thu, Jan 22, 2026 at 07:55:35PM +0100, Rafael J. Wysocki wrote:
> >>>>>>>> On Thu, Jan 22, 2026 at 7:21 PM Jaroslav Pulchart
> >>>>>>>> <jaroslav.pulchart@gooddata.com> wrote:
> >>>>>>>>> Hello,
> >>>>>>>>>
> >>>>>>>>> after upgrading from kernel 6.17.y to 6.18.y we started to
> >>>>>>>>> observe a regression
> >>>>>>>>> in the ACPI power meter hwmon interface. Reading
> >>>>>>>>> power*_average sysfs
> >>>>>>>>> attributes blocks indefinitely and causes tasks to enter
> >>>>>>>>> uninterruptible
> >>>>>>>>> sleep (D state).
> >>>>>>>> The most recent change in the acpi_power_meter driver was made in
> >>>>>>>> 6.15, so this is not a regression in that driver.
> >>>>>>>>
> >>>>>>>> Also, nothing suspicious is done in power1_average_min_show() and
> >>>>>>>> power1_average_min_store() AFAICS.
> >>>>>>>>
> >>>>>>> I decided to ask AI (Gemini 3, more specifically) for an
> >>>>>>> analysis of the
> >>>>>>> problem, using Chris Mason's prompts for guidance. Result is below.
> >>>>>> So it agrees with me in the analysis part.
> >>>>>>
> >>>> Yes.
> >>>>
> >>>>>>> The suggested fix may be a feasible workaround. Let me know what
> >>>>>>> you think.
> >>>>>> Well, I'm afraid it won't work if two METER_NOTIFY_CONFIG
> >>>>>> notifications compete with each other because they may try to
> >>>>>> unregister the hwmon device at the same time.
> >>>>>>
> >>>> Good point.
> >>>>
> >>>>>> I would just add a separate lock for the notifier (a static one
> >>>>>> should
> >>>>>> suffice) and make changes to "resource" only under resource->lock.
> >>>>>>
> >>>>>> Let me cut a prototype patch for this.
> >>>>>>
> >>>>> Something like the below (untested).
> >>>>>
> >>>>> Note that it also fixes the driver removal which is tangentially
> >>>>> related to the
> >>>>> problem at hand.
> >>>> I can't test it either, but I ran it through Gemini and it tells me:
> >>>>
> >>>>    Fixes: tag missing (y) [Fixes: 16746ce8adfe ("hwmon:
> >>>> (acpi_power_meter) Replace the deprecated hwmon_device_register")]
> >>>>
> >>>>    The commit addresses a deadlock regression introduced by the
> >>>> conversion to
> >>>>    hwmon_device_register_with_info.
> >>>>
> >>>>    CHANGE-1: New static mutex `acpi_notify_lock` introduces global
> >>>> serialization.
> >>>>    This prevents concurrent notification handling for multiple
> >>>> power meter devices.
> >>>>    While likely a minor impact given the nature of the device, it
> >>>> technically reduces parallelism compared to the per-device locking
> >>>> used previously.
> >>>>
> >>>> Just for fun I tried again, this time using a model which is more
> >>>> prone to
> >>>> false positives. It pretty much provided the same result in more
> >>>> detail.
> >>>> It is a bit more verbose, so I attached it below. I also asked it
> >>>> to verify,
> >>>> using the backtrace, if the patch is complete, and to suggest a patch
> >>>> description. The result is also attached below.
> >>>>
> >>>> I think this is good enough for a formal patch. WDYT ?
> >>> With all due respect to the AI, I think that it has missed a couple
> >>> of things,
> >>> so below it the patch with my version of the changelog.
> >>>
> >>> Also, it really wants to be two patches IMV, one adding the
> >>> IS_ERR(resource->hwmon_dev) checks before hwmon device
> >>> unregistration and
> >>> another one fixing the deadlock in question on top of it. Please let
> >>> me know
> >>> if you want me to split this one.
> >>>
> >>> Jaroslav, it would be nice to get some feedback on it from you as
> >>> you seem to
> >>> be the only person here who can test it.
> >>>
> >>> ---
> >>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >>> Subject: [PATCH v1] hwmon: (acpi_power_meter) Fix deadlocks related
> >>> to acpi_power_meter_notify()
> >>>
> >>> The acpi_power_meter driver's .notify() callback function,
> >>> acpi_power_meter_notify(), calls hwmon_device_unregister() under a lock
> >>> that is also acquired by callbacks in sysfs attributes of the device
> >>> being unregistered which is prone to deadlocks between sysfs access and
> >>> device removal.
> >>>
> >>> Address this by moving the hwmon device removal in
> >>> acpi_power_meter_notify() outside the lock in question, but notice
> >>> that doing it alone is not sufficient because two concurrent
> >>> METER_NOTIFY_CONFIG notifications may be attempting to remove the
> >>> same device at the same time.  To prevent that from happening, add a
> >>> new lock serializing the execution of the switch () statement in
> >>> acpi_power_meter_notify().  For simplicity, it is a static mutex
> >>> which should not be a problem from the performance perspective.
> >>>
> >>> The new lock also allows the hwmon_device_register_with_info()
> >>> in acpi_power_meter_notify() to be called outside the inner lock
> >>> because it prevents the other notifications handled by that function
> >>> from manipulating the "resource" object while the hwmon device based
> >>> on it is being registered.  The sending of ACPI netlink messages from
> >>> acpi_power_meter_notify() is serialized by the new lock too which
> >>> generally helps to ensure that the order of handling firmware
> >>> notifications is the same as the order of sending netlink messages
> >>> related to them.
> >>>
> >>> In addition, notice that hwmon_device_register_with_info() may fail
> >>> in which case resource->hwmon_dev will become an error pointer,
> >>> so add checks to avoid attempting to unregister the hwmon device
> >>> pointer to by it in that case to acpi_power_meter_notify() and
> >>> acpi_power_meter_remove().
> >>>
> >>> Fixes: 16746ce8adfe ("hwmon: (acpi_power_meter) Replace the
> >>> deprecated hwmon_device_register")
> >>> Reported-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com>
> >>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >>> ---
> >>>   drivers/hwmon/acpi_power_meter.c |   17 ++++++++++++++---
> >>>   1 file changed, 14 insertions(+), 3 deletions(-)
> >>>
> >>> --- a/drivers/hwmon/acpi_power_meter.c
> >>> +++ b/drivers/hwmon/acpi_power_meter.c
> >>> @@ -47,6 +47,8 @@
> >>>   static int cap_in_hardware;
> >>>   static bool force_cap_on;
> >>> +static DEFINE_MUTEX(acpi_notify_lock);
> >>> +
> >>>   static int can_cap_in_hardware(void)
> >>>   {
> >>>       return force_cap_on || cap_in_hardware;
> >>> @@ -823,18 +825,26 @@ static void acpi_power_meter_notify(stru
> >>>       resource = acpi_driver_data(device);
> >>> +    guard(mutex)(&acpi_notify_lock);
> >>> +
> >>>       switch (event) {
> >>>       case METER_NOTIFY_CONFIG:
> >>> +        if (!IS_ERR(resource->hwmon_dev))
> >>> +            hwmon_device_unregister(resource->hwmon_dev);
> >>> +
> >>>           mutex_lock(&resource->lock);
> >>> +
> >>>           free_capabilities(resource);
> >>>           remove_domain_devices(resource);
> >>> -        hwmon_device_unregister(resource->hwmon_dev);
> >>>           res = read_capabilities(resource);
> >>>           if (res)
> >>>               dev_err_once(&device->dev, "read capabilities
> >>> failed.\n");
> >>>           res = read_domain_devices(resource);
> >>>           if (res && res != -ENODEV)
> >>>               dev_err_once(&device->dev, "read domain devices
> >>> failed.\n");
> >>> +
> >>> +        mutex_unlock(&resource->lock);
> >>> +
> >>>           resource->hwmon_dev =
> >>> hwmon_device_register_with_info(&device->dev,
> >>>                               ACPI_POWER_METER_NAME,
> >>> @@ -843,7 +853,7 @@ static void acpi_power_meter_notify(stru
> >>>                               power_extra_groups);
> >>>           if (IS_ERR(resource->hwmon_dev))
> >>>               dev_err_once(&device->dev, "register hwmon device
> >>> failed.\n");
> >>> -        mutex_unlock(&resource->lock);
> >>> +
> >>>           break;
> >>>       case METER_NOTIFY_TRIP:
> >>>           sysfs_notify(&device->dev.kobj, NULL, POWER_AVERAGE_NAME);
> >>> @@ -953,7 +963,8 @@ static void acpi_power_meter_remove(stru
> >>>           return;
> >>>       resource = acpi_driver_data(device);
> >>> -    hwmon_device_unregister(resource->hwmon_dev);
> >>> +    if (!IS_ERR(resource->hwmon_dev))
> >>> +        hwmon_device_unregister(resource->hwmon_dev);
> >> !IS_ERR(resource->hwmon_dev) may be not enough. There might be UAF in
> >> concurrent case.
> >> How about do it like:
> >> if (!IS_ERR_OR_NULL(resource->hwmon_dev)) {
> >
> > Not sure what you mean with 'concurrent' case. There is a potential
> > race with the
> Apologize for your confusion. What I mean is as mentioned in your other
> mail.
> > notification code, as mentioned in my other mail. Question is if the
> > acpi subsystem
> > disables notification calls before the remove function is called, and
> > guarantees
> > that no notifications are pending. If so, I don't think there is a
> > problem.
> > Also, resource->hwmon_dev should never be NULL.
> >
> All right,
> the resource is already released and this notify should not be entered
> again if remove() is called first.
>
> But the .remove() of acpi_power_meter driver must clear the
> 'driver_data' in acpi_device.
> In this way, even if ACPI subsystem doesn't disable notification,

But the ACPI subsystem does disable the notification, so what's the problem?

> the notify also doesn't perform any action after calling .remove().

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v1] hwmon: (acpi_power_meter) Fix deadlocks related to acpi_power_meter_notify()
  2026-01-30 17:53                   ` Rafael J. Wysocki
@ 2026-01-31 10:06                     ` lihuisong (C)
  0 siblings, 0 replies; 60+ messages in thread
From: lihuisong (C) @ 2026-01-31 10:06 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Guenter Roeck, Jaroslav Pulchart, linux-acpi, linux-hwmon,
	Igor Raits, Daniel Secik, Zdenek Pesek, Jiri Jurica


On 1/31/2026 1:53 AM, Rafael J. Wysocki wrote:
> On Fri, Jan 30, 2026 at 9:40 AM lihuisong (C) <lihuisong@huawei.com> wrote:
>>
>> On 1/30/2026 1:07 PM, Guenter Roeck wrote:
>>> On 1/29/26 17:47, lihuisong (C) wrote:
>>>> On 1/29/2026 10:18 PM, Rafael J. Wysocki wrote:
>>>>> On Wednesday, January 28, 2026 10:52:15 PM CET Guenter Roeck wrote:
>>>>>> On Wed, Jan 28, 2026 at 08:52:49PM +0100, Rafael J. Wysocki wrote:
>>>>>>> On Wednesday, January 28, 2026 7:45:32 PM CET Rafael J. Wysocki
>>>>>>> wrote:
>>>>>>>> On Wed, Jan 28, 2026 at 7:18 PM Guenter Roeck
>>>>>>>> <linux@roeck-us.net> wrote:
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> On Thu, Jan 22, 2026 at 07:55:35PM +0100, Rafael J. Wysocki wrote:
>>>>>>>>>> On Thu, Jan 22, 2026 at 7:21 PM Jaroslav Pulchart
>>>>>>>>>> <jaroslav.pulchart@gooddata.com> wrote:
>>>>>>>>>>> Hello,
>>>>>>>>>>>
>>>>>>>>>>> after upgrading from kernel 6.17.y to 6.18.y we started to
>>>>>>>>>>> observe a regression
>>>>>>>>>>> in the ACPI power meter hwmon interface. Reading
>>>>>>>>>>> power*_average sysfs
>>>>>>>>>>> attributes blocks indefinitely and causes tasks to enter
>>>>>>>>>>> uninterruptible
>>>>>>>>>>> sleep (D state).
>>>>>>>>>> The most recent change in the acpi_power_meter driver was made in
>>>>>>>>>> 6.15, so this is not a regression in that driver.
>>>>>>>>>>
>>>>>>>>>> Also, nothing suspicious is done in power1_average_min_show() and
>>>>>>>>>> power1_average_min_store() AFAICS.
>>>>>>>>>>
>>>>>>>>> I decided to ask AI (Gemini 3, more specifically) for an
>>>>>>>>> analysis of the
>>>>>>>>> problem, using Chris Mason's prompts for guidance. Result is below.
>>>>>>>> So it agrees with me in the analysis part.
>>>>>>>>
>>>>>> Yes.
>>>>>>
>>>>>>>>> The suggested fix may be a feasible workaround. Let me know what
>>>>>>>>> you think.
>>>>>>>> Well, I'm afraid it won't work if two METER_NOTIFY_CONFIG
>>>>>>>> notifications compete with each other because they may try to
>>>>>>>> unregister the hwmon device at the same time.
>>>>>>>>
>>>>>> Good point.
>>>>>>
>>>>>>>> I would just add a separate lock for the notifier (a static one
>>>>>>>> should
>>>>>>>> suffice) and make changes to "resource" only under resource->lock.
>>>>>>>>
>>>>>>>> Let me cut a prototype patch for this.
>>>>>>>>
>>>>>>> Something like the below (untested).
>>>>>>>
>>>>>>> Note that it also fixes the driver removal which is tangentially
>>>>>>> related to the
>>>>>>> problem at hand.
>>>>>> I can't test it either, but I ran it through Gemini and it tells me:
>>>>>>
>>>>>>     Fixes: tag missing (y) [Fixes: 16746ce8adfe ("hwmon:
>>>>>> (acpi_power_meter) Replace the deprecated hwmon_device_register")]
>>>>>>
>>>>>>     The commit addresses a deadlock regression introduced by the
>>>>>> conversion to
>>>>>>     hwmon_device_register_with_info.
>>>>>>
>>>>>>     CHANGE-1: New static mutex `acpi_notify_lock` introduces global
>>>>>> serialization.
>>>>>>     This prevents concurrent notification handling for multiple
>>>>>> power meter devices.
>>>>>>     While likely a minor impact given the nature of the device, it
>>>>>> technically reduces parallelism compared to the per-device locking
>>>>>> used previously.
>>>>>>
>>>>>> Just for fun I tried again, this time using a model which is more
>>>>>> prone to
>>>>>> false positives. It pretty much provided the same result in more
>>>>>> detail.
>>>>>> It is a bit more verbose, so I attached it below. I also asked it
>>>>>> to verify,
>>>>>> using the backtrace, if the patch is complete, and to suggest a patch
>>>>>> description. The result is also attached below.
>>>>>>
>>>>>> I think this is good enough for a formal patch. WDYT ?
>>>>> With all due respect to the AI, I think that it has missed a couple
>>>>> of things,
>>>>> so below it the patch with my version of the changelog.
>>>>>
>>>>> Also, it really wants to be two patches IMV, one adding the
>>>>> IS_ERR(resource->hwmon_dev) checks before hwmon device
>>>>> unregistration and
>>>>> another one fixing the deadlock in question on top of it. Please let
>>>>> me know
>>>>> if you want me to split this one.
>>>>>
>>>>> Jaroslav, it would be nice to get some feedback on it from you as
>>>>> you seem to
>>>>> be the only person here who can test it.
>>>>>
>>>>> ---
>>>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>>> Subject: [PATCH v1] hwmon: (acpi_power_meter) Fix deadlocks related
>>>>> to acpi_power_meter_notify()
>>>>>
>>>>> The acpi_power_meter driver's .notify() callback function,
>>>>> acpi_power_meter_notify(), calls hwmon_device_unregister() under a lock
>>>>> that is also acquired by callbacks in sysfs attributes of the device
>>>>> being unregistered which is prone to deadlocks between sysfs access and
>>>>> device removal.
>>>>>
>>>>> Address this by moving the hwmon device removal in
>>>>> acpi_power_meter_notify() outside the lock in question, but notice
>>>>> that doing it alone is not sufficient because two concurrent
>>>>> METER_NOTIFY_CONFIG notifications may be attempting to remove the
>>>>> same device at the same time.  To prevent that from happening, add a
>>>>> new lock serializing the execution of the switch () statement in
>>>>> acpi_power_meter_notify().  For simplicity, it is a static mutex
>>>>> which should not be a problem from the performance perspective.
>>>>>
>>>>> The new lock also allows the hwmon_device_register_with_info()
>>>>> in acpi_power_meter_notify() to be called outside the inner lock
>>>>> because it prevents the other notifications handled by that function
>>>>> from manipulating the "resource" object while the hwmon device based
>>>>> on it is being registered.  The sending of ACPI netlink messages from
>>>>> acpi_power_meter_notify() is serialized by the new lock too which
>>>>> generally helps to ensure that the order of handling firmware
>>>>> notifications is the same as the order of sending netlink messages
>>>>> related to them.
>>>>>
>>>>> In addition, notice that hwmon_device_register_with_info() may fail
>>>>> in which case resource->hwmon_dev will become an error pointer,
>>>>> so add checks to avoid attempting to unregister the hwmon device
>>>>> pointer to by it in that case to acpi_power_meter_notify() and
>>>>> acpi_power_meter_remove().
>>>>>
>>>>> Fixes: 16746ce8adfe ("hwmon: (acpi_power_meter) Replace the
>>>>> deprecated hwmon_device_register")
>>>>> Reported-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com>
>>>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>>> ---
>>>>>    drivers/hwmon/acpi_power_meter.c |   17 ++++++++++++++---
>>>>>    1 file changed, 14 insertions(+), 3 deletions(-)
>>>>>
>>>>> --- a/drivers/hwmon/acpi_power_meter.c
>>>>> +++ b/drivers/hwmon/acpi_power_meter.c
>>>>> @@ -47,6 +47,8 @@
>>>>>    static int cap_in_hardware;
>>>>>    static bool force_cap_on;
>>>>> +static DEFINE_MUTEX(acpi_notify_lock);
>>>>> +
>>>>>    static int can_cap_in_hardware(void)
>>>>>    {
>>>>>        return force_cap_on || cap_in_hardware;
>>>>> @@ -823,18 +825,26 @@ static void acpi_power_meter_notify(stru
>>>>>        resource = acpi_driver_data(device);
>>>>> +    guard(mutex)(&acpi_notify_lock);
>>>>> +
>>>>>        switch (event) {
>>>>>        case METER_NOTIFY_CONFIG:
>>>>> +        if (!IS_ERR(resource->hwmon_dev))
>>>>> +            hwmon_device_unregister(resource->hwmon_dev);
>>>>> +
>>>>>            mutex_lock(&resource->lock);
>>>>> +
>>>>>            free_capabilities(resource);
>>>>>            remove_domain_devices(resource);
>>>>> -        hwmon_device_unregister(resource->hwmon_dev);
>>>>>            res = read_capabilities(resource);
>>>>>            if (res)
>>>>>                dev_err_once(&device->dev, "read capabilities
>>>>> failed.\n");
>>>>>            res = read_domain_devices(resource);
>>>>>            if (res && res != -ENODEV)
>>>>>                dev_err_once(&device->dev, "read domain devices
>>>>> failed.\n");
>>>>> +
>>>>> +        mutex_unlock(&resource->lock);
>>>>> +
>>>>>            resource->hwmon_dev =
>>>>> hwmon_device_register_with_info(&device->dev,
>>>>>                                ACPI_POWER_METER_NAME,
>>>>> @@ -843,7 +853,7 @@ static void acpi_power_meter_notify(stru
>>>>>                                power_extra_groups);
>>>>>            if (IS_ERR(resource->hwmon_dev))
>>>>>                dev_err_once(&device->dev, "register hwmon device
>>>>> failed.\n");
>>>>> -        mutex_unlock(&resource->lock);
>>>>> +
>>>>>            break;
>>>>>        case METER_NOTIFY_TRIP:
>>>>>            sysfs_notify(&device->dev.kobj, NULL, POWER_AVERAGE_NAME);
>>>>> @@ -953,7 +963,8 @@ static void acpi_power_meter_remove(stru
>>>>>            return;
>>>>>        resource = acpi_driver_data(device);
>>>>> -    hwmon_device_unregister(resource->hwmon_dev);
>>>>> +    if (!IS_ERR(resource->hwmon_dev))
>>>>> +        hwmon_device_unregister(resource->hwmon_dev);
>>>> !IS_ERR(resource->hwmon_dev) may be not enough. There might be UAF in
>>>> concurrent case.
>>>> How about do it like:
>>>> if (!IS_ERR_OR_NULL(resource->hwmon_dev)) {
>>> Not sure what you mean with 'concurrent' case. There is a potential
>>> race with the
>> Apologize for your confusion. What I mean is as mentioned in your other
>> mail.
>>> notification code, as mentioned in my other mail. Question is if the
>>> acpi subsystem
>>> disables notification calls before the remove function is called, and
>>> guarantees
>>> that no notifications are pending. If so, I don't think there is a
>>> problem.
>>> Also, resource->hwmon_dev should never be NULL.
>>>
>> All right,
>> the resource is already released and this notify should not be entered
>> again if remove() is called first.
>>
>> But the .remove() of acpi_power_meter driver must clear the
>> 'driver_data' in acpi_device.
>> In this way, even if ACPI subsystem doesn't disable notification,
> But the ACPI subsystem does disable the notification, so what's the problem?
ok, so that's no problem.
>
>> the notify also doesn't perform any action after calling .remove().

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v1] hwmon: (acpi_power_meter) Fix deadlocks related to acpi_power_meter_notify()
  2026-01-29 14:18           ` [PATCH v1] hwmon: (acpi_power_meter) Fix deadlocks related to acpi_power_meter_notify() Rafael J. Wysocki
  2026-01-30  1:07             ` Guenter Roeck
  2026-01-30  1:47             ` lihuisong (C)
@ 2026-01-31 15:29             ` Guenter Roeck
  2026-02-01 11:40               ` Rafael J. Wysocki
  2 siblings, 1 reply; 60+ messages in thread
From: Guenter Roeck @ 2026-01-31 15:29 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Jaroslav Pulchart, linux-acpi, linux-hwmon, Igor Raits,
	Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li

Hi all,

On Thu, Jan 29, 2026 at 03:18:11PM +0100, Rafael J. Wysocki wrote:

I decided to apply the patch below as-is. While we don't have confirmation
that it fixes the immediate problem, it is a starting point and won't make
the situation worse. A complete fix which does not require re-registering
the hardware monitoring device will require more work and will not be
suitable for LTS backports.

Thanks,
Guenter

> ---
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Subject: [PATCH v1] hwmon: (acpi_power_meter) Fix deadlocks related to acpi_power_meter_notify()
> 
> The acpi_power_meter driver's .notify() callback function,
> acpi_power_meter_notify(), calls hwmon_device_unregister() under a lock
> that is also acquired by callbacks in sysfs attributes of the device
> being unregistered which is prone to deadlocks between sysfs access and
> device removal.
> 
> Address this by moving the hwmon device removal in
> acpi_power_meter_notify() outside the lock in question, but notice
> that doing it alone is not sufficient because two concurrent
> METER_NOTIFY_CONFIG notifications may be attempting to remove the
> same device at the same time.  To prevent that from happening, add a
> new lock serializing the execution of the switch () statement in
> acpi_power_meter_notify().  For simplicity, it is a static mutex
> which should not be a problem from the performance perspective.
> 
> The new lock also allows the hwmon_device_register_with_info()
> in acpi_power_meter_notify() to be called outside the inner lock
> because it prevents the other notifications handled by that function
> from manipulating the "resource" object while the hwmon device based
> on it is being registered.  The sending of ACPI netlink messages from
> acpi_power_meter_notify() is serialized by the new lock too which
> generally helps to ensure that the order of handling firmware
> notifications is the same as the order of sending netlink messages
> related to them.
> 
> In addition, notice that hwmon_device_register_with_info() may fail
> in which case resource->hwmon_dev will become an error pointer,
> so add checks to avoid attempting to unregister the hwmon device
> pointer to by it in that case to acpi_power_meter_notify() and
> acpi_power_meter_remove().
> 
> Fixes: 16746ce8adfe ("hwmon: (acpi_power_meter) Replace the deprecated hwmon_device_register")
> Reported-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/hwmon/acpi_power_meter.c |   17 ++++++++++++++---
>  1 file changed, 14 insertions(+), 3 deletions(-)
> 
> --- a/drivers/hwmon/acpi_power_meter.c
> +++ b/drivers/hwmon/acpi_power_meter.c
> @@ -47,6 +47,8 @@
>  static int cap_in_hardware;
>  static bool force_cap_on;
>  
> +static DEFINE_MUTEX(acpi_notify_lock);
> +
>  static int can_cap_in_hardware(void)
>  {
>  	return force_cap_on || cap_in_hardware;
> @@ -823,18 +825,26 @@ static void acpi_power_meter_notify(stru
>  
>  	resource = acpi_driver_data(device);
>  
> +	guard(mutex)(&acpi_notify_lock);
> +
>  	switch (event) {
>  	case METER_NOTIFY_CONFIG:
> +		if (!IS_ERR(resource->hwmon_dev))
> +			hwmon_device_unregister(resource->hwmon_dev);
> +
>  		mutex_lock(&resource->lock);
> +
>  		free_capabilities(resource);
>  		remove_domain_devices(resource);
> -		hwmon_device_unregister(resource->hwmon_dev);
>  		res = read_capabilities(resource);
>  		if (res)
>  			dev_err_once(&device->dev, "read capabilities failed.\n");
>  		res = read_domain_devices(resource);
>  		if (res && res != -ENODEV)
>  			dev_err_once(&device->dev, "read domain devices failed.\n");
> +
> +		mutex_unlock(&resource->lock);
> +
>  		resource->hwmon_dev =
>  			hwmon_device_register_with_info(&device->dev,
>  							ACPI_POWER_METER_NAME,
> @@ -843,7 +853,7 @@ static void acpi_power_meter_notify(stru
>  							power_extra_groups);
>  		if (IS_ERR(resource->hwmon_dev))
>  			dev_err_once(&device->dev, "register hwmon device failed.\n");
> -		mutex_unlock(&resource->lock);
> +
>  		break;
>  	case METER_NOTIFY_TRIP:
>  		sysfs_notify(&device->dev.kobj, NULL, POWER_AVERAGE_NAME);
> @@ -953,7 +963,8 @@ static void acpi_power_meter_remove(stru
>  		return;
>  
>  	resource = acpi_driver_data(device);
> -	hwmon_device_unregister(resource->hwmon_dev);
> +	if (!IS_ERR(resource->hwmon_dev))
> +		hwmon_device_unregister(resource->hwmon_dev);
>  
>  	remove_domain_devices(resource);
>  	free_capabilities(resource);

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v1] hwmon: (acpi_power_meter) Fix deadlocks related to acpi_power_meter_notify()
  2026-01-31 15:29             ` Guenter Roeck
@ 2026-02-01 11:40               ` Rafael J. Wysocki
  0 siblings, 0 replies; 60+ messages in thread
From: Rafael J. Wysocki @ 2026-02-01 11:40 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Rafael J. Wysocki, Jaroslav Pulchart, linux-acpi, linux-hwmon,
	Igor Raits, Daniel Secik, Zdenek Pesek, Jiri Jurica, Huisong Li

On Sat, Jan 31, 2026 at 4:29 PM Guenter Roeck <linux@roeck-us.net> wrote:
>
> Hi all,
>
> On Thu, Jan 29, 2026 at 03:18:11PM +0100, Rafael J. Wysocki wrote:
>
> I decided to apply the patch below as-is. While we don't have confirmation
> that it fixes the immediate problem, it is a starting point and won't make
> the situation worse. A complete fix which does not require re-registering
> the hardware monitoring device will require more work and will not be
> suitable for LTS backports.

Sounds good to me, thank you!

> > ---
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > Subject: [PATCH v1] hwmon: (acpi_power_meter) Fix deadlocks related to acpi_power_meter_notify()
> >
> > The acpi_power_meter driver's .notify() callback function,
> > acpi_power_meter_notify(), calls hwmon_device_unregister() under a lock
> > that is also acquired by callbacks in sysfs attributes of the device
> > being unregistered which is prone to deadlocks between sysfs access and
> > device removal.
> >
> > Address this by moving the hwmon device removal in
> > acpi_power_meter_notify() outside the lock in question, but notice
> > that doing it alone is not sufficient because two concurrent
> > METER_NOTIFY_CONFIG notifications may be attempting to remove the
> > same device at the same time.  To prevent that from happening, add a
> > new lock serializing the execution of the switch () statement in
> > acpi_power_meter_notify().  For simplicity, it is a static mutex
> > which should not be a problem from the performance perspective.
> >
> > The new lock also allows the hwmon_device_register_with_info()
> > in acpi_power_meter_notify() to be called outside the inner lock
> > because it prevents the other notifications handled by that function
> > from manipulating the "resource" object while the hwmon device based
> > on it is being registered.  The sending of ACPI netlink messages from
> > acpi_power_meter_notify() is serialized by the new lock too which
> > generally helps to ensure that the order of handling firmware
> > notifications is the same as the order of sending netlink messages
> > related to them.
> >
> > In addition, notice that hwmon_device_register_with_info() may fail
> > in which case resource->hwmon_dev will become an error pointer,
> > so add checks to avoid attempting to unregister the hwmon device
> > pointer to by it in that case to acpi_power_meter_notify() and
> > acpi_power_meter_remove().
> >
> > Fixes: 16746ce8adfe ("hwmon: (acpi_power_meter) Replace the deprecated hwmon_device_register")
> > Reported-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com>
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >  drivers/hwmon/acpi_power_meter.c |   17 ++++++++++++++---
> >  1 file changed, 14 insertions(+), 3 deletions(-)
> >
> > --- a/drivers/hwmon/acpi_power_meter.c
> > +++ b/drivers/hwmon/acpi_power_meter.c
> > @@ -47,6 +47,8 @@
> >  static int cap_in_hardware;
> >  static bool force_cap_on;
> >
> > +static DEFINE_MUTEX(acpi_notify_lock);
> > +
> >  static int can_cap_in_hardware(void)
> >  {
> >       return force_cap_on || cap_in_hardware;
> > @@ -823,18 +825,26 @@ static void acpi_power_meter_notify(stru
> >
> >       resource = acpi_driver_data(device);
> >
> > +     guard(mutex)(&acpi_notify_lock);
> > +
> >       switch (event) {
> >       case METER_NOTIFY_CONFIG:
> > +             if (!IS_ERR(resource->hwmon_dev))
> > +                     hwmon_device_unregister(resource->hwmon_dev);
> > +
> >               mutex_lock(&resource->lock);
> > +
> >               free_capabilities(resource);
> >               remove_domain_devices(resource);
> > -             hwmon_device_unregister(resource->hwmon_dev);
> >               res = read_capabilities(resource);
> >               if (res)
> >                       dev_err_once(&device->dev, "read capabilities failed.\n");
> >               res = read_domain_devices(resource);
> >               if (res && res != -ENODEV)
> >                       dev_err_once(&device->dev, "read domain devices failed.\n");
> > +
> > +             mutex_unlock(&resource->lock);
> > +
> >               resource->hwmon_dev =
> >                       hwmon_device_register_with_info(&device->dev,
> >                                                       ACPI_POWER_METER_NAME,
> > @@ -843,7 +853,7 @@ static void acpi_power_meter_notify(stru
> >                                                       power_extra_groups);
> >               if (IS_ERR(resource->hwmon_dev))
> >                       dev_err_once(&device->dev, "register hwmon device failed.\n");
> > -             mutex_unlock(&resource->lock);
> > +
> >               break;
> >       case METER_NOTIFY_TRIP:
> >               sysfs_notify(&device->dev.kobj, NULL, POWER_AVERAGE_NAME);
> > @@ -953,7 +963,8 @@ static void acpi_power_meter_remove(stru
> >               return;
> >
> >       resource = acpi_driver_data(device);
> > -     hwmon_device_unregister(resource->hwmon_dev);
> > +     if (!IS_ERR(resource->hwmon_dev))
> > +             hwmon_device_unregister(resource->hwmon_dev);
> >
> >       remove_domain_devices(resource);
> >       free_capabilities(resource);

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y
  2026-01-28 18:18   ` Guenter Roeck
  2026-01-28 18:45     ` Rafael J. Wysocki
@ 2026-01-29  1:55     ` lihuisong (C)
  1 sibling, 0 replies; 60+ messages in thread
From: lihuisong (C) @ 2026-01-29  1:55 UTC (permalink / raw)
  To: Guenter Roeck, Rafael J. Wysocki
  Cc: Jaroslav Pulchart, linux-acpi, linux-hwmon, Igor Raits,
	Daniel Secik, Zdenek Pesek, Jiri Jurica


On 1/29/2026 2:18 AM, Guenter Roeck wrote:
> Hi all,
>
> On Thu, Jan 22, 2026 at 07:55:35PM +0100, Rafael J. Wysocki wrote:
>> On Thu, Jan 22, 2026 at 7:21 PM Jaroslav Pulchart
>> <jaroslav.pulchart@gooddata.com> wrote:
>>> Hello,
>>>
>>> after upgrading from kernel 6.17.y to 6.18.y we started to observe a regression
>>> in the ACPI power meter hwmon interface. Reading power*_average sysfs
>>> attributes blocks indefinitely and causes tasks to enter uninterruptible
>>> sleep (D state).
>> The most recent change in the acpi_power_meter driver was made in
>> 6.15, so this is not a regression in that driver.
>>
>> Also, nothing suspicious is done in power1_average_min_show() and
>> power1_average_min_store() AFAICS.
>>
> I decided to ask AI (Gemini 3, more specifically) for an analysis of the
> problem, using Chris Mason's prompts for guidance. Result is below.
>
> The suggested fix may be a feasible workaround. Let me know what you think.
>
> Thanks,
> Guenter
>
> ---
> Summary of crash or warning
> Deadlock in acpi_power_meter driver between sysfs read and ACPI notify.
>
> Kernel version if available
> 6.18.6-1.gdc.el9.x86_64
>
> Cleaned up copy of oops or stack trace
> [  740.721209] Call Trace:
> [  741.467334]  power_meter_read+0x1ea/0x2c0 [acpi_power_meter]
> [  741.475320]  hwmon_attr_show+0x5e/0x130
> [  741.481343]  dev_attr_show+0x19/0x60
> [  741.487325]  sysfs_kf_seq_show+0xbf/0x140
> [  741.494315]  seq_read_iter+0x112/0x510
> [  741.503544]  vfs_read+0x215/0x340
>
> [  740.286380] INFO: task alloy:5901 blocked for more than 122 seconds.
> [  740.326290] Call Trace:
> [  740.336311]  __schedule+0x2b5/0x690
> [  740.341312]  schedule+0x23/0x80
> [  740.354335]  __mutex_lock.constprop.0+0x3c9/0xa00
> [  740.360318]  hwmon_attr_show+0x36/0x130
>
> Explanation of the problem
> The acpi_power_meter driver exhibits a deadlock when acpi_power_meter_notify
> is called with METER_NOTIFY_CONFIG while a userspace process is concurrently
> reading a sysfs attribute (e.g., power1_average).
>
> The deadlock occurs because acpi_power_meter_notify acquires resource->lock
> and then calls hwmon_device_unregister. The hwmon_device_unregister function
> (via device_unregister and kernfs_remove) waits for all active sysfs callbacks
> to complete. However, the active sysfs callback (power_meter_read) tries to
> acquire resource->lock which is already held by acpi_power_meter_notify.
>
> Sequence of events:
> 1. Userspace reads power1_average.
> 2. hwmon_attr_show is called. It calls power_meter_read.
> 3. power_meter_read attempts to acquire resource->lock.
> 4. Concurrently, ACPI triggers acpi_power_meter_notify with METER_NOTIFY_CONFIG.
> 5. acpi_power_meter_notify acquires resource->lock.
> 6. acpi_power_meter_notify calls hwmon_device_unregister.
> 7. hwmon_device_unregister waits for the active hwmon_attr_show (and thus
>     power_meter_read) to return.
> 8. power_meter_read is blocked waiting for resource->lock.
> 9. acpi_power_meter_notify is blocked waiting for power_meter_read to finish.
Great analysis. Amazing.
Fully figured out this issue.

^ permalink raw reply	[flat|nested] 60+ messages in thread

end of thread, other threads:[~2026-02-13 12:47 UTC | newest]

Thread overview: 60+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-22 18:21 acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y Jaroslav Pulchart
2026-01-22 18:55 ` Rafael J. Wysocki
2026-01-22 19:51   ` Guenter Roeck
2026-01-22 20:42     ` Rafael J. Wysocki
2026-01-22 22:28       ` Guenter Roeck
2026-01-23 10:19         ` Jaroslav Pulchart
2026-01-23 18:23           ` Guenter Roeck
2026-01-27 11:58             ` Jaroslav Pulchart
2026-01-27 14:24               ` Guenter Roeck
2026-01-29 15:26                 ` Jaroslav Pulchart
2026-01-29 18:22                   ` Guenter Roeck
2026-02-02 16:48                     ` Jaroslav Pulchart
2026-02-02 18:00                       ` Guenter Roeck
2026-02-02 18:14                         ` Jaroslav Pulchart
2026-02-02 19:26                           ` Rafael J. Wysocki
2026-02-03  0:26                             ` Guenter Roeck
2026-02-03  8:23                             ` Jaroslav Pulchart
2026-02-03 23:21                               ` [BISECTED]: " Jaroslav Pulchart
2026-02-04  1:01                                 ` Guenter Roeck
2026-02-04  8:20                                   ` Jaroslav Pulchart
2026-02-04 19:54                                   ` Igor Raits
2026-02-04 22:48                                     ` Guenter Roeck
2026-02-05  7:25                                       ` Igor Raits
2026-02-05 17:51                                         ` Guenter Roeck
2026-02-05 19:04                                           ` Rafael J. Wysocki
2026-02-05 20:57                                             ` [BISECTED - impi related]: " Guenter Roeck
2026-02-06 12:08                                               ` Rafael J. Wysocki
2026-02-06 15:58                                                 ` Corey Minyard
2026-02-06 19:33                                                   ` Rafael J. Wysocki
2026-02-10 16:31                                                   ` Rafael J. Wysocki
2026-02-12  9:10                                                     ` Jaroslav Pulchart
2026-02-12 12:27                                                       ` Rafael J. Wysocki
2026-02-12 16:48                                                         ` Corey Minyard
2026-02-12 17:22                                                           ` Rafael J. Wysocki
2026-02-12 18:34                                                             ` Corey Minyard
2026-02-12 21:33                                                               ` Rafael J. Wysocki
2026-02-12 22:06                                                                 ` Corey Minyard
2026-02-13  6:55                                                                 ` Corey Minyard
2026-02-13 12:47                                                                   ` Rafael J. Wysocki
2026-02-06 16:08                                                 ` Corey Minyard
2026-02-06 16:31                                                   ` Guenter Roeck
2026-02-06 19:35                                                     ` Rafael J. Wysocki
2026-01-23  2:53       ` lihuisong (C)
2026-01-27 16:13         ` Guenter Roeck
2026-01-28 18:18   ` Guenter Roeck
2026-01-28 18:45     ` Rafael J. Wysocki
2026-01-28 19:52       ` Rafael J. Wysocki
2026-01-28 21:52         ` Guenter Roeck
2026-01-29 14:18           ` [PATCH v1] hwmon: (acpi_power_meter) Fix deadlocks related to acpi_power_meter_notify() Rafael J. Wysocki
2026-01-30  1:07             ` Guenter Roeck
2026-01-30 17:51               ` Rafael J. Wysocki
2026-01-30 18:51                 ` Guenter Roeck
2026-01-30  1:47             ` lihuisong (C)
2026-01-30  5:07               ` Guenter Roeck
2026-01-30  8:40                 ` lihuisong (C)
2026-01-30 17:53                   ` Rafael J. Wysocki
2026-01-31 10:06                     ` lihuisong (C)
2026-01-31 15:29             ` Guenter Roeck
2026-02-01 11:40               ` Rafael J. Wysocki
2026-01-29  1:55     ` acpi_power_meter: power*_average sysfs read hangs, mutex deadlock in hwmon_attr_show since v6.18.y lihuisong (C)

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.