From: Matt Fleming <matt@readmodwrite.com>
To: Corey Minyard <corey@minyard.net>
Cc: Tony Camuso <tcamuso@redhat.com>,
openipmi-developer@lists.sourceforge.net,
linux-kernel@vger.kernel.org, kernel-team@cloudflare.com,
Matt Fleming <mfleming@cloudflare.com>
Subject: Re: [PATCH] ipmi: Add timeout to unconditional wait in __get_device_id()
Date: Fri, 17 Apr 2026 16:41:10 +0100 [thread overview]
Message-ID: <aeJNOc0YgQ4GIyLK@matt-Precision-5490> (raw)
In-Reply-To: <ad-BtS5b3qiowqb7@mail.minyard.net>
On Wed, Apr 15, 2026 at 07:16:53AM -0500, Corey Minyard wrote:
>
> I've seen this before in several scenarios, including a system that put
> IPMI in the ACPI tree and it sort of worked but there was no BMC
> present. I had to disable that particular device.
>
> What hardware is involved here?
I'm fairly sure we've seen this across a bunch of different BMCs, so
it's not a BMC hardware thing. Almost certainly a driver issue.
> Can you give a more detailed example of what's happening in the
> low-level hardware? If it's KCS there's a debug flag in the
> drivers/char/ipmi/ipmi_kcs_sm.c file that should help.
Yep, it's KCS. Unfortunately I haven't found a way to reproduce this
reliably yet.
Looking at a wedged machine (cat /sys/class/ipmi/.../firmware_revision)
with drgn I can see that there's 99,846 messages sat on intf->xmit_msgs
and the KCS SM is idle (it's responding to internal traffic like Get
Global Enables and Get Msg Flags). So it looks like completions are
getting dropped.
We're running a 6.18.18 kernel which includes c08ec55617cb ("ipmi: Fix
use-after-free and list corruption on sender error"), so it's not that.
Here's a dump of some of the data structures.
intf = 0xffff9d2f4a5a0000
intf->curr_msg = 0xffff9d34f21a9c00
intf->xmit_msgs.next = 0xffff9d30c28e3c80
intf->waiting_rcv_msgs = empty
intf->maintenance_mode = 0
intf->maintenance_mode_state = 0
intf->in_shutdown = false
intf->seq_table = 0/64 slots used
intf->smi_work.pending = 0
The stuck message itself — intf->curr_msg:
msg @ 0xffff9d34f21a9c00
.data = { 0x18, 0x01 } # NetFn 0x06 (App), cmd 0x01 = Get Device ID
.data_size = 2
.rsp_size = 38
.rsp[0..7] = 2c 01 00 00 ...
.done = free_smi_msg
.user_data = NULL
.msgid = (internal GDI poll)
.type = IPMI_SMI_MSG_TYPE_NORMAL
smi_info = 0xffff9d2f4a010000
smi_info->si_state = SI_NORMAL (0)
smi_info->curr_msg = 0xffff9d2f48c7b800
smi_info->waiting_msg = NULL
smi_info->interrupt_disabled = false
smi_info->supports_event_msg_buff = true
smi_info->io.irq = 0
smi_info->run_to_completion = false
smi_info->in_maintenance_mode = 0
Let me know if you want any other info. I'll try to trace this and
catch it reproducing.
next prev parent reply other threads:[~2026-04-17 15:41 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-15 11:59 [PATCH] ipmi: Add timeout to unconditional wait in __get_device_id() Matt Fleming
2026-04-15 12:16 ` Corey Minyard
2026-04-15 15:46 ` Tony Camuso
2026-04-15 21:22 ` Frederick Lawler
2026-04-16 14:28 ` Tony Camuso
2026-04-17 16:01 ` Matt Fleming
2026-04-17 15:41 ` Matt Fleming [this message]
2026-04-17 22:23 ` Matt Fleming
2026-04-17 23:53 ` Corey Minyard
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aeJNOc0YgQ4GIyLK@matt-Precision-5490 \
--to=matt@readmodwrite.com \
--cc=corey@minyard.net \
--cc=kernel-team@cloudflare.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mfleming@cloudflare.com \
--cc=openipmi-developer@lists.sourceforge.net \
--cc=tcamuso@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox