public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Matt Fleming <matt@readmodwrite.com>
To: Corey Minyard <corey@minyard.net>
Cc: Tony Camuso <tcamuso@redhat.com>,
	 openipmi-developer@lists.sourceforge.net,
	linux-kernel@vger.kernel.org, kernel-team@cloudflare.com,
	 Matt Fleming <mfleming@cloudflare.com>
Subject: Re: [PATCH] ipmi: Add timeout to unconditional wait in __get_device_id()
Date: Fri, 17 Apr 2026 16:41:10 +0100	[thread overview]
Message-ID: <aeJNOc0YgQ4GIyLK@matt-Precision-5490> (raw)
In-Reply-To: <ad-BtS5b3qiowqb7@mail.minyard.net>

On Wed, Apr 15, 2026 at 07:16:53AM -0500, Corey Minyard wrote:
> 
> I've seen this before in several scenarios, including a system that put
> IPMI in the ACPI tree and it sort of worked but there was no BMC
> present.  I had to disable that particular device.
> 
> What hardware is involved here?
 
I'm fairly sure we've seen this across a bunch of different BMCs, so
it's not a BMC hardware thing. Almost certainly a driver issue.

> Can you give a more detailed example of what's happening in the
> low-level hardware?  If it's KCS there's a debug flag in the
> drivers/char/ipmi/ipmi_kcs_sm.c file that should help.

Yep, it's KCS. Unfortunately I haven't found a way to reproduce this
reliably yet.

Looking at a wedged machine (cat /sys/class/ipmi/.../firmware_revision)
with drgn I can see that there's 99,846 messages sat on intf->xmit_msgs
and the KCS SM is idle (it's responding to internal traffic like Get
Global Enables and Get Msg Flags). So it looks like completions are
getting dropped.

We're running a 6.18.18 kernel which includes c08ec55617cb ("ipmi: Fix
use-after-free and list corruption on sender error"), so it's not that.

Here's a dump of some of the data structures.

intf                       = 0xffff9d2f4a5a0000
intf->curr_msg             = 0xffff9d34f21a9c00    
intf->xmit_msgs.next       = 0xffff9d30c28e3c80 
intf->waiting_rcv_msgs     = empty
intf->maintenance_mode     = 0
intf->maintenance_mode_state = 0
intf->in_shutdown          = false
intf->seq_table            = 0/64 slots used
intf->smi_work.pending     = 0

The stuck message itself — intf->curr_msg:

msg @ 0xffff9d34f21a9c00
  .data      = { 0x18, 0x01 }           # NetFn 0x06 (App), cmd 0x01 = Get Device ID
  .data_size = 2
  .rsp_size  = 38                       
  .rsp[0..7] = 2c 01 00 00 ...          
                                        
                                        
  .done      = free_smi_msg             
  .user_data = NULL
  .msgid     = (internal GDI poll)
  .type      = IPMI_SMI_MSG_TYPE_NORMAL


smi_info                   = 0xffff9d2f4a010000
smi_info->si_state         = SI_NORMAL (0)
smi_info->curr_msg         = 0xffff9d2f48c7b800 
smi_info->waiting_msg      = NULL
smi_info->interrupt_disabled = false
smi_info->supports_event_msg_buff = true
smi_info->io.irq           = 0                     
smi_info->run_to_completion = false
smi_info->in_maintenance_mode = 0

Let me know if you want any other info. I'll try to trace this and
catch it reproducing.

  parent reply	other threads:[~2026-04-17 15:41 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-15 11:59 [PATCH] ipmi: Add timeout to unconditional wait in __get_device_id() Matt Fleming
2026-04-15 12:16 ` Corey Minyard
2026-04-15 15:46   ` Tony Camuso
2026-04-15 21:22     ` Frederick Lawler
2026-04-16 14:28       ` Tony Camuso
2026-04-17 16:01         ` Matt Fleming
2026-04-17 15:41   ` Matt Fleming [this message]
2026-04-17 22:23   ` Matt Fleming
2026-04-17 23:53     ` Corey Minyard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aeJNOc0YgQ4GIyLK@matt-Precision-5490 \
    --to=matt@readmodwrite.com \
    --cc=corey@minyard.net \
    --cc=kernel-team@cloudflare.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mfleming@cloudflare.com \
    --cc=openipmi-developer@lists.sourceforge.net \
    --cc=tcamuso@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox