From: Corey Minyard <corey@minyard.net>
To: Matt Fleming <matt@readmodwrite.com>, Tony Camuso <tcamuso@redhat.com>
Cc: openipmi-developer@lists.sourceforge.net,
linux-kernel@vger.kernel.org, kernel-team@cloudflare.com,
Matt Fleming <mfleming@cloudflare.com>
Subject: Re: [PATCH] ipmi: Add timeout to unconditional wait in __get_device_id()
Date: Wed, 15 Apr 2026 07:16:53 -0500 [thread overview]
Message-ID: <ad-BtS5b3qiowqb7@mail.minyard.net> (raw)
In-Reply-To: <20260415115930.3428942-1-matt@readmodwrite.com>
On Wed, Apr 15, 2026 at 12:59:30PM +0100, Matt Fleming wrote:
> From: Matt Fleming <mfleming@cloudflare.com>
>
> When the BMC does not respond to a "Get Device ID" command, the
> wait_event() in __get_device_id() blocks forever in TASK_UNINTERRUPTIBLE
> while holding bmc->dyn_mutex. Every subsequent sysfs reader then piles
> up in D state. Replace with wait_event_timeout() to return -EIO after 1
> second.
This is the second report I have of something like this. So something
is up. I'm adding Tony, who reported something like this dealing with
the watchdog.
The lower level driver should never not return an answer, it is supposed
to guarantee that it returns an error if the BMC doesn't respond.
So the bug is not here, the bug is elsewhere. My guess is that there
is some new failure mode where a BMC is not working but it responds well
enough that it sort of works and fools the driver. But that's only a
guess.
I've seen this before in several scenarios, including a system that put
IPMI in the ACPI tree and it sort of worked but there was no BMC
present. I had to disable that particular device.
What hardware is involved here?
Can you give a more detailed example of what's happening in the
low-level hardware? If it's KCS there's a debug flag in the
drivers/char/ipmi/ipmi_kcs_sm.c file that should help.
Thanks,
-corey
>
> Signed-off-by: Matt Fleming <matt@readmodwrite.com>
> ---
> drivers/char/ipmi/ipmi_msghandler.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/char/ipmi/ipmi_msghandler.c b/drivers/char/ipmi/ipmi_msghandler.c
> index c41f51c82edd..efa9588e8210 100644
> --- a/drivers/char/ipmi/ipmi_msghandler.c
> +++ b/drivers/char/ipmi/ipmi_msghandler.c
> @@ -2599,7 +2599,13 @@ static int __get_device_id(struct ipmi_smi *intf, struct bmc_device *bmc)
> if (rv)
> goto out_reset_handler;
>
> - wait_event(intf->waitq, bmc->dyn_id_set != 2);
> + if (!wait_event_timeout(intf->waitq, bmc->dyn_id_set != 2,
> + msecs_to_jiffies(1000))) {
> + dev_warn(intf->si_dev,
> + "Timed out waiting for get bmc device id response\n");
> + rv = -EIO;
> + goto out_reset_handler;
> + }
>
> if (!bmc->dyn_id_set) {
> if (bmc->cc != IPMI_CC_NO_ERROR &&
> --
> 2.43.0
>
next prev parent reply other threads:[~2026-04-15 12:16 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-15 11:59 [PATCH] ipmi: Add timeout to unconditional wait in __get_device_id() Matt Fleming
2026-04-15 12:16 ` Corey Minyard [this message]
2026-04-15 15:46 ` Tony Camuso
2026-04-15 21:22 ` Frederick Lawler
2026-04-16 14:28 ` Tony Camuso
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ad-BtS5b3qiowqb7@mail.minyard.net \
--to=corey@minyard.net \
--cc=kernel-team@cloudflare.com \
--cc=linux-kernel@vger.kernel.org \
--cc=matt@readmodwrite.com \
--cc=mfleming@cloudflare.com \
--cc=openipmi-developer@lists.sourceforge.net \
--cc=tcamuso@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox