From: Corey Minyard <cminyard@mvista.com>
To: Suresh Marisetty <smarisetty@yahoo.com>
Cc: "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
Peter Maydell <peter.maydell@linaro.org>,
Philippe Mathieu-Daude <philmd@linaro.org>,
"Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: [PATCH 3/3] hw/ipmi: Accept any extern BMC msg_id to fix SMM firmware deadlock
Date: Sun, 14 Jun 2026 18:08:19 -0500 [thread overview]
Message-ID: <ai80Y2TVcU3tmoBj@mail.minyard.net> (raw)
In-Reply-To: <703047508.1680926.1781456306974@mail.yahoo.com>
[-- Attachment #1: Type: text/plain, Size: 6028 bytes --]
A number of problems.
First, I'm having to top-post because you sent this as an HTML message,
which is not allowed on this mailing list. There are a number of rules
and tools to validate patches, read the documentation on how to submit
patches.
Second, the coding style is bad. It's just a hack.
Third, this whole fix is a hack. The design of the BMCs here is to
allow only one outstanding message to an IPMI BMC. The BT interface
capabilities response says this. The entire design of this allows
only one outstanding message. The changes you are making break
the checks around this, which is asking for trouble. If you want
to fix this right, you will need to re-design this to handle
multiple outstanding messages.
Really, what's broken here is the firmware. Having two separate pieces
of code talk to the same device without a shared device driver is just
broken. Plus it's not querying the capabilities of the device.
It's strange that SMM holds the big QEMU lock. I don't know much about
that, but it doesn't sound right to me.
Anyway, this certainly isn't the right fix. I'm not sure what the
right fix is because I don't completely understand what's going on.
-corey
On Sun, Jun 14, 2026 at 04:58:26PM +0000, Suresh Marisetty wrote:
> From smarisetty@yahoo.com Fri Jun 13 2026From: Suresh Marisetty <smarisetty@yahoo.com>Date: Fri, 13 Jun 2026 18:00:03 -0700Subject: [PATCH 3/3] hw/ipmi: Accept any extern BMC msg_id to fix SMM firmware deadlockTo: qemu-devel@nongnu.orgCc: Corey Minyard <cminyard@mvista.com>, Peter Maydell <peter.maydell@linaro.org>, Philippe Mathieu-Daude <philmd@linaro.org>, Michael S. Tsirkin <mst@redhat.com>Message-Id: <20260613180000.1-3-smarisetty@yahoo.com>In-Reply-To: <20260613180000.1-0-smarisetty@yahoo.com>References: <20260613180000.1-0-smarisetty@yahoo.com>MIME-Version: 1.0Content-Type: text/plain; charset=UTF-8 ipmi_bt_handle_rsp() accepts a response from the extern BMC only when: ib->waiting_rsp == msg_id || msg_id == 0xFF This strict msg_id matching cannot be satisfied by firmware that sendsIPMI BT commands from System Management Mode (SMM). Root cause — SMM and the QEMU main loop are mutually exclusive: When a guest vCPU enters SMM, the vCPU thread holds the Big QEMU Lock(BQL) for the duration of the SMM handler. The ipmi-bmc-extern chardevreceive callback (ipmi_bmc_extern_receive) runs in the QEMU main eventloop, which cannot execute while the BQL is held. Consequently, any ACKsent by the extern BMC in response to an IPMI BT command issued fromSMM can only be received and processed by QEMU *after* the SMM handlerhas exited and released the BQL. This creates an unresolvable deadlock if SMM firmware polls for B_BUSYto clear after sending a BT frame: the firmware waits inside SMM forB_BUSY to deassert, but QEMU cannot process the extern BMC ACK (whichwould deassert B_BUSY) until SMM exits. SMM never exits because it iswaiting for B_BUSY. Secondary cause — sequence counter divergence between DXE and SMM: UEFI platforms typically have multiple firmware modules sharing the sameIPMI BT interface: DXE-phase drivers (e.g. TFLiteDxe) and SMM handlers(e.g. TrustForgeSmmDxe) each have independent instances of the BTlibrary with independent msg_id counters starting from 0. DXE sendsframes with msg_id=0,1,... advancing QEMU's waiting_rsp counter. WhenSMM subsequently sends frames starting from msg_id=0 or a differentbase, the msg_id values no longer match waiting_rsp, and all SMM ACKsare silently dropped. B_BUSY remains set permanently. Correct design for SMM BT usage: Firmware that sends IPMI BT from SMM must use a "fire-and-forget"pattern: write the BT frame, assert H2B_ATN, and return from SMMimmediately without polling for B_BUSY or B2H_ATN. QEMU then processesthe frame after SMM exits, the extern BMC sends an ACK, and QEMU clearsB_BUSY. The next SMI entry can drain the pending ACK from the BT FIFObefore sending the next frame. This pattern requires QEMU to accept theextern BMC ACK regardless of msg_id. Fix: Remove the waiting_rsp == msg_id check. Accept any response from theextern BMC unconditionally. waiting_rsp continues to increment so thatthe outgoing BT sequence presented to the guest firmware (via theGET_BT_INTERFACE_CAPABILITIES "outstanding requests" field) remainsconsistent. The existing msg_id == 0xFF special case for the extern init timeout issubsumed by this change. This bug affects any QEMU version where firmware sends IPMI BT commandsfrom SMM, including all versions through 10.0.0. Reported-by: Suresh Marisetty <smarisetty@yahoo.com>Signed-off-by: Suresh Marisetty <smarisetty@yahoo.com>--- hw/ipmi/ipmi_bt.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/hw/ipmi/ipmi_bt.c b/hw/ipmi/ipmi_bt.cindex c3f7e25a91..8b4d2e7f03 100644--- a/hw/ipmi/ipmi_bt.c+++ b/hw/ipmi/ipmi_bt.c@@ -152,9 +152,15 @@ static void ipmi_bt_handle_rsp(IPMIInterface *ii, uint8_t msg_id, IPMIInterfaceClass *iic = IPMI_INTERFACE_GET_CLASS(ii); IPMIBT *ib = iic->get_backend_data(ii); - if (ib->waiting_rsp == msg_id || msg_id == 0xFF) { /* 0xFF = extern init timeout */- ib->waiting_rsp++;- if (rsp_len > (sizeof(ib->outmsg) - 2)) {+ /*+ * Accept any msg_id from the extern BMC unconditionally.+ *+ * SMM firmware cannot guarantee msg_id matches waiting_rsp: the QEMU+ * main loop is paused while the vCPU executes SMM (BQL held), so the+ * extern chardev receive callback only runs after SMM exits. Firmware+ * using fire-and-forget BT sends from SMM has independent msg_id+ * counters (per DXE/SMM module instance) that diverge from+ * waiting_rsp, causing all SMM responses to be silently dropped.+ *+ * waiting_rsp increments unconditionally to maintain BT sequencing.+ */+ (void)msg_id;+ {+ ib->waiting_rsp++;+ if (rsp_len > (sizeof(ib->outmsg) - 2)) {--2.39.0
[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 3365 bytes --]
prev parent reply other threads:[~2026-06-14 23:09 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <703047508.1680926.1781456306974.ref@mail.yahoo.com>
2026-06-14 16:58 ` [PATCH 3/3] hw/ipmi: Accept any extern BMC msg_id to fix SMM firmware deadlock Suresh Marisetty
2026-06-14 23:08 ` Corey Minyard [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ai80Y2TVcU3tmoBj@mail.minyard.net \
--to=cminyard@mvista.com \
--cc=mst@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=philmd@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=smarisetty@yahoo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.