Kernel KVM virtualization development
 help / color / mirror / Atom feed
From: "Carlos López" <clopez@suse.de>
To: kvm@vger.kernel.org, seanjc@google.com, pbonzini@redhat.com
Cc: linux-kernel@vger.kernel.org, x86@kernel.org, tglx@kernel.org,
	mingo@redhat.com, dave.hansen@linux.intel.com, hpa@zytor.com,
	"Carlos López" <clopez@suse.de>, "Borislav Petkov" <bp@alien8.de>,
	"Ying Huang" <huang.ying.caritas@gmail.com>,
	"Avi Kivity" <avi@redhat.com>
Subject: [PATCH v3 2/2] KVM: x86: Fix MCE logging rules for KVM_X86_SET_MCE
Date: Tue,  9 Jun 2026 15:18:56 +0200	[thread overview]
Message-ID: <20260609131856.2562222-4-clopez@suse.de> (raw)
In-Reply-To: <20260609131856.2562222-2-clopez@suse.de>

When userspace issues KVM_X86_SET_MCE, kvm_vcpu_ioctl_x86_set_mce()
decides whether to log an uncorrectable MCE by looking at the
corresponding IA32_MCi_CTL MSR. This is not the behavior specified in
the Intel SDM (17.3.2.1 IA32_MCi_CTL MSRs):

  Setting an EEj flag enables signaling #MC of the associated error and
  clearing it disables signaling of the error. Error logging happens
  regardless of the setting of these bits. The processor drops writes to
  bits that are not implemented.

Perform the logging before checking MCi_CTL, unless there is already
a valid UC error logged for the bank, in which case the SDM (17.3.2.2
"IA32_MCi_STATUS MSRS") specifies that error information should not
be overwritten.

To avoid even more complex control flow, hoist the logging logic into a
separate function, which then enables the removal of the non-UC branch
in kvm_vcpu_ioctl_x86_set_mce(), which only existed to perform logging.

Note that the SDM is ambiguous regarding the effects of IA32_MCG_CTL on
logging, so preserve the existing logic (i.e. do not log the error if
MCG_CTL is disabled).

Fixes: 890ca9aefa78 ("KVM: Add MCE support")
Signed-off-by: Carlos López <clopez@suse.de>
---
 arch/x86/kvm/x86.c | 62 +++++++++++++++++++++++++++++-----------------
 1 file changed, 39 insertions(+), 23 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 77a780177c4e..af3662aa3ce3 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5462,6 +5462,28 @@ static int kvm_vcpu_x86_set_ucna(struct kvm_vcpu *vcpu, struct kvm_x86_mce *mce,
 	return 0;
 }
 
+/*
+ * Record @mce into @banks per the SDM logging rules:
+ *   - Per 17.3.2.1, logging happens regardless of IA32_MCi_CTL.
+ *   - Per 17.3.2.2.1 and 17.3.2.2, a UC error does not overwrite a
+ *     previous valid UC log in the same bank; only OVER is set.
+ *     Any other prior state (invalid, or valid non-UC) is replaced,
+ *     with OVER set if a prior valid log was overwritten.
+ */
+static void kvm_log_mce(u64 *banks, struct kvm_x86_mce *mce)
+{
+	bool overflow = banks[1] & MCI_STATUS_VAL;
+
+	if (!overflow || !(banks[1] & MCI_STATUS_UC)) {
+		banks[2] = mce->addr;
+		banks[3] = mce->misc;
+		banks[1] = mce->status;
+	}
+
+	if (overflow)
+		banks[1] |= MCI_STATUS_OVER;
+}
+
 static int kvm_vcpu_ioctl_x86_set_mce(struct kvm_vcpu *vcpu,
 				      struct kvm_x86_mce *mce)
 {
@@ -5485,34 +5507,28 @@ static int kvm_vcpu_ioctl_x86_set_mce(struct kvm_vcpu *vcpu,
 	if ((mce->status & MCI_STATUS_UC) && (mcg_cap & MCG_CTL_P) &&
 	    vcpu->arch.mcg_ctl != ~(u64)0)
 		return 0;
+
+	kvm_log_mce(banks, mce);
+
+	if (!(mce->status & MCI_STATUS_UC))
+		return 0;
+
 	/*
 	 * if IA32_MCi_CTL is not all 1s, the uncorrected error
 	 * reporting is disabled for the bank
 	 */
-	if ((mce->status & MCI_STATUS_UC) && banks[0] != ~(u64)0)
+	if (banks[0] != ~(u64)0)
 		return 0;
-	if (mce->status & MCI_STATUS_UC) {
-		if ((vcpu->arch.mcg_status & MCG_STATUS_MCIP) ||
-		    !kvm_is_cr4_bit_set(vcpu, X86_CR4_MCE)) {
-			kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
-			return 0;
-		}
-		if (banks[1] & MCI_STATUS_VAL)
-			mce->status |= MCI_STATUS_OVER;
-		banks[2] = mce->addr;
-		banks[3] = mce->misc;
-		vcpu->arch.mcg_status = mce->mcg_status;
-		banks[1] = mce->status;
-		kvm_queue_exception(vcpu, MC_VECTOR);
-	} else if (!(banks[1] & MCI_STATUS_VAL)
-		   || !(banks[1] & MCI_STATUS_UC)) {
-		if (banks[1] & MCI_STATUS_VAL)
-			mce->status |= MCI_STATUS_OVER;
-		banks[2] = mce->addr;
-		banks[3] = mce->misc;
-		banks[1] = mce->status;
-	} else
-		banks[1] |= MCI_STATUS_OVER;
+
+	if ((vcpu->arch.mcg_status & MCG_STATUS_MCIP) ||
+	    !kvm_is_cr4_bit_set(vcpu, X86_CR4_MCE)) {
+		kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
+		return 0;
+	}
+
+	vcpu->arch.mcg_status = mce->mcg_status;
+	kvm_queue_exception(vcpu, MC_VECTOR);
+
 	return 0;
 }
 
-- 
2.51.0


      parent reply	other threads:[~2026-06-09 13:22 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-09 13:18 [PATCH v3 0/2] KVM: x86: MCE fixes Carlos López
2026-06-09 13:18 ` [PATCH v3 1/2] KVM: x86: Fix array_index_nospec() protection in kvm_vcpu_ioctl_x86_set_mce() Carlos López
2026-06-09 13:18 ` Carlos López [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260609131856.2562222-4-clopez@suse.de \
    --to=clopez@suse.de \
    --cc=avi@redhat.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=huang.ying.caritas@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=tglx@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox