* [PATCH v2 0/2] KVM: x86: MCE fixes @ 2026-06-09 10:55 Carlos López 2026-06-09 10:55 ` [PATCH v2 1/2] KVM: x86: Fix array_index_nospec() protection in kvm_vcpu_ioctl_x86_set_mce() Carlos López 2026-06-09 10:55 ` [PATCH v2 2/2] KVM: x86: Do not gate MCE logging based on IA32_MCi_CTL Carlos López 0 siblings, 2 replies; 5+ messages in thread From: Carlos López @ 2026-06-09 10:55 UTC (permalink / raw) To: kvm, seanjc, pbonzini Cc: linux-kernel, x86, tglx, mingo, dave.hansen, hpa, Carlos López These two patches are somewhat unrelated, but patch 2 came out of Sean's suggestions from reviewing the first patch (see [1]). Patch 1 is unchanged. Note that the SDM is ambiguous regarding the logging behavior for IA32_MCG_CTL, so additional correctness fixes may be warranted on top. [1] https://lore.kernel.org/all/20260516163412.601908-1-clopez@suse.de/ Carlos López (2): KVM: x86: Fix array_index_nospec() protection in kvm_vcpu_ioctl_x86_set_mce() KVM: x86: Do not gate MCE logging based on IA32_MCi_CTL arch/x86/kvm/x86.c | 32 ++++++++++++++++++++------------ 1 file changed, 20 insertions(+), 12 deletions(-) base-commit: de3a35be92d2391ece4bf3143ef2887192625fd0 -- 2.51.0 ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v2 1/2] KVM: x86: Fix array_index_nospec() protection in kvm_vcpu_ioctl_x86_set_mce() 2026-06-09 10:55 [PATCH v2 0/2] KVM: x86: MCE fixes Carlos López @ 2026-06-09 10:55 ` Carlos López 2026-06-09 10:55 ` [PATCH v2 2/2] KVM: x86: Do not gate MCE logging based on IA32_MCi_CTL Carlos López 1 sibling, 0 replies; 5+ messages in thread From: Carlos López @ 2026-06-09 10:55 UTC (permalink / raw) To: kvm, seanjc, pbonzini Cc: linux-kernel, x86, tglx, mingo, dave.hansen, hpa, Carlos López, Borislav Petkov, Jue Wang Commit aebc3ca19063 ("KVM: x86: Enable CMCI capability by default and handle injected UCNA errors") introduced kvm_vcpu_x86_set_ucna(), which accesses @vcpu->arch.mci_ctl2_banks[] using @mce->bank as the index. The @mce struct is user-controlled, provided via the KVM_X86_SET_MCE ioctl. The caller of this function, kvm_vcpu_ioctl_x86_set_mce(), bounds-checks @mce->bank and applies array_index_nospec() to advance the @banks pointer, but @mce->bank itself is passed through unclamped. On a speculative path that bypasses the bounds check, the raw @mce->bank value can index mci_ctl2_banks[] out-of-bounds. In practice this is a very weak gadget, and would at most allow leaking a single bit in a 64-bit integer, but prevent potential future issues by clamping @mce->bank in place with array_index_nospec(), before passing the struct to kvm_vcpu_x86_set_ucna(). Fixes: aebc3ca19063 ("KVM: x86: Enable CMCI capability by default and handle injected UCNA errors") Signed-off-by: Carlos López <clopez@suse.de> --- arch/x86/kvm/x86.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index cf122b8c3210..77a780177c4e 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -5472,7 +5472,8 @@ static int kvm_vcpu_ioctl_x86_set_mce(struct kvm_vcpu *vcpu, if (mce->bank >= bank_num || !(mce->status & MCI_STATUS_VAL)) return -EINVAL; - banks += array_index_nospec(4 * mce->bank, 4 * bank_num); + mce->bank = array_index_nospec(mce->bank, bank_num); + banks += 4 * mce->bank; if (is_ucna(mce)) return kvm_vcpu_x86_set_ucna(vcpu, mce, banks); -- 2.51.0 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH v2 2/2] KVM: x86: Do not gate MCE logging based on IA32_MCi_CTL 2026-06-09 10:55 [PATCH v2 0/2] KVM: x86: MCE fixes Carlos López 2026-06-09 10:55 ` [PATCH v2 1/2] KVM: x86: Fix array_index_nospec() protection in kvm_vcpu_ioctl_x86_set_mce() Carlos López @ 2026-06-09 10:55 ` Carlos López 2026-06-09 11:10 ` sashiko-bot 1 sibling, 1 reply; 5+ messages in thread From: Carlos López @ 2026-06-09 10:55 UTC (permalink / raw) To: kvm, seanjc, pbonzini Cc: linux-kernel, x86, tglx, mingo, dave.hansen, hpa, Carlos López, Borislav Petkov, Ying Huang, Avi Kivity When userspace issues KVM_X86_SET_MCE, kvm_vcpu_ioctl_x86_set_mce() decides whether to log an uncorrectable MCE by looking at the corresponding IA32_MCi_CTL MSR. This is not the behavior specified in the Intel SDM (17.3.2.1 IA32_MCi_CTL MSRs): Setting an EEj flag enables signaling #MC of the associated error and clearing it disables signaling of the error. Error logging happens regardless of the setting of these bits. The processor drops writes to bits that are not implemented. Hoist the logging code, which writes to the corresponding error banks, above the MCi_CTL check. Keep the #MC delivery after the check, as this is correct behavior. Fixes: 890ca9aefa78 ("KVM: Add MCE support") Signed-off-by: Carlos López <clopez@suse.de> --- arch/x86/kvm/x86.c | 29 ++++++++++++++++++----------- 1 file changed, 18 insertions(+), 11 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 77a780177c4e..722bead405b5 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -5485,24 +5485,31 @@ static int kvm_vcpu_ioctl_x86_set_mce(struct kvm_vcpu *vcpu, if ((mce->status & MCI_STATUS_UC) && (mcg_cap & MCG_CTL_P) && vcpu->arch.mcg_ctl != ~(u64)0) return 0; - /* - * if IA32_MCi_CTL is not all 1s, the uncorrected error - * reporting is disabled for the bank - */ - if ((mce->status & MCI_STATUS_UC) && banks[0] != ~(u64)0) - return 0; + if (mce->status & MCI_STATUS_UC) { + /* + * Per the SDM, error logging always happens independently of + * IA32_MCi_CTL + */ + if (banks[1] & MCI_STATUS_VAL) + mce->status |= MCI_STATUS_OVER; + banks[2] = mce->addr; + banks[3] = mce->misc; + banks[1] = mce->status; + + /* + * if IA32_MCi_CTL is not all 1s, the uncorrected error + * reporting is disabled for the bank + */ + if (banks[0] != ~(u64)0) + return 0; + if ((vcpu->arch.mcg_status & MCG_STATUS_MCIP) || !kvm_is_cr4_bit_set(vcpu, X86_CR4_MCE)) { kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu); return 0; } - if (banks[1] & MCI_STATUS_VAL) - mce->status |= MCI_STATUS_OVER; - banks[2] = mce->addr; - banks[3] = mce->misc; vcpu->arch.mcg_status = mce->mcg_status; - banks[1] = mce->status; kvm_queue_exception(vcpu, MC_VECTOR); } else if (!(banks[1] & MCI_STATUS_VAL) || !(banks[1] & MCI_STATUS_UC)) { -- 2.51.0 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v2 2/2] KVM: x86: Do not gate MCE logging based on IA32_MCi_CTL 2026-06-09 10:55 ` [PATCH v2 2/2] KVM: x86: Do not gate MCE logging based on IA32_MCi_CTL Carlos López @ 2026-06-09 11:10 ` sashiko-bot 2026-06-09 12:56 ` Carlos López 0 siblings, 1 reply; 5+ messages in thread From: sashiko-bot @ 2026-06-09 11:10 UTC (permalink / raw) To: Carlos López; +Cc: kvm Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider: New issues: - [High] The patch incorrectly overwrites the MCE banks when a double/triple fault occurs (MCIP flag already set). Pre-existing issues: - [High] The patch fails to apply the MCE logging fix to the global control register IA32_MCG_CTL, leaving MCE emulation architecturally broken. -- commit 146247f45f3a8d3f982b364bba19a51f76828ce0 Author: Carlos López <clopez@suse.de> KVM: x86: Do not gate MCE logging based on IA32_MCi_CTL This patch hoists the MCE logging logic above the IA32_MCi_CTL check. This ensures that KVM logs uncorrectable machine check errors even if reporting is disabled for that specific bank, conforming to the Intel SDM. > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -5486,24 +5486,31 @@ static int kvm_vcpu_ioctl_x86_set_mce(struct kvm_vcpu *vcpu, > if ((mce->status & MCI_STATUS_UC) && (mcg_cap & MCG_CTL_P) && > vcpu->arch.mcg_ctl != ~(u64)0) > return 0; [Severity: High] This isn't a bug introduced by this patch, but does this global control register check also need to be adjusted? Just like IA32_MCi_CTL, the Intel SDM states that error logging happens regardless of the setting of the IA32_MCG_CTL bits. By returning early here, KVM skips logging the uncorrectable error to the bank if the global reporting is disabled. Additionally, KVM currently suppresses generation and logging for all banks if any bit in IA32_MCG_CTL is 0, rather than checking the specific bit for the bank. Should the logging logic be hoisted above this check as well? > - /* > - * if IA32_MCi_CTL is not all 1s, the uncorrected error > - * reporting is disabled for the bank > - */ > - if ((mce->status & MCI_STATUS_UC) && banks[0] != ~(u64)0) > - return 0; > + > if (mce->status & MCI_STATUS_UC) { > + /* > + * Per the SDM, error logging always happens independently of > + * IA32_MCi_CTL > + */ > + if (banks[1] & MCI_STATUS_VAL) > + mce->status |= MCI_STATUS_OVER; > + banks[2] = mce->addr; > + banks[3] = mce->misc; > + banks[1] = mce->status; > + > + /* > + * if IA32_MCi_CTL is not all 1s, the uncorrected error > + * reporting is disabled for the bank > + */ > + if (banks[0] != ~(u64)0) > + return 0; > + > if ((vcpu->arch.mcg_status & MCG_STATUS_MCIP) || > !kvm_is_cr4_bit_set(vcpu, X86_CR4_MCE)) { > kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu); > return 0; > } [Severity: High] By hoisting the error logging block above the IA32_MCi_CTL check, does this now incorrectly overwrite the MCE banks when a double or triple fault occurs? According to the SDM, if the processor encounters another machine-check error while the MCIP flag is already set, it causes a triple fault and does not update the status registers for the second error. Since the logging block in kvm_vcpu_ioctl_x86_set_mce() now executes before checking for MCG_STATUS_MCIP, it looks like it will overwrite the original addr, misc, and status of the first error before requesting the triple fault. > - if (banks[1] & MCI_STATUS_VAL) > - mce->status |= MCI_STATUS_OVER; > - banks[2] = mce->addr; > - banks[3] = mce->misc; > vcpu->arch.mcg_status = mce->mcg_status; > - banks[1] = mce->status; > kvm_queue_exception(vcpu, MC_VECTOR); > } else if (!(banks[1] & MCI_STATUS_VAL) [ ... ] -- Sashiko AI review · https://sashiko.dev/#/patchset/20260609105525.2545638-2-clopez@suse.de?part=2 ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2 2/2] KVM: x86: Do not gate MCE logging based on IA32_MCi_CTL 2026-06-09 11:10 ` sashiko-bot @ 2026-06-09 12:56 ` Carlos López 0 siblings, 0 replies; 5+ messages in thread From: Carlos López @ 2026-06-09 12:56 UTC (permalink / raw) To: sashiko-reviews; +Cc: kvm On 6/9/26 1:10 PM, sashiko-bot@kernel.org wrote: >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >> --- a/arch/x86/kvm/x86.c >> +++ b/arch/x86/kvm/x86.c >> @@ -5486,24 +5486,31 @@ static int kvm_vcpu_ioctl_x86_set_mce(struct kvm_vcpu *vcpu, >> if ((mce->status & MCI_STATUS_UC) && (mcg_cap & MCG_CTL_P) && >> vcpu->arch.mcg_ctl != ~(u64)0) >> return 0; > > [Severity: High] > This isn't a bug introduced by this patch, but does this global control > register check also need to be adjusted? > > Just like IA32_MCi_CTL, the Intel SDM states that error logging happens > regardless of the setting of the IA32_MCG_CTL bits. By returning early > here, KVM skips logging the uncorrectable error to the bank if the global > reporting is disabled. Does the SDM specify so, though? As Sean noted, and as echoed in the cover letter, this is pretty ambiguous. > Additionally, KVM currently suppresses generation and logging for all > banks if any bit in IA32_MCG_CTL is 0, rather than checking the specific > bit for the bank. Should the logging logic be hoisted above this check > as well? > >> - /* >> - * if IA32_MCi_CTL is not all 1s, the uncorrected error >> - * reporting is disabled for the bank >> - */ >> - if ((mce->status & MCI_STATUS_UC) && banks[0] != ~(u64)0) >> - return 0; >> + >> if (mce->status & MCI_STATUS_UC) { >> + /* >> + * Per the SDM, error logging always happens independently of >> + * IA32_MCi_CTL >> + */ >> + if (banks[1] & MCI_STATUS_VAL) >> + mce->status |= MCI_STATUS_OVER; >> + banks[2] = mce->addr; >> + banks[3] = mce->misc; >> + banks[1] = mce->status; >> + >> + /* >> + * if IA32_MCi_CTL is not all 1s, the uncorrected error >> + * reporting is disabled for the bank >> + */ >> + if (banks[0] != ~(u64)0) >> + return 0; >> + >> if ((vcpu->arch.mcg_status & MCG_STATUS_MCIP) || >> !kvm_is_cr4_bit_set(vcpu, X86_CR4_MCE)) { >> kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu); >> return 0; >> } > > [Severity: High] > By hoisting the error logging block above the IA32_MCi_CTL check, does > this now incorrectly overwrite the MCE banks when a double or triple fault > occurs? > > According to the SDM, if the processor encounters another machine-check > error while the MCIP flag is already set, it causes a triple fault and > does not update the status registers for the second error. 17.3.1.2 "IA32_MCG_STATUS MSR" does not say anything about updating or not addr/misc/status depending on MCIP, it just says: The occurrence of a second Machine-Check Event while MCIP is set will cause the processor to enter a shutdown state. Claude helped me find a good counterexample to what Sashiko says regarding MCIP. Table 18-40 "Machine Check Error Codes for IA32_MC4_STATUS" says this about bits 19:16: Model specific error code bits 19:16. If MACOD = 40CH, MSCOD encoding should be interpreted as: 01H: MCE when CR4.MCE is clear. 02H: MCE when MCIP bit is set. MSCOD=02H is precisely a case of an update to an MCi_STATUS MSR while MCIP is set. > Since the logging block in kvm_vcpu_ioctl_x86_set_mce() now executes > before checking for MCG_STATUS_MCIP, it looks like it will overwrite the > original addr, misc, and status of the first error before requesting the > triple fault. 17.3.2.2 "IA32_MCi_STATUS MSRS" does say that a second UC *to the same bank* should not overwrite the first: OVER (machine check overflow) flag, bit 62 - (...). Uncorrected errors are not written over previous valid uncorrected errors. (...) But this has nothing to do with MCIP. So, IIUC, the the new behavior is still not completely correct because a second UC to the *same* bank would overwrite the first. I can prepare a v3 with this. > >> - if (banks[1] & MCI_STATUS_VAL) >> - mce->status |= MCI_STATUS_OVER; >> - banks[2] = mce->addr; >> - banks[3] = mce->misc; >> vcpu->arch.mcg_status = mce->mcg_status; >> - banks[1] = mce->status; >> kvm_queue_exception(vcpu, MC_VECTOR); >> } else if (!(banks[1] & MCI_STATUS_VAL) > [ ... ] > ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-06-09 12:56 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-06-09 10:55 [PATCH v2 0/2] KVM: x86: MCE fixes Carlos López 2026-06-09 10:55 ` [PATCH v2 1/2] KVM: x86: Fix array_index_nospec() protection in kvm_vcpu_ioctl_x86_set_mce() Carlos López 2026-06-09 10:55 ` [PATCH v2 2/2] KVM: x86: Do not gate MCE logging based on IA32_MCi_CTL Carlos López 2026-06-09 11:10 ` sashiko-bot 2026-06-09 12:56 ` Carlos López
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox