All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lev Kujawski <lkujaw@member.fsf.org>
To: Sean Christopherson <seanjc@google.com>
Cc: kvm@vger.kernel.org, lkujaw@member.fsf.org
Subject: Re: [PATCH] KVM: set_msr_mce: Permit guests to ignore single-bit ECC errors
Date: Mon, 23 May 2022 17:48:37 -0400	[thread overview]
Message-ID: <874k1gnlre.fsf@iridium.uucp> (raw)
In-Reply-To: <You/kms+AnKE1t0L@google.com>


Sean Christopherson writes:

> "KVM: x86:" for the shortlog scope.
>
> On Sat, May 21, 2022, Lev Kujawski wrote:
>> Certain guest operating systems (e.g., UNIXWARE) clear bit 0 of
>> MC1_CTL to ignore single-bit ECC data errors.
>
> Not that it really matters, but is this behavior documented anywhere?  I've searched
> a variety of SDMs, APMs, and PPRs, and can't find anything that documents this exact
> behavior.  I totally believe that some CPUs behave this way, but it'd be nice to
> document exactly which generations of whose CPUs allow clearing bit zero.

Intel's coverage of IA32_MC1_CTL appears to be proprietary (perhaps
Appendix H material), but AMD helpfully documented it on page 204 of
their BIOS and Kernel Developer's Guide:

https://www.amd.com/system/files/TechDocs/26094.PDF

I experimentally determined that UNIXWARE writes MC1_CTL on QEMU models
"pentium2" or newer, but my guess is that this functionality was
actually introduced with the Pentium Pro.

>> Single-bit ECC data errors are always correctable and thus are safe to ignore
>> because they are informational in nature rather than signaling a loss of data
>> integrity.
>> 
>> Prior to this patch, these guests would crash upon writing MC1_CTL,
>> with resultant error messages like the following:
>> 
>> error: kvm run failed Operation not permitted
>> EAX=fffffffe EBX=fffffffe ECX=00000404 EDX=ffffffff
>> ESI=ffffffff EDI=00000001 EBP=fffdaba4 ESP=fffdab20
>> EIP=c01333a5 EFL=00000246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
>> ES =0108 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
>> CS =0100 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA]
>> SS =0108 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
>> DS =0108 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
>> FS =0000 00000000 ffffffff 00c00000
>> GS =0000 00000000 ffffffff 00c00000
>> LDT=0118 c1026390 00000047 00008200 DPL=0 LDT
>> TR =0110 ffff5af0 00000067 00008b00 DPL=0 TSS32-busy
>> GDT=     ffff5020 000002cf
>> IDT=     ffff52f0 000007ff
>> CR0=8001003b CR2=00000000 CR3=0100a000 CR4=00000230
>> DR0=00000000 DR1=00000000 DR2=00000000 DR3=00000000
>> DR6=ffff0ff0 DR7=00000400
>> EFER=0000000000000000
>> Code=08 89 01 89 51 04 c3 8b 4c 24 08 8b 01 8b 51 04 8b 4c 24 04 <0f>
>> 30 c3 f7 05 a4 6d ff ff 10 00 00 00 74 03 0f 31 c3 33 c0 33 d2 c3 8d
>> 74 26 00 0f 31 c3
>> 
>> Signed-off-by: Lev Kujawski <lkujaw@member.fsf.org>
>> ---
>>  arch/x86/kvm/x86.c | 7 +++++--
>>  1 file changed, 5 insertions(+), 2 deletions(-)
>> 
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 4790f0d7d40b..128dca4e7bb7 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -3215,10 +3215,13 @@ static int set_msr_mce(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>>  			/* only 0 or all 1s can be written to IA32_MCi_CTL
>>  			 * some Linux kernels though clear bit 10 in bank 4 to
>>  			 * workaround a BIOS/GART TBL issue on AMD K8s, ignore
>> -			 * this to avoid an uncatched #GP in the guest
>> +			 * this to avoid an uncatched #GP in the guest.
>> +			 *
>> +			 * UNIXWARE clears bit 0 of MC1_CTL to ignore
>> +			 * correctable, single-bit ECC data errors.
>>  			 */
>>  			if ((offset & 0x3) == 0 &&
>> -			    data != 0 && (data | (1 << 10)) != ~(u64)0)
>> +			    data != 0 && (data | (1 << 10) | 1) != ~(u64)0)
>>  				return -1;
>
> If KVM injects a #GP like it's supposed to[*], will UNIXWARE eat the #GP and continue
> on, or will it explode?  If it continues on, I'd prefer to avoid more special casing in
> KVM.
>
> If it explodes, I think my preference would be to just drop the MCi_CTL checks
> entirely.  AFAICT, P4-based and P5-based Intel CPus, and all? AMD CPUs allow
> setting/clearing arbitrary bits.  The checks really aren't buying us anything,
> and it seems like Intel retroactively defined the "architectural" behavior of
> only 0s/1s.
>
> [*] https://lore.kernel.org/all/20220512222716.4112548-2-seanjc@google.com

Unfortunately, I cannot say if the UNIXWARE kernel would panic because
QEMU enters a STOP state from which attempts to continue are met with
"Error: Resetting the Virtual Machine is required."

Thanks for the feedback, Lev


  parent reply	other threads:[~2022-05-23 21:48 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-21  8:15 [PATCH] KVM: set_msr_mce: Permit guests to ignore single-bit ECC errors Lev Kujawski
2022-05-23 17:08 ` Sean Christopherson
2022-05-23 19:20   ` Paolo Bonzini
2022-05-23 21:48   ` Lev Kujawski [this message]
2022-05-23 19:11 ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=874k1gnlre.fsf@iridium.uucp \
    --to=lkujaw@member.fsf.org \
    --cc=kvm@vger.kernel.org \
    --cc=seanjc@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.