public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	kvm@vger.kernel.org,  rcu@vger.kernel.org,
	linux-kernel@vger.kernel.org,  Kevin Tian <kevin.tian@intel.com>,
	Yan Zhao <yan.y.zhao@intel.com>,
	 Yiwei Zhang <zzyiwei@google.com>,
	Lai Jiangshan <jiangshanlai@gmail.com>,
	 "Paul E. McKenney" <paulmck@kernel.org>,
	Josh Triplett <josh@joshtriplett.org>
Subject: Re: [PATCH 5/5] KVM: VMX: Always honor guest PAT on CPUs that support self-snoop
Date: Fri, 30 Aug 2024 09:13:03 -0700	[thread overview]
Message-ID: <ZtHvjzBFUbG3fcMc@google.com> (raw)
In-Reply-To: <87plpqt6uh.fsf@redhat.com>

On Fri, Aug 30, 2024, Vitaly Kuznetsov wrote:
> Vitaly Kuznetsov <vkuznets@redhat.com> writes:
> 
> > Sean Christopherson <seanjc@google.com> writes:
> >
> >> On Fri, Aug 30, 2024, Vitaly Kuznetsov wrote:
> >>> Gerd Hoffmann <kraxel@redhat.com> writes:
> >>> 
> >>> >> Necroposting!
> >>> >> 
> >>> >> Turns out that this change broke "bochs-display" driver in QEMU even
> >>> >> when the guest is modern (don't ask me 'who the hell uses bochs for
> >>> >> modern guests', it was basically a configuration error :-). E.g:
> >>> >
> >>> > qemu stdvga (the default display device) is affected too.
> >>> >
> >>> 
> >>> So far, I was only able to verify that the issue has nothing to do with
> >>> OVMF and multi-vcpu, it reproduces very well with
> >>> 
> >>> $ qemu-kvm -machine q35,accel=kvm,kernel-irqchip=split -name guest=c10s
> >>> -cpu host -smp 1 -m 16384 -drive file=/var/lib/libvirt/images/c10s-bios.qcow2,if=none,id=drive-ide0-0-0
> >>> -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1
> >>> -vnc :0 -device VGA -monitor stdio --no-reboot
> >>> 
> >>> Comparing traces of working and broken cases, I couldn't find anything
> >>> suspicious but I may had missed something of course. For now, it seems
> >>> like a userspace misbehavior resulting in a segfault.
> >>
> >> Guest userspace?
> >>
> >
> > Yes? :-) As Gerd described, video memory is "mapped into userspace so
> > the wayland / X11 display server can software-render into the buffer"
> > and it seems that wayland gets something unexpected in this memory and
> > crashes. 
> 
> Also, I don't know if it helps or not, but out of two hunks in
> 377b2f359d1f, it is the vmx_get_mt_mask() one which brings the
> issue. I.e. the following is enough to fix things:
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index f18c2d8c7476..733a0c45d1a6 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7659,13 +7659,11 @@ u8 vmx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio)
>  
>         /*
>          * Force WB and ignore guest PAT if the VM does NOT have a non-coherent
> -        * device attached and the CPU doesn't support self-snoop.  Letting the
> -        * guest control memory types on Intel CPUs without self-snoop may
> -        * result in unexpected behavior, and so KVM's (historical) ABI is to
> -        * trust the guest to behave only as a last resort.
> +        * device attached.  Letting the guest control memory types on Intel
> +        * CPUs may result in unexpected behavior, and so KVM's ABI is to trust
> +        * the guest to behave only as a last resort.
>          */
> -       if (!static_cpu_has(X86_FEATURE_SELFSNOOP) &&
> -           !kvm_arch_has_noncoherent_dma(vcpu->kvm))
> +       if (!kvm_arch_has_noncoherent_dma(vcpu->kvm))
>                 return (MTRR_TYPE_WRBACK << VMX_EPT_MT_EPTE_SHIFT) | VMX_EPT_IPAT_BIT;
>  
>         return (MTRR_TYPE_WRBACK << VMX_EPT_MT_EPTE_SHIFT);

Hmm, that suggests the guest kernel maps the buffer as WC.  And looking at the
bochs driver, IIUC, the kernel mappings via ioremap() are UC-, not WC.  So it
could be that userspace doesn't play nice with WC, but could it also be that the
QEMU backend doesn't play nice with WC (on Intel)?

Given that this is a purely synthetic device, is there any reason to use UC or WC?
I.e. can the bochs driver configure its VRAM buffers to be WB?  It doesn't look
super easy (the DRM/TTM code has so. many. layers), but it appears doable.  Since
the device only exists in VMs, it's possible the bochs driver has never run on
Intel CPUs with WC memtype.

The one thing that confuses and concerns me is that this broke in the first place.
KVM has honored guest PAT on SVM since forever, which is why I/we had decent
confidence KVM could honor guest PAT on VMX without breaking anything.  SVM (NPT)
has an explicitlyed document special "WC+" memtype, where guest=WC && host=WB == WC+,
and WC+ accesses snoop caches on all CPUs.

But per Intel engineers, Intel CPUs with self-snoop are supposed to snoop caches
on all processors too.

I assume this same setup works fine on AMD/SVM?  If so, we probably need to do
more digging before fudging around this in the guest.

  reply	other threads:[~2024-08-30 16:13 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-09  1:09 [PATCH 0/5] KVM: VMX: Drop MTRR virtualization, honor guest PAT Sean Christopherson
2024-03-09  1:09 ` [PATCH 1/5] KVM: x86: Remove VMX support for virtualizing guest MTRR memtypes Sean Christopherson
2024-03-11  7:44   ` Yan Zhao
2024-03-12  0:08     ` Sean Christopherson
2024-03-12  1:10   ` Dongli Zhang
2024-03-12 17:08     ` Sean Christopherson
2024-03-14 10:31       ` Dongli Zhang
2024-03-14 14:47         ` Sean Christopherson
2024-03-09  1:09 ` [PATCH 2/5] KVM: VMX: Drop support for forcing UC memory when guest CR0.CD=1 Sean Christopherson
2024-03-09  1:09 ` [PATCH 3/5] srcu: Add an API for a memory barrier after SRCU read lock Sean Christopherson
2024-03-09  1:09 ` [PATCH 4/5] KVM: x86: Ensure a full memory barrier is emitted in the VM-Exit path Sean Christopherson
2024-06-20 22:38   ` Paolo Bonzini
2024-06-20 23:42     ` Paul E. McKenney
2024-06-21  0:52     ` Yan Zhao
2024-03-09  1:09 ` [PATCH 5/5] KVM: VMX: Always honor guest PAT on CPUs that support self-snoop Sean Christopherson
2024-03-11  1:16   ` Yan Zhao
2024-03-12  0:25     ` Sean Christopherson
2024-03-12  7:30       ` Tian, Kevin
2024-03-12 16:07         ` Sean Christopherson
2024-03-13  1:18           ` Yan Zhao
2024-03-13  8:52             ` Tian, Kevin
2024-03-13  8:55               ` Yan Zhao
2024-03-13 15:09                 ` Sean Christopherson
2024-03-14  0:12                   ` Yan Zhao
2024-03-14  1:00                     ` Sean Christopherson
2024-03-25  3:43   ` Chao Gao
2024-04-01 22:29     ` Sean Christopherson
2024-08-30  9:35   ` Vitaly Kuznetsov
2024-08-30 11:05     ` Gerd Hoffmann
2024-08-30 13:47       ` Vitaly Kuznetsov
2024-08-30 13:52         ` Sean Christopherson
2024-08-30 14:06           ` Vitaly Kuznetsov
2024-08-30 14:37             ` Vitaly Kuznetsov
2024-08-30 16:13               ` Sean Christopherson [this message]
2024-09-02  8:23                 ` Gerd Hoffmann
2024-09-02  1:44         ` Yan Zhao
2024-09-02  9:49           ` Vitaly Kuznetsov
2024-09-03  0:25             ` Yan Zhao
2024-09-03 15:30             ` Sean Christopherson
2024-09-03 16:20               ` Vitaly Kuznetsov
2024-09-04  2:28                 ` Yan Zhao
2024-09-04 12:17                   ` Yan Zhao
2024-09-05  0:41                     ` Sean Christopherson
2024-09-05  9:43                       ` Yan Zhao
2024-09-09  5:30                         ` Yan Zhao
2024-09-09 13:24                           ` Paolo Bonzini
2024-09-09 16:04                             ` Sean Christopherson
2024-09-10  1:05                             ` Yan Zhao
2024-09-04 11:47                 ` Vitaly Kuznetsov
2024-10-07 13:28     ` Linux regression tracking (Thorsten Leemhuis)
2024-10-07 13:38       ` Vitaly Kuznetsov
2024-10-07 14:04         ` Linux regression tracking (Thorsten Leemhuis)
2024-03-22  9:29 ` [PATCH 0/5] KVM: VMX: Drop MTRR virtualization, honor guest PAT Ma, Yongwei
2024-03-22 13:08 ` Yan Zhao
2024-03-25  6:56   ` Ma, XiangfeiX
2024-03-25  8:02     ` Ma, XiangfeiX
2024-06-05 23:20 ` Sean Christopherson
2024-06-06  0:03   ` Paul E. McKenney
  -- strict thread matches above, loose matches on Subject: below --
2025-04-10  1:13 [PATCH 5/5] KVM: VMX: Always honor guest PAT on CPUs that support self-snoop Myrsky Lintu
2025-04-10  5:12 ` Yan Zhao
2025-04-10 10:05   ` Myrsky Lintu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZtHvjzBFUbG3fcMc@google.com \
    --to=seanjc@google.com \
    --cc=jiangshanlai@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=kevin.tian@intel.com \
    --cc=kraxel@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=rcu@vger.kernel.org \
    --cc=vkuznets@redhat.com \
    --cc=yan.y.zhao@intel.com \
    --cc=zzyiwei@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox