All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Yan Zhao <yan.y.zhao@intel.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	 Gerd Hoffmann <kraxel@redhat.com>,
	kvm@vger.kernel.org, rcu@vger.kernel.org,
	 linux-kernel@vger.kernel.org, Kevin Tian <kevin.tian@intel.com>,
	 Yiwei Zhang <zzyiwei@google.com>,
	Lai Jiangshan <jiangshanlai@gmail.com>,
	 "Paul E. McKenney" <paulmck@kernel.org>,
	Josh Triplett <josh@joshtriplett.org>
Subject: Re: [PATCH 5/5] KVM: VMX: Always honor guest PAT on CPUs that support self-snoop
Date: Mon, 9 Sep 2024 09:04:17 -0700	[thread overview]
Message-ID: <Zt8cgUASZCN6gP8H@google.com> (raw)
In-Reply-To: <c1d420ba-13de-48dd-abee-473988172d07@redhat.com>

On Mon, Sep 09, 2024, Paolo Bonzini wrote:
> On 9/9/24 07:30, Yan Zhao wrote:
> > On Thu, Sep 05, 2024 at 05:43:17PM +0800, Yan Zhao wrote:
> > > On Wed, Sep 04, 2024 at 05:41:06PM -0700, Sean Christopherson wrote:
> > > > On Wed, Sep 04, 2024, Yan Zhao wrote:
> > > > > On Wed, Sep 04, 2024 at 10:28:02AM +0800, Yan Zhao wrote:
> > > > > > On Tue, Sep 03, 2024 at 06:20:27PM +0200, Vitaly Kuznetsov wrote:
> > > > > > > Sean Christopherson <seanjc@google.com> writes:
> > > > > > > 
> > > > > > > > On Mon, Sep 02, 2024, Vitaly Kuznetsov wrote:
> > > > > > > > > FWIW, I use QEMU-9.0 from the same C10S (qemu-kvm-9.0.0-7.el10.x86_64)
> > > > > > > > > but I don't think it matters in this case. My CPU is "Intel(R) Xeon(R)
> > > > > > > > > Silver 4410Y".
> > > > > > > > 
> > > > > > > > Has this been reproduced on any other hardware besides SPR?  I.e. did we stumble
> > > > > > > > on another hardware issue?
> > > > > > > 
> > > > > > > Very possible, as according to Yan Zhao this doesn't reproduce on at
> > > > > > > least "Coffee Lake-S". Let me try to grab some random hardware around
> > > > > > > and I'll be back with my observations.
> > > > > > 
> > > > > > Update some new findings from my side:
> > > > > > 
> > > > > > BAR 0 of bochs VGA (fb_map) is used for frame buffer, covering phys range
> > > > > > from 0xfd000000 to 0xfe000000.
> > > > > > 
> > > > > > On "Sapphire Rapids XCC":
> > > > > > 
> > > > > > 1. If KVM forces this fb_map range to be WC+IPAT, installer/gdm can launch
> > > > > >     correctly.
> > > > > >     i.e.
> > > > > >     if (gfn >= 0xfd000 && gfn < 0xfe000) {
> > > > > >     	return (MTRR_TYPE_WRCOMB << VMX_EPT_MT_EPTE_SHIFT) | VMX_EPT_IPAT_BIT;
> > > > > >     }
> > > > > >     return MTRR_TYPE_WRBACK << VMX_EPT_MT_EPTE_SHIFT;
> > > > > > 
> > > > > > 2. If KVM forces this fb_map range to be UC+IPAT, installer failes to show / gdm
> > > > > >     restarts endlessly. (though on Coffee Lake-S, installer/gdm can launch
> > > > > >     correctly in this case).
> > > > > > 
> > > > > > 3. On starting GDM, ttm_kmap_iter_linear_io_init() in guest is called to set
> > > > > >     this fb_map range as WC, with
> > > > > >     iosys_map_set_vaddr_iomem(&iter_io->dmap, ioremap_wc(mem->bus.offset, mem->size));
> > > > > > 
> > > > > >     However, during bochs_pci_probe()-->bochs_load()-->bochs_hw_init(), pfns for
> > > > > >     this fb_map has been reserved as uc- by ioremap().
> > > > > >     Then, the ioremap_wc() during starting GDM will only map guest PAT with UC-.
> > > > > > 
> > > > > >     So, with KVM setting WB (no IPAT) to this fb_map range, the effective
> > > > > >     memory type is UC- and installer/gdm restarts endlessly.
> > > > > > 
> > > > > > 4. If KVM sets WB (no IPAT) to this fb_map range, and changes guest bochs driver
> > > > > >     to call ioremap_wc() instead in bochs_hw_init(), gdm can launch correctly.
> > > > > >     (didn't verify the installer's case as I can't update the driver in that case).
> > > > > > 
> > > > > >     The reason is that the ioremap_wc() called during starting GDM will no longer
> > > > > >     meet conflict and can map guest PAT as WC.
> > > > 
> > > > Huh.  The upside of this is that it sounds like there's nothing broken with WC
> > > > or self-snoop.
> > > Considering a different perspective, the fb_map range is used as frame buffer
> > > (vram), with the guest writing to this range and the host reading from it.
> > > If the issue were related to self-snooping, we would expect the VNC window to
> > > display distorted data. However, the observed behavior is that the GDM window
> > > shows up correctly for a sec and restarts over and over.
> > > 
> > > So, do you think we can simply fix this issue by calling ioremap_wc() for the
> > > frame buffer/vram range in bochs driver, as is commonly done in other gpu
> > > drivers?
> > > 
> > > --- a/drivers/gpu/drm/tiny/bochs.c
> > > +++ b/drivers/gpu/drm/tiny/bochs.c
> > > @@ -261,7 +261,9 @@ static int bochs_hw_init(struct drm_device *dev)
> > >          if (pci_request_region(pdev, 0, "bochs-drm") != 0)
> > >                  DRM_WARN("Cannot request framebuffer, boot fb still active?\n");
> > > 
> > > -       bochs->fb_map = ioremap(addr, size);
> > > +       bochs->fb_map = ioremap_wc(addr, size);
> > >          if (bochs->fb_map == NULL) {
> > >                  DRM_ERROR("Cannot map framebuffer\n");
> > >                  return -ENOMEM;
> 
> While this is a fix for future kernels, it doesn't change the result for VMs
> already in existence.

I would prefer to bottom out on exactly whether or not the SPR/CLX behavior is
working as intended.  Maybe the ~8x slowdown is just a side effect of any Intel
multi-socket/node system, but I think we should get confirmation (inasmuch as
possible) that that is indeed the case.  E.g. if this is actually a bug in CLX+,
then the actions we need to take are different.

> I don't think there's an alternative to putting this behind a quirk.

This gets a bit weird, which is why I want to bottom out on whether or not CLX
and SPR are working as intended.  If non-coherent DMA is attached to the VM, then
even before this patch KVM would honor guest PAT.  I agree that we don't want to
break existing setups, but if CLX+SPR are working as intended, then this is
inarguably a bochs driver bug, and I would prefer to have the quirk explicitly
reference bochs-compatible devices, e.g. in the name and documentation, so that
userspace can disable the quirk by default and only leave it enabled if a bochs
device is being exposed to the guest.

  reply	other threads:[~2024-09-09 16:04 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-09  1:09 [PATCH 0/5] KVM: VMX: Drop MTRR virtualization, honor guest PAT Sean Christopherson
2024-03-09  1:09 ` [PATCH 1/5] KVM: x86: Remove VMX support for virtualizing guest MTRR memtypes Sean Christopherson
2024-03-11  7:44   ` Yan Zhao
2024-03-12  0:08     ` Sean Christopherson
2024-03-12  1:10   ` Dongli Zhang
2024-03-12 17:08     ` Sean Christopherson
2024-03-14 10:31       ` Dongli Zhang
2024-03-14 14:47         ` Sean Christopherson
2024-03-09  1:09 ` [PATCH 2/5] KVM: VMX: Drop support for forcing UC memory when guest CR0.CD=1 Sean Christopherson
2024-03-09  1:09 ` [PATCH 3/5] srcu: Add an API for a memory barrier after SRCU read lock Sean Christopherson
2024-03-09  1:09 ` [PATCH 4/5] KVM: x86: Ensure a full memory barrier is emitted in the VM-Exit path Sean Christopherson
2024-06-20 22:38   ` Paolo Bonzini
2024-06-20 23:42     ` Paul E. McKenney
2024-06-21  0:52     ` Yan Zhao
2024-03-09  1:09 ` [PATCH 5/5] KVM: VMX: Always honor guest PAT on CPUs that support self-snoop Sean Christopherson
2024-03-11  1:16   ` Yan Zhao
2024-03-12  0:25     ` Sean Christopherson
2024-03-12  7:30       ` Tian, Kevin
2024-03-12 16:07         ` Sean Christopherson
2024-03-13  1:18           ` Yan Zhao
2024-03-13  8:52             ` Tian, Kevin
2024-03-13  8:55               ` Yan Zhao
2024-03-13 15:09                 ` Sean Christopherson
2024-03-14  0:12                   ` Yan Zhao
2024-03-14  1:00                     ` Sean Christopherson
2024-03-25  3:43   ` Chao Gao
2024-04-01 22:29     ` Sean Christopherson
2024-08-30  9:35   ` Vitaly Kuznetsov
2024-08-30 11:05     ` Gerd Hoffmann
2024-08-30 13:47       ` Vitaly Kuznetsov
2024-08-30 13:52         ` Sean Christopherson
2024-08-30 14:06           ` Vitaly Kuznetsov
2024-08-30 14:37             ` Vitaly Kuznetsov
2024-08-30 16:13               ` Sean Christopherson
2024-09-02  8:23                 ` Gerd Hoffmann
2024-09-02  1:44         ` Yan Zhao
2024-09-02  9:49           ` Vitaly Kuznetsov
2024-09-03  0:25             ` Yan Zhao
2024-09-03 15:30             ` Sean Christopherson
2024-09-03 16:20               ` Vitaly Kuznetsov
2024-09-04  2:28                 ` Yan Zhao
2024-09-04 12:17                   ` Yan Zhao
2024-09-05  0:41                     ` Sean Christopherson
2024-09-05  9:43                       ` Yan Zhao
2024-09-09  5:30                         ` Yan Zhao
2024-09-09 13:24                           ` Paolo Bonzini
2024-09-09 16:04                             ` Sean Christopherson [this message]
2024-09-10  1:05                             ` Yan Zhao
2024-09-04 11:47                 ` Vitaly Kuznetsov
2024-10-07 13:28     ` Linux regression tracking (Thorsten Leemhuis)
2024-10-07 13:38       ` Vitaly Kuznetsov
2024-10-07 14:04         ` Linux regression tracking (Thorsten Leemhuis)
2024-03-22  9:29 ` [PATCH 0/5] KVM: VMX: Drop MTRR virtualization, honor guest PAT Ma, Yongwei
2024-03-22 13:08 ` Yan Zhao
2024-03-25  6:56   ` Ma, XiangfeiX
2024-03-25  8:02     ` Ma, XiangfeiX
2024-06-05 23:20 ` Sean Christopherson
2024-06-06  0:03   ` Paul E. McKenney
  -- strict thread matches above, loose matches on Subject: below --
2025-04-10  1:13 [PATCH 5/5] KVM: VMX: Always honor guest PAT on CPUs that support self-snoop Myrsky Lintu
2025-04-10  5:12 ` Yan Zhao
2025-04-10 10:05   ` Myrsky Lintu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zt8cgUASZCN6gP8H@google.com \
    --to=seanjc@google.com \
    --cc=jiangshanlai@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=kevin.tian@intel.com \
    --cc=kraxel@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=rcu@vger.kernel.org \
    --cc=vkuznets@redhat.com \
    --cc=yan.y.zhao@intel.com \
    --cc=zzyiwei@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.