From: Sean Christopherson <sean.j.christopherson@intel.com>
To: Derek Yerger <derek@djy.llc>
Cc: Alex Williamson <alex.williamson@redhat.com>,
kvm@vger.kernel.org, "Bonzini, Paolo" <pbonzini@redhat.com>
Subject: Re: PROBLEM: Regression of MMU causing guest VM application errors
Date: Tue, 22 Oct 2019 13:28:47 -0700 [thread overview]
Message-ID: <20191022202847.GO2343@linux.intel.com> (raw)
In-Reply-To: <53f506b3-e864-b3ca-f18f-f8e9a1612072@djy.llc>
On Thu, Oct 17, 2019 at 07:57:35PM -0400, Derek Yerger wrote:
> On 10/16/19 1:49 PM, Sean Christopherson wrote:
> >On Wed, Oct 16, 2019 at 11:28:57AM -0600, Alex Williamson wrote:
> >>On Wed, 16 Oct 2019 00:49:51 -0400
> >>Derek Yerger<derek@djy.llc> wrote:
> >>
> >>>In at least Linux 5.2.7 via Fedora, up to 5.2.18, guest OS applications
> >>>repeatedly crash with segfaults. The problem does not occur on 5.1.16.
> >>>
> >>>System is running Fedora 29 with kernel 5.2.18. Guest OS is Windows 10 with an
> >>>AMD Radeon 540 GPU passthrough. When on 5.2.7 or 5.2.18, specific windows
> >>>applications frequently and repeatedly crash, throwing exceptions in random
> >>>libraries. Going back to 5.1.16, the issue does not occur.
> >>>
> >>>The host system is unaffected by the regression.
> >>>
> >>>Keywords: kvm mmu pci passthrough vfio vfio-pci amdgpu
> >>>
> >>>Possibly related: Unmerged [PATCH] KVM: x86/MMU: Zap all when removing memslot
> >>>if VM has assigned device
> >>That was never merged because it was superseded by:
> >>
> >>d012a06ab1d2 Revert "KVM: x86/mmu: Zap only the relevant pages when removing a memslot"
> >>
> >>That revert also induced this commit:
> >>
> >>002c5f73c508 KVM: x86/mmu: Reintroduce fast invalidate/zap for flushing memslot
> >>
> >>Both of these were merged to stable, showing up in 5.2.11 and 5.2.16
> >>respectively, so seeing these sorts of issues might be considered a
> >>known issue on 5.2.7, but not 5.2.18 afaik. Do you have a specific
> >>test that reliably reproduces the issue? Thanks,
> Test case 1: Kernel 5.2.18, PCI passthrough, Windows 10 guest, error condition.
> Error 1: Application error in Firefox, restarting firefox and restoring tabs
> reliably causes application crash with stack overflow error.
> Error 2: Guest BSOD by the morning if left idle
> Error 3: Guest BSOD within 1 minute of using SolidWorks CAD software
>
> Test case 2: Kernel 5.2.18, no PCI passthrough, same environment. Guest BSOD
> encountered.
>
> Test case 3: Kernel 5.1.16, no PCI passthrough, same environment. Worked in
> Solidworks for 10 minutes without BSOD. Opened firefox and restored tabs, no
> crash.
>
> Test case 4: Kernel 5.1.16, with PCI passthrough, same environment. Worked
> in Solidworks for a half hour. Opened firefox and restored tabs, no crash.
>
> Other factors: The guest does not change between tests. Same drivers,
> software, etc. I have reliably switched between 5.2.x and 5.1.x multiple
> times in the past month and repeatably see issues with 5.2.x. At this point
> I'm unsure if it's PCI passthrough causing the problem.
>
> I know I should probably start from fresh host and guest, but time isn't
> really permitting.
> >Also, does the failure reproduce on on 5.2.1 - 5.2.6? The memslot debacle
> >exists on all flavors of 5.2.x, if the errors showed up in 5.2.7 then they
> >are being caused by something else.
> After experiencing the issue in absence of PCI passthrough, I believe the
> problem is unrelated to the memslot debacle.
Heh, should've checked from the get go... It's definitely not the memslot
issue, because the memslot bug is in 5.1.16 as well. :-)
> I'm stuck on 5.1.x for now, maybe I'll give up and get a dedicated windows
> machine /s
What hardware are you running on? I was thinking this was AMD specific,
but then realized you said "AMD Radeon 540 GPU" and not "AMD CPU".
next prev parent reply other threads:[~2019-10-22 20:28 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-16 4:49 PROBLEM: Regression of MMU causing guest VM application errors Derek Yerger
2019-10-16 7:28 ` Paolo Bonzini
2019-10-16 17:28 ` Alex Williamson
2019-10-16 17:49 ` Sean Christopherson
2019-10-17 23:57 ` Derek Yerger
2019-10-22 20:28 ` Sean Christopherson [this message]
2019-10-24 15:18 ` Derek Yerger
2019-10-24 17:32 ` Sean Christopherson
2019-10-31 3:44 ` Derek Yerger
2019-11-19 20:01 ` Sean Christopherson
2019-11-20 9:19 ` Wanpeng Li
2019-11-20 9:57 ` Paolo Bonzini
2019-11-20 18:19 ` Sean Christopherson
2019-11-20 19:04 ` Derek Yerger
2019-11-20 19:28 ` Sean Christopherson
2019-11-27 15:24 ` Sean Christopherson
2019-12-17 23:11 ` Sean Christopherson
2019-12-17 23:13 ` Derek Yerger
2020-01-02 13:42 ` Derek Yerger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191022202847.GO2343@linux.intel.com \
--to=sean.j.christopherson@intel.com \
--cc=alex.williamson@redhat.com \
--cc=derek@djy.llc \
--cc=kvm@vger.kernel.org \
--cc=pbonzini@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).