From: Sean Christopherson <seanjc@google.com>
To: Brian Cowan <brcowan@gmail.com>
Cc: "kvm@vger.kernel.org" <kvm@vger.kernel.org>
Subject: Re: A really weird guest crash, that ONLY happens on KVM, and ONLY on 6th gen+ Intel Core CPU's
Date: Wed, 18 May 2022 21:26:42 +0000 [thread overview]
Message-ID: <YoVkkrXbGFz3PmVY@google.com> (raw)
In-Reply-To: <CAPUGS=oTTzn+HjXMdSK7jsysCagfipmnj25ofNFKD03rq=3Brw@mail.gmail.com>
On Wed, May 18, 2022, Brian Cowan wrote:
> Hi all, looking for hints on a wild crash.
>
> The company I work for has a kernel driver used to literally make a db
> query result look like a filesystem… The “database” in question being
> a proprietary SCM repository… (ClearCase, for those who have been
> around forever… Like me…)
>
> We have a crash on mounting the remote repository ONE way (ClearCase
> “Automatic views”) but not another (ClearCase “Dynamic views”) where
> both use the same kernel driver… The guest OS is RHEL 7.8, not
> registered with RH (since the VM is only supposed to last a couple of
> days.) The host OS is Ubuntu 20.04.2 LTS, though that does not seem to
> matter.
>
> The wild part is that this only happens when the ClearCase host is a
> KVM guest, and only on 6th-generation or newer . It does NOT happen
> on:
> * VMWare Virtual machines configured identically
> * VirtualBox Virtual machines Configured identically
> * 2nd generation intel core hosts running the same KVM release.
> (because OF COURSE my office "secondary desktop" host is ancient...
Heh, Sandy Bridge isn't ancient, we still get bug reports for Core2 :-)
> * A 4th generation I7 host running Ubuntu 22.04 and that version’s
> default KVM. (Because I am a laptop packrat. That laptop had been
> sitting on a bookshelf for 3+ years and I went "what if...")
What kernel version is the 6th gen (Skylake) 20.04.2 running? Same question for
the 4th gen (Haswell) 22.04. And if it's not too much trouble, can you try running
the Skylake with 22.04 kernel, or vice versa? Not super high priority if it's a
pain, the fact that the bug goes away based on what's advertised to the guest
suggests this might be a guest bug. But, it could also be a KVM bug that's
specific to a feature that's only supported in Skylake+.
> If I edit the KVM configuration and change the “mirror host CPU”
> option to use the 2nd or 4th generation CPU options, the crash stops
> happening… If this was happening on physical machines, the VM crash
> would make sense, but it's literally a hypervisor-specific crash.
>
> Any hints, tips, or comments would be most appreciated... Never
> thought I'd be trying to debug kernel/hypervisor interactions, but
> here I am...
It might be that there's a guest bug. And even if it's not a guest bug, you can
likely identify exactly what feature is problematic, though it might require
invoking QEMU directly (I don't know exactly what level of vCPU customization
libvirt allows).
First thing to try: does it repro by explicitly specifying "Skylake-Client" as the
vCPU model? No idea what libvirt calls that. If that works, then I think XSAVES
would be to blame; AFAICT that's the only thing that might be exposed by "mirror
host CPU" and not the explicit "Skylake-Client". XSAVE being to blame seems unlikely
though.
Assuming "Skylake-Client" fails, the next step would be to disable features that
are in "Skylake-Client" but not "Haswell", one by one, to figure out what's to
blame.
In QEMU, the featuers I see being in Skylake but not Haswell are:
3dnowprefetch, rdseed, adx, smap, xsavec, xgetbv1
Again, no idea if/how libvirt exposes that level of granularity. For running
QEMU directly, removing all those features would be:
-cpu Skylake-Client,-3dnowprefetch,-rdseed,-adx,-smap,-xsavec,-xgetbv1
My money is on SMAP :-)
next prev parent reply other threads:[~2022-05-18 21:26 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-18 17:12 A really weird guest crash, that ONLY happens on KVM, and ONLY on 6th gen+ Intel Core CPU's Brian Cowan
2022-05-18 17:30 ` Jim Mattson
2022-05-18 18:49 ` Brian Cowan
2022-05-18 21:27 ` Jim Mattson
2022-05-18 21:26 ` Sean Christopherson [this message]
2022-05-20 14:53 ` Brian Cowan
2022-05-20 15:22 ` Sean Christopherson
2022-05-20 22:03 ` Brian Cowan
2022-05-20 23:09 ` Jim Mattson
2022-05-24 15:30 ` Brian Cowan
2022-05-25 19:57 ` Brian Cowan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YoVkkrXbGFz3PmVY@google.com \
--to=seanjc@google.com \
--cc=brcowan@gmail.com \
--cc=kvm@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox