From: "Kani, Toshimitsu" <toshi.kani@hpe.com>
To: "boris.ostrovsky@oracle.com" <boris.ostrovsky@oracle.com>,
"eswierk@skyportsystems.com" <eswierk@skyportsystems.com>
Cc: "jgross@suse.com" <jgross@suse.com>,
"xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
"Kani, Toshimitsu" <toshi.kani@hpe.com>,
"x86@kernel.org" <x86@kernel.org>,
"david.vrabel@citrix.com" <david.vrabel@citrix.com>,
"bp@suse.de" <bp@suse.de>
Subject: Re: PAT-related crash booting Linux 4.4 + Xen 4.5 on VMware ESXi
Date: Tue, 24 May 2016 16:59:47 +0000 [thread overview]
Message-ID: <1464108603.3504.40.camel@hpe.com> (raw)
In-Reply-To: <57447953.9060005@oracle.com>
On Tue, 2016-05-24 at 11:54 -0400, Boris Ostrovsky wrote:
> On 05/24/2016 10:53 AM, Kani, Toshimitsu wrote:
> >
> > On Mon, 2016-05-23 at 15:52 -0700, Ed Swierk wrote:
> > >
> > > Good question. I ran my tests again, and found I'd misinterpreted the
> > > Fusion behavior.
> > >
> > > On Fusion 8.1.1, MSR_IA32_CR_PAT returns a reasonable value:
> > >
> > > (XEN) Freed 308kB init memory.
> > > mapping kernel into physical memory
> > > cpu_has_pat=0 cpuid_edx(1)=f89cbf5 pat=65536
> > > pat_init_cache_modes pat=50100070406
> > > pat_init_cache_modes i=7 pat_val=0 cache=3
> > > pat_init_cache_modes ok
> > > pat_init_cache_modes i=6 pat_val=0 cache=3
> > > pat_init_cache_modes ok
> > > pat_init_cache_modes i=5 pat_val=5 cache=5
> > > pat_init_cache_modes ok
> > > pat_init_cache_modes i=4 pat_val=1 cache=1
> > > pat_init_cache_modes ok
> > > pat_init_cache_modes i=3 pat_val=0 cache=3
> > > pat_init_cache_modes ok
> > > pat_init_cache_modes i=2 pat_val=7 cache=2
> > > pat_init_cache_modes ok
> > > pat_init_cache_modes i=1 pat_val=4 cache=4
> > > pat_init_cache_modes ok
> > > pat_init_cache_modes i=0 pat_val=6 cache=0
> > > pat_init_cache_modes ok
> > > pat_init_cache_modes pat_msg=WB WT UC- UC WC WP UC UC
> > > about to get started...
> > > [ 0.000000] x86/PAT: Configuration [0-7]: WB WT UC-
> > > UC WC WP UC UC
> > >
> > > On ESXi 5.5.0, MSR_IA32_CR_PAT returns 0, and we are indeed hitting
> > > the BUG_ON in update_cache_mode_entry():
> > >
> > > (XEN) Freed 312kB init memory.
> > > mapping kernel into physical memory
> > > cpu_has_pat=0 cpuid_edx(1)=f89cbf5 pat=65536
> > > pat_init_cache_modes pat=0
> > > pat_init_cache_modes i=7 pat_val=0 cache=3
> > > pat_init_cache_modes ok
> > > pat_init_cache_modes i=6 pat_val=0 cache=3
> > > pat_init_cache_modes ok
> > > pat_init_cache_modes i=5 pat_val=0 cache=3
> > > pat_init_cache_modes ok
> > > pat_init_cache_modes i=4 pat_val=0 cache=3
> > > pat_init_cache_modes ok
> > > pat_init_cache_modes i=3 pat_val=0 cache=3
> > > pat_init_cache_modes ok
> > > pat_init_cache_modes i=2 pat_val=0 cache=3
> > > pat_init_cache_modes ok
> > > pat_init_cache_modes i=1 pat_val=0 cache=3
> > > pat_init_cache_modes ok
> > > pat_init_cache_modes i=0 pat_val=0 cache=3
> > > (XEN) traps.c:459:d0v0 Unhandled invalid opcode fault/trap [#6] on
> > > VCPU 0 [ec=0000]
> > > (XEN) domain_crash_sync called from entry.S: fault at
> > > ffff82d0802276c3
> > > create_bounce_frame+0x12b/0x13a
> > >
> > > In both cases, the PAT CPUID feature bit is set, and cpu_has_pat is
> > > always 0 at this early point (so my RFC patch is wrong). The simplest
> > > fix is to call pat_init_cache_modes(pat) only if pat != 0.
> > >
> > > This is starting to look like the same logic that's in
> > > pat_bsp_init(),
> > > which doesn't seem to be called when booting on Xen. Should it be?
> > > Was
> > > Xen deliberately excluded from this PAT emulation change?
> > > https://groups.google.com/d/msg/linux.kernel/JoJKbCOxV0U/PM0I9d1v60kJ
> >
> > Calling pat_init() requires the CPU rendezvous handler in MTRR, which
> > is disabled in Xen. This PAT initialization has been problematic, and
> > the following patches addressed it in 4.6. This will fix your problem
> > as well.
> > https://lkml.org/lkml/2016/3/23/500
> >
> > In particular, patch 6/7 removed the Xen code in question.
> > https://lkml.org/lkml/2016/3/23/503
> >
> > Do you need to fix this issue in 4.4? If so, we should be able to
> > request backporting the patches to 4.4 stable.
>
> Would disabling PAT when the MSR is clearly broken (and not trying to
> emulate it) not work?
That should work, but the above patches fix the qemu32 issue also found in
4.4. So, they need to be backported to 4.4.
https://lkml.org/lkml/2016/3/3/828
-Toshi
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
prev parent reply other threads:[~2016-05-24 16:59 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-20 23:58 PAT-related crash booting Linux 4.4 + Xen 4.5 on VMware ESXi Ed Swierk
2016-05-23 14:15 ` Konrad Rzeszutek Wilk
2016-05-23 20:13 ` Boris Ostrovsky
2016-05-23 22:52 ` Ed Swierk
2016-05-24 14:53 ` Kani, Toshimitsu
2016-05-24 15:25 ` Ed Swierk
2016-05-24 15:54 ` Boris Ostrovsky
2016-05-24 16:59 ` Kani, Toshimitsu [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1464108603.3504.40.camel@hpe.com \
--to=toshi.kani@hpe.com \
--cc=boris.ostrovsky@oracle.com \
--cc=bp@suse.de \
--cc=david.vrabel@citrix.com \
--cc=eswierk@skyportsystems.com \
--cc=jgross@suse.com \
--cc=x86@kernel.org \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).