All of lore.kernel.org
 help / color / mirror / Atom feed
From: Boris Ostrovsky <boris.ostrovsky@oracle.com>
To: "Kani, Toshimitsu" <toshi.kani@hpe.com>,
	"eswierk@skyportsystems.com" <eswierk@skyportsystems.com>
Cc: "jgross@suse.com" <jgross@suse.com>,
	"xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
	"toshi.kani@hp.com" <toshi.kani@hp.com>,
	"david.vrabel@citrix.com" <david.vrabel@citrix.com>,
	"bp@suse.de" <bp@suse.de>
Subject: Re: PAT-related crash booting Linux 4.4 + Xen 4.5 on VMware ESXi
Date: Tue, 24 May 2016 11:54:59 -0400	[thread overview]
Message-ID: <57447953.9060005@oracle.com> (raw)
In-Reply-To: <1464101000.3504.32.camel@hpe.com>

On 05/24/2016 10:53 AM, Kani, Toshimitsu wrote:
> On Mon, 2016-05-23 at 15:52 -0700, Ed Swierk wrote:
>> Good question. I ran my tests again, and found I'd misinterpreted the
>> Fusion behavior.
>>
>> On Fusion 8.1.1, MSR_IA32_CR_PAT returns a reasonable value:
>>
>> (XEN) Freed 308kB init memory.
>> mapping kernel into physical memory
>> cpu_has_pat=0 cpuid_edx(1)=f89cbf5 pat=65536
>> pat_init_cache_modes pat=50100070406
>> pat_init_cache_modes i=7 pat_val=0 cache=3
>> pat_init_cache_modes ok
>> pat_init_cache_modes i=6 pat_val=0 cache=3
>> pat_init_cache_modes ok
>> pat_init_cache_modes i=5 pat_val=5 cache=5
>> pat_init_cache_modes ok
>> pat_init_cache_modes i=4 pat_val=1 cache=1
>> pat_init_cache_modes ok
>> pat_init_cache_modes i=3 pat_val=0 cache=3
>> pat_init_cache_modes ok
>> pat_init_cache_modes i=2 pat_val=7 cache=2
>> pat_init_cache_modes ok
>> pat_init_cache_modes i=1 pat_val=4 cache=4
>> pat_init_cache_modes ok
>> pat_init_cache_modes i=0 pat_val=6 cache=0
>> pat_init_cache_modes ok
>> pat_init_cache_modes pat_msg=WB  WT  UC- UC  WC  WP  UC  UC
>> about to get started...
>> [    0.000000] x86/PAT: Configuration [0-7]: WB  WT  UC-
>> UC  WC  WP  UC  UC
>>
>> On ESXi 5.5.0, MSR_IA32_CR_PAT returns 0, and we are indeed hitting
>> the BUG_ON in update_cache_mode_entry():
>>
>> (XEN) Freed 312kB init memory.
>> mapping kernel into physical memory
>> cpu_has_pat=0 cpuid_edx(1)=f89cbf5 pat=65536
>> pat_init_cache_modes pat=0
>> pat_init_cache_modes i=7 pat_val=0 cache=3
>> pat_init_cache_modes ok
>> pat_init_cache_modes i=6 pat_val=0 cache=3
>> pat_init_cache_modes ok
>> pat_init_cache_modes i=5 pat_val=0 cache=3
>> pat_init_cache_modes ok
>> pat_init_cache_modes i=4 pat_val=0 cache=3
>> pat_init_cache_modes ok
>> pat_init_cache_modes i=3 pat_val=0 cache=3
>> pat_init_cache_modes ok
>> pat_init_cache_modes i=2 pat_val=0 cache=3
>> pat_init_cache_modes ok
>> pat_init_cache_modes i=1 pat_val=0 cache=3
>> pat_init_cache_modes ok
>> pat_init_cache_modes i=0 pat_val=0 cache=3
>> (XEN) traps.c:459:d0v0 Unhandled invalid opcode fault/trap [#6] on
>> VCPU 0 [ec=0000]
>> (XEN) domain_crash_sync called from entry.S: fault at ffff82d0802276c3
>> create_bounce_frame+0x12b/0x13a
>>
>> In both cases, the PAT CPUID feature bit is set, and cpu_has_pat is
>> always 0 at this early point (so my RFC patch is wrong). The simplest
>> fix is to call pat_init_cache_modes(pat) only if pat != 0.
>>
>> This is starting to look like the same logic that's in pat_bsp_init(),
>> which doesn't seem to be called when booting on Xen. Should it be? Was
>> Xen deliberately excluded from this PAT emulation change?
>> https://groups.google.com/d/msg/linux.kernel/JoJKbCOxV0U/PM0I9d1v60kJ
> Calling pat_init() requires the CPU rendezvous handler in MTRR, which is
> disabled in Xen.  This PAT initialization has been problematic, and the
> following patches addressed it in 4.6.  This will fix your problem as
> well. 
> https://lkml.org/lkml/2016/3/23/500
>
> In particular, patch 6/7 removed the Xen code in question.
> https://lkml.org/lkml/2016/3/23/503
>
> Do you need to fix this issue in 4.4?  If so, we should be able to request
> backporting the patches to 4.4 stable.


Would disabling PAT when the MSR is clearly broken (and not trying to
emulate it) not work?

-boris


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  parent reply	other threads:[~2016-05-24 15:54 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-20 23:58 PAT-related crash booting Linux 4.4 + Xen 4.5 on VMware ESXi Ed Swierk
2016-05-23 14:15 ` Konrad Rzeszutek Wilk
2016-05-23 20:13   ` Boris Ostrovsky
2016-05-23 22:52     ` Ed Swierk
2016-05-24 14:53       ` Kani, Toshimitsu
2016-05-24 15:25         ` Ed Swierk
2016-05-24 15:54         ` Boris Ostrovsky [this message]
2016-05-24 16:59           ` Kani, Toshimitsu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57447953.9060005@oracle.com \
    --to=boris.ostrovsky@oracle.com \
    --cc=bp@suse.de \
    --cc=david.vrabel@citrix.com \
    --cc=eswierk@skyportsystems.com \
    --cc=jgross@suse.com \
    --cc=toshi.kani@hp.com \
    --cc=toshi.kani@hpe.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.