All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFH]: AMD SVM #PF error code with P and RSVD bit....
@ 2014-06-14  1:03 Mukesh Rathor
  2014-06-16  9:24 ` Jan Beulich
  0 siblings, 1 reply; 6+ messages in thread
From: Mukesh Rathor @ 2014-06-14  1:03 UTC (permalink / raw)
  To: Xen-devel@lists.xensource.com, suravee.suthikulpanit,
	Aravind.Gopalakrishnan, boris.ostrovsky@oracle.com

Hi,

I am trying to debug this triple fault bringing up PVH linux domU on
AMD.

Instruction:
ffffffff81d2d976: 8:dmi_scan_machine+b7          mov (%r12),
%rax r12: ffffffffff46e000

This first causes #PF:
(XEN) exitcode = 0x4e exitintinfo = 0
(XEN) exitinfo1 = 0x9 exitinfo2 = 0xffffffffff46e000 

erro_code == 0x9 => RSVD bit set. according to the APM:

   RSV—Bit 3. If this bit is set to 1, the page fault is a result
   of the processor reading a 1 from a reserved field within a
   page-translation-table entry. This type of page fault occurs only
   when CR4.PSE=1 or CR4.PAE=1.

My CR4 == 0x0000000000000060 == PAE MCE (Full vmcb below). 
However, all PTEs seem OK, all NPT entries seem OK too.

PTE entries (l4 thru L1):

0000000001c16067 0000000001c18067 0000000001e8d067 80000000000f0463 

P2M (L4 thru L1):

6000000102b75667 6000000102b74467 6000000102b7f267 6000000101a41067

The P bit being set in error code doesn't make sense either.. 

Appreciate any help.

thanks
Mukesh


VMCB:
(XEN) general1_intercepts = 0xbd44000f general2_intercepts = 0x2e7f
(XEN) iopm_base_pa = 0xcfce9000 msrpm_base_pa = 0x100e10000 tsc_offset
= 0 (XEN) tlb_control = 0 vintr = 0x1000000 interrupt_shadow = 0
(XEN) exitcode = 0x4e exitintinfo = 0
(XEN) exitinfo1 = 0x9 exitinfo2 = 0xffffffffff46e000 
(XEN) np_enable = 1 guest_asid = 0x5
(XEN) cpl = 0 efer = 0x1500 star = 0 lstar = 0
(XEN) CR0 = 0x0000000080050033 CR2 = 0x0000000000000000
(XEN) CR3 = 0x0000000001c13000 CR4 = 0x0000000000000060
(XEN) RSP = 0xffffffff81c01df8  RIP = 0xffffffff81d2d976
(XEN) RAX = 0x0000000000000000  RFLAGS=0x0000000000000087
(XEN) DR6 = 0x00000000ffff0ff0, DR7 = 0x0000000000000400
(XEN) CSTAR = 0x0000000000000000 SFMask = 0x0000000000000000
(XEN) KernGSBase = 0x0000000000000000 PAT = 0x0007040600070406 
(XEN) H_CR3 = 0x0000000100faa000 CleanBits = 0
(XEN) CS: sel=0x0010, attr=0x029b, limit=0xffffffff,
base=0x0000000000000000 (XEN) DS: sel=0x0000, attr=0x0000,
limit=0xffffffff, base=0x0000000000000000 (XEN) SS: sel=0x0000,
attr=0x0c93, limit=0xffffffff, base=0x0000000000000000 (XEN) ES:
sel=0x0000, attr=0x0000, limit=0xffffffff, base=0x0000000000000000
(XEN) FS: sel=0x0000, attr=0x0000, limit=0xffffffff,
base=0x0000000000000000 (XEN) GS: sel=0x0000, attr=0x0000,
limit=0xffffffff, base=0xffffffff81ccf000 (XEN) GDTR: sel=0x0000,
attr=0x0000, limit=0x0000007f, base=0xffffffff81cd3000 (XEN) LDTR:
sel=0x0000, attr=0x0000, limit=0x00000000, base=0x0000000000000000
(XEN) IDTR: sel=0x0000, attr=0x0000, limit=0x00000fff,
base=0xffffffff81e8a000 (XEN) TR: sel=0x0000, attr=0x008b,
limit=0x000000ff, base=0x0000000000000000


GDT:
ffffffff81cd3000:  0000000000000000 00cf9b000000ffff
ffffffff81cd3010:  00af9b000000ffff 00cf93000000ffff



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFH]: AMD SVM #PF error code with P and RSVD bit....
  2014-06-14  1:03 [RFH]: AMD SVM #PF error code with P and RSVD bit Mukesh Rathor
@ 2014-06-16  9:24 ` Jan Beulich
  2014-06-16 22:44   ` Mukesh Rathor
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Beulich @ 2014-06-16  9:24 UTC (permalink / raw)
  To: Mukesh Rathor
  Cc: xen-devel, boris.ostrovsky@oracle.com, Aravind.Gopalakrishnan,
	suravee.suthikulpanit

>>> On 14.06.14 at 03:03, <mukesh.rathor@oracle.com> wrote:
> I am trying to debug this triple fault bringing up PVH linux domU on
> AMD.
> 
> Instruction:
> ffffffff81d2d976: 8:dmi_scan_machine+b7          mov (%r12),
> %rax r12: ffffffffff46e000
> 
> This first causes #PF:
> (XEN) exitcode = 0x4e exitintinfo = 0
> (XEN) exitinfo1 = 0x9 exitinfo2 = 0xffffffffff46e000 
> 
> erro_code == 0x9 => RSVD bit set. according to the APM:
> 
>    RSV—Bit 3. If this bit is set to 1, the page fault is a result
>    of the processor reading a 1 from a reserved field within a
>    page-translation-table entry. This type of page fault occurs only
>    when CR4.PSE=1 or CR4.PAE=1.
> 
> My CR4 == 0x0000000000000060 == PAE MCE (Full vmcb below). 
> However, all PTEs seem OK, all NPT entries seem OK too.
> 
> PTE entries (l4 thru L1):
> 
> 0000000001c16067 0000000001c18067 0000000001e8d067 80000000000f0463 

EFER.NX is clear, and hence the NX bit on the L1 entry is wrong.

> P2M (L4 thru L1):
> 
> 6000000102b75667 6000000102b74467 6000000102b7f267 6000000101a41067
> 
> The P bit being set in error code doesn't make sense either.. 

I don't think this is surprising: All the levels have P set, so the fault
was clearly one on a present entry.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFH]: AMD SVM #PF error code with P and RSVD bit....
  2014-06-16  9:24 ` Jan Beulich
@ 2014-06-16 22:44   ` Mukesh Rathor
  2014-06-17  7:01     ` Jan Beulich
  0 siblings, 1 reply; 6+ messages in thread
From: Mukesh Rathor @ 2014-06-16 22:44 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, boris.ostrovsky@oracle.com, Aravind.Gopalakrishnan,
	suravee.suthikulpanit

On Mon, 16 Jun 2014 10:24:15 +0100
"Jan Beulich" <JBeulich@suse.com> wrote:

> >>> On 14.06.14 at 03:03, <mukesh.rathor@oracle.com> wrote:
> > I am trying to debug this triple fault bringing up PVH linux domU on
> > AMD.
> > 
> > Instruction:
> > ffffffff81d2d976: 8:dmi_scan_machine+b7          mov (%r12),
> > %rax r12: ffffffffff46e000
> > 
> > This first causes #PF:
> > (XEN) exitcode = 0x4e exitintinfo = 0
> > (XEN) exitinfo1 = 0x9 exitinfo2 = 0xffffffffff46e000 
> > 
> > erro_code == 0x9 => RSVD bit set. according to the APM:
> > 
> >    RSV—Bit 3. If this bit is set to 1, the page fault is a result
> >    of the processor reading a 1 from a reserved field within a
> >    page-translation-table entry. This type of page fault occurs only
> >    when CR4.PSE=1 or CR4.PAE=1.
> > 
> > My CR4 == 0x0000000000000060 == PAE MCE (Full vmcb below). 
> > However, all PTEs seem OK, all NPT entries seem OK too.
> > 
> > PTE entries (l4 thru L1):
> > 
> > 0000000001c16067 0000000001c18067 0000000001e8d067 80000000000f0463 
> 
> EFER.NX is clear, and hence the NX bit on the L1 entry is wrong.

Ah, interesting, I didn't realize it would complain about NX during load/store.

BTW on:

Intel:
    Guest EFER = 0x0000000000000000

    Ptes:
       0000000001c16067 0000000001c18067 0000000001e8d067 80000000000f0463

L1 has XD set. Maybe Intel just ignores the bit if EFER.NX is 0!


So, that leads me to wonder next whether it's better to set EFER.NX in SVM
when setting LME/LMA, or impose on the guest to do it itself. Both seem OK
to me...

thanks
mukesh


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFH]: AMD SVM #PF error code with P and RSVD bit....
  2014-06-16 22:44   ` Mukesh Rathor
@ 2014-06-17  7:01     ` Jan Beulich
  2014-06-17 13:32       ` Andrew Cooper
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Beulich @ 2014-06-17  7:01 UTC (permalink / raw)
  To: Mukesh Rathor
  Cc: xen-devel, boris.ostrovsky@oracle.com, Aravind.Gopalakrishnan,
	suravee.suthikulpanit

>>> On 17.06.14 at 00:44, <mukesh.rathor@oracle.com> wrote:
> On Mon, 16 Jun 2014 10:24:15 +0100
> "Jan Beulich" <JBeulich@suse.com> wrote:
> 
>> >>> On 14.06.14 at 03:03, <mukesh.rathor@oracle.com> wrote:
>> > I am trying to debug this triple fault bringing up PVH linux domU on
>> > AMD.
>> > 
>> > Instruction:
>> > ffffffff81d2d976: 8:dmi_scan_machine+b7          mov (%r12),
>> > %rax r12: ffffffffff46e000
>> > 
>> > This first causes #PF:
>> > (XEN) exitcode = 0x4e exitintinfo = 0
>> > (XEN) exitinfo1 = 0x9 exitinfo2 = 0xffffffffff46e000 
>> > 
>> > erro_code == 0x9 => RSVD bit set. according to the APM:
>> > 
>> >    RSV—Bit 3. If this bit is set to 1, the page fault is a result
>> >    of the processor reading a 1 from a reserved field within a
>> >    page-translation-table entry. This type of page fault occurs only
>> >    when CR4.PSE=1 or CR4.PAE=1.
>> > 
>> > My CR4 == 0x0000000000000060 == PAE MCE (Full vmcb below). 
>> > However, all PTEs seem OK, all NPT entries seem OK too.
>> > 
>> > PTE entries (l4 thru L1):
>> > 
>> > 0000000001c16067 0000000001c18067 0000000001e8d067 80000000000f0463 
>> 
>> EFER.NX is clear, and hence the NX bit on the L1 entry is wrong.
> 
> Ah, interesting, I didn't realize it would complain about NX during 
> load/store.
> 
> BTW on:
> 
> Intel:
>     Guest EFER = 0x0000000000000000
> 
>     Ptes:
>        0000000001c16067 0000000001c18067 0000000001e8d067 80000000000f0463
> 
> L1 has XD set. Maybe Intel just ignores the bit if EFER.NX is 0!

Which would be a bug imo.

> So, that leads me to wonder next whether it's better to set EFER.NX in SVM
> when setting LME/LMA, or impose on the guest to do it itself. Both seem OK
> to me...

No, this should be entirely under guest control.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFH]: AMD SVM #PF error code with P and RSVD bit....
  2014-06-17  7:01     ` Jan Beulich
@ 2014-06-17 13:32       ` Andrew Cooper
  2014-06-17 21:43         ` Mukesh Rathor
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Cooper @ 2014-06-17 13:32 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Aravind.Gopalakrishnan, boris.ostrovsky@oracle.com,
	suravee.suthikulpanit

On 17/06/14 08:01, Jan Beulich wrote:
>>>> On 17.06.14 at 00:44, <mukesh.rathor@oracle.com> wrote:
>> On Mon, 16 Jun 2014 10:24:15 +0100
>> "Jan Beulich" <JBeulich@suse.com> wrote:
>>
>>>>>> On 14.06.14 at 03:03, <mukesh.rathor@oracle.com> wrote:
>>>> I am trying to debug this triple fault bringing up PVH linux domU on
>>>> AMD.
>>>>
>>>> Instruction:
>>>> ffffffff81d2d976: 8:dmi_scan_machine+b7          mov (%r12),
>>>> %rax r12: ffffffffff46e000
>>>>
>>>> This first causes #PF:
>>>> (XEN) exitcode = 0x4e exitintinfo = 0
>>>> (XEN) exitinfo1 = 0x9 exitinfo2 = 0xffffffffff46e000 
>>>>
>>>> erro_code == 0x9 => RSVD bit set. according to the APM:
>>>>
>>>>    RSV—Bit 3. If this bit is set to 1, the page fault is a result
>>>>    of the processor reading a 1 from a reserved field within a
>>>>    page-translation-table entry. This type of page fault occurs only
>>>>    when CR4.PSE=1 or CR4.PAE=1.
>>>>
>>>> My CR4 == 0x0000000000000060 == PAE MCE (Full vmcb below). 
>>>> However, all PTEs seem OK, all NPT entries seem OK too.
>>>>
>>>> PTE entries (l4 thru L1):
>>>>
>>>> 0000000001c16067 0000000001c18067 0000000001e8d067 80000000000f0463 
>>> EFER.NX is clear, and hence the NX bit on the L1 entry is wrong.
>> Ah, interesting, I didn't realize it would complain about NX during 
>> load/store.
>>
>> BTW on:
>>
>> Intel:
>>     Guest EFER = 0x0000000000000000
>>
>>     Ptes:
>>        0000000001c16067 0000000001c18067 0000000001e8d067 80000000000f0463
>>
>> L1 has XD set. Maybe Intel just ignores the bit if EFER.NX is 0!
> Which would be a bug imo.

Intel Manual vol 3, 4.4.2 (32bit PAE) and 4.5 (64bit) states that
EFER.NXE = 0 and L1.P = 1 causes the L1.NX to be reserved, and must be 0.

I would expect this to fail with with a #PF indicating RSVD on Intel as
well as AMD.

I wonder whether there are some interaction issues with the non-root
paging mode?

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFH]: AMD SVM #PF error code with P and RSVD bit....
  2014-06-17 13:32       ` Andrew Cooper
@ 2014-06-17 21:43         ` Mukesh Rathor
  0 siblings, 0 replies; 6+ messages in thread
From: Mukesh Rathor @ 2014-06-17 21:43 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: kevin.tian, suravee.suthikulpanit, eddie.dong, Jan Beulich,
	Aravind.Gopalakrishnan, jun.nakajima, xen-devel,
	boris.ostrovsky@oracle.com

On Tue, 17 Jun 2014 14:32:44 +0100
Andrew Cooper <andrew.cooper3@citrix.com> wrote:

> On 17/06/14 08:01, Jan Beulich wrote:
> >>>> On 17.06.14 at 00:44, <mukesh.rathor@oracle.com> wrote:
> >> On Mon, 16 Jun 2014 10:24:15 +0100
> >> "Jan Beulich" <JBeulich@suse.com> wrote:
> >>
> >>>>>> On 14.06.14 at 03:03, <mukesh.rathor@oracle.com> wrote:
> >>>> I am trying to debug this triple fault bringing up PVH linux
> >>>> domU on AMD.
> >>>>
> >>>> Instruction:
> >>>> ffffffff81d2d976: 8:dmi_scan_machine+b7          mov (%r12),
> >>>> %rax r12: ffffffffff46e000
> >>>>
> >>>> This first causes #PF:
> >>>> (XEN) exitcode = 0x4e exitintinfo = 0
> >>>> (XEN) exitinfo1 = 0x9 exitinfo2 = 0xffffffffff46e000 
> >>>>
> >>>> erro_code == 0x9 => RSVD bit set. according to the APM:
> >>>>
> >>>>    RSV—Bit 3. If this bit is set to 1, the page fault is a result
> >>>>    of the processor reading a 1 from a reserved field within a
> >>>>    page-translation-table entry. This type of page fault occurs
> >>>> only when CR4.PSE=1 or CR4.PAE=1.
> >>>>
> >>>> My CR4 == 0x0000000000000060 == PAE MCE (Full vmcb below). 
> >>>> However, all PTEs seem OK, all NPT entries seem OK too.
> >>>>
> >>>> PTE entries (l4 thru L1):
> >>>>
> >>>> 0000000001c16067 0000000001c18067 0000000001e8d067
> >>>> 80000000000f0463 
> >>> EFER.NX is clear, and hence the NX bit on the L1 entry is wrong.
> >> Ah, interesting, I didn't realize it would complain about NX
> >> during load/store.
> >>
> >> BTW on:
> >>
> >> Intel:
> >>     Guest EFER = 0x0000000000000000
> >>
> >>     Ptes:
> >>        0000000001c16067 0000000001c18067 0000000001e8d067
> >> 80000000000f0463
> >>
> >> L1 has XD set. Maybe Intel just ignores the bit if EFER.NX is 0!
> > Which would be a bug imo.
> 
> Intel Manual vol 3, 4.4.2 (32bit PAE) and 4.5 (64bit) states that
> EFER.NXE = 0 and L1.P = 1 causes the L1.NX to be reserved, and must
> be 0.
> 
> I would expect this to fail with with a #PF indicating RSVD on Intel
> as well as AMD.
> 
> I wonder whether there are some interaction issues with the non-root
> paging mode?

Hmm.. can't think of any, appears to be a processor issue.


[Adding Intel folks]

Hi Intel folks,

Can you please confirm if this is processor issue, and if yes, is
there need to open bug, and if yes, how?

thanks
Mukesh




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-06-17 21:43 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-06-14  1:03 [RFH]: AMD SVM #PF error code with P and RSVD bit Mukesh Rathor
2014-06-16  9:24 ` Jan Beulich
2014-06-16 22:44   ` Mukesh Rathor
2014-06-17  7:01     ` Jan Beulich
2014-06-17 13:32       ` Andrew Cooper
2014-06-17 21:43         ` Mukesh Rathor

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.