xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Jeff_Zimmerman@McAfee.com, asit.k.mallick@intel.com
Cc: xen-devel@lists.xenproject.org
Subject: Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)
Date: Wed, 6 Nov 2013 00:23:25 +0000	[thread overview]
Message-ID: <52798BFD.3010608@citrix.com> (raw)
In-Reply-To: <CBAEFD34-320E-45F6-AA9F-AC65E4393E6F@McAfee.com>


[-- Attachment #1.1: Type: text/plain, Size: 7667 bytes --]

On 05/11/2013 22:46, Jeff_Zimmerman@McAfee.com wrote:
> Asit,
> I've attached two files, one is from dmesg | grep microcode, second is
> first process from /proc/cpuinfo
> Jeff
>
> On Nov 5, 2013, at 2:29 PM, "Mallick, Asit K" <asit.k.mallick@intel.com>
>  wrote:
>
> > Jeff,
> > Could you check if you you have latest microcode updates installed
> on this system? Or, could you send me the microcode rev and I can check.
> >
> > Thanks,
> > Asit
> >
> >
> > From: "Jeff_Zimmerman@McAfee.com<mailto:Jeff_Zimmerman@McAfee.com>"
> <Jeff_Zimmerman@McAfee.com<mailto:Jeff_Zimmerman@McAfee.com>>
> > Date: Tuesday, November 5, 2013 2:55 PM
> > To: "lars.kurth@xen.org<mailto:lars.kurth@xen.org>"
> <lars.kurth@xen.org<mailto:lars.kurth@xen.org>>
> > Cc: "lars.kurth.xen@gmail.com<mailto:lars.kurth.xen@gmail.com>"
> <lars.kurth.xen@gmail.com<mailto:lars.kurth.xen@gmail.com>>,
> "xen-devel@lists.xenproject.org<mailto:xen-devel@lists.xenproject.org>" <xen-devel@lists.xenproject.org<mailto:xen-devel@lists.xenproject.org>>,
> "JBeulich@suse.com<mailto:JBeulich@suse.com>"
> <JBeulich@suse.com<mailto:JBeulich@suse.com>>
> > Subject: Re: [Xen-devel] Intermittent fatal page fault with XEN
> 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)
> >
> > Lars,
> > I understand the mailing list limits attachment size to 512K. Where
> can I post the xen binary an symbols file?
> > Jeff
> >
> > On Nov 5, 2013, at 7:46 AM, Lars Kurth
> <lars.kurth@xen.org<mailto:lars.kurth@xen.org>> wrote:
> >
> > Jan, Andrew, Ian,
> >
> > pulling in Jeff who raised the question. Snippets from misc replies
> attached. Jeff, please look through these (in particular Jan's answer)
> and answer any further questions on this thread.
> >
> > On 05/11/2013 09:53, Ian Campbell wrote:
> >> TBH I think for this kind of thing (i.e. a bug not a user question)
> the most appropriate thing to
> >> do would be to redirect them to xen-devel themselves (with a
> reminder that they do not need
> >> to subscribe to post).
> > Agreed. Another option is for me to start the thread and pull in the
> raiser of the thread into it, if it is a bug. Was not sure this was a
> real bug at first, but it seems it is.
> >
> > On 04/11/2013 20:00, Andrew Cooper wrote:
> >> Which version of Xen were these images saved on?
> > [Jeff] We were careful to regenerate all the images after upgrading
> the 4.3.1. Also saw the same problem on 4.3.0.
> >
> >> Are you expecting to be using nested-virt? (It is still very
> definitely experimental)
> > [Jeff] Not using nested-virt.
> >
> > On 05/11/2013 10:04, Jan Beulich wrote:
> >
> > On 04.11.13 at 20:54, Lars Kurth
> <lars.kurth.xen@gmail.com><mailto:lars.kurth.xen@gmail.com> wrote:
> >
> >
> > See
> >
> http://xenproject.org/help/questions-and-answers/hypervisor-fatal-page-fault-xen-4-3-
> > 1.html
> > ---
> > I have a 32 core system running XEN 4.3.1 with 30 Windows XP VM's.
> > DOM0 is Centos 6.3 based with linux kernel 3.10.16.
> > In my configuration all of the windows HVMs are running having been
> > restored from xl save.
> > VM's are destroyed or restored in an on-demand fashion. After some
> time XEN
> > will experience a fatal page fault while restoring one of the
> windows HVM
> > subjects. This does not happen very often, perhaps once in a 16 to
> 48 hour
> > period.
> > The stack trace from xen follows. Thanks in advance for any help.
> >
> > (XEN) ----[ Xen-4.3.1 x86_64 debug=n Tainted: C ]----
> > (XEN) CPU: 52
> > (XEN) RIP: e008:[] domain_page_map_to_mfn+0x86/0xc0
> >
> >
> > Zapping addresses (here and below in the stack trace) is never
> > helpful when someone asks for help with a crash. Also, in order
> > to not just guess, the matching xen-syms or xen.efi should be
> > made available or pointed to.
> >
> >
> >
> > (XEN) RFLAGS: 0000000000010246 CONTEXT: hypervisor
> > (XEN) rax: 000ffffffffff000 rbx: ffff8300bb163760 rcx: 0000000000000000
> > (XEN) rdx: ffff810000000000 rsi: 0000000000000000 rdi: 0000000000000000
> > (XEN) rbp: ffff8300bb163000 rsp: ffff8310333e7cd8 r8: 0000000000000000
> > (XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000000000
> > (XEN) r12: ffff8310333e7f18 r13: 0000000000000000 r14: 0000000000000000
> > (XEN) r15: 0000000000000000 cr0: 0000000080050033 cr4: 00000000000426f0
> > (XEN) cr3: 000000211bee5000 cr2: ffff810000000000
> > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008
> > (XEN) Xen stack trace from rsp=ffff8310333e7cd8:
> > (XEN) 0000000000000001 ffff82c4c01de869 ffff82c4c0182c70
> ffff8300bb163000
> > (XEN) 0000000000000014 ffff8310333e7f18 0000000000000000
> ffff82c4c01d7548
> > (XEN) ffff8300bb163490 ffff8300bb163000 ffff82c4c01c65b8
> ffff8310333e7e60
> > (XEN) ffff82c4c01badef ffff8300bb163000 0000000000000003
> ffff833144d8e000
> > (XEN) ffff82c4c01b4885 ffff8300bb163000 ffff8300bb163000
> ffff8300bdff1000
> > (XEN) 0000000000000001 ffff82c4c02f2880 ffff82c4c02f2880
> ffff82c4c0308440
> > (XEN) ffff82c4c01d0ea8 ffff8300bb163000 ffff82c4c015ad6c
> ffff82c4c02f2880
> > (XEN) ffff82c4c02cf800 00000000ffffffff ffff8310333f5060
> ffff82c4c02f2880
> > (XEN) 0000000000000282 0010000000000000 0000000000000000
> 0000000000000000
> > (XEN) 0000000000000000 ffff82c4c02f2880 ffff8300bdff1000
> ffff8300bb163000
> > (XEN) 000031a10f2b16ca 0000000000000001 ffff82c4c02f2880
> ffff82c4c0308440
> > (XEN) ffff82c4c0124444 0000000000000034 ffff8310333f5060
> 0000000001c9c380
> > (XEN) 00000000c0155965 ffff82c4c01c6146 0000000001c9c380
> ffffffffffffff00
> > (XEN) ffff82c4c0128fa8 ffff8300bb163000 ffff8327d50e9000
> ffff82c4c01bc490
> > (XEN) 0000000000000000 ffff82c4c01dd254 0000000080549ae0
> ffff82c4c01cfc3c
> > (XEN) ffff8300bb163000 ffff82c4c01d6128 ffff82c4c0125db9
> ffff82c4c0125db9
> > (XEN) ffff8310333e0000 ffff8300bb163000 000000000012ffc0
> 0000000000000000
> > (XEN) 0000000000000000 0000000000000000 0000000000000000
> ffff82c4c01deaa3
> > (XEN) 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000
> > (XEN) 000000000012ffc0 000000007ffdf000 0000000000000000
> 0000000000000000
> > (XEN) Xen call trace:
> > (XEN) [] domain_page_map_to_mfn+0x86/0xc0
> > (XEN) [] nvmx_handle_vmlaunch+0x49/0x160
> > (XEN) [] __update_vcpu_system_time+0x240/0x310
> > (XEN) [] vmx_vmexit_handler+0xb58/0x18c0
> > (XEN) [] pt_restore_timer+0xa8/0xc0
> > (XEN) [] hvm_io_assist+0xef/0x120
> > (XEN) [] hvm_do_resume+0x195/0x1c0
> > (XEN) [] vmx_do_resume+0x148/0x210
> > (XEN) [] context_switch+0x1bc/0xfc0
> > (XEN) [] schedule+0x254/0x5f0
> > (XEN) [] pt_update_irq+0x256/0x2b0
> > (XEN) [] timer_softirq_action+0x168/0x210
> > (XEN) [] hvm_vcpu_has_pending_irq+0x50/0xb0
> > (XEN) [] nvmx_switch_guest+0x54/0x1560
> > (XEN) [] vmx_intr_assist+0x6c/0x490
> > (XEN) [] vmx_vmenter_helper+0x88/0x160
> > (XEN) [] __do_softirq+0x69/0xa0
> > (XEN) [] __do_softirq+0x69/0xa0
> > (XEN) [] vmx_asm_do_vmentry+0/0xed
> > (XEN)
> > (XEN) Pagetable walk from ffff810000000000:
> > (XEN) L4[0x102] = 000000211bee5063 ffffffffffffffff
> > (XEN) L3[0x000] = 0000000000000000 ffffffffffffffff
> >
> >
> > This makes me suspect that domain_page_map_to_mfn() gets a
> > NULL pointer passed here. As said above, this is only guesswork
> > at this point, and as Ian already pointed out, directing the
> > reporter to xen-devel would seem to be the right thing to do
> > here anyway.
> >
> > Jan
> >
> >
> >
>

As Jan said, the above censoring is almost completely defeating the
purpose of trying to help you.

However, while you are not expecting to be using nested-virt, you
clearly appear to be from the stack trace, so something is clearly up.

Which toolstack are you using for VMs ?  What is the configuration for
the affected VM?

~Andrew

[-- Attachment #1.2: Type: text/html, Size: 14333 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  parent reply	other threads:[~2013-11-06  0:23 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CE9EAEF6.59305%asit.k.mallick@intel.com>
2013-11-05 22:46 ` Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.) Jeff_Zimmerman
2013-11-05 23:17   ` Mallick, Asit K
2013-11-06  0:23   ` Andrew Cooper [this message]
2013-11-06 10:05     ` Ian Campbell
2013-11-04 19:54 Lars Kurth
2013-11-04 20:00 ` Andrew Cooper
2013-11-05  9:53 ` Ian Campbell
2013-11-05 10:04 ` Jan Beulich
2013-11-05 15:46   ` Lars Kurth
2013-11-05 21:55     ` Jeff_Zimmerman
     [not found]     ` <5E2B3362-4D93-4FEF-987A-E477B0DCEE51@mcafee.com>
2013-11-06 14:09       ` Jan Beulich
2013-11-06 16:05         ` Jeff_Zimmerman
2013-11-06 16:16           ` Jan Beulich
2013-11-06 16:18           ` Ian Campbell
2013-11-06 16:48             ` Jeff_Zimmerman
2013-11-06 16:54               ` Andrew Cooper
2013-11-06 17:06                 ` Ian Campbell
2013-11-06 17:07                   ` Andrew Cooper
2013-11-07  9:10                     ` Jan Beulich
2013-11-07  9:30                       ` Ian Campbell
2013-11-07 15:41                         ` Jeff_Zimmerman
2013-11-07 15:54                           ` Andrew Cooper
2013-11-07 16:00                             ` Jan Beulich
2013-11-07 16:06                               ` Andrew Cooper
2013-11-07 16:12                                 ` Jeff_Zimmerman
2013-11-07 15:57                           ` Jan Beulich
2013-11-07 16:02                             ` Jeff_Zimmerman
2013-11-07 16:53                               ` Jan Beulich
2013-11-07 17:02                                 ` Andrew Cooper
2013-11-08  7:50                                   ` Jan Beulich
2013-11-07 18:13                                 ` Andrew Cooper
2013-11-07 18:33                                 ` Jeff_Zimmerman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52798BFD.3010608@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=Jeff_Zimmerman@McAfee.com \
    --cc=asit.k.mallick@intel.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).