All of lore.kernel.org
 help / color / mirror / Atom feed
From: Olaf Hering <olaf@aepfle.de>
To: Keir Fraser <keir.xen@gmail.com>
Cc: xen-devel@lists.xensource.com
Subject: Re: Need help with fixing the Xen waitqueue feature
Date: Tue, 22 Nov 2011 12:40:57 +0100	[thread overview]
Message-ID: <20111122114057.GA28583@aepfle.de> (raw)
In-Reply-To: <CAE3CA1B.24BD2%keir.xen@gmail.com>

On Sat, Nov 12, Keir Fraser wrote:

> On 11/11/2011 22:56, "Olaf Hering" <olaf@aepfle.de> wrote:
> 
> > Keir,
> > 
> > just do dump my findings to the list:
> > 
> > On Tue, Nov 08, Keir Fraser wrote:
> > 
> >> Tbh I wonder anyway whether stale hypercall context would be likely to cause
> >> a silent machine reboot. Booting with max_cpus=1 would eliminate moving
> >> between CPUs as a cause of inconsistencies, or pin the guest under test.
> >> Another problem could be sleeping with locks held, but we do test for that
> >> (in debug builds at least) and I'd expect crash/hang rather than silent
> >> reboot. Another problem could be if the vcpu has its own state in an
> >> inconsistent/invalid state temporarily (e.g., its pagetable base pointers)
> >> which then is attempted to be restored during a waitqueue wakeup. That could
> >> certainly cause a reboot, but I don't know of an example where this might
> >> happen.
> > 
> > The crashes also happen with maxcpus=1 and a single guest cpu.
> > Today I added wait_event to ept_get_entry and this works.
> > 
> > But at some point the codepath below is executed, after that wake_up the
> > host hangs hard. I will trace it further next week, maybe the backtrace
> > gives a glue what the cause could be.
> 
> So you run with a single CPU, and with wait_event() in one location, and
> that works for a while (actually doing full waitqueue work: executing wait()
> and wake_up()), but then hangs? That's weird, but pretty interesting if I've
> understood correctly.

Yes, thats what happens with single cpu in dom0 and domU.
I have added some more debug. After the backtrace below I see one more
call to check_wakeup_from_wait() for dom0, then the host hangs hard.

> > Also, the 3K stacksize is still too small, this path uses 3096.
> 
> I'll allocate a whole page for the stack then.

Thanks.


Olaf

> > (XEN) prep 127a 30 0
> > (XEN) wake 127a 30
> > (XEN) prep 1cf71 30 0
> > (XEN) wake 1cf71 30
> > (XEN) prep 1cf72 30 0
> > (XEN) wake 1cf72 30
> > (XEN) prep 1cee9 30 0
> > (XEN) wake 1cee9 30
> > (XEN) prep 121a 30 0
> > (XEN) wake 121a 30
> > 
> > (This means 'gfn  (p2m_unshare << 4) in_atomic)'
> > 
> > (XEN) prep 1ee61 20 0
> > (XEN) max stacksize c18
> > (XEN) Xen WARN at wait.c:126
> > (XEN) ----[ Xen-4.2.24114-20111111.221356  x86_64  debug=y  Tainted:    C
> > ]----
> > (XEN) CPU:    0
> > (XEN) RIP:    e008:[<ffff82c48012b85e>] prepare_to_wait+0x178/0x1b2
> > (XEN) RFLAGS: 0000000000010286   CONTEXT: hypervisor
> > (XEN) rax: 0000000000000000   rbx: ffff830201f76000   rcx: 0000000000000000
> > (XEN) rdx: ffff82c4802b7f18   rsi: 000000000000000a   rdi: ffff82c4802673f0
> > (XEN) rbp: ffff82c4802b73a8   rsp: ffff82c4802b7378   r8:  0000000000000000
> > (XEN) r9:  ffff82c480221da0   r10: 00000000fffffffa   r11: 0000000000000003
> > (XEN) r12: ffff82c4802b7f18   r13: ffff830201f76000   r14: ffff83003ea5c000
> > (XEN) r15: 000000000001ee61   cr0: 000000008005003b   cr4: 00000000000026f0
> > (XEN) cr3: 000000020336d000   cr2: 00007fa88ac42000
> > (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
> > (XEN) Xen stack trace from rsp=ffff82c4802b7378:
> > (XEN)    0000000000000020 000000000001ee61 0000000000000002 ffff830201aa9e90
> > (XEN)    ffff830201aa9f60 0000000000000020 ffff82c4802b7428 ffff82c4801e02f9
> > (XEN)    ffff830000000002 0000000000000000 ffff82c4802b73f8 ffff82c4802b73f4
> > (XEN)    0000000000000000 ffff82c4802b74e0 ffff82c4802b74e4 0000000101aa9e90
> > (XEN)    000000ffffffffff ffff830201aa9e90 000000000001ee61 ffff82c4802b74e4
> > (XEN)    0000000000000002 0000000000000000 ffff82c4802b7468 ffff82c4801d810f
> > (XEN)    ffff82c4802b74e0 000000000001ee61 ffff830201aa9e90 ffff82c4802b75bc
> > (XEN)    00000000002167f5 ffff88001ee61900 ffff82c4802b7518 ffff82c480211b80
> > (XEN)    ffff8302167f5000 ffff82c4801c168c 0000000000000000 ffff83003ea5c000
> > (XEN)    ffff88001ee61900 0000000001805063 0000000001809063 000000001ee001e3
> > (XEN)    000000001ee61067 00000000002167f5 000000000022ee70 000000000022ed10
> > (XEN)    ffffffffffffffff 0000000a00000007 0000000000000004 ffff82c48025db80
> > (XEN)    ffff83003ea5c000 ffff82c4802b75bc ffff88001ee61900 ffff830201aa9e90
> > (XEN)    ffff82c4802b7528 ffff82c480211cb1 ffff82c4802b7568 ffff82c4801da97f
> > (XEN)    ffff82c4801be053 0000000000000008 ffff82c4802b7b58 ffff88001ee61900
> > (XEN)    0000000000000000 ffff82c4802b78b0 ffff82c4802b75f8 ffff82c4801aaec8
> > (XEN)    0000000000000003 ffff88001ee61900 ffff82c4802b78b0 ffff82c4802b7640
> > (XEN)    ffff83003ea5c000 00000000000000a0 0000000000000900 0000000000000008
> > (XEN)    00000003802b7650 0000000000000004 00000003802b7668 0000000000000000
> > (XEN)    ffff82c4802b7b58 0000000000000001 0000000000000003 ffff82c4802b78b0
> > (XEN) Xen call trace:
> > (XEN)    [<ffff82c48012b85e>] prepare_to_wait+0x178/0x1b2
> > (XEN)    [<ffff82c4801e02f9>] ept_get_entry+0x81/0xd8
> > (XEN)    [<ffff82c4801d810f>] gfn_to_mfn_type_p2m+0x55/0x114
> > (XEN)    [<ffff82c480211b80>] hap_p2m_ga_to_gfn_4_levels+0x1c4/0x2d6
> > (XEN)    [<ffff82c480211cb1>] hap_gva_to_gfn_4_levels+0x1f/0x2e
> > (XEN)    [<ffff82c4801da97f>] paging_gva_to_gfn+0xae/0xc4
> > (XEN)    [<ffff82c4801aaec8>] hvmemul_linear_to_phys+0xf1/0x25c
> > (XEN)    [<ffff82c4801ab762>] hvmemul_rep_movs+0xe8/0x31a
> > (XEN)    [<ffff82c48018de07>] x86_emulate+0x4e01/0x10fde
> > (XEN)    [<ffff82c4801aab3c>] hvm_emulate_one+0x12d/0x1c5
> > (XEN)    [<ffff82c4801b68a9>] handle_mmio+0x4e/0x1d8
> > (XEN)    [<ffff82c4801b3a1e>] hvm_hap_nested_page_fault+0x1e7/0x302
> > (XEN)    [<ffff82c4801d1ff6>] vmx_vmexit_handler+0x12cf/0x1594
> > (XEN)
> > (XEN) wake 1ee61 20
> > 
> > 
> > 
> 
> 

  reply	other threads:[~2011-11-22 11:40 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-08 21:20 Need help with fixing the Xen waitqueue feature Olaf Hering
2011-11-08 22:05 ` Keir Fraser
2011-11-08 22:20   ` Olaf Hering
2011-11-08 22:54     ` Keir Fraser
2011-11-11 22:56       ` Olaf Hering
2011-11-12  7:00         ` Keir Fraser
2011-11-22 11:40           ` Olaf Hering [this message]
2011-11-22 13:04             ` Keir Fraser
2011-11-22 13:54               ` Olaf Hering
2011-11-22 14:24                 ` Keir Fraser
     [not found] <20111108214540.EAEBB72C4A1@homiemail-mx8.g.dreamhost.com>
2011-11-09  3:37 ` Andres Lagar-Cavilla
2011-11-09  7:02   ` Olaf Hering
     [not found] <20111108224414.83985CF73A@homiemail-mx7.g.dreamhost.com>
2011-11-09  3:52 ` Andres Lagar-Cavilla
2011-11-09  7:09   ` Olaf Hering
2011-11-09 21:21     ` Andres Lagar-Cavilla
2011-11-22 14:34       ` George Dunlap
2011-11-09 21:30     ` Andres Lagar-Cavilla
2011-11-09 22:11       ` Olaf Hering
2011-11-10  4:29         ` Andres Lagar-Cavilla
2011-11-10  9:20           ` Jan Beulich
2011-11-10  9:26           ` Keir Fraser
2011-11-10 10:18           ` Olaf Hering
2011-11-10 12:05             ` Olaf Hering
     [not found] <20111122150755.GA18727@aepfle.de>
2011-11-22 15:40 ` Keir Fraser
2011-11-22 15:54   ` Keir Fraser
2011-11-22 17:36   ` Olaf Hering
2011-11-22 17:42     ` Keir Fraser
2011-11-22 18:04       ` Olaf Hering
2011-11-22 21:15         ` Olaf Hering
2011-11-22 21:53           ` Keir Fraser
2011-11-23 17:00             ` Olaf Hering
2011-11-23 17:16               ` Keir Fraser
2011-11-23 18:06                 ` Olaf Hering
2011-11-23 18:23                   ` Keir Fraser
2011-11-23 18:18                 ` Keir Fraser
2011-11-23 18:31                   ` Olaf Hering
2011-11-23 19:21                     ` Keir Fraser
2011-11-23 21:03                       ` Keir Fraser
2011-11-23 22:30                         ` Olaf Hering
2011-11-23 23:12                           ` Keir Fraser
2011-11-24 10:00                             ` Olaf Hering
2011-11-25 12:56                               ` Olaf Hering
2011-11-25 18:26                                 ` Olaf Hering
2011-11-25 19:35                                   ` Keir Fraser
2011-11-24  9:15                         ` Jan Beulich
2011-11-24  9:51                           ` Keir Fraser
2011-11-24  9:58                           ` Keir Fraser

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111122114057.GA28583@aepfle.de \
    --to=olaf@aepfle.de \
    --cc=keir.xen@gmail.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.