From: Keir Fraser <keir@xen.org>
To: Olaf Hering <olaf@aepfle.de>
Cc: xen-devel@lists.xensource.com
Subject: Re: Need help with fixing the Xen waitqueue feature
Date: Tue, 22 Nov 2011 15:40:47 +0000 [thread overview]
Message-ID: <CAF172FF.34839%keir@xen.org> (raw)
In-Reply-To: <20111122150755.GA18727@aepfle.de>
On 22/11/2011 15:07, "Olaf Hering" <olaf@aepfle.de> wrote:
> On Tue, Nov 22, Keir Fraser wrote:
>
>> On 22/11/2011 13:54, "Olaf Hering" <olaf@aepfle.de> wrote:
>>
>>> On Tue, Nov 22, Keir Fraser wrote:
>>>
>>>> I think I checked before, but: also unresponsive to serial debug keys?
>>>
>>> Good point, I will check that. So far I havent used these keys.
>>
>> If they work then 'd' will give you a backtrace on every CPU, and 'q' will
>> dump domain and vcpu states. That should make things easier!
>
> They do indeed work. The backtrace below is from another system.
> Looks like hpet_broadcast_exit() is involved.
>
> Does that output below give any good hints?
It tells us that the hypervisor itself is in good shape. The deterministic
RIP in hpet_broadcast_exit() is simply because the serial rx interrupt is
always waking us from the idle loop. That RIP value will simply be the first
possible interruption point after the HLT instruction.
I have a new theory, which is that if we go round the for-loop in
wait_event() more than once, the vcpu's pause counter gets messed up and
goes negative, condemning it to sleep forever.
I have *just* pushed a change to the debug 'q' key (ignore the changeset
comment referring to 'd' key, I got that wrong!) which will print per-vcpu
and per-domain pause_count values. Please get the system stuck again, and
send the output from 'q' key with that new changeset (c/s 24178).
Finally, I don't really know what the prep/wake/done messages from your logs
mean, as you didn't send the patch that prints them.
-- Keir
>> Try the attached patch (please also try reducing the size of the new
>> parameter to the inline asm from PAGE_SIZE down to e.g. 2000 to force the
>> domain-crashing path).
>
> Thanks, I will try it.
>
>
> Olaf
>
>
> ..........
>
> (XEN) 'q' pressed -> dumping domain info (now=0x5E:F50D77F8)
> (XEN) General information for domain 0:
> (XEN) refcnt=3 dying=0 nr_pages=1852873 xenheap_pages=5 dirty_cpus={}
> max_pages=4294967295
> (XEN) handle=00000000-0000-0000-0000-000000000000 vm_assist=00000004
> (XEN) Rangesets belonging to domain 0:
> (XEN) I/O Ports { 0-1f, 22-3f, 44-60, 62-9f, a2-3f7, 400-807, 80c-cfb,
> d00-ffff }
> (XEN) Interrupts { 0-207 }
> (XEN) I/O Memory { 0-febff, fec01-fedff, fee01-ffffffffffffffff }
> (XEN) Memory pages belonging to domain 0:
> (XEN) DomPage list too long to display
> (XEN) XenPage 000000000021e6d9: caf=c000000000000002, taf=7400000000000002
> (XEN) XenPage 000000000021e6d8: caf=c000000000000001, taf=7400000000000001
> (XEN) XenPage 000000000021e6d7: caf=c000000000000001, taf=7400000000000001
> (XEN) XenPage 000000000021e6d6: caf=c000000000000001, taf=7400000000000001
> (XEN) XenPage 00000000000db2fe: caf=c000000000000002, taf=7400000000000002
> (XEN) VCPU information and callbacks for domain 0:
> (XEN) VCPU0: CPU0 [has=F] flags=0 poll=0 upcall_pend = 01, upcall_mask =
> 00 dirty_cpus={} cpu_affinity={0}
> (XEN) 250 Hz periodic timer (period 4 ms)
> (XEN) General information for domain 1:
> (XEN) refcnt=3 dying=0 nr_pages=3645 xenheap_pages=6 dirty_cpus={}
> max_pages=131328
> (XEN) handle=d80155e4-8f8b-94e1-8382-94084b7f1e51 vm_assist=00000000
> (XEN) paging assistance: hap refcounts log_dirty translate external
> (XEN) Rangesets belonging to domain 1:
> (XEN) I/O Ports { }
> (XEN) Interrupts { }
> (XEN) I/O Memory { }
> (XEN) Memory pages belonging to domain 1:
> (XEN) DomPage list too long to display
> (XEN) PoD entries=0 cachesize=0
> (XEN) XenPage 000000000020df70: caf=c000000000000001, taf=7400000000000001
> (XEN) XenPage 000000000020e045: caf=c000000000000001, taf=7400000000000001
> (XEN) XenPage 000000000020c58c: caf=c000000000000001, taf=7400000000000001
> (XEN) XenPage 000000000020c5a4: caf=c000000000000001, taf=7400000000000001
> (XEN) XenPage 0000000000019f1e: caf=c000000000000001, taf=7400000000000001
> (XEN) XenPage 000000000020eb23: caf=c000000000000001, taf=7400000000000001
> (XEN) VCPU information and callbacks for domain 1:
> (XEN) VCPU0: CPU0 [has=F] flags=4 poll=0 upcall_pend = 00, upcall_mask =
> 00 dirty_cpus={} cpu_affinity={0}
> (XEN) paging assistance: hap, 4 levels
> (XEN) No periodic timer
> (XEN) Notifying guest 0:0 (virq 1, port 0, stat 0/-1/-1)
> (XEN) Notifying guest 1:0 (virq 1, port 0, stat 0/0/0)
> (XEN) 'q' pressed -> dumping domain info (now=0x60:A7DD8B08)
> (XEN) General information for domain 0:
> (XEN) refcnt=3 dying=0 nr_pages=1852873 xenheap_pages=5 dirty_cpus={}
> max_pages=4294967295
> (XEN) handle=00000000-0000-0000-0000-000000000000 vm_assist=00000004
> (XEN) Rangesets belonging to domain 0:
> (XEN) I/O Ports { 0-1f, 22-3f, 44-60, 62-9f, a2-3f7, 400-807, 80c-cfb,
> d00-ffff }
> (XEN) Interrupts { 0-207 }
> (XEN) I/O Memory { 0-febff, fec01-fedff, fee01-ffffffffffffffff }
> (XEN) Memory pages belonging to domain 0:
> (XEN) DomPage list too long to display
> (XEN) XenPage 000000000021e6d9: caf=c000000000000002, taf=7400000000000002
> (XEN) XenPage 000000000021e6d8: caf=c000000000000001, taf=7400000000000001
> (XEN) XenPage 000000000021e6d7: caf=c000000000000001, taf=7400000000000001
> (XEN) XenPage 000000000021e6d6: caf=c000000000000001, taf=7400000000000001
> (XEN) XenPage 00000000000db2fe: caf=c000000000000002, taf=7400000000000002
> (XEN) VCPU information and callbacks for domain 0:
> (XEN) VCPU0: CPU0 [has=F] flags=0 poll=0 upcall_pend = 01, upcall_mask =
> 00 dirty_cpus={} cpu_affinity={0}
> (XEN) 250 Hz periodic timer (period 4 ms)
> (XEN) General information for domain 1:
> (XEN) refcnt=3 dying=0 nr_pages=3645 xenheap_pages=6 dirty_cpus={}
> max_pages=131328
> (XEN) handle=d80155e4-8f8b-94e1-8382-94084b7f1e51 vm_assist=00000000
> (XEN) paging assistance: hap refcounts log_dirty translate external
> (XEN) Rangesets belonging to domain 1:
> (XEN) I/O Ports { }
> (XEN) Interrupts { }
> (XEN) I/O Memory { }
> (XEN) Memory pages belonging to domain 1:
> (XEN) DomPage list too long to display
> (XEN) PoD entries=0 cachesize=0
> (XEN) XenPage 000000000020df70: caf=c000000000000001, taf=7400000000000001
> (XEN) XenPage 000000000020e045: caf=c000000000000001, taf=7400000000000001
> (XEN) XenPage 000000000020c58c: caf=c000000000000001, taf=7400000000000001
> (XEN) XenPage 000000000020c5a4: caf=c000000000000001, taf=7400000000000001
> (XEN) XenPage 0000000000019f1e: caf=c000000000000001, taf=7400000000000001
> (XEN) XenPage 000000000020eb23: caf=c000000000000001, taf=7400000000000001
> (XEN) VCPU information and callbacks for domain 1:
> (XEN) VCPU0: CPU0 [has=F] flags=4 poll=0 upcall_pend = 00, upcall_mask =
> 00 dirty_cpus={} cpu_affinity={0}
> (XEN) paging assistance: hap, 4 levels
> (XEN) No periodic timer
> (XEN) Notifying guest 0:0 (virq 1, port 0, stat 0/-1/-1)
> (XEN) Notifying guest 1:0 (virq 1, port 0, stat 0/0/0)
> (XEN) 'd' pressed -> dumping registers
> (XEN)
> (XEN) *** Dumping CPU0 host state: ***
> (XEN) ----[ Xen-4.2.24169-20111122.144218 x86_64 debug=y Tainted: C
> ]----
> (XEN) CPU: 0
> (XEN) RIP: e008:[<ffff82c48019bfe6>] hpet_broadcast_exit+0x0/0x1f9
> (XEN) RFLAGS: 0000000000000246 CONTEXT: hypervisor
> (XEN) rax: 0000000000003b40 rbx: 000000674742e72d rcx: 0000000000000001
> (XEN) rdx: 0000000000000000 rsi: ffff82c48030f000 rdi: ffff82c4802bfea0
> (XEN) rbp: ffff82c4802bfee0 rsp: ffff82c4802bfe78 r8: 000000008c858211
> (XEN) r9: 0000000000000003 r10: ffff82c4803064e0 r11: 000000676bf885a3
> (XEN) r12: ffff83021e70e840 r13: ffff83021e70e8d0 r14: 00000067471bdb62
> (XEN) r15: ffff82c48030e440 cr0: 000000008005003b cr4: 00000000000026f0
> (XEN) cr3: 00000000db4c4000 cr2: 0000000000beb000
> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008
> (XEN) Xen stack trace from rsp=ffff82c4802bfe78:
> (XEN) ffff82c48019f0ca ffff82c4802bff18 ffffffffffffffff ffff82c4802bfed0
> (XEN) 0000000180124b57 0000000000000000 0000000000000000 ffff82c48025b200
> (XEN) 0000152900006fe3 ffff82c4802bff18 ffff82c48025b200 ffff82c4802bff18
> (XEN) ffff82c48030e468 ffff82c4802bff10 ffff82c48015a88d 0000000000000000
> (XEN) ffff8300db6c6000 ffff8300db6c6000 ffffffffffffffff ffff82c4802bfe00
> (XEN) 0000000000000000 0000000000001000 0000000000001000 0000000000000000
> (XEN) 8000000000000427 ffff8801d8579010 0000000000000246 00000000deadbeef
> (XEN) ffff8801d8579000 ffff8801d8579000 00000000fffffffe ffffffff8000302a
> (XEN) 00000000deadbeef 00000000deadbeef 00000000deadbeef 0000010000000000
> (XEN) ffffffff8000302a 000000000000e033 0000000000000246 ffff8801a515bd10
> (XEN) 000000000000e02b 000000000000beef 000000000000beef 000000000000beef
> (XEN) 000000000000beef 0000000000000000 ffff8300db6c6000 0000000000000000
> (XEN) 0000000000000000
> (XEN) Xen call trace:
> (XEN) [<ffff82c48019bfe6>] hpet_broadcast_exit+0x0/0x1f9
> (XEN) [<ffff82c48015a88d>] idle_loop+0x6c/0x7b
> (XEN)
> (XEN) 'd' pressed -> dumping registers
> (XEN)
> (XEN) *** Dumping CPU0 host state: ***
> (XEN) ----[ Xen-4.2.24169-20111122.144218 x86_64 debug=y Tainted: C
> ]----
> (XEN) CPU: 0
> (XEN) RIP: e008:[<ffff82c48019bfe6>] hpet_broadcast_exit+0x0/0x1f9
> (XEN) RFLAGS: 0000000000000246 CONTEXT: hypervisor
> (XEN) rax: 0000000000003b40 rbx: 00000078f4fbe7ed rcx: 0000000000000001
> (XEN) rdx: 0000000000000000 rsi: ffff82c48030f000 rdi: ffff82c4802bfea0
> (XEN) rbp: ffff82c4802bfee0 rsp: ffff82c4802bfe78 r8: 00000000cd4f8db6
> (XEN) r9: 0000000000000002 r10: ffff82c480308780 r11: 000000790438291d
> (XEN) r12: ffff83021e70e840 r13: ffff83021e70e8d0 r14: 00000078f412a61c
> (XEN) r15: ffff82c48030e440 cr0: 000000008005003b cr4: 00000000000026f0
> (XEN) cr3: 00000000db4c4000 cr2: 0000000000beb000
> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008
> (XEN) Xen stack trace from rsp=ffff82c4802bfe78:
> (XEN) ffff82c48019f0ca ffff82c4802bff18 ffffffffffffffff ffff82c4802bfed0
> (XEN) 0000000180124b57 0000000000000000 0000000000000000 ffff82c48025b200
> (XEN) 0000239e00007657 ffff82c4802bff18 ffff82c48025b200 ffff82c4802bff18
> (XEN) ffff82c48030e468 ffff82c4802bff10 ffff82c48015a88d 0000000000000000
> (XEN) ffff8300db6c6000 ffff8300db6c6000 ffffffffffffffff ffff82c4802bfe00
> (XEN) 0000000000000000 0000000000001000 0000000000001000 0000000000000000
> (XEN) 8000000000000427 ffff8801d8579010 0000000000000246 00000000deadbeef
> (XEN) ffff8801d8579000 ffff8801d8579000 00000000fffffffe ffffffff8000302a
> (XEN) 00000000deadbeef 00000000deadbeef 00000000deadbeef 0000010000000000
> (XEN) ffffffff8000302a 000000000000e033 0000000000000246 ffff8801a515bd10
> (XEN) 000000000000e02b 000000000000beef 000000000000beef 000000000000beef
> (XEN) 000000000000beef 0000000000000000 ffff8300db6c6000 0000000000000000
> (XEN) 0000000000000000
> (XEN) Xen call trace:
> (XEN) [<ffff82c48019bfe6>] hpet_broadcast_exit+0x0/0x1f9
> (XEN) [<ffff82c48015a88d>] idle_loop+0x6c/0x7b
> (XEN)
>
next parent reply other threads:[~2011-11-22 15:40 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20111122150755.GA18727@aepfle.de>
2011-11-22 15:40 ` Keir Fraser [this message]
2011-11-22 15:54 ` Need help with fixing the Xen waitqueue feature Keir Fraser
2011-11-22 17:36 ` Olaf Hering
2011-11-22 17:42 ` Keir Fraser
2011-11-22 18:04 ` Olaf Hering
2011-11-22 21:15 ` Olaf Hering
2011-11-22 21:53 ` Keir Fraser
2011-11-23 17:00 ` Olaf Hering
2011-11-23 17:16 ` Keir Fraser
2011-11-23 18:06 ` Olaf Hering
2011-11-23 18:23 ` Keir Fraser
2011-11-23 18:18 ` Keir Fraser
2011-11-23 18:31 ` Olaf Hering
2011-11-23 19:21 ` Keir Fraser
2011-11-23 21:03 ` Keir Fraser
2011-11-23 22:30 ` Olaf Hering
2011-11-23 23:12 ` Keir Fraser
2011-11-24 10:00 ` Olaf Hering
2011-11-25 12:56 ` Olaf Hering
2011-11-25 18:26 ` Olaf Hering
2011-11-25 19:35 ` Keir Fraser
2011-11-24 9:15 ` Jan Beulich
2011-11-24 9:51 ` Keir Fraser
2011-11-24 9:58 ` Keir Fraser
[not found] <20111108224414.83985CF73A@homiemail-mx7.g.dreamhost.com>
2011-11-09 3:52 ` Andres Lagar-Cavilla
2011-11-09 7:09 ` Olaf Hering
2011-11-09 21:21 ` Andres Lagar-Cavilla
2011-11-22 14:34 ` George Dunlap
2011-11-09 21:30 ` Andres Lagar-Cavilla
2011-11-09 22:11 ` Olaf Hering
2011-11-10 4:29 ` Andres Lagar-Cavilla
2011-11-10 9:20 ` Jan Beulich
2011-11-10 9:26 ` Keir Fraser
2011-11-10 10:18 ` Olaf Hering
2011-11-10 12:05 ` Olaf Hering
[not found] <20111108214540.EAEBB72C4A1@homiemail-mx8.g.dreamhost.com>
2011-11-09 3:37 ` Andres Lagar-Cavilla
2011-11-09 7:02 ` Olaf Hering
2011-11-08 21:20 Olaf Hering
2011-11-08 22:05 ` Keir Fraser
2011-11-08 22:20 ` Olaf Hering
2011-11-08 22:54 ` Keir Fraser
2011-11-11 22:56 ` Olaf Hering
2011-11-12 7:00 ` Keir Fraser
2011-11-22 11:40 ` Olaf Hering
2011-11-22 13:04 ` Keir Fraser
2011-11-22 13:54 ` Olaf Hering
2011-11-22 14:24 ` Keir Fraser
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAF172FF.34839%keir@xen.org \
--to=keir@xen.org \
--cc=olaf@aepfle.de \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.