From: Joanna Rutkowska <joanna@invisiblethingslab.com>
To: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: xen-devel@lists.xensource.com
Subject: Re: Re: DomU lockups after resume from S3 on Core i5 processors
Date: Tue, 06 Jul 2010 00:52:49 +0200 [thread overview]
Message-ID: <4C326241.2030503@invisiblethingslab.com> (raw)
In-Reply-To: <4C32602A.8070305@goop.org>
[-- Attachment #1.1: Type: text/plain, Size: 4051 bytes --]
On 07/06/10 00:43, Jeremy Fitzhardinge wrote:
> On 07/05/2010 03:07 PM, Joanna Rutkowska wrote:
>> On 07/05/10 23:28, Joanna Rutkowska wrote:
>>
>>> On 07/05/10 12:38, Joanna Rutkowska wrote:
>>>
>>>> I'm experiencing very reproducible DomU lockups that occur after I
>>>> resume the system from an S3 sleep. Strangely this seem to happen only
>>>> on my Core i5 systems (tested on two different machines), but not on
>>>> older Core 2 Duo systems.
>>>>
>>>> Usually this causes the apps (e.g. Firefox) running in DomUs to become
>>>> unresponsive, but sometimes I see that some very limited functionality
>>>> of the app is still available (e.g. I can open/close Tabs in Firefox,
>>>> but cannot do much anything more). Also, when I log in to the DomU via
>>>> xm console, I usually can see the login prompt, can enter the username,
>>>> but then the console hangs.
>>>>
>>>> I tried to attach to such a hanged DomU using gdbserver-xen, but when I
>>>> subsequently try to attach to the server from gdb (via the target
>>>> 127.0.0.1:9999 command), my gdb segfaults (how funny!).
>>>>
>>>> I'm running Xen 3.4.3, and fairly recent pvops0 kernel in DomU. In Dom0
>>>> I run 2.6.34-xenlinux kernel (opensuse patches), but I doubt it is
>>>> relevant in any way.
>>>>
>>>> This seems like a scheduling problem, and, because it seems to affect
>>>> Core i5 processors, but not Core 2 Duos, it might have something to do
>>>> with Hyperthreading perhaps?
>>>>
>>>>
>>> Ok, finally got the gdbsever working. This is the backtrace I get when
>>> attaching to a lockedup DomU after resume:
>>>
>>> #0 0xffffffff810093aa in ?? ()
>>> #1 0xffffffff8168be18 in ?? ()
>>> #2 0xffff880003a21600 in ?? ()
>>> #3 0xffffffff8100ee63 in HYPERVISOR_sched_op ()
>>> at
>>> /usr/src/debug/kernel-2.6.32/linux-2.6.32.x86_64/arch/x86/include/asm/xen/hypercall.h:292
>>> #4 xen_safe_halt () at arch/x86/xen/irq.c:104
>>> #5 0xffffffff8100c33e in raw_safe_halt () at
>>> /usr/src/debug/kernel-2.6.32/linux-2.6.32.x86_64/arch/x86/include/asm/paravirt.h:110
>>> #6 xen_idle () at arch/x86/xen/setup.c:193
>>> #7 0xffffffff81011cdd in cpu_idle () at arch/x86/kernel/process_64.c:143
>>> #8 0xffffffff8144b997 in rest_init () at init/main.c:445
>>> #9 0xffffffff81824ddc in start_kernel () at init/main.c:695
>>> #10 0xffffffff818242c1 in x86_64_start_reservations
>>> (real_mode_data=<value optimized out>) at arch/x86/kernel/head64.c:123
>>> #11 0xffffffff81828160 in xen_start_kernel () at
>>> arch/x86/xen/enlighten.c:1300
>>> #12 0xffffffff838f3000 in ?? ()
>>> #13 0xffffffff838f4000 in ?? ()
>>> #14 0xffffffff838f5000 in ?? ()
>>>
>>> Any ideas?
>>>
>>>
>> ... and when I disabled Hyperthreading in BIOS, the problem seems to
>> gone. Obviously this is not a desired solution...
>>
>
> HT has historically been very good at flushing out race conditions which
> would normally be tricky to hit on SMP systems. I assume your domain is
> single CPU?
Actually no. It used to be indeed, but then I thought it might be the
issue and assigned 2 vcpus to it, but it still they were locking up.
> Do you know what's going on it in that it might be waiting
> for?
No idea. I might be guessing that it would be different kernel
subsystems each time -- e.g. when I'm lucky and when the apps got only
"partially" locked up, I can e.g. open new tabs in Google Chrome, I can
see some thumbnails of my popular websites, but without their contents.
This would suggest the networking subsystem is dead, but at the same
time Chrome is apparently communicating fine with the X server in the
DomU (and which in turn talks fine with Dom0 over Xen shared
memory/evtchanl).
I experienced the above behavior also when had only one VCPU er DomU.
> Is it not longer getting timer events or something? Does the Xen
> 'q' debug-key make it do anything?
Ah, that's some secret option I've never heard of... Is in the gdb when
using with gdbserver-xen?
joanna.
[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 226 bytes --]
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
next prev parent reply other threads:[~2010-07-05 22:52 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-07-05 10:38 DomU lockups after resume from S3 on Core i5 processors Joanna Rutkowska
2010-07-05 21:28 ` Joanna Rutkowska
2010-07-05 22:07 ` Joanna Rutkowska
2010-07-05 22:43 ` Jeremy Fitzhardinge
2010-07-05 22:52 ` Joanna Rutkowska [this message]
2010-07-05 23:17 ` Jeremy Fitzhardinge
2010-07-06 8:41 ` Jan Beulich
2010-07-06 8:59 ` Joanna Rutkowska
2010-07-06 9:57 ` Jan Beulich
2010-07-08 14:04 ` Joanna Rutkowska
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4C326241.2030503@invisiblethingslab.com \
--to=joanna@invisiblethingslab.com \
--cc=jeremy@goop.org \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).