* Trouble with Xen 2.0
@ 2004-11-29 15:02 Jérôme Petazzoni
2004-11-29 23:43 ` Keir Fraser
0 siblings, 1 reply; 2+ messages in thread
From: Jérôme Petazzoni @ 2004-11-29 15:02 UTC (permalink / raw)
To: Xen-devel
[-- Attachment #1: Type: text/plain, Size: 1925 bytes --]
I tested Xen 2.0 and basic behaviour looks solid and stable, but
"dynamic" features didn't work at all for me. More specifically ; with
the precompiled build :
- saving a domain on a P4 computer was erratic (the save process was
hung ; dom0 and the to-be-saved VM were still running fine) ; it worked
"sometimes" (with no apparent influence of memory load, cpu load, etc).
The logs showed the line indicating a save, but nothing more (nothing
more than when the save succeeds)
- using the balloon interface to reduce the memory worked, but reducing
it by a large increment caused the affected VM to crash, spitting
repeatedly the same message over the console (I didn't have means to
save those messages yet, sorry, but I'll try to get them asap)
- using the balloon interface to increase the memory between the initial
value and the maximal value didn't work - sometimes it did nothing at
all, sometimes it crashed like before
- OTOH, we could migrate a linux VM from a P4 machine to a Celeron one,
and it kept working for a while (but /proc/cpuinfo showed wrong flags,
so maybe some optimized code would have crashed)
now, on my old Celeron testbed, with the -testing branch compiled by
myself :
- saving a domain seems to work, but when I restore the domain crashes
(see attached kernel messages ; I got them after reattaching the console
after the restore)
- xend.log and xend-debug show nothing exciting ; what should I do to
increase verbosity here ? I tried "xend trace_start" but it didn't
change anything - I certainly forgot something here.
- I recompiled xen0 and xenU kernels for Celeron CPU (since default
config was P4) but it didn't change anything.
I don't know in which direction I should dig ; I could of course try
with different Xen versions, or tweak debug flags, or try over different
hardware ; but I don't know what would be the most clever starting point ...
regards
[-- Attachment #2: bug-200411291251.txt --]
[-- Type: text/plain, Size: 1818 bytes --]
************ REMOTE CONSOLE: CTRL-] TO QUIT ********
invalid operand: 0000 [#1]
PREEMPT
Modules linked in:
CPU: 0
EIP: 0061:[<c03003f8>] Not tainted VLI
EFLAGS: 00010206 (2.6.9-xenU)
EIP is at init_tsc+0x36/0xa0
eax: c02bad10 ebx: 00018000 ecx: fbffc000 edx: 00000001
esi: 00000020 edi: c0102000 ebp: 00000000 esp: c1149f20
ds: 007b es: 007b ss: 0069
Process events/0 (pid: 3, threadinfo=c1148000 task=c0368020)
Stack: c010e4df 00000000 c0109df5 fbffc000 000001ee 00000063 c7a03000 c1148000
00000000 c02b9d80 00000000 c012b8d6 00000000 c1149f74 00000000 c0359278
c0109efd c1148000 c0359268 ffffffff ffffffff 00000001 00000000 c0118cc2
Call Trace:
[<c010e4df>] time_resume+0x12/0x53
[<c0109df5>] __do_suspend+0x1a0/0x1e1
[<c012b8d6>] worker_thread+0x1ea/0x2e0
[<c0109efd>] __shutdown_handler+0x0/0x48
[<c0118cc2>] default_wake_function+0x0/0x12
[<c0118cc2>] default_wake_function+0x0/0x12
[<c012b6ec>] worker_thread+0x0/0x2e0
[<c012f915>] kthread+0xa5/0xab
[<c012f870>] kthread+0x0/0xab
[<c010ed89>] kernel_thread_helper+0x5/0xb
Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <f0> fb ff bf 00 00 00 00 88 f4 ff bf 02 00 00 00 01 00 00 00 00
[-- Attachment #3: bug-200411291300.txt --]
[-- Type: text/plain, Size: 1818 bytes --]
************ REMOTE CONSOLE: CTRL-] TO QUIT ********
invalid operand: 0000 [#1]
PREEMPT
Modules linked in:
CPU: 0
EIP: 0061:[<c03003d3>] Not tainted VLI
EFLAGS: 00010206 (2.6.9-xenU)
EIP is at init_tsc+0x11/0xa0
eax: 2cac6d30 ebx: 00018000 ecx: fbffc000 edx: 00000001
esi: 00000020 edi: c0102000 ebp: 00000000 esp: c1149f1c
ds: 007b es: 007b ss: 0069
Process events/0 (pid: 3, threadinfo=c1148000 task=c0368020)
Stack: 00000246 c010e4df 00000000 c0109df5 fbffc000 000001e8 00000063 c7d82000
c1148000 00000000 c02b9d80 00000000 c012b8d6 00000000 c1149f74 00000000
c0359278 c0109efd c1148000 c0359268 ffffffff ffffffff 00000001 00000000
Call Trace:
[<c010e4df>] time_resume+0x12/0x53
[<c0109df5>] __do_suspend+0x1a0/0x1e1
[<c012b8d6>] worker_thread+0x1ea/0x2e0
[<c0109efd>] __shutdown_handler+0x0/0x48
[<c0118cc2>] default_wake_function+0x0/0x12
[<c0118cc2>] default_wake_function+0x0/0x12
[<c012b6ec>] worker_thread+0x0/0x2e0
[<c012f915>] kthread+0xa5/0xab
[<c012f870>] kthread+0x0/0xab
[<c010ed89>] kernel_thread_helper+0x5/0xb
Code: f8 13 30 c0 f8 13 30 c0 4c 14 30 c0 4c 14 30 c0 f4 14 30 c0 f4 14 30 c0 9c 15 30 c0 9c 15 30 c0 80 6c dc c7 80 6c dc c7 38 29 eb <c7> 38 29 eb c7 a8 62 dc c7 a8 62 dc c7 80 0c d3 c7 80 0c d3 c7
[-- Attachment #4: bug-200411291500.txt --]
[-- Type: text/plain, Size: 1752 bytes --]
************ REMOTE CONSOLE: CTRL-] TO QUIT ********
invalid operand: 0000 [#1]
PREEMPT
Modules linked in:
CPU: 0
EIP: 0061:[<c03172f0>] Not tainted VLI
EFLAGS: 00010286 (2.6.9-xenU)
EIP is at init_tsc+0x0/0xa0
eax: c02d6ef0 ebx: 00000020 ecx: fbffc000 edx: fbffc000
esi: c74ab140 edi: c0102000 ebp: 00000000 esp: c1169f20
ds: 007b es: 007b ss: 0069
Process events/0 (pid: 3, threadinfo=c1168000 task=c1146020)
Stack: c010ed42 00000000 c010a28e fbffc000 000001e8 00000063 c74ab000 c1168000
00000000 c02d6000 00000000 c012db7a 00000000 c1169f74 00000000 c11448d8
c010a390 c1168000 c11448c8 ffffffff ffffffff 00000001 00000000 c0119e00
Call Trace:
[<c010ed42>] time_resume+0x12/0x50
[<c010a28e>] __do_suspend+0x19e/0x1e0
[<c012db7a>] worker_thread+0x1fa/0x2f0
[<c010a390>] __shutdown_handler+0x0/0x50
[<c0119e00>] default_wake_function+0x0/0x20
[<c0119e00>] default_wake_function+0x0/0x20
[<c012d980>] worker_thread+0x0/0x2f0
[<c013203a>] kthread+0xaa/0xb0
[<c0131f90>] kthread+0x0/0xb0
[<c010f675>] kernel_thread_helper+0x5/0x10
Code: 00 00 00 17 00 00 00 58 8f 04 08 11 00 00 00 30 8f 04 08 12 00 00 00 28 00 00 00 13 00 00 00 08 00 00 00 fe ff ff 6f e0 8e 04 08 <ff> ff ff 6f 01 00 00 00 f0 ff ff 6f 12 8e 04 08 00 00 00 00 00
^ permalink raw reply [flat|nested] 2+ messages in thread* Re: Trouble with Xen 2.0
2004-11-29 15:02 Trouble with Xen 2.0 Jérôme Petazzoni
@ 2004-11-29 23:43 ` Keir Fraser
0 siblings, 0 replies; 2+ messages in thread
From: Keir Fraser @ 2004-11-29 23:43 UTC (permalink / raw)
To: Jérôme Petazzoni; +Cc: Xen-devel
>
> I tested Xen 2.0 and basic behaviour looks solid and stable, but
> "dynamic" features didn't work at all for me. More specifically ; with
> the precompiled build :
>
> - saving a domain on a P4 computer was erratic (the save process was
> hung ; dom0 and the to-be-saved VM were still running fine) ; it worked
> "sometimes" (with no apparent influence of memory load, cpu load, etc).
> The logs showed the line indicating a save, but nothing more (nothing
> more than when the save succeeds)
I tested save/restore a little myself recently and couldn't get it to
crash. But various people have seen the crash in time_resume(). Maybe
if I could have a crash dump + vmlinux image file I might be able to
work out what's going on. It'd be even better if I could reproduce the
problem though.
> - using the balloon interface to reduce the memory worked, but reducing
> it by a large increment caused the affected VM to crash, spitting
> repeatedly the same message over the console (I didn't have means to
> save those messages yet, sorry, but I'll try to get them asap)
The balloon driver needs some cleaning and better integration with all
the places in the kernel that increase/decrease the domain's memory
reservation.
> - OTOH, we could migrate a linux VM from a P4 machine to a Celeron one,
> and it kept working for a while (but /proc/cpuinfo showed wrong flags,
> so maybe some optimized code would have crashed)
That just sounds like a bad idea. Suspend/resume or migrating across
very different hardware configurations is asking for trouble. It's
intended for canning images or transferring workloads across a
homegeneous cluster. In particular, 'downgrading' to a lesser CPU is
asking for trouble -- e.g., software RAID trying to use non-existent
SSE instructions, to pull a random example out of the air.
-- Keir
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://productguide.itmanagersjournal.com/
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2004-11-29 23:43 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-11-29 15:02 Trouble with Xen 2.0 Jérôme Petazzoni
2004-11-29 23:43 ` Keir Fraser
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.