* Test report for Xen-3.3.0-rc4 (#18314)
@ 2008-08-13 7:43 Li, Haicheng
2008-08-13 8:30 ` Keir Fraser
0 siblings, 1 reply; 19+ messages in thread
From: Li, Haicheng @ 2008-08-13 7:43 UTC (permalink / raw)
To: 'xen-devel@lists.xensource.com'
All,
We've finished a round of nightly testing and bug verification on RC4 (#18314). There are still 5 P1 open bugs. Bug #1322 and bug #1323 were found by extended testing with Solaris HVM guest.
New P1 bugs:
==============================================
1. Xen HV crashes while booting up Indiana HVM guest
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1322
2. Booting Nevada 81 PAE HVM may cause Xen crash.
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1323
Old P1 bugs:
==============================================
1. One 32e, hotplug attaching VT-d NIC to guest failed.
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1316.
2. On PAE, failed to hotplug attach USB EHCI device to linux guest.
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1318
3. UHCI hotplug can not work on Montevina platform.
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1319
-- haicheng
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Test report for Xen-3.3.0-rc4 (#18314)
2008-08-13 7:43 Test report for Xen-3.3.0-rc4 (#18314) Li, Haicheng
@ 2008-08-13 8:30 ` Keir Fraser
2008-08-13 8:43 ` Keir Fraser
2008-08-13 9:32 ` Xu, Jiajun
0 siblings, 2 replies; 19+ messages in thread
From: Keir Fraser @ 2008-08-13 8:30 UTC (permalink / raw)
To: Li, Haicheng, 'xen-devel@lists.xensource.com'
Do the new P1s still occur if you change SHADOW_OPTIMIZATIONS in
arch/x86/mm/shadow/private.h to 0xff? (i.e., disable out-of-sync
optimisation).
-- Keir
On 13/8/08 08:43, "Li, Haicheng" <haicheng.li@intel.com> wrote:
> All,
>
> We've finished a round of nightly testing and bug verification on RC4
> (#18314). There are still 5 P1 open bugs. Bug #1322 and bug #1323 were found
> by extended testing with Solaris HVM guest.
>
> New P1 bugs:
> ==============================================
> 1. Xen HV crashes while booting up Indiana HVM guest
> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1322
>
> 2. Booting Nevada 81 PAE HVM may cause Xen crash.
> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1323
>
> Old P1 bugs:
> ==============================================
> 1. One 32e, hotplug attaching VT-d NIC to guest failed.
> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1316.
>
> 2. On PAE, failed to hotplug attach USB EHCI device to linux guest.
> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1318
>
> 3. UHCI hotplug can not work on Montevina platform.
> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1319
>
>
>
> -- haicheng
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Test report for Xen-3.3.0-rc4 (#18314)
2008-08-13 8:30 ` Keir Fraser
@ 2008-08-13 8:43 ` Keir Fraser
2008-08-13 9:32 ` Xu, Jiajun
1 sibling, 0 replies; 19+ messages in thread
From: Keir Fraser @ 2008-08-13 8:43 UTC (permalink / raw)
To: Keir Fraser, Li, Haicheng,
'xen-devel@lists.xensource.com'
It goes without saying of course that these new P1s are unfortunately rather
likely to delay 3.3.0, unless we choose to disable optimisations causing
these crashes. Still, we are in deep freeze mode and I won't be taking any
patches that aren't obvious fixes for very serious issues and regressions.
-- Keir
On 13/8/08 09:30, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:
> Do the new P1s still occur if you change SHADOW_OPTIMIZATIONS in
> arch/x86/mm/shadow/private.h to 0xff? (i.e., disable out-of-sync
> optimisation).
>
> -- Keir
>
> On 13/8/08 08:43, "Li, Haicheng" <haicheng.li@intel.com> wrote:
>
>> All,
>>
>> We've finished a round of nightly testing and bug verification on RC4
>> (#18314). There are still 5 P1 open bugs. Bug #1322 and bug #1323 were found
>> by extended testing with Solaris HVM guest.
>>
>> New P1 bugs:
>> ==============================================
>> 1. Xen HV crashes while booting up Indiana HVM guest
>> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1322
>>
>> 2. Booting Nevada 81 PAE HVM may cause Xen crash.
>> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1323
>>
>> Old P1 bugs:
>> ==============================================
>> 1. One 32e, hotplug attaching VT-d NIC to guest failed.
>> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1316.
>>
>> 2. On PAE, failed to hotplug attach USB EHCI device to linux guest.
>> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1318
>>
>> 3. UHCI hotplug can not work on Montevina platform.
>> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1319
>>
>>
>>
>> -- haicheng
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: Test report for Xen-3.3.0-rc4 (#18314)
2008-08-13 8:30 ` Keir Fraser
2008-08-13 8:43 ` Keir Fraser
@ 2008-08-13 9:32 ` Xu, Jiajun
2008-08-13 9:48 ` Keir Fraser
1 sibling, 1 reply; 19+ messages in thread
From: Xu, Jiajun @ 2008-08-13 9:32 UTC (permalink / raw)
To: Keir Fraser, Li, Haicheng, xen-devel
On Wednesday, August 13, 2008 4:31 PM
xen-devel-bounces@lists.xensource.com wrote:
> Do the new P1s still occur if you change SHADOW_OPTIMIZATIONS in
> arch/x86/mm/shadow/private.h to 0xff? (i.e., disable out-of-sync
> optimisation).
After change SHADOW_OPTIMIZATIONS to 0xff, the two issues disappear.
But we found two phenomena:
1. Indiana HVM may reboot when loading grub, serial grub shows:
(XEN) sh error: sh_remove_shadows(): can't find all shadows of mfn
1e4490 (shadow_flags=00000040)
(XEN) domain_crash called from common.c:2714
(XEN) Domain 10 (vcpu#0) crashed on cpu#4:
(XEN) ----[ Xen-3.3.0-rc1 x86_32p debug=n Not tainted ]----
(XEN) CPU: 4
(XEN) EIP: 0158:[<fa58e185>]
(XEN) EFLAGS: 00000282 CONTEXT: hvm
(XEN) eax: 0000000c ebx: 00000007 ecx: d947eba0 edx: 00000000
(XEN) esi: d947e7a0 edi: fa5886c0 ebp: d947e768 esp: d947e748
(XEN) cr0: 8005003b cr4: 000006b8 cr3: 023cb020 cr2: d32daea5
(XEN) ds: 0160 es: 0160 fs: 0000 gs: 01b0 ss: 0160 cs: 0158
(XEN) sh error: sh_remove_shadows(): can't find all shadows of mfn
1e45de (shadow_flags=00000080)
(XEN) domain_crash called from common.c:2714
2. "xm destroy" Indiana HVM may cause xen call trace. But there is no
xen crash.
(XEN) Xen WARN at domain.c:1814
(XEN) ----[ Xen-3.3.0-rc1 x86_32p debug=n Not tainted ]----
(XEN) CPU: 2
(XEN) EIP: e008:[<ff12f70f>] domain_relinquish_resources+0x17f/0x1a0
(XEN) EFLAGS: 00210202 CONTEXT: hypervisor
(XEN) eax: 00000001 ebx: ff1c2090 ecx: ff1c2080 edx: 00000000
(XEN) esi: ff1c2080 edi: ff1c2080 ebp: ffbf7e44 esp: ffbf7dcc
(XEN) cr0: 8005003b cr4: 000026f0 cr3: 00bdcc80 cr2: 080554c8
(XEN) ds: e010 es: e010 fs: 0000 gs: 0033 ss: e010 cs: e008
(XEN) Xen stack trace from esp=ffbf7dcc:
(XEN) 00000020 ff1c2080 fffffff5 fffffff3 ff1c2080 00000000 ffbf7e44
ff10416d
(XEN) ff1c2080 00000005 ff116a57 2709497e 0000012f 513c6fff 00000202
000000ff
(XEN) fffffff3 b33fc518 0000007b ff1030ad ff1c2080 b33fc518 00000090
00000020
(XEN) ff1d9100 00000296 8d654ad1 ff1c2080 ff1d0080 ff1c2326 00000002
00000005
(XEN) b79a000b b79d04fc b79d952c 447c7f77 b33fc54c 46257b48 b7f116a0
0000007f
(XEN) 00000000 b8b08ac8 081215a8 447c7f77 b7f116a0 b7b59d74 b33fc578
447c77d3
(XEN) b7b59d74 a5dba1ee b7ba8140 0000001f a5dba1ee 00000000 0836e4f0
448738e4
(XEN) b7ba8140 08399b54 b33fc5a8 447c77d3 08399b54 b7ba8140 a5dba1ee
448738e4
(XEN) b7ba8140 b7ba8140 43841a1c ff1d9100 00200296 00200296 43841a15
00000006
(XEN) 00000003 00000004 ffbf7fb4 ffbf7f5c ff149cfd ffbf7fb4 43841a16
00000001
(XEN) 0000f800 ff111284 ff1dd044 ffbe6900 ff1dd104 000f0003 0000000f
ff1fa43c
(XEN) 000002e0 00000002 ffbdc080 ffbdc080 00200296 00200296 00000004
00000033
(XEN) 00009695 0000012f 00000004 43841a17 5a2f1fe8 909090ff 90909090
c3900390
(XEN) f95f30d8 ff10f4d2 ff1dd044 53d22d15 0000012f ffbf7f03 5328e111
ffbdc080
(XEN) 0000007b 0000007b 00305000 ff19a7a4 b33fc518 357f4700 b7f49430
b33fc5e8
(XEN) 00000000 00305000 b33fc518 357f4700 b7f49430 b33fc5e8 00000000
00305000
(XEN) 00000024 000d0000 c0101487 00000061 00200282 ca1ebe94 00000069
0000007b
(XEN) 0000007b 00000000 00000033 00000002 ffbdc080
(XEN) Xen call trace:
(XEN) [<ff12f70f>] domain_relinquish_resources+0x17f/0x1a0
(XEN) [<ff1c2080>] get_edd+0x4/0x10
(XEN) [<ff1c2080>] get_edd+0x4/0x10
(XEN) [<ff10416d>] domain_kill+0x6d/0x160
(XEN) [<ff1c2080>] get_edd+0x4/0x10
(XEN) [<ff116a57>] add_entry+0x57/0x140
(XEN) [<ff1030ad>] do_domctl+0x10d/0xc40
(XEN) [<ff1c2080>] get_edd+0x4/0x10
(XEN) [<ff1c2080>] get_edd+0x4/0x10
(XEN) [<ff1d0080>] smp_prepare_cpus+0x2d0/0x800
(XEN) [<ff1c2326>] boot_edd_info+0x170/0x200
(XEN) [<ff149cfd>] do_general_protection+0x42d/0x1310
(XEN) [<ff111284>] csched_tick+0x154/0x5e0
(XEN) [<ff10f4d2>] page_scrub_softirq+0x132/0x160
(XEN) [<ff19a7a4>] hypercall+0x94/0x9b
>
> On 13/8/08 08:43, "Li, Haicheng" <haicheng.li@intel.com> wrote:
>
>> All,
>>
>> We've finished a round of nightly testing and bug verification on RC4
>> (#18314). There are still 5 P1 open bugs. Bug #1322 and bug #1323
>> were found by extended testing with Solaris HVM guest.
>>
>> New P1 bugs:
>> ==============================================
>> 1. Xen HV crashes while booting up Indiana HVM guest
>> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1322
>>
>> 2. Booting Nevada 81 PAE HVM may cause Xen crash.
>> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1323
>>
>> Old P1 bugs:
>> ==============================================
>> 1. One 32e, hotplug attaching VT-d NIC to guest failed.
>> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1316.
>>
>> 2. On PAE, failed to hotplug attach USB EHCI device to linux guest.
>> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1318
>>
>> 3. UHCI hotplug can not work on Montevina platform.
>> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1319
>>
>>
>>
>> -- haicheng
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
Best Regards
Jiajun
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Test report for Xen-3.3.0-rc4 (#18314)
2008-08-13 9:32 ` Xu, Jiajun
@ 2008-08-13 9:48 ` Keir Fraser
2008-08-13 12:25 ` Xu, Jiajun
0 siblings, 1 reply; 19+ messages in thread
From: Keir Fraser @ 2008-08-13 9:48 UTC (permalink / raw)
To: Xu, Jiajun, Li, Haicheng, xen-devel
On 13/8/08 10:32, "Xu, Jiajun" <jiajun.xu@intel.com> wrote:
> 1. Indiana HVM may reboot when loading grub, serial grub shows:
> (XEN) sh error: sh_remove_shadows(): can't find all shadows of mfn
> 1e4490 (shadow_flags=00000040)
At least it's not a hv crash. At this late stage I could perhaps live with
this.
> 2. "xm destroy" Indiana HVM may cause xen call trace. But there is no
> xen crash.
Both backtraces are from 3.3.0-rc1. You modified and tested the wrong tree.
:-) This second backtrace can no longer happen.
We're going to dig into the OOS bug a bit and decide what to do...
-- Keir
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: Test report for Xen-3.3.0-rc4 (#18314)
2008-08-13 9:48 ` Keir Fraser
@ 2008-08-13 12:25 ` Xu, Jiajun
2008-08-13 14:48 ` Keir Fraser
0 siblings, 1 reply; 19+ messages in thread
From: Xu, Jiajun @ 2008-08-13 12:25 UTC (permalink / raw)
To: Keir Fraser, Li, Haicheng, xen-devel
On Wednesday, August 13, 2008 5:49 PM Keir Fraser wrote:
> On 13/8/08 10:32, "Xu, Jiajun" <jiajun.xu@intel.com> wrote:
>
>> 1. Indiana HVM may reboot when loading grub, serial grub shows:
>> (XEN) sh error: sh_remove_shadows(): can't find all shadows of mfn
>> 1e4490 (shadow_flags=00000040)
>
> At least it's not a hv crash. At this late stage I could
> perhaps live with
> this.
>
>> 2. "xm destroy" Indiana HVM may cause xen call trace. But there is no
>> xen crash.
>
> Both backtraces are from 3.3.0-rc1. You modified and tested
> the wrong tree.
> :-) This second backtrace can no longer happen.
>
> We're going to dig into the OOS bug a bit and decide what to do...
Oh, Sorry, I made a mistake. Thanks for your pointing out.
I will try rc4 to see if any issue still exists after this modification
and send you the update.
Best Regards
Jiajun
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Test report for Xen-3.3.0-rc4 (#18314)
2008-08-13 12:25 ` Xu, Jiajun
@ 2008-08-13 14:48 ` Keir Fraser
2008-08-13 15:18 ` Keir Fraser
2008-08-14 7:09 ` Xu, Jiajun
0 siblings, 2 replies; 19+ messages in thread
From: Keir Fraser @ 2008-08-13 14:48 UTC (permalink / raw)
To: Xu, Jiajun, Li, Haicheng, xen-devel
On 13/8/08 13:25, "Xu, Jiajun" <jiajun.xu@intel.com> wrote:
>> Both backtraces are from 3.3.0-rc1. You modified and tested
>> the wrong tree.
>> :-) This second backtrace can no longer happen.
>>
>> We're going to dig into the OOS bug a bit and decide what to do...
>
> Oh, Sorry, I made a mistake. Thanks for your pointing out.
> I will try rc4 to see if any issue still exists after this modification
> and send you the update.
As of c/s 18326 I've not been able to reproduce the hypervisor crash in
around 40 attempts. Could you give that a go (don't remove OOS from
SHADOW_OPTIMIZATIONS -- just test the tree as it is)?
Thanks,
Keir
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Test report for Xen-3.3.0-rc4 (#18314)
2008-08-13 14:48 ` Keir Fraser
@ 2008-08-13 15:18 ` Keir Fraser
2008-08-14 7:09 ` Xu, Jiajun
1 sibling, 0 replies; 19+ messages in thread
From: Keir Fraser @ 2008-08-13 15:18 UTC (permalink / raw)
To: Xu, Jiajun, Li, Haicheng, xen-devel
On 13/8/08 15:48, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:
>> Oh, Sorry, I made a mistake. Thanks for your pointing out.
>> I will try rc4 to see if any issue still exists after this modification
>> and send you the update.
>
> As of c/s 18326 I've not been able to reproduce the hypervisor crash in
> around 40 attempts. Could you give that a go (don't remove OOS from
> SHADOW_OPTIMIZATIONS -- just test the tree as it is)?
If this works okay for you then I'll make a fifth release candidate
tomorrow, and plan to release early next week.
-- Keir
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: Test report for Xen-3.3.0-rc4 (#18314)
2008-08-13 14:48 ` Keir Fraser
2008-08-13 15:18 ` Keir Fraser
@ 2008-08-14 7:09 ` Xu, Jiajun
2008-08-14 7:26 ` Keir Fraser
2008-08-14 10:13 ` Test report for Xen-3.3.0-rc4 (#18314) Keir Fraser
1 sibling, 2 replies; 19+ messages in thread
From: Xu, Jiajun @ 2008-08-14 7:09 UTC (permalink / raw)
To: Keir Fraser, Li, Haicheng, xen-devel
On Wednesday, August 13, 2008 10:48 PM
xen-devel-bounces@lists.xensource.com wrote:
> As of c/s 18326 I've not been able to reproduce the hypervisor crash
> in around 40 attempts. Could you give that a go (don't remove OOS from
> SHADOW_OPTIMIZATIONS -- just test the tree as it is)?
We tried c/s 18326, the two issues still exist. We found it is more easy
to reproduce these issues on 32pae host than on 32e host. See following
log:
If remove OOS, the two issues will disapear and no other error found.
Booting Indiana(2008.05) cause Xen Crash:
###################
(XEN) sh error: sh_remove_shadows(): can't find all shadows of mfn
2291c6 (shadow_flags=60000010)
(XEN) domain_crash called from common.c:2714
(XEN) Domain 2 (vcpu#1) crashed on cpu#1:
(XEN) ----[ Xen-3.3.0-rc5-pre x86_32p debug=n Not tainted ]----
(XEN) CPU: 1
(XEN) EIP: 0158:[<fe832375>]
(XEN) EFLAGS: 00010202 CONTEXT: hvm guest
(XEN) eax: fe832381 ebx: d996dc40 ecx: 00000004 edx: d826c800
(XEN) esi: dc285114 edi: da67511c ebp: d996daec esp: d996dae0
(XEN) cr0: 8005003b cr4: 000006b8 cr3: 023cb040 cr2: 08047e54
(XEN) ds: 0160 es: 0160 fs: 0000 gs: 01b0 ss: 0160 cs: 0158
(XEN) sh error: oos_snapshot_lookup(): gmfn 2291c6 was OOS but not in
hash table
(XEN) Xen BUG at common.c:817
(XEN) ----[ Xen-3.3.0-rc5-pre x86_32p debug=n Not tainted ]----
(XEN) CPU: 1
(XEN) EIP: e008:[<ff18a7bc>] oos_snapshot_lookup+0xcc/0xf0
(XEN) EFLAGS: 00010286 CONTEXT: hypervisor
(XEN) eax: 00000000 ebx: 00000000 ecx: 0000000a edx: 00000000
(XEN) esi: ffbcf034 edi: 002291c6 ebp: 00000008 esp: ff2abd90
(XEN) cr0: 80050033 cr4: 000026f0 cr3: 00bced20 cr2: 08047e54
(XEN) ds: e010 es: e010 fs: 0000 gs: 0000 ss: e010 cs: e008
(XEN) Xen stack trace from esp=ff2abd90:
(XEN) ff1b5a68 ff1a4d78 002291c6 00000002 00000000 00000000 ffbcf040
fe6e1428
(XEN) ffbce080 fdff3708 ffbd0080 ff197b6b ffbce080 002291c6 0022a235
00000001
(XEN) 00000001 ff2abe50 ff2abfb4 ff167149 ffbce080 ffbd0080 033a9d58
00000054
(XEN) 0000002c 00227139 0022a235 00000002 00000000 00000428 00000708
00000000
(XEN) fefe2238 00000020 fe6a4240 000000f8 ffbd0080 000dc285 2a0bf001
00000002
(XEN) fe040338 00000020 ffbd0080 ff13acd3 001e9e65 ff2abfb4 00000020
00000020
(XEN) ff2abfb4 00000020 00000020 ff190001 00b09089 08eb0000 b90843ff
ffffffff
(XEN) feade1e5 00000010 0c9b0158 ffffffff 00000000 00000000 0c930160
ffffffff
(XEN) 00000000 00000000 0c930160 ffffffff 00000000 00000000 00000000
ff2abfb4
(XEN) ffbce080 00000000 ffbd0080 ff194f79 00000000 00000000 0022a03f
00000001
(XEN) 00000001 00003708 f6800000 001e9e65 ffffffff 00000000 dc285114
06062001
(XEN) 00000000 025c6027 00000000 01539361 80000000 001e9262 002291c6
000aa289
(XEN) 2a235067 00000002 27139021 80000002 ff17e469 00000001 ffbce080
dc285114
(XEN) 00000000 ffbce080 ff2abfb4 ff1839eb ffbce080 dc285114 ff2abfb4
c8589e63
(XEN) c8d856e1 000000d6 c85961b3 000000d6 ffbce080 ffbd0080 00000003
0000d900
(XEN) 0000e002 c8d856e1 000000d6 00000000 ff2abfb4 ff1dc180 ff1de100
ff2abfb4
(XEN) 00000001 ff2abfb4 ffbce080 ffbce080 dc285114 ff2abfd4 d996daec
ff17e2d9
(XEN) ff2abfb4 d996dc40 00000004 d826c800 dc285114 da67511c d996daec
fe832381
(XEN) 00f00001 fe832375 00000000 00010202 d996dae0 00000000 00000000
00000000
(XEN) 00000000 00000000 00000001 ffbce080
(XEN) Xen call trace:
(XEN) [<ff18a7bc>] oos_snapshot_lookup+0xcc/0xf0
(XEN) [<ff197b6b>] sh_page_fault__guest_3+0x117b/0x1420
(XEN) [<ff167149>] hvmemul_get_seg_reg+0x49/0x60
(XEN) [<ff13acd3>] put_page_from_l1e+0x63/0xf0
(XEN) [<ff190001>] shadow_set_l2e+0x341/0x400
(XEN) [<ff194f79>] sh_invlpg__guest_3+0x2c9/0x2e0
(XEN) [<ff17e469>] vmx_intr_assist+0x89/0x380
(XEN) [<ff1839eb>] vmx_vmexit_handler+0x63b/0x1230
(XEN) [<ff17e2d9>] vmx_asm_vmexit_handler+0x49/0x4c
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 1:
(XEN) Xen BUG at common.c:817
(XEN) ****************************************
################
Destroy Nevada 81 HVM casue xen crash:
################
(XEN) mm.c:706:d1 Error getting mfn 2235b0 (pfn 23b0) from L1 entry
00000002235b0023 for dom1
(XEN) mm.c:1941:d1 Type count overflow on pfn 2235b0
(XEN) mm.c:706:d1 Error getting mfn 2235b0 (pfn 23b0) from L1 entry
00000002235b0023 for dom1
(XEN) mm.c:1941:d1 Type count overflow on pfn 2235b0
(XEN) mm.c:706:d1 Error getting mfn 2235b0 (pfn 23b0) from L1 entry
00000002235b0023 for dom1
(XEN) mm.c:1941:d1 Type count overflow on pfn 2235b0
(XEN) mm.c:706:d1 Error getting mfn 2235b0 (pfn 23b0) from L1 entry
00000002235b0023 for dom1
(XEN) mm.c:1941:d1 Type count overflow on pfn 2235b0
(XEN) mm.c:706:d1 Error getting mfn 2235b0 (pfn 23b0) from L1 entry
00000002235b0023 for dom1
(XEN) mm.c:1941:d1 Type count overflow on pfn 2235b0
(XEN) mm.c:706:d1 Error getting mfn 2235b0 (pfn 23b0) from L1 entry
00000002235b0023 for dom1
(XEN) mm.c:1941:d1 Type count overflow on pfn 2235b0
(XEN) mm.c:706:d1 Error getting mfn 2235b0 (pfn 23b0) from L1 entry
00000002235b0023 for dom1
(XEN) mm.c:1941:d1 Type count overflow on pfn 2235b0
(XEN) mm.c:706:d1 Error getting mfn 2235b0 (pfn 23b0) from L1 entry
00000002235b0023 for dom1
(XEN) mm.c:1941:d1 Type count overflow on pfn 2235b0
(XEN) mm.c:706:d1 Error getting mfn 2235b0 (pfn 23b0) from L1 entry
00000002235b0023 for dom1
(XEN) mm.c:1941:d1 Type count overflow on pfn 2235b0
(XEN) mm.c:706:d1 Error getting mfn 2235b0 (pfn 23b0) from L1 entry
00000002235b0023 for dom1
(XEN) mm.c:1941:d1 Type count overflow on pfn 2235b0
(XEN) mm.c:706:d1 Error getting mfn 2235b0 (pfn 23b0) from L1 entry
00000002235b0023 for dom1
(XEN) mm.c:1941:d1 Type count overflow on pfn 2235b0
(XEN) mm.c:706:d1 Error getting mfn 2235b0 (pfn 23b0) from L1 entry
00000002235b0023 for dom1
(XEN) mm.c:1941:d1 Type count overflow on pfn 2235b0
(XEN) mm.c:706:d1 Error getting mfn 2235b0 (pfn 23b0) from L1 entry
00000002235b0023 for dom1
(XEN) mm.c:1941:d1 Type count overflow on pfn 2235b(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 7:
(XEN) Xen BUG at page_alloc.c:839
(XEN) ****************************************
(XEN)
(XEN) Reboot in five seconds...
#######################
Best Regards
Jiajun
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Test report for Xen-3.3.0-rc4 (#18314)
2008-08-14 7:09 ` Xu, Jiajun
@ 2008-08-14 7:26 ` Keir Fraser
2008-08-14 8:44 ` Keir Fraser
` (2 more replies)
2008-08-14 10:13 ` Test report for Xen-3.3.0-rc4 (#18314) Keir Fraser
1 sibling, 3 replies; 19+ messages in thread
From: Keir Fraser @ 2008-08-14 7:26 UTC (permalink / raw)
To: Xu, Jiajun, Li, Haicheng, xen-devel; +Cc: Gianluca Guida
On 14/8/08 08:09, "Xu, Jiajun" <jiajun.xu@intel.com> wrote:
>> As of c/s 18326 I've not been able to reproduce the hypervisor crash
>> in around 40 attempts. Could you give that a go (don't remove OOS from
>> SHADOW_OPTIMIZATIONS -- just test the tree as it is)?
>
> We tried c/s 18326, the two issues still exist. We found it is more easy
> to reproduce these issues on 32pae host than on 32e host. See following
> log:
> If remove OOS, the two issues will disapear and no other error found.
Thanks Jiajun,
That's disappointing. :-( I think the second of your crashes is the one that
I've been able to reproduce (but infrequently -- maybe one time in 100). The
symptoms are a bit different for me since I run a debug build and crash
earlier, well before domain destruction.
There's a chance that Gianluca's new patch will fix your first host crash
(although the domain crash would probably still remain).
We still need to decide whether to fix the second issue or disable OOS.
We're not decided on that just yet. If it reproed more reliably for us then
I'd be more optimistic about fixing it. Perhaps I will switch to 32pae as so
far I've been running 32e host.
-- Keir
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Test report for Xen-3.3.0-rc4 (#18314)
2008-08-14 7:26 ` Keir Fraser
@ 2008-08-14 8:44 ` Keir Fraser
2008-08-14 9:03 ` Xu, Jiajun
2008-08-14 15:14 ` [PATCH] Fix OOS typecounting [was: Test report for Xen-3.3.0-rc4 (#18314)] Gianluca Guida
2 siblings, 0 replies; 19+ messages in thread
From: Keir Fraser @ 2008-08-14 8:44 UTC (permalink / raw)
To: Keir Fraser, Xu, Jiajun, Li, Haicheng, xen-devel; +Cc: Gianluca Guida
On 14/8/08 08:26, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:
> That's disappointing. :-( I think the second of your crashes is the one that
> I've been able to reproduce (but infrequently -- maybe one time in 100). The
> symptoms are a bit different for me since I run a debug build and crash
> earlier, well before domain destruction.
Here's my crash:
(XEN) Assertion '(x & ((1U<<26)-1)) != 0' failed at mm.c:1891
(XEN) ----[ Xen-3.3.0-rc5-pre x86_64 debug=y Not tainted ]----
(XEN) CPU: 1
(XEN) RIP: e008:[<ffff828c80150ae2>] put_page_type+0x39/0x13c
(XEN) RFLAGS: 0000000000010246 CONTEXT: hypervisor
(XEN) rax: 00000000e8000000 rbx: 00000000e8000000 rcx: 0000000000000000
(XEN) rdx: ffff83003e1e6100 rsi: ffff83003e1e6100 rdi: ffff828400563808
(XEN) rbp: ffff83003e1f7ae8 rsp: ffff83003e1f7ab8 r8: 000000003e1e6100
(XEN) r9: ffff83003e1e6100 r10: 0000000000000000 r11: 80000000227cd063
(XEN) r12: ffff828400563808 r13: 00000000e7ffffff r14: ffff828400563820
(XEN) r15: ffff828400563820 cr0: 0000000080050033 cr4: 00000000000026f0
(XEN) cr3: 000000003e4e4000 cr2: 0000000008047ff4
(XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008
(XEN) Xen stack trace from rsp=ffff83003e1f7ab8:
(XEN) ffff83003e1f7b18 ffff828400563808 80000000227cd063 ffff83003e1e6100
(XEN) ffff83003e1e6100 000000000003efd6 ffff83003e1f7b18 ffff828c8014da74
(XEN) ffff83003efd6078 ffff83003e1f7b50 ffff83003efd6070 0000000000000001
(XEN) ffff83003e1f7b78 ffff828c801bbc80 80000000227cd063 ffff83003e1e6100
(XEN) 000000013e1f7b68 000000000003efd6 80000000227cd061 0000000000000000
(XEN) ffff83003d2edb38 0000000000000000 00000000000227cd ffff83003d2ec100
(XEN) ffff83003e1f7b88 ffff828c801c0b74 ffff83003e1f7b98 ffff828c801ac765
(XEN) ffff83003e1f7be8 ffff828c801a828a 00000000000227cd ffff83003d2ec100
(XEN) 000000003e1e6100 00000000000227cd ffff83003d2ec100 ffff828400563808
(XEN) 000000000003e4e1 0000000000000000 ffff83003e1f7c18 ffff828c801a8465
(XEN) 0000000000000000 ffff83003d2edb08 ffff8140c0003520 0000000000000002
(XEN) ffff83003e1f7c38 ffff828c801a89ef ffff8140c0003520 000000000003efe2
(XEN) ffff83003e1f7c98 ffff828c801bb496 000000000000a9cd 000000003efe2520
(XEN) 000000003e1f7c98 ffff83003d210100 000000003efc5067 ffff83003d210100
(XEN) ffff83003e1f7e28 ffff8140c0003520 0000000000000002 ffff83003e1f7d28
(XEN) ffff83003e1f7ce8 ffff828c801bc309 000000010f6a3000 000000003efc5067
(XEN) 000000000003efe2 ffff83003e1f7e28 ffff83003d210100 ffff83003e1e6100
(XEN) ffff83003e1f7e28 ffff83003d210100 ffff83003e1f7e98 ffff828c801be326
(XEN) ffff83003e1f7d68 00000000d4990be8 0000000200000000 000000000000f67e
(XEN) ffff83003e1f7f28 00000000d4990be8 000000000003efc5 0000000100000206
(XEN) Xen call trace:
(XEN) [<ffff828c80150ae2>] put_page_type+0x39/0x13c
(XEN) [<ffff828c8014da74>] put_page_from_l1e+0x102/0x16b
(XEN) [<ffff828c801bbc80>] shadow_set_l1e+0x53f/0x551
(XEN) [<ffff828c801c0b74>]
sh_rm_write_access_from_sl1p__guest_3+0xd2/0xfd
(XEN) [<ffff828c801ac765>] sh_remove_write_access_from_sl1p+0x8d/0xaf
(XEN) [<ffff828c801a828a>] oos_remove_write_access+0x5a/0xec
(XEN) [<ffff828c801a8465>] _sh_resync+0x149/0x20f
(XEN) [<ffff828c801a89ef>] sh_resync+0x97/0xd9
(XEN) [<ffff828c801bb496>] shadow_set_l2e+0x1fe/0x4a9
(XEN) [<ffff828c801bc309>] shadow_get_and_create_l1e+0x1b3/0x244
(XEN) [<ffff828c801be326>] sh_page_fault__guest_3+0x9ee/0x1404
(XEN) [<ffff828c801a24ff>] vmx_vmexit_handler+0x2e5/0x841
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 1:
(XEN) Assertion '(x & ((1U<<26)-1)) != 0' failed at mm.c:1891
(XEN) ****************************************
(XEN)
(XEN) Reboot in five seconds...
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: Test report for Xen-3.3.0-rc4 (#18314)
2008-08-14 7:26 ` Keir Fraser
2008-08-14 8:44 ` Keir Fraser
@ 2008-08-14 9:03 ` Xu, Jiajun
2008-08-14 9:07 ` Keir Fraser
2008-08-14 15:14 ` [PATCH] Fix OOS typecounting [was: Test report for Xen-3.3.0-rc4 (#18314)] Gianluca Guida
2 siblings, 1 reply; 19+ messages in thread
From: Xu, Jiajun @ 2008-08-14 9:03 UTC (permalink / raw)
To: Keir Fraser, Li, Haicheng, xen-devel; +Cc: Gianluca Guida
On Thursday, August 14, 2008 3:27 PM Keir Fraser wrote:
> Thanks Jiajun,
>
> That's disappointing. :-( I think the second of your crashes
> is the one that
> I've been able to reproduce (but infrequently -- maybe one
> time in 100). The
> symptoms are a bit different for me since I run a debug build and
> crash earlier, well before domain destruction.
>
> There's a chance that Gianluca's new patch will fix your first
> host crash
> (although the domain crash would probably still remain).
Yes. We tried the patch, still got xen crash.
#########
(XEN) sh error: sh_remove_write_access(): can't remove write access to
mfn 2291b5: guest has 67108863 special-use mappings of it
(XEN) domain_crash called from common.c:2396
(XEN) Domain 1 (vcpu#1) crashed on cpu#0:
(XEN) ----[ Xen-3.3.0-rc5-pre x86_32p debug=n Not tainted ]----
(XEN) CPU: 0
(XEN) EIP: 0158:[<fe817811>]
(XEN) EFLAGS: 00010282 CONTEXT: hvm guest
(XEN) eax: da0cd300 ebx: 00000000 ecx: 80000000 edx: 00000000
(XEN) esi: 00000000 edi: 00000000 ebp: d8213bb4 esp: d8213bb0
(XEN) cr0: 8005003b cr4: 000006b8 cr3: 023cb040 cr2: 08046fdc
(XEN) ds: 0160 es: 0160 fs: 0000 gs: 01b0 ss: 0160 cs: 0158
(XEN) Xen BUG at page_alloc.c:839
(XEN) ----[ Xen-3.3.0-rc5-pre x86_32p debug=n Not tainted ]----
(XEN) CPU: 2
(XEN) EIP: e008:[<ff10f81d>] free_domheap_pages+0x9d/0x250
(XEN) EFLAGS: 00210206 CONTEXT: hypervisor
(XEN) eax: 00000002 ebx: 00000000 ecx: f9bda8f8 edx: 00000002
(XEN) esi: 00000001 edi: ff1c4080 ebp: f9bda8f8 esp: ffbf3d5c
(XEN) cr0: 8005003b cr4: 000026f0 cr3: 00bd8d20 cr2: 082bfef0
(XEN) ds: e010 es: e010 fs: 0000 gs: 0033 ss: e010 cs: e008
(XEN) Xen stack trace from esp=ffbf3d5c:
(XEN) ff10ef33 00000015 00000000 f9bda910 f9bda8f8 00000001 ff1c5504
ff1395ac
(XEN) f9bda8f8 00000000 ffbf3e44 f9bda8f8 f9bda910 f9bda8f8 68000000
ff12fc57
(XEN) f9bda8f8 ff1c4080 ffbf3e44 ff186515 60000000 ff1c4090 ff1c4080
ff1c4090
(XEN) ff1c4080 ff1c4080 ffbf3e44 ff12ff2c ffbca080 00000200 ffbf3e44
fffffff3
(XEN) ff1c4080 00000000 ffbf3e44 ff104171 ff1c4080 00a261a4 00000000
15901a37
(XEN) 00000001 0b200494 0000000b fffffff3 fffffff3 b34f62d8 0000007b
ff1030ad
(XEN) ff1c4080 b34f62d8 00000090 ff13c005 f9880070 e0000000 00000020
ff1c4080
(XEN) ff13ad7b f95b2908 00000002 00000005 b7a10001 b7a42554 b7a4b4dc
447c7f77
(XEN) b34f630c 46257b48 b7f836a0 0000007f 00000000 b8b08ac8 08121598
447c7f77
(XEN) b7f836a0 b7bcbd74 b34f6338 447c77d3 b7bcbd74 a5dba1ee b7c291e0
0000001f
(XEN) a5dba1ee 00000000 0836b320 448738e4 b7c291e0 b5573824 b34f6368
447c77d3
(XEN) b5573824 b7c291e0 a5dba1ee 448738e4 b7c291e0 b7c291e0 43841a1c
000000fb
(XEN) 000000fb 0000005c 43841a15 ffbd4080 00000003 00000004 ffbf3fb4
ffbf3f5c
(XEN) ff14a64d ffbf3fb4 43841a16 00000001 0000f800 00000004 fed1f030
00000030
(XEN) ff1f6080 ffbd4080 ff1f6080 00000000 0000005d 00000002 ffbd8080
ffbd8080
(XEN) b587215b 0000005d 00000004 00000033 0000d3a0 b5b50665 00000004
43841a17
(XEN) 5a9d8fe8 909090ff 90909090 c3900390 ff1f9a00 ff116bbc b5b50665
0000005d
(XEN) ffbf3fb4 ff1e1003 ff1dd180 ffbd8080 0000007b 0000007b 00305000
ff19d614
(XEN) b34f62d8 52ab9700 b7fbb430 b34f63a8 00000000 00305000 b34f62d8
52ab9700
(XEN) b7fbb430 b34f63a8 00000000 00305000 00000024 000d0000 c0101487
00000061
(XEN) Xen call trace:
(XEN) [<ff10f81d>] free_domheap_pages+0x9d/0x250
(XEN) [<ff10ef33>] free_heap_pages+0xc3/0x1d0
(XEN) [<ff1c5504>] nokey+0xc/0x10
(XEN) [<ff1395ac>] put_page+0x5c/0x60
(XEN) [<ff12fc57>] relinquish_memory+0xe7/0x290
(XEN) [<ff1c4080>] __start+0x15/0x1ef
(XEN) [<ff186515>] paging_log_dirty_teardown+0x55/0xa0
(XEN) [<ff1c4090>] __start+0x25/0x1ef
(XEN) [<ff1c4080>] __start+0x15/0x1ef
(XEN) [<ff1c4090>] __start+0x25/0x1ef
(XEN) [<ff1c4080>] __start+0x15/0x1ef
(XEN) [<ff1c4080>] __start+0x15/0x1ef
(XEN) [<ff12ff2c>] domain_relinquish_resources+0x12c/0x190
(XEN) [<ff1c4080>] __start+0x15/0x1ef
(XEN) [<ff104171>] domain_kill+0x71/0x160
(XEN) [<ff1c4080>] __start+0x15/0x1ef
(XEN) [<ff1030ad>] do_domctl+0x10d/0xc40
(XEN) [<ff1c4080>] __start+0x15/0x1ef
(XEN) [<ff13c005>] get_page_from_l1e+0x1b5/0x480
(XEN) [<ff1c4080>] __start+0x15/0x1ef
(XEN) [<ff13ad7b>] put_page_from_l1e+0x9b/0xf0
(XEN) [<ff14a64d>] do_general_protection+0x42d/0x1310
(XEN) [<ff116bbc>] timer_softirq_action+0x10c/0x130
(XEN) [<ff19d614>] hypercall+0x94/0x9b
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 2:
(XEN) Xen BUG at page_alloc.c:839
(XEN) ****************************************
#########
Best Regards
Jiajun
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Test report for Xen-3.3.0-rc4 (#18314)
2008-08-14 9:03 ` Xu, Jiajun
@ 2008-08-14 9:07 ` Keir Fraser
2008-08-14 15:32 ` Keir Fraser
0 siblings, 1 reply; 19+ messages in thread
From: Keir Fraser @ 2008-08-14 9:07 UTC (permalink / raw)
To: Xu, Jiajun, Li, Haicheng, xen-devel; +Cc: Gianluca Guida
On 14/8/08 10:03, "Xu, Jiajun" <jiajun.xu@intel.com> wrote:
>> There's a chance that Gianluca's new patch will fix your first
>> host crash
>> (although the domain crash would probably still remain).
>
> Yes. We tried the patch, still got xen crash.
This could be a variant of the second crash (screwed reference counts)
though. I'll take Gianluca's patch since it probably does make things
better, but clearly we still have a nasty refcounting bug, probably in the
OOS code.
-- Keir
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Test report for Xen-3.3.0-rc4 (#18314)
2008-08-14 7:09 ` Xu, Jiajun
2008-08-14 7:26 ` Keir Fraser
@ 2008-08-14 10:13 ` Keir Fraser
2008-08-14 12:49 ` Xu, Jiajun
1 sibling, 1 reply; 19+ messages in thread
From: Keir Fraser @ 2008-08-14 10:13 UTC (permalink / raw)
To: Xu, Jiajun, Li, Haicheng, xen-devel
On 14/8/08 08:09, "Xu, Jiajun" <jiajun.xu@intel.com> wrote:
> We tried c/s 18326, the two issues still exist. We found it is more easy
> to reproduce these issues on 32pae host than on 32e host. See following
> log:
> If remove OOS, the two issues will disapear and no other error found.
So, without OOS you didn't see any issues (not even domain crash)? That's
promising.
With OOS, how easily do you reproduce the host crash? It takes me 50-100
guest boots to cause a crash right now. Presumably you can repro more
quickly than that?
-- Keir
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: Test report for Xen-3.3.0-rc4 (#18314)
2008-08-14 10:13 ` Test report for Xen-3.3.0-rc4 (#18314) Keir Fraser
@ 2008-08-14 12:49 ` Xu, Jiajun
2008-08-14 13:46 ` Xu, Jiajun
0 siblings, 1 reply; 19+ messages in thread
From: Xu, Jiajun @ 2008-08-14 12:49 UTC (permalink / raw)
To: Keir Fraser, Li, Haicheng, xen-devel
[-- Attachment #1: Type: text/plain, Size: 894 bytes --]
On Thursday, August 14, 2008 6:14 PM Keir Fraser wrote:
> On 14/8/08 08:09, "Xu, Jiajun" <jiajun.xu@intel.com> wrote:
>
>> We tried c/s 18326, the two issues still exist. We found it is more
>> easy to reproduce these issues on 32pae host than on 32e host. See
>> following log: If remove OOS, the two issues will disapear and no
>> other error found.
>
> So, without OOS you didn't see any issues (not even domain
> crash)? That's
> promising.
No. I didn't see any issues without OOS.
> With OOS, how easily do you reproduce the host crash? It takes
> me 50-100
> guest boots to cause a crash right now. Presumably you can repro more
> quickly than that?
It is very easy to reproduce the crash on our machine. About 2~3 times
trying will meet a crash.
I attach my config file, maybe there is some difference between our
environments.
Best Regards
Jiajun
[-- Attachment #2: config.vmxgbp36 --]
[-- Type: application/octet-stream, Size: 11866 bytes --]
# -*- mode: python; -*-
#============================================================================
# Python configuration setup for 'xm create'.
# This script sets the parameters used when a domain is created using 'xm create'.
# You use a separate script for each domain you want to create, or
# you can set the parameters for the domain on the xm command line.
#============================================================================
import os, re
arch = os.uname()[4]
if re.search('64', arch):
arch_libdir = 'lib64'
else:
arch_libdir = 'lib'
#----------------------------------------------------------------------------
# Kernel image file.
kernel = "/usr/lib/xen/boot/hvmloader"
# The domain build function. HVM domain uses 'hvm'.
builder='hvm'
# Initial memory allocation (in megabytes) for the new domain.
#
# WARNING: Creating a domain with insufficient memory may cause out of
# memory errors. The domain needs enough memory to boot kernel
# and modules. Allocating less than 32MBs is not recommended.
memory = 256
# Shadow pagetable memory for the domain, in MB.
# If not explicictly set, xend will pick an appropriate value.
# Should be at least 2KB per MB of domain memory, plus a few MB per vcpu.
# shadow_memory = 8
# A name for your domain. All domains must have different names.
name = "indiana-hvm"
# 128-bit UUID for the domain. The default behavior is to generate a new UUID
# on each call to 'xm create'.
#uuid = "06ed00fe-1162-4fc4-b5d8-11993ee4a8b9"
#-----------------------------------------------------------------------------
# The number of cpus guest platform has, default=1
vcpus=2
# Enable/disable HVM guest PAE, default=1 (enabled)
pae=1
# Enable/disable HVM guest ACPI, default=1 (enabled)
acpi=1
# Enable/disable HVM APIC mode, default=1 (enabled)
# Note that this option is ignored if vcpus > 1
apic=1
# List of which CPUS this domain is allowed to use, default Xen picks
#cpus = "" # leave to Xen to pick
#cpus = "0" # all vcpus run on CPU0
#cpus = "0-3,5,^1" # all vcpus run on cpus 0,2,3,5
#cpus = ["2", "3"] # VCPU0 runs on CPU2, VCPU1 runs on CPU3
# Optionally define mac and/or bridge for the network interfaces.
# Random MACs are assigned if not given.
#vif = [ 'type=ioemu, mac=00:16:3e:00:00:11, bridge=xenbr0, model=ne2k_pci' ]
# type=ioemu specify the NIC is an ioemu device not netfront
vif = [ 'type=ioemu, mac=00:16:3e:25:8b:08, bridge=xenbr0' ]
#----------------------------------------------------------------------------
# Define the disk devices you want the domain to have access to, and
# what you want them accessible as.
# Each disk entry is of the form phy:UNAME,DEV,MODE
# where UNAME is the device, DEV is the device name the domain will see,
# and MODE is r for read-only, w for read-write.
#disk = [ 'phy:hda1,hda1,r' ]
disk = [ 'tap:qcow:/share/xvs/var/indiana.img,hda,w', ',hdc:cdrom,r' ]
#----------------------------------------------------------------------------
# Configure the behaviour when a domain exits. There are three 'reasons'
# for a domain to stop: poweroff, reboot, and crash. For each of these you
# may specify:
#
# "destroy", meaning that the domain is cleaned up as normal;
# "restart", meaning that a new domain is started in place of the old
# one;
# "preserve", meaning that no clean-up is done until the domain is
# manually destroyed (using xm destroy, for example); or
# "rename-restart", meaning that the old domain is not cleaned up, but is
# renamed and a new domain started in its place.
#
# In the event a domain stops due to a crash, you have the additional options:
#
# "coredump-destroy", meaning dump the crashed domain's core and then destroy;
# "coredump-restart', meaning dump the crashed domain's core and the restart.
#
# The default is
#
# on_poweroff = 'destroy'
# on_reboot = 'restart'
# on_crash = 'restart'
#
# For backwards compatibility we also support the deprecated option restart
#
# restart = 'onreboot' means on_poweroff = 'destroy'
# on_reboot = 'restart'
# on_crash = 'destroy'
#
# restart = 'always' means on_poweroff = 'restart'
# on_reboot = 'restart'
# on_crash = 'restart'
#
# restart = 'never' means on_poweroff = 'destroy'
# on_reboot = 'destroy'
# on_crash = 'destroy'
#on_poweroff = 'destroy'
#on_reboot = 'restart'
#on_crash = 'restart'
#============================================================================
# Device Model to be used
device_model = '/usr/' + arch_libdir + '/xen/bin/qemu-dm'
#-----------------------------------------------------------------------------
# boot on floppy (a), hard disk (c), Network (n) or CD-ROM (d)
# default: hard disk, cd-rom, floppy
#boot="cda"
#-----------------------------------------------------------------------------
# write to temporary files instead of disk image files
#snapshot=1
#----------------------------------------------------------------------------
# enable SDL library for graphics, default = 0
sdl=1
#----------------------------------------------------------------------------
# enable OpenGL for texture rendering inside the SDL window, default = 1
# valid only if sdl is enabled.
opengl=1
#----------------------------------------------------------------------------
# enable VNC library for graphics, default = 1
vnc=0
#----------------------------------------------------------------------------
# address that should be listened on for the VNC server if vnc is set.
# default is to use 'vnc-listen' setting from /etc/xen/xend-config.sxp
#vnclisten="127.0.0.1"
#----------------------------------------------------------------------------
# set VNC display number, default = domid
#vncdisplay=1
#----------------------------------------------------------------------------
# try to find an unused port for the VNC server, default = 1
#vncunused=1
#----------------------------------------------------------------------------
# set password for domain's VNC console
# default is depents on vncpasswd in xend-config.sxp
vncpasswd=''
#----------------------------------------------------------------------------
# no graphics, use serial port
#nographic=0
#----------------------------------------------------------------------------
# enable stdvga, default = 0 (use cirrus logic device model)
stdvga=0
#-----------------------------------------------------------------------------
# serial port re-direct to pty deivce, /dev/pts/n
# then xm console or minicom can connect
serial='pty'
#-----------------------------------------------------------------------------
# Qemu Monitor, default is disable
# Use ctrl-alt-2 to connect
#monitor=1
#-----------------------------------------------------------------------------
# enable sound card support, [sb16|es1370|all|..,..], default none
#soundhw='sb16'
#-----------------------------------------------------------------------------
# set the real time clock to local time [default=0 i.e. set to utc]
#localtime=1
#-----------------------------------------------------------------------------
# set the real time clock offset in seconds [default=0 i.e. same as dom0]
#rtc_timeoffset=3600
#-----------------------------------------------------------------------------
# start in full screen
#full-screen=1
#-----------------------------------------------------------------------------
# Enable USB support (specific devices specified at runtime through the
# monitor window)
#usb=1
# Enable USB mouse support (only enable one of the following, `mouse' for
# PS/2 protocol relative mouse, `tablet' for
# absolute mouse)
#usbdevice='mouse'
#usbdevice='tablet'
#-----------------------------------------------------------------------------
# Set keyboard layout, default is en-us keyboard.
#keymap='ja'
#-----------------------------------------------------------------------------
# Configure guest CPUID responses:
#
#cpuid=[ '1:ecx=xxxxxxxxxxx00xxxxxxxxxxxxxxxxxxx,
# eax=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' ]
# - Unset the SSE4 features (CPUID.1[ECX][20-19])
# - Default behaviour for all other bits in ECX And EAX registers.
#
# Each successive character represent a lesser-significant bit:
# '1' -> force the corresponding bit to 1
# '0' -> force to 0
# 'x' -> Get a safe value (pass through and mask with the default policy)
# 'k' -> pass through the host bit value
# 's' -> as 'k' but preserve across save/restore and migration
#
# Expose to the guest multi-core cpu instead of multiple processors
# Example for intel, expose a 8-core processor :
#cpuid=['1:edx=xxx1xxxxxxxxxxxxxxxxxxxxxxxxxxxx,
# ebx=xxxxxxxx00010000xxxxxxxxxxxxxxxx',
# '4,0:eax=001111xxxxxxxxxxxxxxxxxxxxxxxxxx']
# - CPUID.1[EDX][HT] : Enable HT
# - CPUID.1[EBX] : Number of vcpus * 2
# - CPUID.4,0[EAX] : Number of vcpus * 2 - 1
#vcpus=8
#
# Example for amd, expose a 5-core processor :
# cpuid = ['1:ebx=xxxxxxxx00001010xxxxxxxxxxxxxxxx,
# edx=xxx1xxxxxxxxxxxxxxxxxxxxxxxxxxxx',
# '0x80000001:ecx=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx1x',
# '0x80000008:ecx=xxxxxxxxxxxxxxxxxxxxxxxxxx001001']
# - CPUID.1[EBX] : Threads per Core * Cores per Socket (2 * #vcpus)
# - CPUID.1[EDX][HT] : Enable HT
# - CPUID.0x80000001[CmpLegacy] : Use legacy method
# - CPUID.0x80000008[ECX] : #vcpus * 2 - 1
#vcpus=5
#
# Downgrade the cpuid to make a better compatibility for migration :
# Look like a generic 686 :
# cpuid = [ '0:eax=0x3,ebx=0x0,ecx=0x0,edx=0x0',
# '1:eax=0x06b1,
# ecx=xxxxxxxxxx0000xx00xxx0000000xx0,
# edx=xx00000xxxxxxx0xxxxxxxxx0xxxxxx',
# '4:eax=0x3,ebx=0x0,ecx=0x0,edx=0x0',
# '0x80000000:eax=0x3,ebx=0x0,ecx=0x0,edx=0x0']
# with the highest leaf
# - CPUID.0[EAX] : Set the highest leaf
# - CPUID.1[EAX] : 686
# - CPUID.1[ECX] : Mask some features
# - CPUID.1[EDX] : Mask some features
# - CPUID.4 : Reply like the highest leaf, in our case CPUID.3
# - CPUID.0x80000000 : No extension we are on a Pentium III, reply like the
# highest leaf (CPUID.3).
#
# Configure host CPUID consistency checks, which must be satisfied for this
# VM to be allowed to run on this host's processor type:
#cpuid_check=[ '1:ecx=xxxxxxxxxxxxxxxxxxxxxxxxxx1xxxxx' ]
# - Host must have VMX feature flag set
#
# The format is similar to the above for 'cpuid':
# '1' -> the bit must be '1'
# '0' -> the bit must be '0'
# 'x' -> we don't care (do not check)
# 's' -> the bit must be the same as on the host that started this VM
#-----------------------------------------------------------------------------
# Configure PVSCSI devices:
#
#vscsi=[ 'PDEV, VDEV' ]
#
# PDEV gives physical SCSI device to be attached to specified guest
# domain by one of the following identifier format.
# - XX:XX:XX:XX (4-tuples with decimal notation which shows
# "host:channel:target:lun")
# - /dev/sdxx or sdx
# - /dev/stxx or stx
# - /dev/sgxx or sgx
# - result of 'scsi_id -gu -s'.
# ex. # scsi_id -gu -s /block/sdb
# 36000b5d0006a0000006a0257004c0000
#
# VDEV gives virtual SCSI device by 4-tuples (XX:XX:XX:XX) as
# which the specified guest domain recognize.
#
#vscsi = [ '/dev/sdx, 0:0:0:0' ]
[-- Attachment #3: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: Test report for Xen-3.3.0-rc4 (#18314)
2008-08-14 12:49 ` Xu, Jiajun
@ 2008-08-14 13:46 ` Xu, Jiajun
0 siblings, 0 replies; 19+ messages in thread
From: Xu, Jiajun @ 2008-08-14 13:46 UTC (permalink / raw)
To: Xu, Jiajun, Keir Fraser, Li, Haicheng, xen-devel
On Thursday, August 14, 2008 8:49 PM
xen-devel-bounces@lists.xensource.com wrote:
>> With OOS, how easily do you reproduce the host crash? It takes
>> me 50-100
>> guest boots to cause a crash right now. Presumably you can repro more
>> quickly than that?
>
> It is very easy to reproduce the crash on our machine. About 2~3
> times trying will meet a crash. I attach my config file, maybe there
> is some difference between our environments.
And the two issues mostly happen after guest loading kernel and begin to
start system services.
We didn't meet crash at the beginning of creating guest.
Best Regards
Jiajun
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH] Fix OOS typecounting [was: Test report for Xen-3.3.0-rc4 (#18314)]
2008-08-14 7:26 ` Keir Fraser
2008-08-14 8:44 ` Keir Fraser
2008-08-14 9:03 ` Xu, Jiajun
@ 2008-08-14 15:14 ` Gianluca Guida
2 siblings, 0 replies; 19+ messages in thread
From: Gianluca Guida @ 2008-08-14 15:14 UTC (permalink / raw)
To: Keir Fraser; +Cc: Xu, Jiajun, Li, Haicheng, xen-devel
[-- Attachment #1: Type: text/plain, Size: 564 bytes --]
Hello,
Keir Fraser wrote:
> We still need to decide whether to fix the second issue or disable OOS.
The attached patch should fix this issue. It was an all-my-fault
breakage of set_l1e atomicity.
> We're not decided on that just yet. If it reproed more reliably for us then
> I'd be more optimistic about fixing it. Perhaps I will switch to 32pae as so
> far I've been running 32e host.
A very easy way to reproduce this bug is to set SHADOW_OOS_FIXUPS to 1
in xen/include/asm-x86/mm.h. This will reproduce very quickly the
typecount corruption.
Gianluca
[-- Attachment #2: fix-oos-typecount.patch --]
[-- Type: text/x-patch, Size: 1210 bytes --]
diff -r d3947223dfae xen/arch/x86/mm/shadow/multi.c
--- a/xen/arch/x86/mm/shadow/multi.c Thu Aug 14 13:46:48 2008 +0100
+++ b/xen/arch/x86/mm/shadow/multi.c Thu Aug 14 10:31:12 2008 -0400
@@ -1415,6 +1415,15 @@ static int shadow_set_l1e(struct vcpu *v
mfn_t new_gmfn = shadow_l1e_get_mfn(new_sl1e);
#endif
ASSERT(sl1e != NULL);
+
+#if SHADOW_OPTIMIZATIONS & SHOPT_OUT_OF_SYNC
+ if ( mfn_valid(new_gmfn) && mfn_oos_may_write(new_gmfn)
+ && ((shadow_l1e_get_flags(new_sl1e) & (_PAGE_RW|_PAGE_PRESENT))
+ == (_PAGE_RW|_PAGE_PRESENT)) )
+ {
+ oos_fixup_add(v, new_gmfn, sl1mfn, pgentry_ptr_to_slot(sl1e));
+ }
+#endif
old_sl1e = *sl1e;
@@ -1434,14 +1443,6 @@ static int shadow_set_l1e(struct vcpu *v
else
{
shadow_vram_get_l1e(new_sl1e, sl1e, sl1mfn, d);
-#if SHADOW_OPTIMIZATIONS & SHOPT_OUT_OF_SYNC
- if ( mfn_valid(new_gmfn) && mfn_oos_may_write(new_gmfn)
- && (shadow_l1e_get_flags(new_sl1e) & _PAGE_RW) )
- {
- oos_fixup_add(v, new_gmfn, sl1mfn, pgentry_ptr_to_slot(sl1e));
- }
-#endif
-
}
}
}
[-- Attachment #3: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Test report for Xen-3.3.0-rc4 (#18314)
2008-08-14 9:07 ` Keir Fraser
@ 2008-08-14 15:32 ` Keir Fraser
2008-08-15 1:56 ` Xu, Jiajun
0 siblings, 1 reply; 19+ messages in thread
From: Keir Fraser @ 2008-08-14 15:32 UTC (permalink / raw)
To: Xu, Jiajun, Li, Haicheng, xen-devel; +Cc: Gianluca Guida
On 14/8/08 10:07, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:
>>> There's a chance that Gianluca's new patch will fix your first
>>> host crash
>>> (although the domain crash would probably still remain).
>>
>> Yes. We tried the patch, still got xen crash.
>
> This could be a variant of the second crash (screwed reference counts)
> though. I'll take Gianluca's patch since it probably does make things
> better, but clearly we still have a nasty refcounting bug, probably in the
> OOS code.
It's believed fixed by changeset 18331. Please can you test this?
If it works okay for you then we'll make a new release candidate tomorrow
and plan to release early next week.
-- Keir
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: Test report for Xen-3.3.0-rc4 (#18314)
2008-08-14 15:32 ` Keir Fraser
@ 2008-08-15 1:56 ` Xu, Jiajun
0 siblings, 0 replies; 19+ messages in thread
From: Xu, Jiajun @ 2008-08-15 1:56 UTC (permalink / raw)
To: Keir Fraser, Li, Haicheng, xen-devel; +Cc: Gianluca Guida
On Thursday, August 14, 2008 11:33 PM Keir Fraser wrote:
> It's believed fixed by changeset 18331. Please can you test this?
>
> If it works okay for you then we'll make a new release
> candidate tomorrow
> and plan to release early next week.
Yes. Both Indiana and Nevada can boot well on c/s 18331.
We didn't meet crash or any error massage on the c/s.
Best Regards
Jiajun
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2008-08-15 1:56 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-08-13 7:43 Test report for Xen-3.3.0-rc4 (#18314) Li, Haicheng
2008-08-13 8:30 ` Keir Fraser
2008-08-13 8:43 ` Keir Fraser
2008-08-13 9:32 ` Xu, Jiajun
2008-08-13 9:48 ` Keir Fraser
2008-08-13 12:25 ` Xu, Jiajun
2008-08-13 14:48 ` Keir Fraser
2008-08-13 15:18 ` Keir Fraser
2008-08-14 7:09 ` Xu, Jiajun
2008-08-14 7:26 ` Keir Fraser
2008-08-14 8:44 ` Keir Fraser
2008-08-14 9:03 ` Xu, Jiajun
2008-08-14 9:07 ` Keir Fraser
2008-08-14 15:32 ` Keir Fraser
2008-08-15 1:56 ` Xu, Jiajun
2008-08-14 15:14 ` [PATCH] Fix OOS typecounting [was: Test report for Xen-3.3.0-rc4 (#18314)] Gianluca Guida
2008-08-14 10:13 ` Test report for Xen-3.3.0-rc4 (#18314) Keir Fraser
2008-08-14 12:49 ` Xu, Jiajun
2008-08-14 13:46 ` Xu, Jiajun
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.