All of lore.kernel.org
 help / color / mirror / Atom feed
* Xen crash with latest 3.3 bits
@ 2009-03-10 15:55 John Levon
  2009-03-10 16:23 ` Gianluca Guida
  0 siblings, 1 reply; 6+ messages in thread
From: John Levon @ 2009-03-10 15:55 UTC (permalink / raw)
  To: xen-devel


Some time after starting an 64-bit SMP Solaris 10 domain (HVM with PV drivers),
I get the below crash. Any ideas?

thanks
john

(xVM) grant_table.c:800:d3 Expanding dom (3) grant table from (4) to (32) frames.
(xVM) Assertion '(page->u.inuse.type_info & (7U<<29)) != (7U<<29) || (page->u.inuse.type_info & ((1U<<26)-1)) == 0 || v->domain->is_shutting_down' failed at common.c:1000
(xVM) ----[ Xen-3.3.2-rc1-pre-xvm-debug  x86_64  debug=y  Not tainted ]----
(xVM) CPU:    12
(xVM) RIP:    e008:[<ffff828c801ad88d>] shadow_promote+0xa0/0x108
(xVM) RFLAGS: 0000000000010246   CONTEXT: hypervisor
(xVM) rax: ffff8300deca6100   rbx: ffff828411d446b0   rcx: 00000000e8000001
(xVM) rdx: 00000000e0000000   rsi: 0000000000721b5e   rdi: ffff8300dec92100
(xVM) rbp: ffff8300dec2fc08   rsp: ffff8300dec2fbe8   r8:  0000000000000020
(xVM) r9:  0000000000000000   r10: ffff828c80240340   r11: ffff828c80207ac8
(xVM) r12: 0000000000000008   r13: ffff8300dec92100   r14: 0000000000721b5e
(xVM) r15: 000000000075112a   cr0: 0000000080050033   cr4: 00000000000006f0
(xVM) cr3: 0000000751252000   cr2: 0000000008046b7c
(xVM) ds: 0000   es: 0000   fs: 0000   gs: 01c3   ss: 0000   cs: e008
(xVM) Xen stack trace from rsp=ffff8300dec2fbe8:
(xVM)    0000000000000008 ffff8300dec92100 0000000000000008 ffff8140c0000200
(xVM)    ffff8300dec2fc68 ffff828c801caf95 0000000000721b5e ffff8300dec2fd08
(xVM)    ffff8300dec2fc68 ffff828c801b00f1 00000000007510fa ffff8300dec92100
(xVM)    ffff8300dec2fe08 ffff8140c0000200 0000000000000006 ffff8300dec2fd08
(xVM)    ffff8300dec2fcc8 ffff828c801cb6e9 ffff8300deca7070 ffff8300dec92100
(xVM)    0000000305a7e000 0000000000751255 00000001deca6100 ffff8300dec2fe08
(xVM)    ffff8300deca6100 ffff8300dec2fe08 ffff8300dec92100 0000000008046b7c
(xVM)    ffff8300dec2fe88 ffff828c801ce2a1 0000000000000001 0000000600000000
(xVM)    0000000000726358 ffff8300dec2ff28 0000000008046b7c ffff8300dec92100
(xVM)    ffffffffffffffff 00000001dec2fd78 ffff8300dec92100 ffff8300dec90000
(xVM)    0000000000008046 0000000000030758 ffff830000000007 ffff828c8018eabe
(xVM)    ffff8300dec2fd68 ffff828c801073a7 fffffe80000b96ac ffffffffffffffda
(xVM)    ffff8300dec2fe18 ffff828c801080f2 ffff8300dec2ff28 0000004000000040
(xVM)    ffff8300dec20001 ade9c9c3c9c0220f 6690666666ffff83 fffffffffb840838
(xVM)    029b002800000010 0000000000000fff 0400000000000000 00000000ffffffff
(xVM)    0cf3004300000000 00000000ffffffff ffff830000000000 ffff828c8018ad6d
(xVM)    ffff8300dec2fe08 ffff828c8019ac12 000000000000000f 00000000000000ff
(xVM)    0000000008046b7c 000000000dbfa027 0000000038a7e027 0000000034f5e027
(xVM)    8000000030758067 000000000071cffc 000000000071cffa 0000000000005a7e
(xVM)    0000000000721b5e 000000000046d708 0000000000000000 0000000008046b7c
(xVM) Xen call trace:
(xVM)    [<ffff828c801ad88d>] shadow_promote+0xa0/0x108
(xVM)    [<ffff828c801caf95>] sh_make_shadow+0x426/0x4a9
(xVM)    [<ffff828c801cb6e9>] shadow_get_and_create_l1e+0x198/0x279
(xVM)    [<ffff828c801ce2a1>] sh_page_fault__guest_4+0xb30/0x1624
(xVM)    [<ffff828c8019e49e>] svm_vmexit_handler+0x633/0xaec
(xVM)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Xen crash with latest 3.3 bits
  2009-03-10 15:55 Xen crash with latest 3.3 bits John Levon
@ 2009-03-10 16:23 ` Gianluca Guida
  2009-03-10 17:45   ` John Levon
  0 siblings, 1 reply; 6+ messages in thread
From: Gianluca Guida @ 2009-03-10 16:23 UTC (permalink / raw)
  To: John Levon; +Cc: xen-devel@lists.xensource.com



John Levon wrote:
> Some time after starting an 64-bit SMP Solaris 10 domain (HVM with PV drivers),
> I get the below crash. Any ideas?

This should be fixed by xen-unstable.hg's changeset 18806, named 
"shadow: fix race between resync and page promotion".

It should be in xen-3.3.hg, can't check right now because xenbits web 
page seems to be dead/slow.

Gianluca

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Xen crash with latest 3.3 bits
  2009-03-10 16:23 ` Gianluca Guida
@ 2009-03-10 17:45   ` John Levon
  2009-03-10 18:49     ` Gianluca Guida
  0 siblings, 1 reply; 6+ messages in thread
From: John Levon @ 2009-03-10 17:45 UTC (permalink / raw)
  To: Gianluca Guida; +Cc: xen-devel@lists.xensource.com

On Tue, Mar 10, 2009 at 04:23:25PM +0000, Gianluca Guida wrote:

> >Some time after starting an 64-bit SMP Solaris 10 domain (HVM with PV 
> >drivers),
> >I get the below crash. Any ideas?
> 
> This should be fixed by xen-unstable.hg's changeset 18806, named 
> "shadow: fix race between resync and page promotion".
> 
> It should be in xen-3.3.hg, can't check right now because xenbits web 
> page seems to be dead/slow.

I have the patch you're referring to, but still see the problem. It's
pretty rare and only seems to have happened with S10 SMP so far...

regards
john

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Xen crash with latest 3.3 bits
  2009-03-10 17:45   ` John Levon
@ 2009-03-10 18:49     ` Gianluca Guida
  2009-03-17 12:51       ` [PATCH]: Prevent in-sync L1s to become writable (was: Re: Xen crash with latest 3.3 bits) Gianluca Guida
  0 siblings, 1 reply; 6+ messages in thread
From: Gianluca Guida @ 2009-03-10 18:49 UTC (permalink / raw)
  To: John Levon; +Cc: xen-devel@lists.xensource.com



John Levon wrote:
> On Tue, Mar 10, 2009 at 04:23:25PM +0000, Gianluca Guida wrote:
> 
>>> Some time after starting an 64-bit SMP Solaris 10 domain (HVM with PV 
>>> drivers),
>>> I get the below crash. Any ideas?
>> This should be fixed by xen-unstable.hg's changeset 18806, named 
>> "shadow: fix race between resync and page promotion".
>>
>> It should be in xen-3.3.hg, can't check right now because xenbits web 
>> page seems to be dead/slow.
> 
> I have the patch you're referring to, but still see the problem. It's
> pretty rare and only seems to have happened with S10 SMP so far...

Ah, interesting. I'll look into it.

Gianluca

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH]: Prevent in-sync L1s to become writable (was: Re: Xen crash with latest 3.3 bits)
  2009-03-10 18:49     ` Gianluca Guida
@ 2009-03-17 12:51       ` Gianluca Guida
  2009-03-17 17:40         ` John Levon
  0 siblings, 1 reply; 6+ messages in thread
From: Gianluca Guida @ 2009-03-17 12:51 UTC (permalink / raw)
  To: John Levon; +Cc: xen-devel@lists.xensource.com, Christian Limpach

[-- Attachment #1: Type: text/plain, Size: 796 bytes --]

Hi,

Gianluca Guida wrote:
> 
> John Levon wrote:
>> On Tue, Mar 10, 2009 at 04:23:25PM +0000, Gianluca Guida wrote:
>>
>>>> Some time after starting an 64-bit SMP Solaris 10 domain (HVM with PV 
>>>> drivers),
>>>> I get the below crash. Any ideas?
>>> This should be fixed by xen-unstable.hg's changeset 18806, named 
>>> "shadow: fix race between resync and page promotion".
>>>
>>> It should be in xen-3.3.hg, can't check right now because xenbits web 
>>> page seems to be dead/slow.
>> I have the patch you're referring to, but still see the problem. It's
>> pretty rare and only seems to have happened with S10 SMP so far...
> 
> Ah, interesting. I'll look into it.

Meanwhile, Christian Limpach was hit by the same bug and fixed it.

Versions for Xen-3.3 and unstable attached.

Gianluca

[-- Attachment #2: prevent-insync-writable-l1 --]
[-- Type: text/plain, Size: 1737 bytes --]

diff -r b249f3e979a5 xen/arch/x86/mm/shadow/multi.c
--- a/xen/arch/x86/mm/shadow/multi.c	Mon Mar 09 10:32:24 2009 +0000
+++ b/xen/arch/x86/mm/shadow/multi.c	Tue Mar 17 12:45:56 2009 +0000
@@ -3112,6 +3112,19 @@ static int sh_page_fault(struct vcpu *v,
     shadow_lock(d);
 
     TRACE_CLEAR_PATH_FLAGS;
+
+    /* Make sure there is enough free shadow memory to build a chain of
+     * shadow tables. (We never allocate a top-level shadow on this path,
+     * only a 32b l1, pae l1, or 64b l3+2+1. Note that while
+     * SH_type_l1_shadow isn't correct in the latter case, all page
+     * tables are the same size there.)
+     *
+     * Preallocate shadow pages *before* removing writable accesses
+     * otherwhise an OOS L1 might be demoted and promoted again with
+     * writable mappings. */
+    shadow_prealloc(d,
+                    SH_type_l1_shadow,
+                    GUEST_PAGING_LEVELS < 4 ? 1 : GUEST_PAGING_LEVELS - 1);
     
     rc = gw_remove_write_accesses(v, va, &gw);
 
@@ -3144,15 +3157,6 @@ static int sh_page_fault(struct vcpu *v,
 
     shadow_audit_tables(v);
     sh_audit_gw(v, &gw);
-
-    /* Make sure there is enough free shadow memory to build a chain of
-     * shadow tables. (We never allocate a top-level shadow on this path,
-     * only a 32b l1, pae l1, or 64b l3+2+1. Note that while
-     * SH_type_l1_shadow isn't correct in the latter case, all page
-     * tables are the same size there.) */
-    shadow_prealloc(d,
-                    SH_type_l1_shadow,
-                    GUEST_PAGING_LEVELS < 4 ? 1 : GUEST_PAGING_LEVELS - 1);
 
     /* Acquire the shadow.  This must happen before we figure out the rights 
      * for the shadow entry, since we might promote a page here. */

[-- Attachment #3: prevent-insync-writable-l1-3.3 --]
[-- Type: text/plain, Size: 1766 bytes --]

diff -r 587e81dd3540 xen/arch/x86/mm/shadow/multi.c
--- a/xen/arch/x86/mm/shadow/multi.c	Mon Mar 02 14:19:35 2009 +0000
+++ b/xen/arch/x86/mm/shadow/multi.c	Tue Mar 17 12:31:10 2009 +0000
@@ -3257,6 +3257,19 @@ static int sh_page_fault(struct vcpu *v,
 
     shadow_lock(d);
 
+    /* Make sure there is enough free shadow memory to build a chain of
+     * shadow tables. (We never allocate a top-level shadow on this path,
+     * only a 32b l1, pae l1, or 64b l3+2+1. Note that while
+     * SH_type_l1_shadow isn't correct in the latter case, all page
+     * tables are the same size there.)
+     *
+     * Preallocate shadow pages *before* removing writable accesses
+     * otherwhise an OOS L1 might be demoted and promoted again with
+     * writable mappings. */
+    shadow_prealloc(d,
+                    SH_type_l1_shadow,
+                    GUEST_PAGING_LEVELS < 4 ? 1 : GUEST_PAGING_LEVELS - 1);
+    
     rc = gw_remove_write_accesses(v, va, &gw);
 
     /* First bit set: Removed write access to a page. */
@@ -3288,15 +3301,6 @@ static int sh_page_fault(struct vcpu *v,
 
     shadow_audit_tables(v);
     sh_audit_gw(v, &gw);
-
-    /* Make sure there is enough free shadow memory to build a chain of
-     * shadow tables. (We never allocate a top-level shadow on this path,
-     * only a 32b l1, pae l1, or 64b l3+2+1. Note that while
-     * SH_type_l1_shadow isn't correct in the latter case, all page
-     * tables are the same size there.) */
-    shadow_prealloc(d,
-                    SH_type_l1_shadow,
-                    GUEST_PAGING_LEVELS < 4 ? 1 : GUEST_PAGING_LEVELS - 1);
 
     /* Acquire the shadow.  This must happen before we figure out the rights 
      * for the shadow entry, since we might promote a page here. */

[-- Attachment #4: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH]: Prevent in-sync L1s to become writable (was: Re: Xen crash with latest 3.3 bits)
  2009-03-17 12:51       ` [PATCH]: Prevent in-sync L1s to become writable (was: Re: Xen crash with latest 3.3 bits) Gianluca Guida
@ 2009-03-17 17:40         ` John Levon
  0 siblings, 0 replies; 6+ messages in thread
From: John Levon @ 2009-03-17 17:40 UTC (permalink / raw)
  To: Gianluca Guida; +Cc: xen-devel@lists.xensource.com, Christian Limpach

On Tue, Mar 17, 2009 at 12:51:19PM +0000, Gianluca Guida wrote:

> >>I have the patch you're referring to, but still see the problem. It's
> >>pretty rare and only seems to have happened with S10 SMP so far...
> >
> >Ah, interesting. I'll look into it.
> 
> Meanwhile, Christian Limpach was hit by the same bug and fixed it.

Cool. In the meantime, I'm having trouble reproducing the issue (of
course!) so I can't confirm yet that it fixes it for me.

thanks
john

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2009-03-17 17:40 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-10 15:55 Xen crash with latest 3.3 bits John Levon
2009-03-10 16:23 ` Gianluca Guida
2009-03-10 17:45   ` John Levon
2009-03-10 18:49     ` Gianluca Guida
2009-03-17 12:51       ` [PATCH]: Prevent in-sync L1s to become writable (was: Re: Xen crash with latest 3.3 bits) Gianluca Guida
2009-03-17 17:40         ` John Levon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.