xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Keir Fraser <keir.fraser@eu.citrix.com>
To: MaoXiaoyun <tinnycloud@hotmail.com>,
	"jbeulich@novell.com" <jbeulich@novell.com>
Cc: xen devel <xen-devel@lists.xensource.com>
Subject: Re: Xen-unstable panic: FATAL PAGE FAULT
Date: Wed, 1 Sep 2010 10:58:54 +0100	[thread overview]
Message-ID: <C8A3E270.21A27%keir.fraser@eu.citrix.com> (raw)
In-Reply-To: <BAY121-W693BCEFADA48DA1129883DA8B0@phx.gbl>

[-- Attachment #1: Type: text/plain, Size: 8420 bytes --]

Hm, well, it is a bit weird. The check in init_heap_pages() ought to prevent
merging across node boundaries. Nonetheless the code is simpler and more
obvious if we put a further merging constraint in free_heap_pages() instead.
It's also correcter, since I'm not sure that the
phys_to_nid(page_to_maddr(pg-1)) in init_heap_pages() won't possibly BUG out
if pg-1 is not a RAM page and is not in a known NUMA node range.

Please give the attached patch a spin. (You should revert the previous
patch, of course).

 Thanks,
 Keir

On 01/09/2010 10:23, "MaoXiaoyun" <tinnycloud@hotmail.com> wrote:

> Well. It did crash on every startup.
>  
> below is what I got.
> ---------------------------------------------------
> root (hd0,0)     
>  Filesystem type is ext2fs, partition type 0x83
> kernel /xen-4.0.0.gz msi=1 iommu=off x2apic=off hap=0 dom0_mem=10240M
> dom0_max_ 
> vcpus=4 dom0_vcpus_pin console=com1,vga com1=115200,8n1 conswitch=ax noreboot
>    [Multiboot-elf, <0x100000:0x152000:0x148000>, shtab=0x39a078,
> entry=0x100000 
> ]                
> module /vmlinuz-2.6.31.13-pvops-patch ro root=LABEL=/ hda=noprobe console=hvc0
>    [Multiboot-module @ 0x39b000, 0x3214d0 bytes]
>                  
>                  
>                                                  __  __            _  _
> ___   ___  
>  \ \/ /___ _ __   | || |  / _ \ / _ \                                      *
>   \  // _ \ '_ \  | || |_| | | | | | |                                     *
>   /  \  __/ | | | |__   _| |_| | |_| |                                     * *
>  /_/\_\___|_| |_|    |_|(_)___(_)___/ **************************************
>                                       hich entry is highlighted.
> (XEN) Xen version 4.0.0 (root@dev.sd.aliyun.com) (gcc version 4.1.2 20080704
> (Red Hat 4.1.2-46)) Wed Sep  1 17:13:35 CST 2010
> (XEN) Latest ChangeSet: unavailableto modify the kernel arguments
> (XEN) Command line: msi=1 iommu=off x2apic=off hap=0 dom0_mem=10240M
> dom0_max_vcpus=4 dom0_vcpus_pin console=com1,vga com1=115200,8n1 conswitch=ax
> noreboot
> (XEN) Video information:
> (XEN)  VGA is text mode 80x25, font 8x16automatically in 3 seconds.
> (XEN)  VBE/DDC methods: none; EDID transfer time: 0 seconds
> (XEN)  EDID info not retrieved because no DDC retrieval method detected
> (XEN) Disc information:
> (XEN)  Found 6 MBR signatures
> (XEN)  Found 6 EDD information structures
> (XEN) Xen-e820 RAM map:
> (XEN)  0000000000000000 - 000000000009a800 (usable)
> (XEN)  000000000009a800 - 00000000000a0000 (reserved)
> (XEN)  00000000000e4bb0 - 0000000000100000 (reserved)
> (XEN)  0000000000100000 - 00000000bf790000 (usable)
> (XEN)  00000000bf790000 - 00000000bf79e000 (ACPI data)
> (XEN)  00000000bf79e000 - 00000000bf7d0000 (ACPI NVS)
> (XEN)  00000000bf7d0000 - 00000000bf7e0000 (reserved)
> (XEN)  00000000bf7ec000 - 00000000c0000000 (reserved)
> (XEN)  00000000e0000000 - 00000000f0000000 (reserved)
> (XEN)  00000000fee00000 - 00000000fee01000 (reserved)
> (XEN)  00000000fff00000 - 0000000100000000 (reserved)
> (XEN)  0000000100000000 - 0000000640000000 (usable)
> (XEN) --------------849
> (XEN) --------------849
> (XEN) --------------849
> (XEN) ACPI: RSDP 000F9DD0, 0024 (r2 ACPIAM)
> (XEN) ACPI: XSDT BF790100, 005C (r1 112309 XSDT1113 20091123 MSFT       97)
> (XEN) ACPI: FACP BF790290, 00F4 (r4 112309 FACP1113 20091123 MSFT       97)
> (XEN) ACPI: DSDT BF7904B0, 4D6A (r2  CTSAV CTSAV122      122 INTL 20051117)
> (XEN) ACPI: FACS BF79E000, 0040
> (XEN) ACPI: APIC BF790390, 00D8 (r2 112309 APIC1113 20091123 MSFT       97)
> (XEN) ACPI: MCFG BF790470, 003C (r1 112309 OEMMCFG  20091123 MSFT       97)
> (XEN) ACPI: OEMB BF79E040, 007A (r1 112309 OEMB1113 20091123 MSFT       97)
> (XEN) ACPI: SRAT BF79A4B0, 01D0 (r1 112309 OEMSRAT         1 INTL        1)
> (XEN) ACPI: HPET BF79A680, 0038 (r1 112309 OEMHPET  20091123 MSFT       97)
> (XEN) ACPI: SSDT BF7A1A00, 0363 (r1 DpgPmm    CpuPm       12 INTL 20051117)
> (XEN) --------------847
> (XEN) ---------srat enter
> (XEN) ---------prepare enter into pfn
> (XEN) -------in pfn
> (XEN) -------hole shift returned
> (XEN) --------------849
> (XEN) System RAM: 24542MB (25131224kB)
> (XEN) Unknown interrupt (cr2=0000000000000000)
> (XEN)     00000000000000ab    0000000000000000    ffff82f600004020
> 00007d0a00000000    ffff82f600004000    0000000000000020    0000000000201000
> 0000000000000000    ffffffffffffffff    0000000000000000    0000000000000008
> 0000000000000000    00000000000001ff    00000000000001ff    0000000000000000
> ffff82c480115787    000000000000e008    0000000000010002    ffff82c48035fd18
> 0000000000000000    ffff82c48011536a    0000000000000000    0000000000000000
> 0000000000000163    0000000900000000    00000000000000ab    0000000000000201
> 0000000000000000    0000000000000100    ffff82f600004020    0000000000000eff
> 0000000000000000    ffff82c480115e60    0000000000000000    ffff82f600002020
> 0000000000001000    0000000000000004    0000000000000080    0000000000000001
> ffff82c48020be8d    ffff830000100000    0000000000000008    0000000000000000
> 0000000000000000    ffffffffffffffff    0000000000000101    ffff82c48022d8fc
> 0000000000540000    00000000005fde36    0000000000540000    0000000000100000
> 0000000100000000    0000000000000010    ffff82c48024deb4    ffff82c4802404f7
> 0000000000000000    0000000000000000    0000000000000000    0000000000000000
> 0000000000000000    ffff8300bf568ff8    ffff8300bf569ff8    000000000022a630
> 000000000022a695    0000000000087f00    0000000000000000    ffff830000087fc0
> 00000000005fde36    000000000087b6d0    0000000000d44000    0000000001000000
> 0000000000000000    ffffffffffffffff    ffff830000087f00    0000100000000000
> 0000000800000000    000000010000006e    0000000000000003    00000000000002f8
> 0000000000000000    0000000000000000    0000000000067ebc    0000000000000000
> 0000000000000000    0000000000000000    0000000000000000    ffff82c4801000b5
> 0000000000000000    0000000000000000    0000000000000000    0000000000000000
> 0000000000000000    0000000000000000    0000000000000000    0000000000000000
> 0000000000000000    0000000000000000    0000000000000000    0000000000000000
> 0000000000000000    0000000000000000    0000000000000000    0000000000000000
> 0000000000000000    0000000000000000    0000000000000000    0000000000000000
> 0000000000000000    0000000000000000    0000000000000000    0000000000000000
> 0000000000000000    0000000000000000    00000000fffff000
>  
>> Date: Wed, 1 Sep 2010 09:49:18 +0100
>> Subject: Re: [Xen-devel] Xen-unstable panic: FATAL PAGE FAULT
>> From: keir.fraser@eu.citrix.com
>> To: JBeulich@novell.com
>> CC: tinnycloud@hotmail.com; xen-devel@lists.xensource.com
>> 
>> On 01/09/2010 09:02, "Jan Beulich" <JBeulich@novell.com> wrote:
>> 
>>>> Well I agree with your logic anyway. So I don't see that this can be the
>>>> cause of MaoXiaoyun's bug. At least not directly. But then I'm stumped as
>>>> to
>>>> why the page arithmetic and checks in free_heap_pages are (apparently)
>>>> resulting in a page pointer way outside the frame-table region and actually
>>>> in the directmap region.
>>> 
>>> There must be some unchecked use of PAGE_LIST_NULL, i.e.
>>> running off a list end without taking notice (0xffff8315ffffffe4
>>> exactly corresponds with that).
>> 
>> Okay, my next guess then is that we are deleting a chunk from the wrong list
>> head. I don't see any check that the adjacent chunks we are considering to
>> merge are from the same node and zone. I suppose the zone logic does just
>> work as we're dealing with 2**x aligned and sized regions. But, shouldn't
>> the merging logic in free_heap_pages be checking that the merging candidate
>> is from the same NUMA node? I see I have an ASSERTion later in the same
>> function, but it's too weak and wishful I suspect.
>> 
>> MaoXiaoyun: can you please test with the attached patch? If I'm right, you
>> will crash on one of the BUG_ON checks that I added, rather than crashing on
>> a pointer dereference. You may even crash during boot. Anyhow, what is
>> interesting is whether this patch always makes you crash on BUG_ON before
>> you would normally crash on pointer dereference. If so this is trivial to
>> fix.
>> 
>> Thanks,
>> Keir
>> 
>        


[-- Attachment #2: 00-freeheap --]
[-- Type: application/octet-stream, Size: 3043 bytes --]

diff -r ae0cd4e5cc01 xen/common/page_alloc.c
--- a/xen/common/page_alloc.c	Wed Sep 01 10:19:14 2010 +0100
+++ b/xen/common/page_alloc.c	Wed Sep 01 10:55:50 2010 +0100
@@ -579,7 +579,8 @@
             /* Merge with predecessor block? */
             if ( !mfn_valid(page_to_mfn(pg-mask)) ||
                  !page_state_is(pg-mask, free) ||
-                 (PFN_ORDER(pg-mask) != order) )
+                 (PFN_ORDER(pg-mask) != order) ||
+                 (phys_to_nid(page_to_maddr(pg-mask)) != node) )
                 break;
             pg -= mask;
             page_list_del(pg, &heap(node, zone, order));
@@ -589,15 +590,13 @@
             /* Merge with successor block? */
             if ( !mfn_valid(page_to_mfn(pg+mask)) ||
                  !page_state_is(pg+mask, free) ||
-                 (PFN_ORDER(pg+mask) != order) )
+                 (PFN_ORDER(pg+mask) != order) ||
+                 (phys_to_nid(page_to_maddr(pg+mask)) != node) )
                 break;
             page_list_del(pg + mask, &heap(node, zone, order));
         }
 
         order++;
-
-        /* After merging, pg should remain in the same node. */
-        ASSERT(phys_to_nid(page_to_maddr(pg)) == node);
     }
 
     PFN_ORDER(pg) = order;
@@ -849,25 +848,22 @@
 static void init_heap_pages(
     struct page_info *pg, unsigned long nr_pages)
 {
-    unsigned int nid_curr, nid_prev;
     unsigned long i;
 
-    nid_prev = phys_to_nid(page_to_maddr(pg-1));
+    for ( i = 0; i < nr_pages; i++ )
+    {
+        unsigned int nid = phys_to_nid(page_to_maddr(pg+i));
 
-    for ( i = 0; i < nr_pages; nid_prev = nid_curr, i++ )
-    {
-        nid_curr = phys_to_nid(page_to_maddr(pg+i));
-
-        if ( unlikely(!avail[nid_curr]) )
+        if ( unlikely(!avail[nid]) )
         {
             unsigned long s = page_to_mfn(pg + i);
             unsigned long e = page_to_mfn(pg + nr_pages - 1) + 1;
-            bool_t use_tail = (nid_curr == phys_to_nid(pfn_to_paddr(e - 1))) &&
+            bool_t use_tail = (nid == phys_to_nid(pfn_to_paddr(e - 1))) &&
                               !(s & ((1UL << MAX_ORDER) - 1)) &&
                               (find_first_set_bit(e) <= find_first_set_bit(s));
             unsigned long n;
 
-            n = init_node_heap(nid_curr, page_to_mfn(pg+i), nr_pages - i,
+            n = init_node_heap(nid, page_to_mfn(pg+i), nr_pages - i,
                                &use_tail);
             BUG_ON(i + n > nr_pages);
             if ( n && !use_tail )
@@ -880,16 +876,7 @@
             nr_pages -= n;
         }
 
-        /*
-         * Free pages of the same node, or if they differ, but are on a
-         * MAX_ORDER alignment boundary (which already get reserved).
-         */
-        if ( (nid_curr == nid_prev) ||
-             !(page_to_mfn(pg+i) & ((1UL << MAX_ORDER) - 1)) )
-            free_heap_pages(pg+i, 0);
-        else
-            printk("Reserving non-aligned node boundary @ mfn %#lx\n",
-                   page_to_mfn(pg+i));
+        free_heap_pages(pg+i, 0);
     }
 }
 

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

  reply	other threads:[~2010-09-01  9:58 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <SNT0-MC2-F12iKC1rdi000797d9@snt0-mc2-f12.Snt0.hotmail.com>
2010-08-26  4:49 ` Re:Re: Xen-unstable panic: FATAL PAGE FAULT MaoXiaoyun
2010-08-26  7:39   ` Keir Fraser
2010-08-26  8:59     ` MaoXiaoyun
2010-08-26  9:11       ` Keir Fraser
2010-08-30  8:47         ` MaoXiaoyun
2010-08-30  9:02           ` Keir Fraser
2010-08-30 13:03             ` MaoXiaoyun
2010-08-30 13:16               ` Keir Fraser
2010-08-31 13:49                 ` MaoXiaoyun
2010-08-31 14:49                   ` Keir Fraser
2010-08-31 15:00                     ` Keir Fraser
2010-08-31 15:07                     ` Jan Beulich
2010-08-31 16:01                       ` Keir Fraser
2010-08-31 16:22                         ` Jan Beulich
2010-08-31 16:35                           ` Keir Fraser
2010-08-31 17:03                             ` Keir Fraser
2010-09-01  7:17                               ` MaoXiaoyun
2010-09-01  7:40                                 ` Keir Fraser
2010-09-01  8:05                                 ` Jan Beulich
2010-09-01  8:32                                   ` MaoXiaoyun
2010-09-01  8:02                               ` Jan Beulich
2010-09-01  8:49                                 ` Keir Fraser
2010-09-01  9:01                                   ` Jan Beulich
2010-09-01  9:28                                     ` Keir Fraser
2010-09-01  9:48                                     ` MaoXiaoyun
2010-09-01 10:09                                       ` Jan Beulich
2010-09-01  9:06                                   ` MaoXiaoyun
2010-09-01  9:23                                   ` MaoXiaoyun
2010-09-01  9:58                                     ` Keir Fraser [this message]
2010-09-01 10:21                                       ` MaoXiaoyun
2010-09-01 10:25                                         ` Keir Fraser
2010-09-01 10:28                                           ` Keir Fraser
2010-09-01 10:34                                         ` Jan Beulich
2010-09-01 11:32                                       ` MaoXiaoyun
2010-09-01  7:54                             ` Jan Beulich
2010-09-01  3:17                     ` MaoXiaoyun
2010-02-06 22:56 Mark Hurenkamp
2010-02-07 11:56 ` Keir Fraser
2010-04-30 20:52   ` Bastian Blank

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=C8A3E270.21A27%keir.fraser@eu.citrix.com \
    --to=keir.fraser@eu.citrix.com \
    --cc=jbeulich@novell.com \
    --cc=tinnycloud@hotmail.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).