From mboxrd@z Thu Jan 1 00:00:00 1970 From: Chris Lalancette Subject: [PATCH]: Allow Xen to boot/run on large memory (>64G) machines Date: Wed, 21 Feb 2007 19:38:12 -0500 Message-ID: <45DCE5F4.6060007@redhat.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------050804030203070304090202" Return-path: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org This is a multi-part message in MIME format. --------------050804030203070304090202 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit All, I've been tracking down a problem where dom0 refuses to boot on very large memory x86_64 machines. Here's what happens: The hypervisor starts up with 1GB in the DMA zone. Two large allocations come out of the DMA zone; the frame table (in init_frametable()), and the memory for dom0 (in construct_dom0()). With a lot of memory in the box, most of the DMA zone gets allocated during init_frametable; so much so, in fact, that there is no room to make the allocation in construct_dom0, and the dom0 fails to boot with: (XEN) **************************************** (XEN) Panic on CPU 0: (XEN) Not enough RAM for domain 0 allocation. (XEN) **************************************** (XEN) (XEN) Reboot in five seconds... The solution (suggested by Keir), is to make the frametable allocated out of high memory instead of the DMA zone. The attached patch (against 3.0.3, but the problem is the same in unstable), does this. I tested this out on a 96GB machine; without the patch, the machine would reboot as described above; with the patch, I was able to boot dom0 and create a PV guest with 92GB of memory. I only compile tested this on ia64, but I don't see anything in it that should cause problems there. Note that this is not the end of the story, however. For even larger machines, it can *still* be the case that the allocation in construct_dom0() fails; in particular, if the order goes above 17, it will fail in the same way. One way to fix it would be to just allocate that memory out of the normal zone for x86_64, as well; however, I'm not sure if this will break anything else. Any comments? Signed-off-by: Chris Lalancette --------------050804030203070304090202 Content-Type: text/x-patch; name="xen-hugemem-3.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="xen-hugemem-3.patch" diff -urp xen.orig/arch/x86/mm.c xen/arch/x86/mm.c --- xen.orig/arch/x86/mm.c 2007-02-21 14:45:38.000000000 -0500 +++ xen/arch/x86/mm.c 2007-02-21 16:11:34.000000000 -0500 @@ -179,7 +179,15 @@ void __init init_frametable(void) for ( i = 0; i < nr_pages; i += page_step ) { +#ifdef __x86_64__ + /* for x86_64 we want to allocate the frame table from the top + * of memory rather than the bottom; otherwise, on large memory + * machines (> 64G), we exhaust DMA memory, and dom0 cannot boot + */ + mfn = alloc_boot_pages_reverse(min(nr_pages - i, page_step), page_step); +#else mfn = alloc_boot_pages(min(nr_pages - i, page_step), page_step); +#endif if ( mfn == 0 ) panic("Not enough memory for frame table\n"); map_pages_to_xen( diff -urp xen.orig/common/page_alloc.c xen/common/page_alloc.c --- xen.orig/common/page_alloc.c 2006-10-16 16:07:17.000000000 -0400 +++ xen/common/page_alloc.c 2007-02-21 16:09:38.000000000 -0500 @@ -213,26 +213,44 @@ void init_boot_pages(paddr_t ps, paddr_t } } -unsigned long alloc_boot_pages(unsigned long nr_pfns, unsigned long pfn_align) +static unsigned long check_and_map_page(unsigned long pg, unsigned long nr_pfns) { - unsigned long pg, i; + unsigned long i; - for ( pg = 0; (pg + nr_pfns) < max_page; pg += pfn_align ) + for ( i = 0; i < nr_pfns; i++ ) + if ( allocated_in_map(pg + i) ) + break; + + if ( i == nr_pfns ) { - for ( i = 0; i < nr_pfns; i++ ) - if ( allocated_in_map(pg + i) ) - break; + map_alloc(pg, nr_pfns); + return pg; + } - if ( i == nr_pfns ) - { - map_alloc(pg, nr_pfns); + return 0; +} + +unsigned long alloc_boot_pages(unsigned long nr_pfns, unsigned long pfn_align) +{ + unsigned long pg; + + for ( pg = 0; (pg + nr_pfns) < max_page; pg += pfn_align ) + if (check_and_map_page(pg, nr_pfns)) return pg; - } - } return 0; } +unsigned long alloc_boot_pages_reverse(unsigned long nr_pfns, unsigned long pfn_align) +{ + unsigned long pg; + + for ( pg = (max_page - nr_pfns); pg > 0; pg -= pfn_align ) + if (check_and_map_page(pg, nr_pfns)) + return pg; + + return 0; +} /************************* diff -urp xen.orig/include/xen/mm.h xen/include/xen/mm.h --- xen.orig/include/xen/mm.h 2006-10-16 16:07:18.000000000 -0400 +++ xen/include/xen/mm.h 2007-02-21 15:57:46.000000000 -0500 @@ -39,6 +39,7 @@ struct page_info; /* Boot-time allocator. Turns into generic allocator after bootstrap. */ paddr_t init_boot_allocator(paddr_t bitmap_start); void init_boot_pages(paddr_t ps, paddr_t pe); +unsigned long alloc_boot_pages_reverse(unsigned long nr_pfns, unsigned long pfn_align); unsigned long alloc_boot_pages(unsigned long nr_pfns, unsigned long pfn_align); void end_boot_allocator(void); --------------050804030203070304090202 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --------------050804030203070304090202--