xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* Re: PATCH: Hugepage support for Domains booting with 4KB pages
@ 2011-03-21 21:01 Keshav Darak
  2011-03-21 21:31 ` Keir Fraser
  0 siblings, 1 reply; 7+ messages in thread
From: Keshav Darak @ 2011-03-21 21:01 UTC (permalink / raw)
  To: xen-devel; +Cc: jeremy, keir


[-- Attachment #1.1: Type: text/plain, Size: 2227 bytes --]

have corrected few mistakes in previously attached xen patch file.
Please review it.

--- On Sun, 3/20/11, Keshav Darak <keshav_darak@yahoo.com> wrote:

From: Keshav Darak <keshav_darak@yahoo.com>
Subject: [Xen-devel] PATCH: Hugepage support for Domains booting with 4KB pages
To: xen-devel@lists.xensource.com
Cc: jeremy@goop.org, keir@xen.org
Date: Sunday, March 20, 2011, 10:34 PM

We have implemented hugepage support for guests in following manner

In
 our implementation we added a parameter hugepage_num which is specified
 in the config file of the DomU. It is the number of hugepages that the 
guest is guaranteed to receive whenever the kernel asks for hugepage by 
using its boot time parameter or reserving after booting (eg. Using echo
 XX > /proc/sys/vm/nr_hugepages). During creation of the domain we 
reserve MFN's for these hugepages and store them in the list. The 
listhead of this list is inside the domain structure with name 
"hugepage_list". When the domain is booting, at that time the memory 
seen by the kernel is allocated memory  less the amount required for hugepages. The function 
reserve_hugepage_range is called as a initcall. Before this function the
 xen_extra_mem_start points to this apparent end of the memory. In this 
function we reserve the PFN range for the hugepages which are going to 
be allocated by kernel by incrementing the xen_extra_mem_start. We 
maintain these PFNs as pages in "xen_hugepfn_list" in the kernel. 

Now
 before the kernel requests for hugepages, it makes a hypercall 
HYPERVISOR_memory_op  to get count of hugepages allocated to it and 
accordingly reserves the pfn range.
then whenever kernel requests for
 hugepages it again make hypercall HYPERVISOR_memory_op to get the 
preallocated hugepage and according makes the p2m mapping on both sides 
(xen as well as kernel side)

The approach can be better explained using the presentation attached.

--
Keshav Darak
Kaustubh Kabra
Ashwin Vasani 
Aditya Gadre



      
-----Inline Attachment Follows-----

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel



      

[-- Attachment #1.2: Type: text/html, Size: 2928 bytes --]

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: xen.patch --]
[-- Type: text/x-patch; name="xen.patch", Size: 19814 bytes --]

diff -r 4e108cf56d07 tools/libxc/xc_dom.h
--- a/tools/libxc/xc_dom.h	Mon Dec 27 08:00:09 2010 +0000
+++ b/tools/libxc/xc_dom.h	Mon Mar 21 11:29:26 2011 +0530
@@ -113,6 +113,7 @@
     domid_t guest_domid;
     int8_t vhpt_size_log2; /* for IA64 */
     int8_t superpages;
+    int hugepage_num;
     int shadow_enabled;
 
     int xen_version;
diff -r 4e108cf56d07 tools/libxc/xc_dom_core.c
--- a/tools/libxc/xc_dom_core.c	Mon Dec 27 08:00:09 2010 +0000
+++ b/tools/libxc/xc_dom_core.c	Mon Mar 21 11:29:26 2011 +0530
@@ -699,6 +699,20 @@
 
     page_shift = XC_DOM_PAGE_SHIFT(dom);
     nr_pages = mem_mb << (20 - page_shift);
+    
+    //a2k2
+    if(dom->hugepage_num && dom->superpages!=1)
+        {
+
+            nr_pages-=dom->hugepage_num*512;
+
+    }
+    if(nr_pages<=0)
+        {
+            xc_dom_panic(dom->xch, XC_INTERNAL_ERROR, "%s: Allocated memory less than required for hugepages",
+                     __FUNCTION__);
+        return -1;
+        }
 
     DOMPRINTF("%s: mem %d MB, pages 0x%" PRIpfn " pages, %dk each",
                __FUNCTION__, mem_mb, nr_pages, 1 << (page_shift-10));
diff -r 4e108cf56d07 tools/libxc/xc_dom_x86.c
--- a/tools/libxc/xc_dom_x86.c	Mon Dec 27 08:00:09 2010 +0000
+++ b/tools/libxc/xc_dom_x86.c	Mon Mar 21 11:29:26 2011 +0530
@@ -747,9 +747,18 @@
             for ( j = 0; j < SUPERPAGE_NR_PFNS; j++, pfn++ )
                 dom->p2m_host[pfn] = mfn + j;
         }
+
     }
     else
     {
+        /*a2k2 setting up hugepages pool for domain in xen from its mem_size not allocated as free pages to domain. */
+        if(dom->hugepage_num)
+        {
+            rc = xc_domain_populate_hugemap(
+                dom->xch, dom->guest_domid, dom->hugepage_num,
+                9, 0, &dom->p2m_host[0]);
+
+        }
         /* setup initial p2m */
         for ( pfn = 0; pfn < dom->total_pages; pfn++ )
             dom->p2m_host[pfn] = pfn;
diff -r 4e108cf56d07 tools/libxc/xc_domain.c
--- a/tools/libxc/xc_domain.c	Mon Dec 27 08:00:09 2010 +0000
+++ b/tools/libxc/xc_domain.c	Mon Mar 21 11:29:26 2011 +0530
@@ -729,6 +729,34 @@
     return do_memory_op(xch, XENMEM_add_to_physmap, &xatp, sizeof(xatp));
 }
 
+int xc_domain_populate_hugemap(xc_interface *xch,
+                               uint32_t domid,
+                               unsigned long nr_extents,
+                               unsigned int extent_order,
+                               unsigned int mem_flags,
+                               xen_pfn_t *extent_start)
+{
+    int err;
+    DECLARE_HYPERCALL_BOUNCE(extent_start, nr_extents * sizeof(*extent_start), XC_HYPERCALL_BUFFER_BOUNCE_BOTH);
+    struct xen_memory_reservation reservation = {
+        .nr_extents   = nr_extents,
+        .extent_order = extent_order,
+        .mem_flags    = mem_flags,
+        .domid        = domid
+    };
+
+    if ( xc_hypercall_bounce_pre(xch, extent_start) )
+    {
+        PERROR("Could not bounce memory for XENMEM_populate_physmap hypercall");
+        return -1;
+    }
+    set_xen_guest_handle(reservation.extent_start, extent_start);
+
+    err = do_memory_op(xch, XENMEM_populate_hugemap, &reservation, sizeof(reservation));
+
+    xc_hypercall_bounce_post(xch, extent_start);
+    return err;
+}
 int xc_domain_populate_physmap(xc_interface *xch,
                                uint32_t domid,
                                unsigned long nr_extents,
diff -r 4e108cf56d07 tools/libxc/xenctrl.h
--- a/tools/libxc/xenctrl.h	Mon Dec 27 08:00:09 2010 +0000
+++ b/tools/libxc/xenctrl.h	Mon Mar 21 11:29:26 2011 +0530
@@ -1006,6 +1006,13 @@
                                unsigned int mem_flags,
                                xen_pfn_t *extent_start);
 
+int xc_domain_populate_hugemap(xc_interface *xch,
+                               uint32_t domid,
+                               unsigned long nr_extents,
+                               unsigned int extent_order,
+                               unsigned int mem_flags,
+                               xen_pfn_t *extent_start);
+
 int xc_domain_populate_physmap_exact(xc_interface *xch,
                                      uint32_t domid,
                                      unsigned long nr_extents,
diff -r 4e108cf56d07 tools/python/xen/lowlevel/xc/xc.c
--- a/tools/python/xen/lowlevel/xc/xc.c	Mon Dec 27 08:00:09 2010 +0000
+++ b/tools/python/xen/lowlevel/xc/xc.c	Mon Mar 21 11:29:26 2011 +0530
@@ -455,6 +455,7 @@
     int store_evtchn, console_evtchn;
     int vhpt = 0;
     int superpages = 0;
+    int hugepage_num = 0;
     unsigned int mem_mb;
     unsigned long store_mfn = 0;
     unsigned long console_mfn = 0;
@@ -467,14 +468,14 @@
                                 "console_evtchn", "image",
                                 /* optional */
                                 "ramdisk", "cmdline", "flags",
-                                "features", "vhpt", "superpages", NULL };
-
-    if ( !PyArg_ParseTupleAndKeywords(args, kwds, "iiiis|ssisii", kwd_list,
+                                "features", "vhpt", "superpages","hugepage_num", NULL };
+
+    if ( !PyArg_ParseTupleAndKeywords(args, kwds, "iiiis|ssisiii", kwd_list,
                                       &domid, &store_evtchn, &mem_mb,
                                       &console_evtchn, &image,
                                       /* optional */
                                       &ramdisk, &cmdline, &flags,
-                                      &features, &vhpt, &superpages) )
+                                      &features, &vhpt, &superpages,&hugepage_num) )
         return NULL;
 
     xc_dom_loginit(self->xc_handle);
@@ -484,6 +485,7 @@
     /* for IA64 */
     dom->vhpt_size_log2 = vhpt;
 
+  dom->hugepage_num=hugepage_num;
     dom->superpages = superpages;
 
     if ( xc_dom_linux_build(self->xc_handle, dom, domid, mem_mb, image,
diff -r 4e108cf56d07 tools/python/xen/xend/XendConfig.py
--- a/tools/python/xen/xend/XendConfig.py	Mon Dec 27 08:00:09 2010 +0000
+++ b/tools/python/xen/xend/XendConfig.py	Mon Mar 21 11:29:26 2011 +0530
@@ -244,6 +244,7 @@
     'memory_sharing': int,
     'pool_name' : str,
     'Description': str,
+    'hugepage_num':int,
 }
 
 # List of legacy configuration keys that have no equivalent in the
@@ -423,6 +424,7 @@
             'pool_name' : 'Pool-0',
             'superpages': 0,
             'description': '',
+            'hugepage_num':0,
         }
         
         return defaults
@@ -2135,6 +2137,8 @@
             image.append(['args', self['PV_args']])
         if self.has_key('superpages'):
             image.append(['superpages', self['superpages']])
+	if self.has_key('hugepage_num'):
+            image.append(['hugepage_num', self['hugepage_num']])
 
         for key in XENAPI_PLATFORM_CFG_TYPES.keys():
             if key in self['platform']:
@@ -2179,6 +2183,9 @@
         val = sxp.child_value(image_sxp, 'superpages')
         if val is not None:
             self['superpages'] = val
+        val = sxp.child_value(image_sxp, 'hugepage_num')
+        if val is not None:
+            self['hugepage_num'] = val
         
         val = sxp.child_value(image_sxp, 'memory_sharing')
         if val is not None:
diff -r 4e108cf56d07 tools/python/xen/xend/image.py
--- a/tools/python/xen/xend/image.py	Mon Dec 27 08:00:09 2010 +0000
+++ b/tools/python/xen/xend/image.py	Mon Mar 21 11:29:26 2011 +0530
@@ -84,6 +84,7 @@
 
     ostype = None
     superpages = 0
+    hugepage_num = 0
     memory_sharing = 0
 
     def __init__(self, vm, vmConfig):
@@ -711,6 +712,7 @@
         self.vramsize = int(vmConfig['platform'].get('videoram',4)) * 1024
         self.is_stubdom = (self.kernel.find('stubdom') >= 0)
         self.superpages = int(vmConfig['superpages'])
+	self.hugepage_num = int(vmConfig['hugepage_num'])
 
     def buildDomain(self):
         store_evtchn = self.vm.getStorePort()
@@ -729,6 +731,7 @@
         log.debug("features       = %s", self.vm.getFeatures())
         log.debug("flags          = %d", self.flags)
         log.debug("superpages     = %d", self.superpages)
+	log.debug("hugepage_num   = %d", self.hugepage_num)
         if arch.type == "ia64":
             log.debug("vhpt          = %d", self.vhpt)
 
@@ -742,7 +745,8 @@
                               features       = self.vm.getFeatures(),
                               flags          = self.flags,
                               vhpt           = self.vhpt,
-                              superpages     = self.superpages)
+                              superpages     = self.superpages,
+			      hugepage_num   = self.hugepage_num)
 
     def getBitSize(self):
         return xc.getBitSize(image    = self.kernel,
diff -r 4e108cf56d07 tools/python/xen/xm/create.dtd
--- a/tools/python/xen/xm/create.dtd	Mon Dec 27 08:00:09 2010 +0000
+++ b/tools/python/xen/xm/create.dtd	Mon Mar 21 11:29:26 2011 +0530
@@ -56,6 +56,7 @@
                  actions_after_crash    %CRASH_BEHAVIOUR; #REQUIRED
                  PCI_bus                CDATA #REQUIRED
                  superpages             CDATA #REQUIRED
+		 hugepage_num		CDATA #REQUIRED
                  security_label         CDATA #IMPLIED>
 
 <!ELEMENT memory EMPTY> 
diff -r 4e108cf56d07 tools/python/xen/xm/create.py
--- a/tools/python/xen/xm/create.py	Mon Dec 27 08:00:09 2010 +0000
+++ b/tools/python/xen/xm/create.py	Mon Mar 21 11:29:26 2011 +0530
@@ -680,6 +680,11 @@
            fn=set_int, default=0,
            use="Create domain with superpages")
 
+gopts.var('hugepage_num', val='NUM',
+           fn=set_int, default=0,
+           use="Domain with hugepages support")
+
+
 def err(msg):
     """Print an error to stderr and exit.
     """
@@ -770,6 +775,9 @@
         config_image.append(['args', vals.extra])
     if vals.superpages:
         config_image.append(['superpages', vals.superpages])
+    if vals.hugepage_num:
+        config_image.append(['hugepage_num', vals.hugepage_num])
+
 
     if vals.builder == 'hvm':
         configure_hvm(config_image, vals) 
diff -r 4e108cf56d07 tools/python/xen/xm/xenapi_create.py
--- a/tools/python/xen/xm/xenapi_create.py	Mon Dec 27 08:00:09 2010 +0000
+++ b/tools/python/xen/xm/xenapi_create.py	Mon Mar 21 11:29:26 2011 +0530
@@ -285,6 +285,8 @@
                 vm.attributes["s3_integrity"].value,
             "superpages":
                 vm.attributes["superpages"].value,
+ 	    "hugepage_num":
+                vm.attributes["hugepage_num"].value,
             "memory_static_max":
                 get_child_node_attribute(vm, "memory", "static_max"),
             "memory_static_min":
@@ -697,6 +699,8 @@
             = str(get_child_by_name(config, "s3_integrity", 0))
         vm.attributes["superpages"] \
             = str(get_child_by_name(config, "superpages", 0))
+	vm.attributes["hugepage_num"] \
+            = str(get_child_by_name(config, "hugepage_num", 0))
         vm.attributes["pool_name"] \
             = str(get_child_by_name(config, "pool_name", "Pool-0"))
 
diff -r 4e108cf56d07 xen/arch/x86/setup.c
--- a/xen/arch/x86/setup.c	Mon Dec 27 08:00:09 2010 +0000
+++ b/xen/arch/x86/setup.c	Mon Mar 21 11:29:26 2011 +0530
@@ -44,6 +44,7 @@
 #include <asm/mach-generic/mach_apic.h> /* for generic_apic_probe */
 #include <asm/setup.h>
 #include <xen/cpu.h>
+
 
 extern u16 boot_edid_caps;
 extern u8 boot_edid_info[128];
@@ -60,7 +61,10 @@
 /* opt_watchdog: If true, run a watchdog NMI on each processor. */
 static bool_t __initdata opt_watchdog;
 boolean_param("watchdog", opt_watchdog);
-
+//a2k2:
+static unsigned int __initdata dom0_hugepages=0;
+integer_param("dom0_hugepages", dom0_hugepages);
+int allocate_hugepages(struct domain *,int,int);
 /* **** Linux config option: propagated to domain0. */
 /* "acpi=off":    Sisables both ACPI table parsing and interpreter. */
 /* "acpi=force":  Override the disable blacklist.                   */
@@ -1259,7 +1263,10 @@
     
     if ( !tboot_protect_mem_regions() )
         panic("Could not protect TXT memory regions\n");
+    //a2k2:
 
+    //init_hugepages_pool();
+    printk("a2k2:Dom0 hugepages are :%u\n",dom0_hugepages);
     /* Create initial domain 0. */
     dom0 = domain_create(0, DOMCRF_s3_integrity, DOM0_SSIDREF);
     if ( (dom0 == NULL) || (alloc_dom0_vcpu0() == NULL) )
@@ -1267,7 +1274,7 @@
 
     dom0->is_privileged = 1;
     dom0->target = NULL;
-
+    allocate_hugepages(dom0,dom0_hugepages,SUPERPAGE_ORDER);
     /* Grab the DOM0 command line. */
     cmdline = (char *)(mod[0].string ? __va(mod[0].string) : NULL);
     if ( (cmdline != NULL) || (kextra != NULL) )
diff -r 4e108cf56d07 xen/common/domain.c
--- a/xen/common/domain.c	Mon Dec 27 08:00:09 2010 +0000
+++ b/xen/common/domain.c	Mon Mar 21 11:29:26 2011 +0530
@@ -240,9 +240,9 @@
     spin_lock_init(&d->hypercall_deadlock_mutex);
     INIT_PAGE_LIST_HEAD(&d->page_list);
     INIT_PAGE_LIST_HEAD(&d->xenpage_list);
-
+    INIT_PAGE_LIST_HEAD(&d->hugepage_list);
     spin_lock_init(&d->node_affinity_lock);
-
+    d->hugepage_num=0;
     spin_lock_init(&d->shutdown_lock);
     d->shutdown_code = -1;
 
@@ -441,7 +441,7 @@
 int domain_kill(struct domain *d)
 {
     int rc = 0;
-
+    struct page_info* page;
     if ( d == current->domain )
         return -EINVAL;
 
@@ -451,6 +451,12 @@
     case DOMDYING_alive:
         domain_pause(d);
         d->is_dying = DOMDYING_dying;
+         while(!page_list_empty(&(d->hugepage_list)))
+            {
+          
+                page=page_list_remove_head(&(d->hugepage_list));
+                free_domheap_pages(page,SUPERPAGE_ORDER);
+            }
         spin_barrier(&d->domain_lock);
         evtchn_destroy(d);
         gnttab_release_mappings(d);
diff -r 4e108cf56d07 xen/common/memory.c
--- a/xen/common/memory.c	Mon Dec 27 08:00:09 2010 +0000
+++ b/xen/common/memory.c	Mon Mar 21 11:29:26 2011 +0530
@@ -21,6 +21,7 @@
 #include <xen/errno.h>
 #include <xen/tmem.h>
 #include <xen/tmem_xen.h>
+
 #include <asm/current.h>
 #include <asm/hardirq.h>
 #ifdef CONFIG_X86
@@ -89,8 +90,44 @@
  out:
     a->nr_done = i;
 }
+int allocate_hugepages(struct domain *d,int hugepage_num,int order){
+    int i=0;
+    struct page_info *page;
+    if(order!=SUPERPAGE_ORDER)
+    {
+        goto out_huge; 
+    }
+    for(i=0;i<hugepage_num;i++){
+        page = alloc_domheap_pages(NULL, order,0);
+        if(page==NULL){
+            printk("a2k2: couldn't allocate hugepages for the Domain %d \n",d->domain_id);
+            goto out_huge;
+        }
+        if ( d->domain_id ){
+            if ( unlikely((d->tot_pages + (1 << order)) > d->max_pages)){
+                 if ( !opt_tmem || order != 0 || d->tot_pages != d->max_pages )
+                     gdprintk(XENLOG_INFO, "Over-allocation for domain %u: "
+                              "%u > %u\n", d->domain_id,
+                              d->tot_pages + (1 << order), d->max_pages);
+                 goto err;
+            }
 
-static void populate_physmap(struct memop_args *a)
+            if ( unlikely(d->tot_pages == 0) )
+                get_knownalive_domain(d);
+            d->tot_pages += 1 << order;
+         }
+         page_list_add(page,&(d->hugepage_list));    
+    }
+    goto out_huge;
+err:
+    free_domheap_pages(page,order);
+    out_huge:
+    d->hugepage_num+=i;
+    return i;
+}
+
+
+static void populate_physmap(struct memop_args *a,int flags)
 {
     struct page_info *page;
     unsigned long i, j;
@@ -123,7 +160,26 @@
         }
         else
         {
-            page = alloc_domheap_pages(d, a->extent_order, a->memflags);
+            if(flags){
+                //a2k2:
+                page=page_list_remove_head(&(d->hugepage_list));
+                if(page==NULL){
+                    // flags=0;
+                }
+                else
+                {
+                    if(d->domain_id)
+                        d->tot_pages-=1 << a->extent_order;
+                    if(assign_pages(d,page,a->extent_order,a->memflags)==-1){
+                        printk("a2k2: hugepage assignment to domain failed.\n");
+
+                        goto out;
+                     }
+                }
+            }
+            if(!flags)
+                page = alloc_domheap_pages(d, a->extent_order, a->memflags);
+           
             if ( unlikely(page == NULL) ) 
             {
                 if ( !opt_tmem || (a->extent_order != 0) )
@@ -511,9 +567,13 @@
 
     switch ( op )
     {
+    case XENMEM_hugepage_cnt:
     case XENMEM_increase_reservation:
     case XENMEM_decrease_reservation:
     case XENMEM_populate_physmap:
+    case XENMEM_populate_hugemap:
+    case XENMEM_populate_hugepage:
+    
         start_extent = cmd >> MEMOP_EXTENT_SHIFT;
 
         if ( copy_from_guest(&reservation, arg, 1) )
@@ -581,8 +641,17 @@
         case XENMEM_decrease_reservation:
             decrease_reservation(&args);
             break;
+        case XENMEM_populate_hugepage:
+            populate_physmap(&args,1);
+            break;
+        case XENMEM_populate_hugemap:
+            args.nr_done=allocate_hugepages(args.domain,args.nr_extents,args.extent_order);
+            break;
+        case XENMEM_hugepage_cnt:
+            args.nr_done=d->hugepage_num;
+            break;
         default: /* XENMEM_populate_physmap */
-            populate_physmap(&args);
+            populate_physmap(&args,0);
             break;
         }
 
@@ -596,7 +665,7 @@
                 op | (rc << MEMOP_EXTENT_SHIFT), arg);
 
         break;
-
+    
     case XENMEM_exchange:
         rc = memory_exchange(guest_handle_cast(arg, xen_memory_exchange_t));
         break;
diff -r 4e108cf56d07 xen/include/public/memory.h
--- a/xen/include/public/memory.h	Mon Dec 27 08:00:09 2010 +0000
+++ b/xen/include/public/memory.h	Mon Mar 21 11:29:26 2011 +0530
@@ -37,6 +37,9 @@
 #define XENMEM_increase_reservation 0
 #define XENMEM_decrease_reservation 1
 #define XENMEM_populate_physmap     6
+#define XENMEM_populate_hugepage    19
+#define XENMEM_hugepage_cnt         20
+#define XENMEM_populate_hugemap     21
 
 #if __XEN_INTERFACE_VERSION__ >= 0x00030209
 /*
diff -r 4e108cf56d07 xen/include/xen/mm.h
--- a/xen/include/xen/mm.h	Mon Dec 27 08:00:09 2010 +0000
+++ b/xen/include/xen/mm.h	Mon Mar 21 11:29:26 2011 +0530
@@ -96,14 +96,19 @@
 #endif
 
 #define page_list_entry list_head
+#define hugepage_list_entry list_head
 
 #include <asm/mm.h>
 
 #ifndef page_list_entry
 struct page_list_head
 {
-    struct page_info *next, *tail;
+    struct page_info *next, *tail; 
 };
+/*struct hugepage_list_head
+{
+  struct hugepage_info *next,*tail;
+  };*/
 /* These must only have instances in struct page_info. */
 # define page_list_entry
 
@@ -326,5 +331,15 @@
 #define RAM_TYPE_ACPI         0x00000008
 /* TRUE if the whole page at @mfn is of the requested RAM type(s) above. */
 int page_is_ram_type(unsigned long mfn, unsigned long mem_type);
+/*
+#define hugepage_list_head list_head
 
+//a2k2
+struct hugepage_info
+{
+  mfn_t mfn;
+  hugepage_list_entry hugepage_list;
+
+
+  };*/
 #endif /* __XEN_MM_H__ */
diff -r 4e108cf56d07 xen/include/xen/sched.h
--- a/xen/include/xen/sched.h	Mon Dec 27 08:00:09 2010 +0000
+++ b/xen/include/xen/sched.h	Mon Mar 21 11:29:26 2011 +0530
@@ -211,6 +211,9 @@
     spinlock_t       page_alloc_lock; /* protects all the following fields  */
     struct page_list_head page_list;  /* linked list, of size tot_pages     */
     struct page_list_head xenpage_list; /* linked list (size xenheap_pages) */
+    struct page_list_head hugepage_list; /*a2k2:Free hugepage list*/
+    unsigned int     hugepage_num;    /*a2k2:  */
+
     unsigned int     tot_pages;       /* number of pages currently possesed */
     unsigned int     max_pages;       /* maximum value for tot_pages        */
     atomic_t         shr_pages;       /* number of shared pages             */

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread
* Re: PATCH: Hugepage support for Domains booting with 4KB pages
@ 2011-03-22 18:05 Keshav Darak
  0 siblings, 0 replies; 7+ messages in thread
From: Keshav Darak @ 2011-03-22 18:05 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel


Konrad,

Thanks for reviewing the patch.

>> We have implemented hugepage support for guests in following manner
>> 
>> In
>>  our implementation we added a parameter hugepage_num which is specified
>>  in the config file of the DomU. It is the number of hugepages that the 
>> guest is guaranteed to receive whenever the kernel asks for hugepage by 
>> using its boot time parameter or reserving after booting (eg. Using echo
>>  XX > /proc/sys/vm/nr_hugepages). During creation of the domain we 
>> reserve MFN's for these hugepages and store them in the list. The 

>There is bootup option for normal Linux kernels to set that up. Was
>that something you could use?

ya, it can be used too, to allocate the hugepages.

>> 
>> static inline int arch_prepare_hugepage(struct page *page)
>> {
>>index f46c340..00c489a 100644
>>--- a/arch/x86/mm/hugetlbpage.c
>>+++ b/arch/x86/mm/hugetlbpage.c
>>@@ -147,8 +147,7 @@ pte_t *huge_pte_alloc(struct mm_struct *mm,
>>             pte = (pte_t *) pmd_alloc(mm, pud, addr);
>>         }
>>     }
>>-    BUG_ON(pte && !pte_none(*pte) && !pte_huge(*pte));
>>-
>>+    BUG_ON(pte && !pte_none(*pte) && !((*pte).pte & (_AT(pteval_t, 1)<<7)));

>Ugh. That is horrible.

>why can't you use 'pte_huge' ? Is it b/c of this
> * (We should never see kernel mappings with _PAGE_PSE set,
> * but we could see hugetlbfs mappings, I think.).
> */

Here actually we don't known the exact reason, but when pte_huge was used the BUG_ON was called even though the PSE bit was set, so we had to rewrite the BUG_ON testing the 7th bit.there could be better ways to do it, But we couldn't find the exact reasons why was it(pte_huge) returning 0 even when the pte was a huge_pte.    


we will try to resolve other issues with the patch as soon as possible. 

--
Keshav Darak
Kaustubh Kabra
Ashwin Vasani
Aditya Gadre




^ permalink raw reply	[flat|nested] 7+ messages in thread
* PATCH: Hugepage support for Domains booting with 4KB pages
@ 2011-03-20 22:34 Keshav Darak
  2011-03-22 16:49 ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 7+ messages in thread
From: Keshav Darak @ 2011-03-20 22:34 UTC (permalink / raw)
  To: xen-devel; +Cc: jeremy, keir


[-- Attachment #1.1: Type: text/plain, Size: 1636 bytes --]

We have implemented hugepage support for guests in following manner

In
 our implementation we added a parameter hugepage_num which is specified
 in the config file of the DomU. It is the number of hugepages that the 
guest is guaranteed to receive whenever the kernel asks for hugepage by 
using its boot time parameter or reserving after booting (eg. Using echo
 XX > /proc/sys/vm/nr_hugepages). During creation of the domain we 
reserve MFN's for these hugepages and store them in the list. The 
listhead of this list is inside the domain structure with name 
"hugepage_list". When the domain is booting, at that time the memory 
seen by the kernel is allocated memory  less the amount required for hugepages. The function 
reserve_hugepage_range is called as a initcall. Before this function the
 xen_extra_mem_start points to this apparent end of the memory. In this 
function we reserve the PFN range for the hugepages which are going to 
be allocated by kernel by incrementing the xen_extra_mem_start. We 
maintain these PFNs as pages in "xen_hugepfn_list" in the kernel. 

Now
 before the kernel requests for hugepages, it makes a hypercall 
HYPERVISOR_memory_op  to get count of hugepages allocated to it and 
accordingly reserves the pfn range.
then whenever kernel requests for
 hugepages it again make hypercall HYPERVISOR_memory_op to get the 
preallocated hugepage and according makes the p2m mapping on both sides 
(xen as well as kernel side)

The approach can be better explained using the presentation attached.

--
Keshav Darak
Kaustubh Kabra
Ashwin Vasani 
Aditya Gadre



      

[-- Attachment #1.2: Type: text/html, Size: 1792 bytes --]

[-- Attachment #2: xen_patch_210311_0227.patch --]
[-- Type: application/x-download, Size: 18234 bytes --]

[-- Attachment #3: jeremy-kernel.patch --]
[-- Type: application/x-download, Size: 6731 bytes --]

[-- Attachment #4: our_hugepage_approach.ppt --]
[-- Type: application/vnd.ms-powerpoint, Size: 327168 bytes --]

[-- Attachment #5: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-03-22 18:05 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-21 21:01 PATCH: Hugepage support for Domains booting with 4KB pages Keshav Darak
2011-03-21 21:31 ` Keir Fraser
2011-03-22 12:36   ` Keshav Darak
2011-03-22 14:07     ` Keir Fraser
  -- strict thread matches above, loose matches on Subject: below --
2011-03-22 18:05 Keshav Darak
2011-03-20 22:34 Keshav Darak
2011-03-22 16:49 ` Konrad Rzeszutek Wilk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).