Root cause of the issue that HVM guest boots slowly with pvops dom0

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

* Root cause of the issue that HVM guest boots slowly with pvops dom0
@ 2010-01-21  8:16 Yang, Xiaowei
  2010-01-21  8:44 ` Keir Fraser
  0 siblings, 1 reply; 7+ messages in thread
From: Yang, Xiaowei @ 2010-01-21  8:16 UTC (permalink / raw)
  To: xen-devel@lists.xensource.com

Symptoms
---------
Boot time of UP HVM on one DP Nehalem server (SMT-on):
          seconds
2.6.18  20
pvops   42

pvops dom0 util% is evidently higher (peak: 230% vs 99%) and resched IPI reaches
200+k per seconds in max.

Analysis
---------
Xentrace shows that the access time of IDE IOport is 10X slower than other
IOports that go through QEMU (~300K v.s. ~30K).

Ftrace of 'sched events' in dom0 tells there are frequent context switches
between 'idle' and 'events/?' (work queue execution kthread) on each idle vCPU,
and events/? runs lru_add_drain_per_cpu(). This explains where resched IPI comes
from: Kernel uses it to notify idle vCPU to do the real work.
lru_add_drain_per_cpu() is triggered by lru_add_drain_all() which is a *costly*
sync operation and won't return until each vCPU executes the work queue.
Throughout the kernel, there are 9 places calls lru_add_drain_all(): shm, memory
migration, mlock and etc. If IDE IOport access invokes one of them, that could
be reason why it's so slow.

Then ftrace of 'syscall' in dom0 reveals the assumption is true - QEMU really
calls mlock(). And it turns out that mlock() is used a lot in Xen (73 places),
to ensure that dom0 user space's buffer passed to Xen HV by hypercall is pinned
in memory. IDE IOport access may call one of them - HVMOP_set_isa_irq_level.

Kernel change log is searched backwards. In 2.6.28
(http://kernelnewbies.org/Linux_2_6_28), one major change to mlock
implementation (b291f000: mlock: mlocked pages are unevictable) puts mlocked
pages under the management of (page frame reclaiming) LRU, and
lru_add_drain_all() is a prepare operation to purge the pages in a temporary
data structures (pagevec) to an LRU list. That's why 2.6.18 dom0 doesn't have so
many resched IPI.

One hack is tried to omit mlock() before HVMOP_set_isa_irq_level in pvops
dom0, and guest boot time returns to normal - ~20s.

Solutions?
-----------
- Limiting vCPU# of dom0 is always an easiest one - you may call it workaround
rather than a solution:) It not only reduces the total # of resched IPI ( =
mlock# * (vCPU#-1)), but reduces the cost of each handler - because of spinlock. 
But the impact is still there, more or less, when vCPU# > 1.

- To remove mlock, another sharing method is needed between dom0 user space app
and Xen HV.

- More ideas?

Thanks,
xiaowei

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Root cause of the issue that HVM guest boots slowly with pvops dom0
  2010-01-21  8:16 Root cause of the issue that HVM guest boots slowly with pvops dom0 Yang, Xiaowei
@ 2010-01-21  8:44 ` Keir Fraser
  2010-01-21  9:27   ` Keir Fraser
  0 siblings, 1 reply; 7+ messages in thread
From: Keir Fraser @ 2010-01-21  8:44 UTC (permalink / raw)
  To: Yang, Xiaowei, xen-devel@lists.xensource.com

On 21/01/2010 08:16, "Yang, Xiaowei" <xiaowei.yang@intel.com> wrote:

> - Limiting vCPU# of dom0 is always an easiest one - you may call it workaround
> rather than a solution:) It not only reduces the total # of resched IPI ( =
> mlock# * (vCPU#-1)), but reduces the cost of each handler - because of
> spinlock. 
> But the impact is still there, more or less, when vCPU# > 1.
> 
> - To remove mlock, another sharing method is needed between dom0 user space
> app
> and Xen HV.

A pre-mlock()ed memory page for small (sub-page) hypercalls? Protected with
a semaphore: failure to acquire semaphore means take slow path. Have all
hypercallers in libxc launder their data buffers through a new interface
that tries to grab and copy into the pre-allocated buffer.

 -- Keir

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Root cause of the issue that HVM guest boots slowly with pvops dom0
  2010-01-21  8:44 ` Keir Fraser
@ 2010-01-21  9:27   ` Keir Fraser
  2010-01-21 11:23     ` Keir Fraser
  0 siblings, 1 reply; 7+ messages in thread
From: Keir Fraser @ 2010-01-21  9:27 UTC (permalink / raw)
  To: Yang, Xiaowei, xen-devel@lists.xensource.com

On 21/01/2010 08:44, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:

>> - Limiting vCPU# of dom0 is always an easiest one - you may call it
>> workaround
>> rather than a solution:) It not only reduces the total # of resched IPI ( =
>> mlock# * (vCPU#-1)), but reduces the cost of each handler - because of
>> spinlock. 
>> But the impact is still there, more or less, when vCPU# > 1.
>> 
>> - To remove mlock, another sharing method is needed between dom0 user space
>> app
>> and Xen HV.
> 
> A pre-mlock()ed memory page for small (sub-page) hypercalls? Protected with
> a semaphore: failure to acquire semaphore means take slow path. Have all
> hypercallers in libxc launder their data buffers through a new interface
> that tries to grab and copy into the pre-allocated buffer.

I'll sort out a trial patch for this myself.

 Thanks,
 Keir

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Root cause of the issue that HVM guest boots slowly with pvops dom0
  2010-01-21  9:27   ` Keir Fraser
@ 2010-01-21 11:23     ` Keir Fraser
  2010-01-22  8:07       ` Yang, Xiaowei
  0 siblings, 1 reply; 7+ messages in thread
From: Keir Fraser @ 2010-01-21 11:23 UTC (permalink / raw)
  To: Yang, Xiaowei, xen-devel@lists.xensource.com

[-- Attachment #1: Type: text/plain, Size: 519 bytes --]

On 21/01/2010 09:27, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:

>> A pre-mlock()ed memory page for small (sub-page) hypercalls? Protected with
>> a semaphore: failure to acquire semaphore means take slow path. Have all
>> hypercallers in libxc launder their data buffers through a new interface
>> that tries to grab and copy into the pre-allocated buffer.
> 
> I'll sort out a trial patch for this myself.

How does the attached patch work for you? It ought to get you the same
speedup as your hack.

 -- Keir


[-- Attachment #2: 00-mlock --]
[-- Type: application/octet-stream, Size: 13568 bytes --]

diff -r ea02c95af387 tools/libxc/xc_domain_restore.c
--- a/tools/libxc/xc_domain_restore.c	Thu Jan 21 09:13:46 2010 +0000
+++ b/tools/libxc/xc_domain_restore.c	Thu Jan 21 11:16:13 2010 +0000
@@ -1424,7 +1424,7 @@
     ctx->p2m   = calloc(dinfo->p2m_size, sizeof(xen_pfn_t));
     pfn_type   = calloc(dinfo->p2m_size, sizeof(unsigned long));
 
-    region_mfn = xg_memalign(PAGE_SIZE, ROUNDUP(
+    region_mfn = xc_memalign(PAGE_SIZE, ROUNDUP(
                               MAX_BATCH_SIZE * sizeof(xen_pfn_t), PAGE_SHIFT));
 
     if ( (ctx->p2m == NULL) || (pfn_type == NULL) ||
diff -r ea02c95af387 tools/libxc/xc_domain_save.c
--- a/tools/libxc/xc_domain_save.c	Thu Jan 21 09:13:46 2010 +0000
+++ b/tools/libxc/xc_domain_save.c	Thu Jan 21 11:16:13 2010 +0000
@@ -1012,9 +1012,9 @@
     sent_last_iter = dinfo->p2m_size;
 
     /* Setup to_send / to_fix and to_skip bitmaps */
-    to_send = xg_memalign(PAGE_SIZE, ROUNDUP(BITMAP_SIZE, PAGE_SHIFT)); 
+    to_send = xc_memalign(PAGE_SIZE, ROUNDUP(BITMAP_SIZE, PAGE_SHIFT)); 
     to_fix  = calloc(1, BITMAP_SIZE);
-    to_skip = xg_memalign(PAGE_SIZE, ROUNDUP(BITMAP_SIZE, PAGE_SHIFT)); 
+    to_skip = xc_memalign(PAGE_SIZE, ROUNDUP(BITMAP_SIZE, PAGE_SHIFT)); 
 
     if ( !to_send || !to_fix || !to_skip )
     {
@@ -1056,7 +1056,7 @@
 
     analysis_phase(xc_handle, dom, ctx, to_skip, 0);
 
-    pfn_type   = xg_memalign(PAGE_SIZE, ROUNDUP(
+    pfn_type   = xc_memalign(PAGE_SIZE, ROUNDUP(
                               MAX_BATCH_SIZE * sizeof(*pfn_type), PAGE_SHIFT));
     pfn_batch  = calloc(MAX_BATCH_SIZE, sizeof(*pfn_batch));
     if ( (pfn_type == NULL) || (pfn_batch == NULL) )
diff -r ea02c95af387 tools/libxc/xc_misc.c
--- a/tools/libxc/xc_misc.c	Thu Jan 21 09:13:46 2010 +0000
+++ b/tools/libxc/xc_misc.c	Thu Jan 21 11:16:13 2010 +0000
@@ -175,29 +175,29 @@
     unsigned int level)
 {
     DECLARE_HYPERCALL;
-    struct xen_hvm_set_pci_intx_level arg;
+    struct xen_hvm_set_pci_intx_level _arg, *arg = &_arg;
     int rc;
 
-    hypercall.op     = __HYPERVISOR_hvm_op;
-    hypercall.arg[0] = HVMOP_set_pci_intx_level;
-    hypercall.arg[1] = (unsigned long)&arg;
-
-    arg.domid  = dom;
-    arg.domain = domain;
-    arg.bus    = bus;
-    arg.device = device;
-    arg.intx   = intx;
-    arg.level  = level;
-
-    if ( (rc = lock_pages(&arg, sizeof(arg))) != 0 )
+    if ( (rc = hcall_buf_prep((void **)&arg, sizeof(*arg))) != 0 )
     {
         PERROR("Could not lock memory");
         return rc;
     }
 
+    hypercall.op     = __HYPERVISOR_hvm_op;
+    hypercall.arg[0] = HVMOP_set_pci_intx_level;
+    hypercall.arg[1] = (unsigned long)arg;
+
+    arg->domid  = dom;
+    arg->domain = domain;
+    arg->bus    = bus;
+    arg->device = device;
+    arg->intx   = intx;
+    arg->level  = level;
+
     rc = do_xen_hypercall(xc_handle, &hypercall);
 
-    unlock_pages(&arg, sizeof(arg));
+    hcall_buf_release((void **)&arg, sizeof(*arg));
 
     return rc;
 }
@@ -208,26 +208,26 @@
     unsigned int level)
 {
     DECLARE_HYPERCALL;
-    struct xen_hvm_set_isa_irq_level arg;
+    struct xen_hvm_set_isa_irq_level _arg, *arg = &_arg;
     int rc;
 
-    hypercall.op     = __HYPERVISOR_hvm_op;
-    hypercall.arg[0] = HVMOP_set_isa_irq_level;
-    hypercall.arg[1] = (unsigned long)&arg;
-
-    arg.domid   = dom;
-    arg.isa_irq = isa_irq;
-    arg.level   = level;
-
-    if ( (rc = lock_pages(&arg, sizeof(arg))) != 0 )
+    if ( (rc = hcall_buf_prep((void **)&arg, sizeof(*arg))) != 0 )
     {
         PERROR("Could not lock memory");
         return rc;
     }
 
+    hypercall.op     = __HYPERVISOR_hvm_op;
+    hypercall.arg[0] = HVMOP_set_isa_irq_level;
+    hypercall.arg[1] = (unsigned long)arg;
+
+    arg->domid   = dom;
+    arg->isa_irq = isa_irq;
+    arg->level   = level;
+
     rc = do_xen_hypercall(xc_handle, &hypercall);
 
-    unlock_pages(&arg, sizeof(arg));
+    hcall_buf_release((void **)&arg, sizeof(*arg));
 
     return rc;
 }
diff -r ea02c95af387 tools/libxc/xc_physdev.c
--- a/tools/libxc/xc_physdev.c	Thu Jan 21 09:13:46 2010 +0000
+++ b/tools/libxc/xc_physdev.c	Thu Jan 21 11:16:13 2010 +0000
@@ -36,7 +36,7 @@
     map.index = index;
     map.pirq = *pirq;
 
-    rc = do_physdev_op(xc_handle, PHYSDEVOP_map_pirq, &map);
+    rc = do_physdev_op(xc_handle, PHYSDEVOP_map_pirq, &map, sizeof(map));
 
     if ( !rc )
         *pirq = map.pirq;
@@ -68,7 +68,7 @@
     map.entry_nr = entry_nr;
     map.table_base = table_base;
 
-    rc = do_physdev_op(xc_handle, PHYSDEVOP_map_pirq, &map);
+    rc = do_physdev_op(xc_handle, PHYSDEVOP_map_pirq, &map, sizeof(map));
 
     if ( !rc )
         *pirq = map.pirq;
@@ -86,7 +86,7 @@
     unmap.domid = domid;
     unmap.pirq = pirq;
 
-    rc = do_physdev_op(xc_handle, PHYSDEVOP_unmap_pirq, &unmap);
+    rc = do_physdev_op(xc_handle, PHYSDEVOP_unmap_pirq, &unmap, sizeof(unmap));
 
     return rc;
 }
diff -r ea02c95af387 tools/libxc/xc_private.c
--- a/tools/libxc/xc_private.c	Thu Jan 21 09:13:46 2010 +0000
+++ b/tools/libxc/xc_private.c	Thu Jan 21 11:16:13 2010 +0000
@@ -8,6 +8,9 @@
 #include "xc_private.h"
 #include "xg_private.h"
 #include <stdarg.h>
+#include <stdlib.h>
+#include <malloc.h>
+#include <unistd.h>
 #include <pthread.h>
 
 static pthread_key_t last_error_pkey;
@@ -126,27 +129,96 @@
     }
 }
 
+#ifdef __sun__
+
+int lock_pages(void *addr, size_t len) { return 0; }
+void unlock_pages(void *addr, size_t len) { }
+
+int hcall_buf_prep(void **addr, size_t len) { return 0; }
+void hcall_buf_release(void **addr, size_t len) { }
+
+#else /* !__sun__ */
+
 int lock_pages(void *addr, size_t len)
 {
-      int e = 0;
-#ifndef __sun__
+      int e;
       void *laddr = (void *)((unsigned long)addr & PAGE_MASK);
       size_t llen = (len + ((unsigned long)addr - (unsigned long)laddr) +
                      PAGE_SIZE - 1) & PAGE_MASK;
       e = mlock(laddr, llen);
-#endif
       return e;
 }
 
 void unlock_pages(void *addr, size_t len)
 {
-#ifndef __sun__
     void *laddr = (void *)((unsigned long)addr & PAGE_MASK);
     size_t llen = (len + ((unsigned long)addr - (unsigned long)laddr) +
                    PAGE_SIZE - 1) & PAGE_MASK;
     safe_munlock(laddr, llen);
+}
+
+static pthread_key_t hcall_buf_pkey;
+static pthread_once_t hcall_buf_pkey_once = PTHREAD_ONCE_INIT;
+struct hcall_buf {
+    void *buf;
+    void *oldbuf;
+};
+
+static void _xc_clean_hcall_buf(void *m)
+{
+    struct hcall_buf *hcall_buf = m;
+    if ( hcall_buf )
+        free(hcall_buf->buf);
+    free(hcall_buf);
+    pthread_setspecific(hcall_buf_pkey, NULL);
+}
+
+static void _xc_init_hcall_buf(void)
+{
+    pthread_key_create(&hcall_buf_pkey, _xc_clean_hcall_buf);
+}
+
+int hcall_buf_prep(void **addr, size_t len)
+{
+    struct hcall_buf *hcall_buf;
+
+    pthread_once(&hcall_buf_pkey_once, _xc_init_hcall_buf);
+    hcall_buf = pthread_getspecific(hcall_buf_pkey);
+    if ( (hcall_buf == NULL) &&
+         ((hcall_buf = calloc(1, sizeof(*hcall_buf))) != NULL) )
+        pthread_setspecific(hcall_buf_pkey, hcall_buf);
+    if ( hcall_buf->buf == NULL )
+        hcall_buf->buf = xc_memalign(PAGE_SIZE, PAGE_SIZE);
+
+    if ( (len < PAGE_SIZE) && hcall_buf && hcall_buf->buf &&
+         !hcall_buf->oldbuf )
+    {
+        memcpy(hcall_buf->buf, *addr, len);
+        hcall_buf->oldbuf = *addr;
+        *addr = hcall_buf->buf;
+        return 0;
+    }
+
+    return lock_pages(*addr, len);
+}
+
+void hcall_buf_release(void **addr, size_t len)
+{
+    struct hcall_buf *hcall_buf = pthread_getspecific(hcall_buf_pkey);
+
+    if ( hcall_buf && (hcall_buf->buf == *addr) )
+    {
+        memcpy(hcall_buf->oldbuf, *addr, len);
+        *addr = hcall_buf->oldbuf;
+        hcall_buf->oldbuf = NULL;
+    }
+    else
+    {
+        unlock_pages(*addr, len);
+    }
+}
+
 #endif
-}
 
 /* NB: arr must be locked */
 int xc_get_pfn_type_batch(int xc_handle, uint32_t dom,
@@ -169,21 +241,21 @@
     DECLARE_HYPERCALL;
     long ret = -EINVAL;
 
+    if ( hcall_buf_prep((void **)&op, nr_ops*sizeof(*op)) != 0 )
+    {
+        PERROR("Could not lock memory for Xen hypercall");
+        goto out1;
+    }
+
     hypercall.op     = __HYPERVISOR_mmuext_op;
     hypercall.arg[0] = (unsigned long)op;
     hypercall.arg[1] = (unsigned long)nr_ops;
     hypercall.arg[2] = (unsigned long)0;
     hypercall.arg[3] = (unsigned long)dom;
 
-    if ( lock_pages(op, nr_ops*sizeof(*op)) != 0 )
-    {
-        PERROR("Could not lock memory for Xen hypercall");
-        goto out1;
-    }
-
     ret = do_xen_hypercall(xc_handle, &hypercall);
 
-    unlock_pages(op, nr_ops*sizeof(*op));
+    hcall_buf_release((void **)&op, nr_ops*sizeof(*op));
 
  out1:
     return ret;
@@ -656,6 +728,22 @@
     return l ? xc_ffs32(l) : h ? xc_ffs32(h) + 32 : 0;
 }
 
+void *xc_memalign(size_t alignment, size_t size)
+{
+#if defined(_POSIX_C_SOURCE) && !defined(__sun__)
+    int ret;
+    void *ptr;
+    ret = posix_memalign(&ptr, alignment, size);
+    if (ret != 0)
+        return NULL;
+    return ptr;
+#elif defined(__NetBSD__) || defined(__OpenBSD__)
+    return valloc(size);
+#else
+    return memalign(alignment, size);
+#endif
+}
+
 /*
  * Local variables:
  * mode: C
diff -r ea02c95af387 tools/libxc/xc_private.h
--- a/tools/libxc/xc_private.h	Thu Jan 21 09:13:46 2010 +0000
+++ b/tools/libxc/xc_private.h	Thu Jan 21 11:16:13 2010 +0000
@@ -78,9 +78,14 @@
 #define PERROR(_m, _a...) xc_set_error(XC_INTERNAL_ERROR, _m " (%d = %s)", \
                                        ## _a , errno, safe_strerror(errno))
 
+void *xc_memalign(size_t alignment, size_t size);
+
 int lock_pages(void *addr, size_t len);
 void unlock_pages(void *addr, size_t len);
 
+int hcall_buf_prep(void **addr, size_t len);
+void hcall_buf_release(void **addr, size_t len);
+
 static inline void safe_munlock(const void *addr, size_t len)
 {
     int saved_errno = errno;
@@ -101,21 +106,22 @@
     return do_xen_hypercall(xc_handle, &hypercall);
 }
 
-static inline int do_physdev_op(int xc_handle, int cmd, void *op)
+static inline int do_physdev_op(int xc_handle, int cmd, void *op, size_t len)
 {
     int ret = -1;
 
     DECLARE_HYPERCALL;
-    hypercall.op = __HYPERVISOR_physdev_op;
-    hypercall.arg[0] = (unsigned long) cmd;
-    hypercall.arg[1] = (unsigned long) op;
 
-    if ( lock_pages(op, sizeof(*op)) != 0 )
+    if ( hcall_buf_prep(&op, len) != 0 )
     {
         PERROR("Could not lock memory for Xen hypercall");
         goto out1;
     }
 
+    hypercall.op = __HYPERVISOR_physdev_op;
+    hypercall.arg[0] = (unsigned long) cmd;
+    hypercall.arg[1] = (unsigned long) op;
+
     if ( (ret = do_xen_hypercall(xc_handle, &hypercall)) < 0 )
     {
         if ( errno == EACCES )
@@ -123,7 +129,7 @@
                     " rebuild the user-space tool set?\n");
     }
 
-    unlock_pages(op, sizeof(*op));
+    hcall_buf_release(&op, len);
 
 out1:
     return ret;
@@ -134,17 +140,17 @@
     int ret = -1;
     DECLARE_HYPERCALL;
 
+    if ( hcall_buf_prep((void **)&domctl, sizeof(*domctl)) != 0 )
+    {
+        PERROR("Could not lock memory for Xen hypercall");
+        goto out1;
+    }
+
     domctl->interface_version = XEN_DOMCTL_INTERFACE_VERSION;
 
     hypercall.op     = __HYPERVISOR_domctl;
     hypercall.arg[0] = (unsigned long)domctl;
 
-    if ( lock_pages(domctl, sizeof(*domctl)) != 0 )
-    {
-        PERROR("Could not lock memory for Xen hypercall");
-        goto out1;
-    }
-
     if ( (ret = do_xen_hypercall(xc_handle, &hypercall)) < 0 )
     {
         if ( errno == EACCES )
@@ -152,7 +158,7 @@
                     " rebuild the user-space tool set?\n");
     }
 
-    unlock_pages(domctl, sizeof(*domctl));
+    hcall_buf_release((void **)&domctl, sizeof(*domctl));
 
  out1:
     return ret;
@@ -163,17 +169,17 @@
     int ret = -1;
     DECLARE_HYPERCALL;
 
+    if ( hcall_buf_prep((void **)&sysctl, sizeof(*sysctl)) != 0 )
+    {
+        PERROR("Could not lock memory for Xen hypercall");
+        goto out1;
+    }
+
     sysctl->interface_version = XEN_SYSCTL_INTERFACE_VERSION;
 
     hypercall.op     = __HYPERVISOR_sysctl;
     hypercall.arg[0] = (unsigned long)sysctl;
 
-    if ( lock_pages(sysctl, sizeof(*sysctl)) != 0 )
-    {
-        PERROR("Could not lock memory for Xen hypercall");
-        goto out1;
-    }
-
     if ( (ret = do_xen_hypercall(xc_handle, &hypercall)) < 0 )
     {
         if ( errno == EACCES )
@@ -181,7 +187,7 @@
                     " rebuild the user-space tool set?\n");
     }
 
-    unlock_pages(sysctl, sizeof(*sysctl));
+    hcall_buf_release((void **)&sysctl, sizeof(*sysctl));
 
  out1:
     return ret;
diff -r ea02c95af387 tools/libxc/xg_private.c
--- a/tools/libxc/xg_private.c	Thu Jan 21 09:13:46 2010 +0000
+++ b/tools/libxc/xg_private.c	Thu Jan 21 11:16:13 2010 +0000
@@ -183,22 +183,6 @@
     return -1;
 }
 
-void *xg_memalign(size_t alignment, size_t size)
-{
-#if defined(_POSIX_C_SOURCE) && !defined(__sun__)
-    int ret;
-    void *ptr;
-    ret = posix_memalign(&ptr, alignment, size);
-    if (ret != 0)
-        return NULL;
-    return ptr;
-#elif defined(__NetBSD__) || defined(__OpenBSD__)
-    return valloc(size);
-#else
-    return memalign(alignment, size);
-#endif
-}
-
 /*
  * Local variables:
  * mode: C
diff -r ea02c95af387 tools/libxc/xg_private.h
--- a/tools/libxc/xg_private.h	Thu Jan 21 09:13:46 2010 +0000
+++ b/tools/libxc/xg_private.h	Thu Jan 21 11:16:13 2010 +0000
@@ -177,6 +177,4 @@
 int pin_table(int xc_handle, unsigned int type, unsigned long mfn,
               domid_t dom);
 
-void *xg_memalign(size_t alignment, size_t size);
-
 #endif /* XG_PRIVATE_H */

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Root cause of the issue that HVM guest boots slowly with pvops dom0
  2010-01-21 11:23     ` Keir Fraser
@ 2010-01-22  8:07       ` Yang, Xiaowei
  2010-01-22  8:31         ` Keir Fraser
  0 siblings, 1 reply; 7+ messages in thread
From: Yang, Xiaowei @ 2010-01-22  8:07 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel@lists.xensource.com

Keir Fraser wrote:
> On 21/01/2010 09:27, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:
> 
>>> A pre-mlock()ed memory page for small (sub-page) hypercalls? Protected with
>>> a semaphore: failure to acquire semaphore means take slow path. Have all
>>> hypercallers in libxc launder their data buffers through a new interface
>>> that tries to grab and copy into the pre-allocated buffer.
>> I'll sort out a trial patch for this myself.
> 
> How does the attached patch work for you? It ought to get you the same
> speedup as your hack.

The speed should be almost the same, regardless of twice memcpy.

Some comments to your trial patch:
1.
diff -r 6b61ef936e69 tools/libxc/xc_private.c
--- a/tools/libxc/xc_private.c	Fri Jan 22 14:50:30 2010 +0800
+++ b/tools/libxc/xc_private.c	Fri Jan 22 15:32:48 2010 +0800
@@ -188,7 +188,10 @@
           ((hcall_buf = calloc(1, sizeof(*hcall_buf))) != NULL) )
          pthread_setspecific(hcall_buf_pkey, hcall_buf);
      if ( hcall_buf->buf == NULL )
+    {
          hcall_buf->buf = xc_memalign(PAGE_SIZE, PAGE_SIZE);
+        lock_pages(hcall_buf->buf, PAGE_SIZE);
+    }

      if ( (len < PAGE_SIZE) && hcall_buf && hcall_buf->buf &&
           !hcall_buf->oldbuf )


2. _xc_clean_hcall_buf needs a more careful NULL pointer check.

3. It does modification to 5 out of 73 hypercalls invoking mlock. Other problem 
hypercalls could turn out to be the bottleneck later?:)

Thanks,
xiaowei

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Root cause of the issue that HVM guest boots slowly with pvops dom0
  2010-01-22  8:07       ` Yang, Xiaowei
@ 2010-01-22  8:31         ` Keir Fraser
  2010-01-22  8:48           ` Yang, Xiaowei
  0 siblings, 1 reply; 7+ messages in thread
From: Keir Fraser @ 2010-01-22  8:31 UTC (permalink / raw)
  To: Yang, Xiaowei; +Cc: xen-devel@lists.xensource.com

On 22/01/2010 08:07, "Yang, Xiaowei" <xiaowei.yang@intel.com> wrote:

>> How does the attached patch work for you? It ought to get you the same
>> speedup as your hack.
> 
> The speed should be almost the same, regardless of twice memcpy.

Did you actually try it out and confirm that?

> Some comments to your trial patch:
> 1.
> diff -r 6b61ef936e69 tools/libxc/xc_private.c
> --- a/tools/libxc/xc_private.c Fri Jan 22 14:50:30 2010 +0800
> +++ b/tools/libxc/xc_private.c Fri Jan 22 15:32:48 2010 +0800

Yes, missed that all-important bit!

> 2. _xc_clean_hcall_buf needs a more careful NULL pointer check.

Not really: free() accepts NULL. But I suppose it would be clearer to put
the free(hcall_buf) inside the if(hcall_buf) block.

> 3. It does modification to 5 out of 73 hypercalls invoking mlock. Other
> problem 
> hypercalls could turn out to be the bottleneck later?:)

The point of a new interface was to be able to do the callers incrementally.
A bit of care is needed on each one, and most are not and probably never
will be bottlenecks.

 -- Keir

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Root cause of the issue that HVM guest boots slowly with pvops dom0
  2010-01-22  8:31         ` Keir Fraser
@ 2010-01-22  8:48           ` Yang, Xiaowei
  0 siblings, 0 replies; 7+ messages in thread
From: Yang, Xiaowei @ 2010-01-22  8:48 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel@lists.xensource.com

Keir Fraser wrote:
> On 22/01/2010 08:07, "Yang, Xiaowei" <xiaowei.yang@intel.com> wrote:
> 
>>> How does the attached patch work for you? It ought to get you the same
>>> speedup as your hack.
>> The speed should be almost the same, regardless of twice memcpy.
> 
> Did you actually try it out and confirm that?

Yes, I tried it out. And there are no obvious speed difference comparing your 
patch (my comment 1 included) and the hack.

> 
>> Some comments to your trial patch:
>> 1.
>> diff -r 6b61ef936e69 tools/libxc/xc_private.c
>> --- a/tools/libxc/xc_private.c Fri Jan 22 14:50:30 2010 +0800
>> +++ b/tools/libxc/xc_private.c Fri Jan 22 15:32:48 2010 +0800
> 
> Yes, missed that all-important bit!
> 
>> 2. _xc_clean_hcall_buf needs a more careful NULL pointer check.
> 
> Not really: free() accepts NULL. But I suppose it would be clearer to put
> the free(hcall_buf) inside the if(hcall_buf) block.
> 
>> 3. It does modification to 5 out of 73 hypercalls invoking mlock. Other
>> problem 
>> hypercalls could turn out to be the bottleneck later?:)
> 
> The point of a new interface was to be able to do the callers incrementally.
> A bit of care is needed on each one, and most are not and probably never
> will be bottlenecks.

Agree. Anyway when we meet other pvops performance issue later, let's go back 
and have a check at this aspect.

Thanks,
xiaowei

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-01-22  8:48 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-21  8:16 Root cause of the issue that HVM guest boots slowly with pvops dom0 Yang, Xiaowei
2010-01-21  8:44 ` Keir Fraser
2010-01-21  9:27   ` Keir Fraser
2010-01-21 11:23     ` Keir Fraser
2010-01-22  8:07       ` Yang, Xiaowei
2010-01-22  8:31         ` Keir Fraser
2010-01-22  8:48           ` Yang, Xiaowei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).