[RFC Patch v4 0/9] Some bugfix patches

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

* [RFC Patch v4 0/9] Some bugfix patches
@ 2014-09-22  5:59 Wen Congyang
  2014-09-22  5:59 ` [RFC Patch v4 1/9] copy the correct page to memory Wen Congyang
                   ` (9 more replies)
  0 siblings, 10 replies; 55+ messages in thread
From: Wen Congyang @ 2014-09-22  5:59 UTC (permalink / raw)
  To: xen devel
  Cc: Ian Campbell, Wen Congyang, Ian Jackson, Jiang Yunhong,
	Dong Eddie, Yang Hongyang, Lai Jiangshan

These bugs are found when we implement COLO, or rebase
COLO to upstream xen. They are independent patches, so
post them in separate series.

The codes are also hosted on github:
https://github.com/wencongyang/xen/tree/bugfix-s1

blktap related bugfix patches are waiting for shriram's review...

Hong Tao (1):
  copy the correct page to memory

Wen Congyang (8):
  csum the correct page
  pass correct file to qemu if we use blktap2
  read nictype from xenstore
  check if mfn is supported by IOCTL_PRIVCMD_MMAPBATCH before calling
    ioctl()
  don't zero out ioreq page and handle the pended I/O after resuming
  correct xc_domain_save()'s return value
  store correct format into tapdisk-params/params
  update libxl__device_disk_from_xs_be() to support blktap device

 tools/libxc/xc_domain_restore.c | 30 +++++++++++++++-------
 tools/libxc/xc_domain_save.c    | 51 ++++++++++++++++++++++--------------
 tools/libxc/xc_linux_osdep.c    | 15 +++++++++++
 tools/libxl/libxl.c             | 57 ++++++++++++++++++++++++++++++++++++-----
 tools/libxl/libxl_blktap2.c     | 12 ++++++++-
 tools/libxl/libxl_dm.c          | 11 ++++++--
 xen/arch/x86/hvm/hvm.c          | 28 ++++++++++++++++++++
 7 files changed, 167 insertions(+), 37 deletions(-)

-- 
1.9.3

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [RFC Patch v4 1/9] copy the correct page to memory
  2014-09-22  5:59 [RFC Patch v4 0/9] Some bugfix patches Wen Congyang
@ 2014-09-22  5:59 ` Wen Congyang
  2014-09-22 14:38   ` Ian Campbell
  2014-09-22  5:59 ` [RFC Patch v4 2/9] csum the correct page Wen Congyang
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 55+ messages in thread
From: Wen Congyang @ 2014-09-22  5:59 UTC (permalink / raw)
  To: xen devel
  Cc: Ian Campbell, Wen Congyang, Ian Jackson, Jiang Yunhong,
	Dong Eddie, Hong Tao, Yang Hongyang, Lai Jiangshan

From: Hong Tao <bobby.hong@huawei.com>

apply_batch() only handles MAX_BATCH_SIZE pages at one time. If
there is some bogus/unmapped/allocate-only/broken page, we will
skip it. So when we call apply_batch() again, the first page's
index is curbatch - invalid_pages. invalid_pages stores the number
of bogus/unmapped/allocate-only/broken pages we have found.

In many cases, invalid_pages is 0, so we don't catch this error.

Signed-off-by: Hong Tao <bobby.hong@huawei.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxc/xc_domain_restore.c | 24 +++++++++++++++++++-----
 1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/tools/libxc/xc_domain_restore.c b/tools/libxc/xc_domain_restore.c
index b9a56d5..bec716c 100644
--- a/tools/libxc/xc_domain_restore.c
+++ b/tools/libxc/xc_domain_restore.c
@@ -1106,7 +1106,7 @@ static int pagebuf_get(xc_interface *xch, struct restore_ctx *ctx,
 static int apply_batch(xc_interface *xch, uint32_t dom, struct restore_ctx *ctx,
                        xen_pfn_t* region_mfn, unsigned long* pfn_type, int pae_extended_cr3,
                        struct xc_mmu* mmu,
-                       pagebuf_t* pagebuf, int curbatch)
+                       pagebuf_t* pagebuf, int curbatch, int *invalid_pages)
 {
     int i, j, curpage, nr_mfns;
     int k, scount;
@@ -1121,6 +1121,12 @@ static int apply_batch(xc_interface *xch, uint32_t dom, struct restore_ctx *ctx,
     struct domain_info_context *dinfo = &ctx->dinfo;
     int* pfn_err = NULL;
     int rc = -1;
+    int local_invalid_pages = 0;
+    /* We have handled curbatch pages before this batch, and there are
+     * *invalid_pages pages that are not in pagebuf->pages. So the first
+     * page for this page is (curbatch - *invalid_pages) page.
+     */
+    int first_page = curbatch - *invalid_pages;
 
     unsigned long mfn, pfn, pagetype;
 
@@ -1293,10 +1299,13 @@ static int apply_batch(xc_interface *xch, uint32_t dom, struct restore_ctx *ctx,
         pfn      = pagebuf->pfn_types[i + curbatch] & ~XEN_DOMCTL_PFINFO_LTAB_MASK;
         pagetype = pagebuf->pfn_types[i + curbatch] &  XEN_DOMCTL_PFINFO_LTAB_MASK;
 
-        if ( pagetype == XEN_DOMCTL_PFINFO_XTAB 
+        if ( pagetype == XEN_DOMCTL_PFINFO_XTAB
              || pagetype == XEN_DOMCTL_PFINFO_XALLOC)
+        {
+            local_invalid_pages++;
             /* a bogus/unmapped/allocate-only page: skip it */
             continue;
+        }
 
         if ( pagetype == XEN_DOMCTL_PFINFO_BROKEN )
         {
@@ -1306,6 +1315,8 @@ static int apply_batch(xc_interface *xch, uint32_t dom, struct restore_ctx *ctx,
                       "dom=%d, pfn=%lx\n", dom, pfn);
                 goto err_mapped;
             }
+
+            local_invalid_pages++;
             continue;
         }
 
@@ -1344,7 +1355,7 @@ static int apply_batch(xc_interface *xch, uint32_t dom, struct restore_ctx *ctx,
             }
         }
         else
-            memcpy(page, pagebuf->pages + (curpage + curbatch) * PAGE_SIZE,
+            memcpy(page, pagebuf->pages + (first_page + curpage) * PAGE_SIZE,
                    PAGE_SIZE);
 
         pagetype &= XEN_DOMCTL_PFINFO_LTABTYPE_MASK;
@@ -1418,6 +1429,7 @@ static int apply_batch(xc_interface *xch, uint32_t dom, struct restore_ctx *ctx,
     } /* end of 'batch' for loop */
 
     rc = nraces;
+    *invalid_pages += local_invalid_pages;
 
   err_mapped:
     munmap(region_base, j*PAGE_SIZE);
@@ -1621,7 +1633,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
  loadpages:
     for ( ; ; )
     {
-        int j, curbatch;
+        int j, curbatch, invalid_pages;
 
         xc_report_progress_step(xch, n, dinfo->p2m_size);
 
@@ -1665,11 +1677,13 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
 
         /* break pagebuf into batches */
         curbatch = 0;
+        invalid_pages = 0;
         while ( curbatch < j ) {
             int brc;
 
             brc = apply_batch(xch, dom, ctx, region_mfn, pfn_type,
-                              pae_extended_cr3, mmu, &pagebuf, curbatch);
+                              pae_extended_cr3, mmu, &pagebuf, curbatch,
+                              &invalid_pages);
             if ( brc < 0 )
                 goto out;
 
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [RFC Patch v4 2/9] csum the correct page
  2014-09-22  5:59 [RFC Patch v4 0/9] Some bugfix patches Wen Congyang
  2014-09-22  5:59 ` [RFC Patch v4 1/9] copy the correct page to memory Wen Congyang
@ 2014-09-22  5:59 ` Wen Congyang
  2014-09-22  5:59 ` [RFC Patch v4 3/9] pass correct file to qemu if we use blktap2 Wen Congyang
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 55+ messages in thread
From: Wen Congyang @ 2014-09-22  5:59 UTC (permalink / raw)
  To: xen devel
  Cc: Ian Campbell, Wen Congyang, Ian Jackson, Jiang Yunhong,
	Dong Eddie, Yang Hongyang, Lai Jiangshan

In verify mode, we map the guest memory, and the guest page is
region_base + i * PAGE_SIZE. So we should csum page (region_base
+ i * PAGE_SIZE), not (region_base + (i+curbatch) * PAGE_SIZE)

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxc/xc_domain_restore.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/libxc/xc_domain_restore.c b/tools/libxc/xc_domain_restore.c
index bec716c..fb4ddfc 100644
--- a/tools/libxc/xc_domain_restore.c
+++ b/tools/libxc/xc_domain_restore.c
@@ -1405,7 +1405,7 @@ static int apply_batch(xc_interface *xch, uint32_t dom, struct restore_ctx *ctx,
 
                 DPRINTF("************** pfn=%lx type=%lx gotcs=%08lx "
                         "actualcs=%08lx\n", pfn, pagebuf->pfn_types[pfn],
-                        csum_page(region_base + (i + curbatch)*PAGE_SIZE),
+                        csum_page(region_base + i * PAGE_SIZE),
                         csum_page(buf));
 
                 for ( v = 0; v < 4; v++ )
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [RFC Patch v4 3/9] pass correct file to qemu if we use blktap2
  2014-09-22  5:59 [RFC Patch v4 0/9] Some bugfix patches Wen Congyang
  2014-09-22  5:59 ` [RFC Patch v4 1/9] copy the correct page to memory Wen Congyang
  2014-09-22  5:59 ` [RFC Patch v4 2/9] csum the correct page Wen Congyang
@ 2014-09-22  5:59 ` Wen Congyang
  2014-12-11 16:45   ` George Dunlap
  2014-09-22  5:59 ` [RFC Patch v4 4/9] read nictype from xenstore Wen Congyang
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 55+ messages in thread
From: Wen Congyang @ 2014-09-22  5:59 UTC (permalink / raw)
  To: xen devel
  Cc: Ian Campbell, Wen Congyang, Ian Jackson, Jiang Yunhong,
	Dong Eddie, Shriram Rajagopalan, Yang Hongyang, Lai Jiangshan

If we use blktap2, the correct file should be blktap device
not the pdev_path.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: Shriram Rajagopalan <rshriram@cs.ubc.ca>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxl/libxl_dm.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 103cbca..fbc82fd 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -718,6 +718,7 @@ static char ** libxl__build_device_model_args_new(libxl__gc *gc,
                 libxl__device_disk_dev_number(disks[i].vdev, &disk, &part);
             const char *format = qemu_disk_format_string(disks[i].format);
             char *drive;
+            const char *pdev_path;
 
             if (dev_number == -1) {
                 LIBXL__LOG(ctx, LIBXL__LOG_WARNING, "unable to determine"
@@ -747,6 +748,12 @@ static char ** libxl__build_device_model_args_new(libxl__gc *gc,
                     continue;
                 }
 
+                if (disks[i].backend == LIBXL_DISK_BACKEND_TAP)
+                    pdev_path = libxl__blktap_devpath(gc, disks[i].pdev_path,
+                                                      disks[i].format);
+                else
+                    pdev_path = disks[i].pdev_path;
+
                 /*
                  * Explicit sd disks are passed through as is.
                  *
@@ -756,11 +763,11 @@ static char ** libxl__build_device_model_args_new(libxl__gc *gc,
                 if (strncmp(disks[i].vdev, "sd", 2) == 0)
                     drive = libxl__sprintf
                         (gc, "file=%s,if=scsi,bus=0,unit=%d,format=%s,cache=writeback",
-                         disks[i].pdev_path, disk, format);
+                         pdev_path, disk, format);
                 else if (disk < 4)
                     drive = libxl__sprintf
                         (gc, "file=%s,if=ide,index=%d,media=disk,format=%s,cache=writeback",
-                         disks[i].pdev_path, disk, format);
+                         pdev_path, disk, format);
                 else
                     continue; /* Do not emulate this disk */
             }
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [RFC Patch v4 4/9] read nictype from xenstore
  2014-09-22  5:59 [RFC Patch v4 0/9] Some bugfix patches Wen Congyang
                   ` (2 preceding siblings ...)
  2014-09-22  5:59 ` [RFC Patch v4 3/9] pass correct file to qemu if we use blktap2 Wen Congyang
@ 2014-09-22  5:59 ` Wen Congyang
  2014-09-22  5:59 ` [RFC Patch v4 5/9] check if mfn is supported by IOCTL_PRIVCMD_MMAPBATCH before calling ioctl() Wen Congyang
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 55+ messages in thread
From: Wen Congyang @ 2014-09-22  5:59 UTC (permalink / raw)
  To: xen devel
  Cc: Ian Campbell, Wen Congyang, Ian Jackson, Jiang Yunhong,
	Dong Eddie, Yang Hongyang, Lai Jiangshan

We need to use nictype to get default vifname.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxl/libxl.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index f796da8..77672d0 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -3348,7 +3348,13 @@ static int libxl__device_nic_from_xs_be(libxl__gc *gc,
     nic->script = READ_BACKEND(NOGC, "script");
 
     /* vif_ioemu nics use the same xenstore entries as vif interfaces */
-    nic->nictype = LIBXL_NIC_TYPE_VIF;
+    tmp = READ_BACKEND(gc, "type");
+    if (tmp) {
+        rc = libxl_nic_type_from_string(tmp, &nic->nictype);
+        if (rc) goto out;
+    } else {
+        nic->nictype = LIBXL_NIC_TYPE_VIF;
+    }
     nic->model = NULL; /* XXX Only for TYPE_IOEMU */
     nic->ifname = NULL; /* XXX Only for TYPE_IOEMU */
 
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [RFC Patch v4 5/9] check if mfn is supported by IOCTL_PRIVCMD_MMAPBATCH before calling ioctl()
  2014-09-22  5:59 [RFC Patch v4 0/9] Some bugfix patches Wen Congyang
                   ` (3 preceding siblings ...)
  2014-09-22  5:59 ` [RFC Patch v4 4/9] read nictype from xenstore Wen Congyang
@ 2014-09-22  5:59 ` Wen Congyang
  2014-09-22 14:26   ` Ian Campbell
  2014-09-22  5:59 ` [RFC Patch v4 6/9] don't zero out ioreq page and handle the pended I/O after resuming Wen Congyang
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 55+ messages in thread
From: Wen Congyang @ 2014-09-22  5:59 UTC (permalink / raw)
  To: xen devel
  Cc: Ian Campbell, Wen Congyang, Ian Jackson, Jiang Yunhong,
	Dong Eddie, Yang Hongyang, Lai Jiangshan

If mfn is invalid, ioctl(fd, IOCTL_PRIVCMD_MMAPBATCH, ..) also returns 0,
and set error information in error bits(bits28-31). So if the user input
a large valid mfn, we cannot reliably distinguish between a large MFN and
an error. So we should check the input mfn before calling ioctl().
The user can input more than one mfn, and part of them are ~0UL. In this
case, the user expects we can map the memory for all valid mfn. So we
cannot just return NULL if some mfn is not supported.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 tools/libxc/xc_linux_osdep.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/tools/libxc/xc_linux_osdep.c b/tools/libxc/xc_linux_osdep.c
index a19e4b6..d11bcee 100644
--- a/tools/libxc/xc_linux_osdep.c
+++ b/tools/libxc/xc_linux_osdep.c
@@ -321,6 +321,18 @@ static void *linux_privcmd_map_foreign_bulk(xc_interface *xch, xc_osdep_handle h
         }
 
         memcpy(pfn, arr, num * sizeof(*arr));
+        for ( i = 0; i < num; i++ )
+        {
+            /*
+             * IOCTL_PRIVCMD_MMAPBATCH doesn't support the mfn which
+             * error bits are set
+             */
+            if ( pfn[i] & PRIVCMD_MMAPBATCH_MFN_ERROR )
+            {
+                pfn[i] = ~0UL;
+                err[i] = -EINVAL;
+            }
+        }
 
         ioctlx.num = num;
         ioctlx.dom = dom;
@@ -333,6 +345,9 @@ static void *linux_privcmd_map_foreign_bulk(xc_interface *xch, xc_osdep_handle h
 
         for ( i = 0; i < num; ++i )
         {
+            if ( pfn[i] == ~0UL )
+                continue;
+
             switch ( pfn[i] ^ arr[i] )
             {
             case 0:
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [RFC Patch v4 6/9] don't zero out ioreq page and handle the pended I/O after resuming
  2014-09-22  5:59 [RFC Patch v4 0/9] Some bugfix patches Wen Congyang
                   ` (4 preceding siblings ...)
  2014-09-22  5:59 ` [RFC Patch v4 5/9] check if mfn is supported by IOCTL_PRIVCMD_MMAPBATCH before calling ioctl() Wen Congyang
@ 2014-09-22  5:59 ` Wen Congyang
  2014-09-22 11:22   ` Jan Beulich
  2014-09-22  5:59 ` [RFC Patch v4 7/9] correct xc_domain_save()'s return value Wen Congyang
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 55+ messages in thread
From: Wen Congyang @ 2014-09-22  5:59 UTC (permalink / raw)
  To: xen devel
  Cc: Ian Campbell, Wen Congyang, Ian Jackson, Jiang Yunhong,
	Dong Eddie, Paul Durrant, Yang Hongyang, Lai Jiangshan

I check ioreq->state after suspending the guest, and it can be
STATE_IOREQ_NONE, STATE_IOREQ_READY or STATE_IORESP_READY.
1. If it is STATE_IOREQ_NONE, nothing need to be done.
2. If it is STATE_IOREQ_RESP_READY, we will get the response
   in hvm_wait_for_io() before resuming.
3. If it is STATE_IOREQ_READY, we need to kick the event channel,
   and qemu can handle the request.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: Paul Durrant <paul.durrant@citrix.com>
---
 tools/libxc/xc_domain_restore.c |  4 +---
 xen/arch/x86/hvm/hvm.c          | 28 ++++++++++++++++++++++++++++
 2 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/tools/libxc/xc_domain_restore.c b/tools/libxc/xc_domain_restore.c
index fb4ddfc..efaf3c2 100644
--- a/tools/libxc/xc_domain_restore.c
+++ b/tools/libxc/xc_domain_restore.c
@@ -2310,9 +2310,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
     }
 
     /* These comms pages need to be zeroed at the start of day */
-    if ( xc_clear_domain_page(xch, dom, tailbuf.u.hvm.magicpfns[0]) ||
-         xc_clear_domain_page(xch, dom, tailbuf.u.hvm.magicpfns[1]) ||
-         xc_clear_domain_page(xch, dom, tailbuf.u.hvm.magicpfns[2]) )
+    if ( xc_clear_domain_page(xch, dom, tailbuf.u.hvm.magicpfns[2]) )
     {
         PERROR("error zeroing magic pages");
         goto out;
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index bb45593..c4e8687 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -393,6 +393,33 @@ bool_t hvm_io_pending(struct vcpu *v)
     return 0;
 }
 
+static void hvm_reinject_pended_io(struct vcpu *v)
+{
+    struct domain *d = v->domain;
+    struct hvm_ioreq_server *s;
+
+    list_for_each_entry ( s,
+                          &d->arch.hvm_domain.ioreq_server.list,
+                          list_entry )
+    {
+        struct hvm_ioreq_vcpu *sv;
+
+        list_for_each_entry ( sv,
+                              &s->ioreq_vcpu_list,
+                              list_entry )
+        {
+            if ( sv->vcpu == v )
+            {
+                evtchn_port_t port = sv->ioreq_evtchn;
+                ioreq_t * p = get_ioreq(s, v);
+
+                if ( p->state == STATE_IOREQ_READY )
+                    notify_via_xen_event_channel(d, port);
+            }
+        }
+    }
+}
+
 static bool_t hvm_wait_for_io(struct hvm_ioreq_vcpu *sv, ioreq_t *p)
 {
     /* NB. Optimised for common case (p->state == STATE_IOREQ_NONE). */
@@ -1977,6 +2004,7 @@ static int hvm_load_cpu_ctxt(struct domain *d, hvm_domain_context_t *h)
     /* Auxiliary processors should be woken immediately. */
     v->is_initialised = 1;
     clear_bit(_VPF_down, &v->pause_flags);
+    hvm_reinject_pended_io(v);
     vcpu_wake(v);
 
     return 0;
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [RFC Patch v4 7/9] correct xc_domain_save()'s return value
  2014-09-22  5:59 [RFC Patch v4 0/9] Some bugfix patches Wen Congyang
                   ` (5 preceding siblings ...)
  2014-09-22  5:59 ` [RFC Patch v4 6/9] don't zero out ioreq page and handle the pended I/O after resuming Wen Congyang
@ 2014-09-22  5:59 ` Wen Congyang
  2014-09-22  7:30   ` Olaf Hering
  2014-09-22  5:59 ` [RFC Patch v4 8/9] store correct format into tapdisk-params/params Wen Congyang
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 55+ messages in thread
From: Wen Congyang @ 2014-09-22  5:59 UTC (permalink / raw)
  To: xen devel
  Cc: Ian Campbell, Wen Congyang, Ian Jackson, Jiang Yunhong,
	Dong Eddie, Yang Hongyang, Lai Jiangshan

If suspend_and_state() fails, the errno may be 0. But
the caller assume that the errno is not 0. So we should
set errno before suspend_and_state() returns.

If the callback checkpoint() fails, it means that remus
failed. But xc_domain_save() returns 0.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 tools/libxc/xc_domain_save.c | 51 +++++++++++++++++++++++++++-----------------
 1 file changed, 32 insertions(+), 19 deletions(-)

diff --git a/tools/libxc/xc_domain_save.c b/tools/libxc/xc_domain_save.c
index 254fdb3..6e14112 100644
--- a/tools/libxc/xc_domain_save.c
+++ b/tools/libxc/xc_domain_save.c
@@ -361,16 +361,20 @@ static int suspend_and_state(int (*suspend)(void*), void* data,
                              xc_interface *xch, int io_fd, int dom,
                              xc_dominfo_t *info)
 {
+    errno = 0;
     if ( !(*suspend)(data) )
     {
         ERROR("Suspend request failed");
+        errno = errno ? : -1;
         return -1;
     }
 
+    errno = 0;
     if ( (xc_domain_getinfo(xch, dom, 1, info) != 1) ||
          !info->shutdown || (info->shutdown_reason != SHUTDOWN_suspend) )
     {
         ERROR("Domain not in suspended state");
+        errno = errno ? : -1;
         return -1;
     }
 
@@ -2113,30 +2117,39 @@ int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iter
     compressing = (flags & XCFLAGS_CHECKPOINT_COMPRESS);
 
     /* checkpoint_cb can spend arbitrarily long in between rounds */
-    if (!rc && callbacks->checkpoint &&
-        callbacks->checkpoint(callbacks->data) > 0)
+    if ( !rc && callbacks->checkpoint )
     {
-        /* reset stats timer */
-        print_stats(xch, dom, 0, &time_stats, &shadow_stats, 0);
-
-        /* last_iter = 1; */
-        if ( suspend_and_state(callbacks->suspend, callbacks->data, xch,
-                               io_fd, dom, &info) )
+        errno = 0;
+        if ( callbacks->checkpoint(callbacks->data) > 0 )
         {
-            ERROR("Domain appears not to have suspended");
-            goto out;
-        }
-        DPRINTF("SUSPEND shinfo %08lx\n", info.shared_info_frame);
-        print_stats(xch, dom, 0, &time_stats, &shadow_stats, 1);
+            /* reset stats timer */
+            print_stats(xch, dom, 0, &time_stats, &shadow_stats, 0);
 
-        if ( xc_shadow_control(xch, dom,
-                               XEN_DOMCTL_SHADOW_OP_CLEAN, HYPERCALL_BUFFER(to_send),
-                               dinfo->p2m_size, NULL, 0, &shadow_stats) != dinfo->p2m_size )
+            /* last_iter = 1; */
+            if ( suspend_and_state(callbacks->suspend, callbacks->data, xch,
+                                   io_fd, dom, &info) )
+            {
+                ERROR("Domain appears not to have suspended");
+                goto out;
+            }
+            DPRINTF("SUSPEND shinfo %08lx\n", info.shared_info_frame);
+            print_stats(xch, dom, 0, &time_stats, &shadow_stats, 1);
+
+            if ( xc_shadow_control(xch, dom,
+                                   XEN_DOMCTL_SHADOW_OP_CLEAN,
+                                   HYPERCALL_BUFFER(to_send),
+                                   dinfo->p2m_size, NULL, 0,
+                                   &shadow_stats) != dinfo->p2m_size )
+            {
+                PERROR("Error flushing shadow PT");
+            }
+
+            goto copypages;
+        }
+        else
         {
-            PERROR("Error flushing shadow PT");
+            rc = errno ? : -1;
         }
-
-        goto copypages;
     }
 
     if ( tmem_saved != 0 && live )
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [RFC Patch v4 8/9] store correct format into tapdisk-params/params
  2014-09-22  5:59 [RFC Patch v4 0/9] Some bugfix patches Wen Congyang
                   ` (6 preceding siblings ...)
  2014-09-22  5:59 ` [RFC Patch v4 7/9] correct xc_domain_save()'s return value Wen Congyang
@ 2014-09-22  5:59 ` Wen Congyang
  2014-09-22 14:37   ` Ian Campbell
  2014-09-22  5:59 ` [RFC Patch v4 9/9] update libxl__device_disk_from_xs_be() to support blktap device Wen Congyang
  2014-09-22 12:44 ` [RFC Patch v4 0/9] Some bugfix patches Ian Campbell
  9 siblings, 1 reply; 55+ messages in thread
From: Wen Congyang @ 2014-09-22  5:59 UTC (permalink / raw)
  To: xen devel
  Cc: Wei Liu, Ian Campbell, Wen Congyang, Ian Jackson, Jiang Yunhong,
	Dong Eddie, Yang Hongyang, Lai Jiangshan

If the format is raw, we store aio into tapdisk-params/params.
And we cannot use the API libxl_disk_format_from_string()
to get the format from the string.

The API libxl__device_disk_string_of_format() is just
for blktap2, which needs to pass aio instead of raw to
tapdisk2. So use libxl_disk_format_to_string() to
instead of libxl__device_disk_string_of_format().

Also update libxl__device_destroy_tapdisk() due to
tapdisk-params changed.

Note: the content of tapdisk-params/params has been changed...

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl.c         |  7 ++++---
 tools/libxl/libxl_blktap2.c | 12 +++++++++++-
 2 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 77672d0..aa9345b 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -2340,7 +2340,7 @@ static void device_disk_add(libxl__egc *egc, uint32_t domid,
                 }
                 flexarray_append(back, "tapdisk-params");
                 flexarray_append(back, libxl__sprintf(gc, "%s:%s",
-                    libxl__device_disk_string_of_format(disk->format),
+                    libxl_disk_format_to_string(disk->format),
                     disk->pdev_path));
 
                 /* tap backends with scripts are rejected by
@@ -2352,7 +2352,8 @@ static void device_disk_add(libxl__egc *egc, uint32_t domid,
             case LIBXL_DISK_BACKEND_QDISK:
                 flexarray_append(back, "params");
                 flexarray_append(back, libxl__sprintf(gc, "%s:%s",
-                              libxl__device_disk_string_of_format(disk->format), disk->pdev_path));
+                              libxl_disk_format_to_string(disk->format),
+                              disk->pdev_path));
                 assert(device->backend_kind == LIBXL__DEVICE_KIND_QDISK);
                 break;
             default:
@@ -2744,7 +2745,7 @@ int libxl_cdrom_insert(libxl_ctx *ctx, uint32_t domid, libxl_device_disk *disk,
     if (disk->format != LIBXL_DISK_FORMAT_EMPTY)
         flexarray_append_pair(insert, "params",
                         GCSPRINTF("%s:%s",
-                            libxl__device_disk_string_of_format(disk->format),
+                            libxl_disk_format_to_string(disk->format),
                             disk->pdev_path));
     else
         flexarray_append_pair(insert, "params", "");
diff --git a/tools/libxl/libxl_blktap2.c b/tools/libxl/libxl_blktap2.c
index 2053403..9b7a5fc 100644
--- a/tools/libxl/libxl_blktap2.c
+++ b/tools/libxl/libxl_blktap2.c
@@ -54,8 +54,9 @@ char *libxl__blktap_devpath(libxl__gc *gc,
 int libxl__device_destroy_tapdisk(libxl__gc *gc, const char *params)
 {
     char *type, *disk;
-    int err;
+    int err, rc;
     tap_list_t tap;
+    libxl_disk_format format;
 
     type = libxl__strdup(gc, params);
 
@@ -67,6 +68,15 @@ int libxl__device_destroy_tapdisk(libxl__gc *gc, const char *params)
 
     *disk++ = '\0';
 
+    /* type may be raw */
+    rc = libxl_disk_format_from_string(type, &format);
+    if (rc < 0) {
+        LOG(ERROR, "invalid disk type %s", type);
+        return ERROR_FAIL;
+    }
+
+    type = libxl__device_disk_string_of_format(format);
+
     err = tap_ctl_find(type, disk, &tap);
     if (err < 0) {
         /* returns -errno */
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [RFC Patch v4 9/9] update libxl__device_disk_from_xs_be() to support blktap device
  2014-09-22  5:59 [RFC Patch v4 0/9] Some bugfix patches Wen Congyang
                   ` (7 preceding siblings ...)
  2014-09-22  5:59 ` [RFC Patch v4 8/9] store correct format into tapdisk-params/params Wen Congyang
@ 2014-09-22  5:59 ` Wen Congyang
  2014-09-22 15:08   ` Ian Campbell
  2014-09-22 12:44 ` [RFC Patch v4 0/9] Some bugfix patches Ian Campbell
  9 siblings, 1 reply; 55+ messages in thread
From: Wen Congyang @ 2014-09-22  5:59 UTC (permalink / raw)
  To: xen devel
  Cc: Ian Campbell, Wen Congyang, Ian Jackson, Jiang Yunhong,
	Dong Eddie, Shriram Rajagopalan, Yang Hongyang, Lai Jiangshan

If the disk backend is blktap device, we store "format:pdev_path"
in tapdisk-params, and store "phy" in type. So use tapdisk-params
to set libxl_device_disk instead of params and type.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: Shriram Rajagopalan <rshriram@cs.ubc.ca>
---
 tools/libxl/libxl.c | 42 ++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 40 insertions(+), 2 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index aa9345b..8c241aa 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -2468,6 +2468,45 @@ static int libxl__device_disk_from_xs_be(libxl__gc *gc,
         goto cleanup;
     }
 
+    disk->format = LIBXL_DISK_FORMAT_UNKNOWN;
+
+    /* "tapdisk-params" is only for tapdisk */
+    tmp = xs_read(ctx->xsh, XBT_NULL,
+                  libxl__sprintf(gc, "%s/tapdisk-params", be_path), &len);
+    if (tmp) {
+        char *pdev_path;
+        /* tmp is "format:pdev_path" */
+        pdev_path = strchr(tmp, ':');
+        if (!pdev_path) {
+            LOG(ERROR, "corrupted tapdisk-params: \"%s\"\n", tmp);
+            free(tmp);
+            goto cleanup;
+        }
+        disk->pdev_path = strdup(pdev_path + 1);
+        *pdev_path = '\0';
+        rc = libxl_disk_format_from_string(tmp, &disk->format);
+        if (rc) {
+            LOG(ERROR, "unknown disk format: %s\n", tmp);
+            free(tmp);
+            goto cleanup;
+        }
+        if (disk->format != LIBXL_DISK_FORMAT_VHD &&
+            disk->format != LIBXL_DISK_FORMAT_RAW) {
+            LOG(ERROR, "unsupported tapdisk format: %s\n", tmp);
+            free(tmp);
+            goto cleanup;
+        }
+        free(tmp);
+
+        /*
+         * The backend is tapdisk, so we store tapdev in params, and
+         * phy in type(see device_disk_add())
+         */
+        disk->backend = LIBXL_DISK_BACKEND_TAP;
+
+        goto skip_type;
+    }
+
     /* "params" may not be present; but everything else must be. */
     tmp = xs_read(ctx->xsh, XBT_NULL,
                   libxl__sprintf(gc, "%s/params", be_path), &len);
@@ -2487,6 +2526,7 @@ static int libxl__device_disk_from_xs_be(libxl__gc *gc,
     }
     libxl_string_to_backend(ctx, tmp, &(disk->backend));
 
+skip_type:
     disk->vdev = xs_read(ctx->xsh, XBT_NULL,
                          libxl__sprintf(gc, "%s/dev", be_path), &len);
     if (!disk->vdev) {
@@ -2520,8 +2560,6 @@ static int libxl__device_disk_from_xs_be(libxl__gc *gc,
     }
     disk->is_cdrom = !strcmp(tmp, "cdrom");
 
-    disk->format = LIBXL_DISK_FORMAT_UNKNOWN;
-
     return 0;
 cleanup:
     libxl_device_disk_dispose(disk);
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [RFC Patch v4 7/9] correct xc_domain_save()'s return value
  2014-09-22  5:59 ` [RFC Patch v4 7/9] correct xc_domain_save()'s return value Wen Congyang
@ 2014-09-22  7:30   ` Olaf Hering
  2014-09-22  7:34     ` Wen Congyang
  0 siblings, 1 reply; 55+ messages in thread
From: Olaf Hering @ 2014-09-22  7:30 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Ian Campbell, Ian Jackson, Jiang Yunhong, Dong Eddie, xen devel,
	Yang Hongyang, Lai Jiangshan

On Mon, Sep 22, Wen Congyang wrote:

>      if ( !(*suspend)(data) )
>      {
>          ERROR("Suspend request failed");
> +        errno = errno ? : -1;

-1 is not a valid value for errno, it should be ESOMETHING.

Olaf

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC Patch v4 7/9] correct xc_domain_save()'s return value
  2014-09-22  7:30   ` Olaf Hering
@ 2014-09-22  7:34     ` Wen Congyang
  2014-09-22  7:46       ` Olaf Hering
  0 siblings, 1 reply; 55+ messages in thread
From: Wen Congyang @ 2014-09-22  7:34 UTC (permalink / raw)
  To: Olaf Hering
  Cc: Ian Campbell, Ian Jackson, Jiang Yunhong, Dong Eddie, xen devel,
	Yang Hongyang, Lai Jiangshan

On 09/22/2014 03:30 PM, Olaf Hering wrote:
> On Mon, Sep 22, Wen Congyang wrote:
> 
>>      if ( !(*suspend)(data) )
>>      {
>>          ERROR("Suspend request failed");
>> +        errno = errno ? : -1;
> 
> -1 is not a valid value for errno, it should be ESOMETHING.

But, we don't know what's wrong, so I don't know which ESOMETHING is
OK here...

Thanks
Wen Congyang

> 
> Olaf
> .
> 

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC Patch v4 7/9] correct xc_domain_save()'s return value
  2014-09-22  7:34     ` Wen Congyang
@ 2014-09-22  7:46       ` Olaf Hering
  2014-09-22  8:06         ` Wen Congyang
  0 siblings, 1 reply; 55+ messages in thread
From: Olaf Hering @ 2014-09-22  7:46 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Ian Campbell, Ian Jackson, Jiang Yunhong, Dong Eddie, xen devel,
	Yang Hongyang, Lai Jiangshan

On Mon, Sep 22, Wen Congyang wrote:

> On 09/22/2014 03:30 PM, Olaf Hering wrote:
> > On Mon, Sep 22, Wen Congyang wrote:
> > 
> >>      if ( !(*suspend)(data) )
> >>      {
> >>          ERROR("Suspend request failed");
> >> +        errno = errno ? : -1;
> > 
> > -1 is not a valid value for errno, it should be ESOMETHING.
> 
> But, we don't know what's wrong, so I don't know which ESOMETHING is
> OK here...

The called function has to set errno. If it doesnt do that today, fix
the callbacks.

Olaf

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC Patch v4 7/9] correct xc_domain_save()'s return value
  2014-09-22  7:46       ` Olaf Hering
@ 2014-09-22  8:06         ` Wen Congyang
  2014-09-22  8:13           ` Olaf Hering
  2014-09-22 12:42           ` Ian Campbell
  0 siblings, 2 replies; 55+ messages in thread
From: Wen Congyang @ 2014-09-22  8:06 UTC (permalink / raw)
  To: Olaf Hering
  Cc: Ian Campbell, Ian Jackson, Jiang Yunhong, Dong Eddie, xen devel,
	Yang Hongyang, Lai Jiangshan

On 09/22/2014 03:46 PM, Olaf Hering wrote:
> On Mon, Sep 22, Wen Congyang wrote:
> 
>> On 09/22/2014 03:30 PM, Olaf Hering wrote:
>>> On Mon, Sep 22, Wen Congyang wrote:
>>>
>>>>      if ( !(*suspend)(data) )
>>>>      {
>>>>          ERROR("Suspend request failed");
>>>> +        errno = errno ? : -1;
>>>
>>> -1 is not a valid value for errno, it should be ESOMETHING.
>>
>> But, we don't know what's wrong, so I don't know which ESOMETHING is
>> OK here...
> 
> The called function has to set errno. If it doesnt do that today, fix
> the callbacks.

The callbacks may be implemented in libxl, which is another process.
xc_domain_save() is a public API, and the caller may forget to set
the errno...

So we cannot assume that the callbacks has set the errno

Thanks
Wen Congyang

> 
> Olaf
> .
> 

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC Patch v4 7/9] correct xc_domain_save()'s return value
  2014-09-22  8:06         ` Wen Congyang
@ 2014-09-22  8:13           ` Olaf Hering
  2014-09-22  8:24             ` Wen Congyang
  2014-09-22  8:41             ` Wen Congyang
  2014-09-22 12:42           ` Ian Campbell
  1 sibling, 2 replies; 55+ messages in thread
From: Olaf Hering @ 2014-09-22  8:13 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Ian Campbell, Ian Jackson, Jiang Yunhong, Dong Eddie, xen devel,
	Yang Hongyang, Lai Jiangshan

On Mon, Sep 22, Wen Congyang wrote:

> The callbacks may be implemented in libxl, which is another process.
> xc_domain_save() is a public API, and the caller may forget to set
> the errno...
> 
> So we cannot assume that the callbacks has set the errno

Then the caller is using xc_domain_save incorrectly.
Looks like there is no API documentation, other than the source code.

Olaf

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC Patch v4 7/9] correct xc_domain_save()'s return value
  2014-09-22  8:13           ` Olaf Hering
@ 2014-09-22  8:24             ` Wen Congyang
  2014-09-22  8:41             ` Wen Congyang
  1 sibling, 0 replies; 55+ messages in thread
From: Wen Congyang @ 2014-09-22  8:24 UTC (permalink / raw)
  To: Olaf Hering
  Cc: Ian Campbell, Ian Jackson, Jiang Yunhong, Dong Eddie, xen devel,
	Yang Hongyang, Lai Jiangshan

On 09/22/2014 04:13 PM, Olaf Hering wrote:
> On Mon, Sep 22, Wen Congyang wrote:
> 
>> The callbacks may be implemented in libxl, which is another process.
>> xc_domain_save() is a public API, and the caller may forget to set
>> the errno...
>>
>> So we cannot assume that the callbacks has set the errno
> 
> Then the caller is using xc_domain_save incorrectly.
> Looks like there is no API documentation, other than the source code.

Yes, we should allow that the caller uses xc_domain_save() incorrectly, and
xc_domain_save() should return 1 not 0 in this case.

Thanks
Wen Congyang

> 
> Olaf
> .
> 

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC Patch v4 7/9] correct xc_domain_save()'s return value
  2014-09-22  8:13           ` Olaf Hering
  2014-09-22  8:24             ` Wen Congyang
@ 2014-09-22  8:41             ` Wen Congyang
  1 sibling, 0 replies; 55+ messages in thread
From: Wen Congyang @ 2014-09-22  8:41 UTC (permalink / raw)
  To: Olaf Hering
  Cc: Ian Campbell, Ian Jackson, Jiang Yunhong, Dong Eddie, xen devel,
	Yang Hongyang, Lai Jiangshan

On 09/22/2014 04:13 PM, Olaf Hering wrote:
> On Mon, Sep 22, Wen Congyang wrote:
> 
>> The callbacks may be implemented in libxl, which is another process.
>> xc_domain_save() is a public API, and the caller may forget to set
>> the errno...
>>
>> So we cannot assume that the callbacks has set the errno
> 
> Then the caller is using xc_domain_save incorrectly.
> Looks like there is no API documentation, other than the source code.

What about correcting the errno before xc_domain_save() returns:
diff --git a/tools/libxc/xc_domain_save.c b/tools/libxc/xc_domain_save.c
index 6e14112..de8974a 100644
--- a/tools/libxc/xc_domain_save.c
+++ b/tools/libxc/xc_domain_save.c
@@ -2188,10 +2188,14 @@ int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iter
     outbuf_free(&ob_pagebuf);
 
     errno = rc;
+    rc = !!errno;
+    if (errno == -1)
+        /* We cannot get the correct errno */
+        errno = 0;
 exit:
     DPRINTF("Save exit of domid %u with errno=%d\n", dom, errno);
 
-    return !!errno;
+    return rc;
 }
 
 /*


> 
> Olaf
> .
> 

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [RFC Patch v4 6/9] don't zero out ioreq page and handle the pended I/O after resuming
  2014-09-22  5:59 ` [RFC Patch v4 6/9] don't zero out ioreq page and handle the pended I/O after resuming Wen Congyang
@ 2014-09-22 11:22   ` Jan Beulich
  0 siblings, 0 replies; 55+ messages in thread
From: Jan Beulich @ 2014-09-22 11:22 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Ian Campbell, Ian Jackson, Jiang Yunhong, Dong Eddie, xen devel,
	Paul Durrant, Yang Hongyang, Lai Jiangshan

>>> On 22.09.14 at 07:59, <wency@cn.fujitsu.com> wrote:
> I check ioreq->state after suspending the guest, and it can be
> STATE_IOREQ_NONE, STATE_IOREQ_READY or STATE_IORESP_READY.
> 1. If it is STATE_IOREQ_NONE, nothing need to be done.
> 2. If it is STATE_IOREQ_RESP_READY, we will get the response
>    in hvm_wait_for_io() before resuming.
> 3. If it is STATE_IOREQ_READY, we need to kick the event channel,
>    and qemu can handle the request.

This is contrary to what you wrote earlier in a reply to David Vrabel
asking "Under what conditions can there be pending I/O?  A domain
suspend should not complete until any I/O accesses are complete":

"IIRC, I print ioreq.state after suspending vm, and ioreq.state can be
STATE_IOREQ_NONE, STATE_IOREQ_READY or STATE_IORESP_READY. If the
state is STATE_IOREQ_READY, we should kick vcpu event channel on resume
after migration.

But I test it again today, the state is always STATE_IOREQ_NONE...

If the state is always STATE_IOREQ_NONE, no need to touch hypervisor/Qemu
to handle pending I/O req(state is STATE_IOREQ_READY)."

and will hence at least need clarification.

Jan

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC Patch v4 7/9] correct xc_domain_save()'s return value
  2014-09-22  8:06         ` Wen Congyang
  2014-09-22  8:13           ` Olaf Hering
@ 2014-09-22 12:42           ` Ian Campbell
  2014-09-22 13:03             ` Ian Jackson
  1 sibling, 1 reply; 55+ messages in thread
From: Ian Campbell @ 2014-09-22 12:42 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Olaf Hering, Lai Jiangshan, Ian Jackson, Jiang Yunhong,
	Dong Eddie, xen devel, Yang Hongyang

On Mon, 2014-09-22 at 16:06 +0800, Wen Congyang wrote:
> On 09/22/2014 03:46 PM, Olaf Hering wrote:
> > On Mon, Sep 22, Wen Congyang wrote:
> > 
> >> On 09/22/2014 03:30 PM, Olaf Hering wrote:
> >>> On Mon, Sep 22, Wen Congyang wrote:
> >>>
> >>>>      if ( !(*suspend)(data) )
> >>>>      {
> >>>>          ERROR("Suspend request failed");
> >>>> +        errno = errno ? : -1;
> >>>
> >>> -1 is not a valid value for errno, it should be ESOMETHING.
> >>
> >> But, we don't know what's wrong, so I don't know which ESOMETHING is
> >> OK here...
> > 
> > The called function has to set errno. If it doesnt do that today, fix
> > the callbacks.
> 
> The callbacks may be implemented in libxl, which is another process.

libxc doesn't know that, if it is important then it seems like the
failure + errno ought to be marshalled across the IPC link.

It may be that this can be easily handled in
libxl__srm_callout_sendreply + helper_getreply. Ian J -- what do you
think?

> xc_domain_save() is a public API, and the caller may forget to set
> the errno...
> 
> So we cannot assume that the callbacks has set the errno

I think we are entitled to assume that they must (iff returning -1)

Ian.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC Patch v4 0/9] Some bugfix patches
  2014-09-22  5:59 [RFC Patch v4 0/9] Some bugfix patches Wen Congyang
                   ` (8 preceding siblings ...)
  2014-09-22  5:59 ` [RFC Patch v4 9/9] update libxl__device_disk_from_xs_be() to support blktap device Wen Congyang
@ 2014-09-22 12:44 ` Ian Campbell
  2014-09-22 12:56   ` Wen Congyang
  9 siblings, 1 reply; 55+ messages in thread
From: Ian Campbell @ 2014-09-22 12:44 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Lai Jiangshan, Ian Jackson, Jiang Yunhong, Dong Eddie, xen devel,
	Yang Hongyang

On Mon, 2014-09-22 at 13:59 +0800, Wen Congyang wrote:
> These bugs are found when we implement COLO, or rebase
> COLO to upstream xen. They are independent patches, so
> post them in separate series.
> 
> The codes are also hosted on github:
> https://github.com/wencongyang/xen/tree/bugfix-s1
> 
> blktap related bugfix patches are waiting for shriram's review...

Are you targeting any of these at the 4.5 release? Since they are marked
RFC I am assuming not.

Ian.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC Patch v4 0/9] Some bugfix patches
  2014-09-22 12:44 ` [RFC Patch v4 0/9] Some bugfix patches Ian Campbell
@ 2014-09-22 12:56   ` Wen Congyang
  0 siblings, 0 replies; 55+ messages in thread
From: Wen Congyang @ 2014-09-22 12:56 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Lai Jiangshan, Ian Jackson, Jiang Yunhong, Dong Eddie, xen devel,
	Yang Hongyang

On 09/22/2014 08:44 PM, Ian Campbell wrote:
> On Mon, 2014-09-22 at 13:59 +0800, Wen Congyang wrote:
>> These bugs are found when we implement COLO, or rebase
>> COLO to upstream xen. They are independent patches, so
>> post them in separate series.
>>
>> The codes are also hosted on github:
>> https://github.com/wencongyang/xen/tree/bugfix-s1
>>
>> blktap related bugfix patches are waiting for shriram's review...
> 
> Are you targeting any of these at the 4.5 release? Since they are marked
> RFC I am assuming not.

Sorry, I forgot to remove RFC...
Most of these patches are migration related bugfix, so it's better to pushed
them to 4.5

Thanks
Wen Congyang

> 
> Ian.
> 
> .
> 

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC Patch v4 7/9] correct xc_domain_save()'s return value
  2014-09-22 12:42           ` Ian Campbell
@ 2014-09-22 13:03             ` Ian Jackson
  2014-09-22 13:13               ` Ian Campbell
  0 siblings, 1 reply; 55+ messages in thread
From: Ian Jackson @ 2014-09-22 13:03 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Olaf Hering, Lai Jiangshan, Wen Congyang, Jiang Yunhong,
	Dong Eddie, xen devel, Yang Hongyang

Ian Campbell writes ("Re: [Xen-devel] [RFC Patch v4 7/9] correct xc_domain_save()'s return value"):
> libxc doesn't know that, if it is important then it seems like the
> failure + errno ought to be marshalled across the IPC link.

Yes, but ...

> It may be that this can be easily handled in
> libxl__srm_callout_sendreply + helper_getreply. Ian J -- what do you
> think?

... while that would be possbile, we have another option.

We could say that the callbacks return errno values.  That would
simplify the API and avoid having the IPC involve accesses to global
variables (ie, things not in the functions' parameter lists).

If we do that then it becomes the responsibility of xc_domain_save to
either change its own API to return errno, or to save the callback's
return value in errno.

Ian.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC Patch v4 7/9] correct xc_domain_save()'s return value
  2014-09-22 13:03             ` Ian Jackson
@ 2014-09-22 13:13               ` Ian Campbell
  2014-09-22 13:57                 ` [PATCH RFC 0/3] " Ian Jackson
  2014-09-22 14:06                 ` [RFC Patch v4 7/9] correct xc_domain_save()'s return value Andrew Cooper
  0 siblings, 2 replies; 55+ messages in thread
From: Ian Campbell @ 2014-09-22 13:13 UTC (permalink / raw)
  To: Ian Jackson
  Cc: Olaf Hering, Lai Jiangshan, Wen Congyang, Jiang Yunhong,
	Dong Eddie, xen devel, Yang Hongyang

On Mon, 2014-09-22 at 14:03 +0100, Ian Jackson wrote:
> Ian Campbell writes ("Re: [Xen-devel] [RFC Patch v4 7/9] correct xc_domain_save()'s return value"):
> > libxc doesn't know that, if it is important then it seems like the
> > failure + errno ought to be marshalled across the IPC link.
> 
> Yes, but ...
> 
> > It may be that this can be easily handled in
> > libxl__srm_callout_sendreply + helper_getreply. Ian J -- what do you
> > think?
> 
> ... while that would be possbile, we have another option.
> 
> We could say that the callbacks return errno values.  That would
> simplify the API and avoid having the IPC involve accesses to global
> variables (ie, things not in the functions' parameter lists).
> 
> If we do that then it becomes the responsibility of xc_domain_save to
> either change its own API to return errno, or to save the callback's
> return value in errno.

Hrm. libxc is already a complete mess wrt error returning/handling
because some proportion of the code incorrectly does/assumes this sort
of thing is happening (because people were confused about the syscall
returns from the kernel vs. process context). Having a place in libxc
where this is now done on purpose seems a bit like setting the rope on
fire to me...

Ian.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH RFC 0/3] Re: [RFC Patch v4 7/9] correct xc_domain_save()'s return value
  2014-09-22 13:13               ` Ian Campbell
@ 2014-09-22 13:57                 ` Ian Jackson
  2014-09-22 13:58                   ` Ian Campbell
                                     ` (3 more replies)
  2014-09-22 14:06                 ` [RFC Patch v4 7/9] correct xc_domain_save()'s return value Andrew Cooper
  1 sibling, 4 replies; 55+ messages in thread
From: Ian Jackson @ 2014-09-22 13:57 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Olaf Hering, Lai Jiangshan, Wen Congyang, Jiang Yunhong,
	Dong Eddie, xen devel, Yang Hongyang

Ian Campbell writes ("Re: [Xen-devel] [RFC Patch v4 7/9] correct xc_domain_save()'s return value"):
> Hrm. libxc is already a complete mess wrt error returning/handling
> because some proportion of the code incorrectly does/assumes this sort
> of thing is happening (because people were confused about the syscall
> returns from the kernel vs. process context). Having a place in libxc
> where this is now done on purpose seems a bit like setting the rope on
> fire to me...

Well, here is an alternative.  You're probably going to hate it.
It DOES NOT COMPILE so (obviously) I have not tested it.

Ian.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH RFC 0/3] Re: [RFC Patch v4 7/9] correct xc_domain_save()'s return value
  2014-09-22 13:57                 ` [PATCH RFC 0/3] " Ian Jackson
@ 2014-09-22 13:58                   ` Ian Campbell
  2014-09-22 14:00                     ` Ian Campbell
  2014-09-22 13:58                   ` [PATCH 1/3] libxl: save helper: remove redundant declaration Ian Jackson
                                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 55+ messages in thread
From: Ian Campbell @ 2014-09-22 13:58 UTC (permalink / raw)
  To: Ian Jackson
  Cc: Olaf Hering, Lai Jiangshan, Wen Congyang, Jiang Yunhong,
	Dong Eddie, xen devel, Yang Hongyang

On Mon, 2014-09-22 at 14:57 +0100, Ian Jackson wrote:
> Ian Campbell writes ("Re: [Xen-devel] [RFC Patch v4 7/9] correct xc_domain_save()'s return value"):
> > Hrm. libxc is already a complete mess wrt error returning/handling
> > because some proportion of the code incorrectly does/assumes this sort
> > of thing is happening (because people were confused about the syscall
> > returns from the kernel vs. process context). Having a place in libxc
> > where this is now done on purpose seems a bit like setting the rope on
> > fire to me...
> 
> Well, here is an alternative.  You're probably going to hate it.
> It DOES NOT COMPILE so (obviously) I have not tested it.

ENOATTACHMENT...


> 
> Ian.
> 

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH 1/3] libxl: save helper: remove redundant declaration
  2014-09-22 13:57                 ` [PATCH RFC 0/3] " Ian Jackson
  2014-09-22 13:58                   ` Ian Campbell
@ 2014-09-22 13:58                   ` Ian Jackson
  2014-09-22 13:58                   ` [PATCH 2/3] libxl: save helper: transport errno with all callback return values Ian Jackson
  2014-09-22 13:58                   ` [PATCH 3/3] libxl: save/restore callbacks: enforce a useful errno value Ian Jackson
  3 siblings, 0 replies; 55+ messages in thread
From: Ian Jackson @ 2014-09-22 13:58 UTC (permalink / raw)
  To: xen-devel
  Cc: olaf, Ian.Campbell, wency, Ian Jackson, yunhong.jiang, eddie.dong,
	yanghy, laijs

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
 tools/libxl/libxl_save_msgs_gen.pl |    1 -
 1 file changed, 1 deletion(-)

diff --git a/tools/libxl/libxl_save_msgs_gen.pl b/tools/libxl/libxl_save_msgs_gen.pl
index 6b4b65e..ddd13e5 100755
--- a/tools/libxl/libxl_save_msgs_gen.pl
+++ b/tools/libxl/libxl_save_msgs_gen.pl
@@ -329,7 +329,6 @@ END_ALWAYS
     } else {
         $f_more_sr->("        r = $c_make_callback;\n".
                      "        $sendreply(r, user);\n");
-	f_decl($sendreply, 'callout', 'void', '(int r, void *user)');
     }
     if ($flags =~ m/x/) {
         my $c_v = "(1u<<$msgnum)";
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 2/3] libxl: save helper: transport errno with all callback return values
  2014-09-22 13:57                 ` [PATCH RFC 0/3] " Ian Jackson
  2014-09-22 13:58                   ` Ian Campbell
  2014-09-22 13:58                   ` [PATCH 1/3] libxl: save helper: remove redundant declaration Ian Jackson
@ 2014-09-22 13:58                   ` Ian Jackson
  2014-09-22 13:58                   ` [PATCH 3/3] libxl: save/restore callbacks: enforce a useful errno value Ian Jackson
  3 siblings, 0 replies; 55+ messages in thread
From: Ian Jackson @ 2014-09-22 13:58 UTC (permalink / raw)
  To: xen-devel
  Cc: olaf, Ian.Campbell, wency, Ian Jackson, yunhong.jiang, eddie.dong,
	yanghy, laijs

Whenever we plumb the return value from a callback, plumb through
errno as well.

libxl__xc_domain_saverestore_async_callback_done now assumes that its
caller has set errno sensibly, although this is probably not true :-/.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
 tools/libxl/libxl_save_callout.c   |   11 +++++++----
 tools/libxl/libxl_save_msgs_gen.pl |   10 +++++++---
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
index 1c9f806..88b406b 100644
--- a/tools/libxl/libxl_save_callout.c
+++ b/tools/libxl/libxl_save_callout.c
@@ -138,10 +138,11 @@ void libxl__xc_domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss)
 
 
 void libxl__xc_domain_saverestore_async_callback_done(libxl__egc *egc,
-                           libxl__save_helper_state *shs, int return_value)
+                           libxl__save_helper_state *shs,
+                           int return_value)
 {
     shs->egc = egc;
-    libxl__srm_callout_sendreply(return_value, shs);
+    libxl__srm_callout_sendreply(return_value, errno, shs);
     shs->egc = 0;
 }
 
@@ -356,15 +357,16 @@ libxl__srm_callout_get_callbacks_restore(void *user)
     return &shs->callbacks.restore.a;
 }
 
-void libxl__srm_callout_sendreply(int r, void *user)
+void libxl__srm_callout_sendreply(int r, int savederrno, void *user)
 {
     libxl__save_helper_state *shs = user;
     libxl__egc *egc = shs->egc;
     STATE_AO_GC(shs->ao);
     int errnoval;
+    int data[2] = { r, savederrno };
 
     errnoval = libxl_write_exactly(CTX, libxl__carefd_fd(shs->pipes[0]),
-                                   &r, sizeof(r), shs->stdin_what,
+                                   data, sizeof(data), shs->stdin_what,
                                    "callback return value");
     if (errnoval)
         helper_failed(egc, shs, ERROR_FAIL);
@@ -397,5 +399,6 @@ int libxl__srm_callout_callback_complete(int retval, int errnoval,
     shs->retval = retval;
     shs->errnoval = errnoval;
     libxl__ev_fd_deregister(gc, &shs->readable);
+    errno = 0;
     return 0;
 }
diff --git a/tools/libxl/libxl_save_msgs_gen.pl b/tools/libxl/libxl_save_msgs_gen.pl
index ddd13e5..3d89de9 100755
--- a/tools/libxl/libxl_save_msgs_gen.pl
+++ b/tools/libxl/libxl_save_msgs_gen.pl
@@ -66,6 +66,7 @@ foreach my $ah (qw(callout helper)) {
 #include <string.h>
 #include <stdint.h>
 #include <limits.h>
+#include <errno.h>
 END_BOTH
 
 #include "libxl_internal.h"
@@ -103,7 +104,7 @@ our $getcallbacks = "${libxl}_callout_get_callbacks";
 our $enumcallbacks = "${libxl}_callout_enumcallbacks";
 sub cbtype ($) { "${libxl}_".$_[0]."_autogen_callbacks"; };
 
-f_decl($sendreply, 'callout', 'void', "(int r, void *user)");
+f_decl($sendreply, 'callout', 'void', "(int r, int savederrno, void *user)");
 
 our $helper = "helper";
 our $encode = "${helper}_stub";
@@ -328,7 +329,7 @@ END_ALWAYS
 	$f_more_sr->("        $c_make_callback;\n");
     } else {
         $f_more_sr->("        r = $c_make_callback;\n".
-                     "        $sendreply(r, user);\n");
+                     "        $sendreply(r, errno, user);\n");
     }
     if ($flags =~ m/x/) {
         my $c_v = "(1u<<$msgnum)";
@@ -354,9 +355,12 @@ END_ALWAYS
 	f_more("${encode}_$name",
                (<<END_ALWAYS.($debug ? <<END_DEBUG : '').<<END_ALWAYS));
     int r = ${helper}_getreply(user);
+    int savederrno = ${helper}_getreply(user);
 END_ALWAYS
-    fprintf(stderr,"libxl-save-helper: $name got reply %d\\n",r);
+    fprintf(stderr,"libxl-save-helper: $name got reply %d (errno: %s)\\n",
+            r,strerror(savederrno));
 END_DEBUG
+    errno = savederrno;
     return r;
 END_ALWAYS
     }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 3/3] libxl: save/restore callbacks: enforce a useful errno value
  2014-09-22 13:57                 ` [PATCH RFC 0/3] " Ian Jackson
                                     ` (2 preceding siblings ...)
  2014-09-22 13:58                   ` [PATCH 2/3] libxl: save helper: transport errno with all callback return values Ian Jackson
@ 2014-09-22 13:58                   ` Ian Jackson
  2014-09-22 14:07                     ` Ian Campbell
  3 siblings, 1 reply; 55+ messages in thread
From: Ian Jackson @ 2014-09-22 13:58 UTC (permalink / raw)
  To: xen-devel
  Cc: olaf, Ian.Campbell, wency, Ian Jackson, yunhong.jiang, eddie.dong,
	yanghy, laijs

libxl__xc_domain_saverestore_async_callback_done now no longer assumes
that the caller has set errno.  Instead, it takes a new parameter
containing the errno value.

For one of the call sites it is obvious what value should be passed.

All the others have been left untouched.  As a result the code now
DOES NOT COMPILE.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
 tools/libxl/libxl_dom.c          |    2 +-
 tools/libxl/libxl_internal.h     |    3 ++-
 tools/libxl/libxl_save_callout.c |    4 ++--
 3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index c944804..f418566 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -952,7 +952,7 @@ static void domain_suspend_switch_qemu_xen_logdirty
 
     rc = libxl__qmp_set_global_dirty_log(gc, domid, enable);
     if (!rc) {
-        libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0);
+        libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0,0);
     } else {
         LOG(ERROR,"logdirty switch failed (rc=%d), aborting suspend",rc);
         libxl__xc_domain_saverestore_async_callback_done(egc, shs, -1);
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 04c9378..891ea7a 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2804,7 +2804,8 @@ _hidden void libxl__xc_domain_save_done(libxl__egc*, void *dss_void,
  * Such functions' actual callback function return void in libxl
  * When they are ready to indicate completion, they call this. */
 void libxl__xc_domain_saverestore_async_callback_done(libxl__egc *egc,
-                           libxl__save_helper_state *shs, int return_value);
+                           libxl__save_helper_state *shs,
+                           int return_value, int errnoval);
 
 
 _hidden void libxl__domain_suspend_common_switch_qemu_logdirty
diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
index 88b406b..b7cbfe8 100644
--- a/tools/libxl/libxl_save_callout.c
+++ b/tools/libxl/libxl_save_callout.c
@@ -139,10 +139,10 @@ void libxl__xc_domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss)
 
 void libxl__xc_domain_saverestore_async_callback_done(libxl__egc *egc,
                            libxl__save_helper_state *shs,
-                           int return_value)
+                           int return_value, int errnoval)
 {
     shs->egc = egc;
-    libxl__srm_callout_sendreply(return_value, errno, shs);
+    libxl__srm_callout_sendreply(return_value, errnoval, shs);
     shs->egc = 0;
 }
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH RFC 0/3] Re: [RFC Patch v4 7/9] correct xc_domain_save()'s return value
  2014-09-22 13:58                   ` Ian Campbell
@ 2014-09-22 14:00                     ` Ian Campbell
  0 siblings, 0 replies; 55+ messages in thread
From: Ian Campbell @ 2014-09-22 14:00 UTC (permalink / raw)
  To: Ian Jackson
  Cc: Olaf Hering, Lai Jiangshan, Wen Congyang, Jiang Yunhong,
	Dong Eddie, xen devel, Yang Hongyang

On Mon, 2014-09-22 at 14:58 +0100, Ian Campbell wrote:
> On Mon, 2014-09-22 at 14:57 +0100, Ian Jackson wrote:
> > Ian Campbell writes ("Re: [Xen-devel] [RFC Patch v4 7/9] correct xc_domain_save()'s return value"):
> > > Hrm. libxc is already a complete mess wrt error returning/handling
> > > because some proportion of the code incorrectly does/assumes this sort
> > > of thing is happening (because people were confused about the syscall
> > > returns from the kernel vs. process context). Having a place in libxc
> > > where this is now done on purpose seems a bit like setting the rope on
> > > fire to me...
> > 
> > Well, here is an alternative.  You're probably going to hate it.
> > It DOES NOT COMPILE so (obviously) I have not tested it.
> 
> ENOATTACHMENT...

Nevermiond, there was a series inbound...
> 
> 
> > 
> > Ian.
> > 
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC Patch v4 7/9] correct xc_domain_save()'s return value
  2014-09-22 13:13               ` Ian Campbell
  2014-09-22 13:57                 ` [PATCH RFC 0/3] " Ian Jackson
@ 2014-09-22 14:06                 ` Andrew Cooper
  2014-09-23  2:14                   ` Wen Congyang
  1 sibling, 1 reply; 55+ messages in thread
From: Andrew Cooper @ 2014-09-22 14:06 UTC (permalink / raw)
  To: Ian Campbell, Ian Jackson
  Cc: Olaf Hering, Lai Jiangshan, Wen Congyang, Jiang Yunhong,
	Dong Eddie, xen devel, Yang Hongyang

On 22/09/14 14:13, Ian Campbell wrote:
> On Mon, 2014-09-22 at 14:03 +0100, Ian Jackson wrote:
>> Ian Campbell writes ("Re: [Xen-devel] [RFC Patch v4 7/9] correct xc_domain_save()'s return value"):
>>> libxc doesn't know that, if it is important then it seems like the
>>> failure + errno ought to be marshalled across the IPC link.
>> Yes, but ...
>>
>>> It may be that this can be easily handled in
>>> libxl__srm_callout_sendreply + helper_getreply. Ian J -- what do you
>>> think?
>> ... while that would be possbile, we have another option.
>>
>> We could say that the callbacks return errno values.  That would
>> simplify the API and avoid having the IPC involve accesses to global
>> variables (ie, things not in the functions' parameter lists).
>>
>> If we do that then it becomes the responsibility of xc_domain_save to
>> either change its own API to return errno, or to save the callback's
>> return value in errno.
> Hrm. libxc is already a complete mess wrt error returning/handling
> because some proportion of the code incorrectly does/assumes this sort
> of thing is happening (because people were confused about the syscall
> returns from the kernel vs. process context). Having a place in libxc
> where this is now done on purpose seems a bit like setting the rope on
> fire to me...
>
> Ian.
>

libxc is an absolute mess, but this is far from the only codepath (even
in xc_domain_save()) which ends up like this.

The *only* safe assumption is that ==0 is success and !=0 is failure for
xc_domain_save(), and errno may or may not be relevant, whether rc is -1
or not.

For the suspend callback itself, I have encountered 3 different error
schemes which stem from a complete lack of expectations set out beside
the function pointer definition in xenguest.h, which is why I had to
copy the old "!=0" check in migration v2.  Even intree in the past, -1
and 1 have been used for errors from this function pointer.

It is my opinion that it is not worth changing any of the error handling
until someone does all of libxc and makes it all consistent.  The risk
for introducing subtle bugs into in (and out-of-) tree callers is just
too high, and this change alone does not fix xc_domain_save() to be
consistent.

~Andrew

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 3/3] libxl: save/restore callbacks: enforce a useful errno value
  2014-09-22 13:58                   ` [PATCH 3/3] libxl: save/restore callbacks: enforce a useful errno value Ian Jackson
@ 2014-09-22 14:07                     ` Ian Campbell
  2014-09-22 14:08                       ` Ian Jackson
  0 siblings, 1 reply; 55+ messages in thread
From: Ian Campbell @ 2014-09-22 14:07 UTC (permalink / raw)
  To: Ian Jackson
  Cc: olaf, xen-devel, laijs, wency, yunhong.jiang, eddie.dong, yanghy

On Mon, 2014-09-22 at 14:58 +0100, Ian Jackson wrote:
> libxl__xc_domain_saverestore_async_callback_done now no longer assumes
> that the caller has set errno.  Instead, it takes a new parameter
> containing the errno value.
> 
> For one of the call sites it is obvious what value should be passed.
> 
> All the others have been left untouched.  As a result the code now
> DOES NOT COMPILE.

I suppose the intention here is that Wen takes this as a basis and makes
it return the appropriate error codes to suit remus's usage?

Ian.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 3/3] libxl: save/restore callbacks: enforce a useful errno value
  2014-09-22 14:07                     ` Ian Campbell
@ 2014-09-22 14:08                       ` Ian Jackson
  0 siblings, 0 replies; 55+ messages in thread
From: Ian Jackson @ 2014-09-22 14:08 UTC (permalink / raw)
  To: Ian Campbell
  Cc: olaf, xen-devel, laijs, wency, yunhong.jiang, eddie.dong, yanghy

Ian Campbell writes ("Re: [PATCH 3/3] libxl: save/restore callbacks: enforce a useful errno value"):
> On Mon, 2014-09-22 at 14:58 +0100, Ian Jackson wrote:
> > For one of the call sites it is obvious what value should be passed.
> > 
> > All the others have been left untouched.  As a result the code now
> > DOES NOT COMPILE.
> 
> I suppose the intention here is that Wen takes this as a basis and makes
> it return the appropriate error codes to suit remus's usage?

Yes.  I'm afraid it's far from clear to me what is correct and there
are about a dozen call sites (many of which are trivial and are
themselves called from somewhere else...)

Ian.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC Patch v4 5/9] check if mfn is supported by IOCTL_PRIVCMD_MMAPBATCH before calling ioctl()
  2014-09-22  5:59 ` [RFC Patch v4 5/9] check if mfn is supported by IOCTL_PRIVCMD_MMAPBATCH before calling ioctl() Wen Congyang
@ 2014-09-22 14:26   ` Ian Campbell
  2014-09-23  1:40     ` Wen Congyang
  0 siblings, 1 reply; 55+ messages in thread
From: Ian Campbell @ 2014-09-22 14:26 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Lai Jiangshan, Ian Jackson, Jiang Yunhong, Dong Eddie, xen devel,
	Yang Hongyang

On Mon, 2014-09-22 at 13:59 +0800, Wen Congyang wrote:
> If mfn is invalid, ioctl(fd, IOCTL_PRIVCMD_MMAPBATCH, ..) also returns 0,
> and set error information in error bits(bits28-31). So if the user input
> a large valid mfn, we cannot reliably distinguish between a large MFN and
> an error. So we should check the input mfn before calling ioctl().
> The user can input more than one mfn, and part of them are ~0UL. In this
> case, the user expects we can map the memory for all valid mfn. So we
> cannot just return NULL if some mfn is not supported.

I don't follow this last bit. linux_privcmd_map_foreign_bulk already
returns NULL and maps nothing in some error cases (e.g. mmap failure),
what is the problem also doing that here?

The way you have it here we need to worry about what the behaviour of
Xen/privcmd is on pfn==~0UL, and whether we can rely on it. Far better
to just declare such attempts to be fundamentally broken on the part of
the caller and return.

Ian.

> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> ---
>  tools/libxc/xc_linux_osdep.c | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/tools/libxc/xc_linux_osdep.c b/tools/libxc/xc_linux_osdep.c
> index a19e4b6..d11bcee 100644
> --- a/tools/libxc/xc_linux_osdep.c
> +++ b/tools/libxc/xc_linux_osdep.c
> @@ -321,6 +321,18 @@ static void *linux_privcmd_map_foreign_bulk(xc_interface *xch, xc_osdep_handle h
>          }
>  
>          memcpy(pfn, arr, num * sizeof(*arr));
> +        for ( i = 0; i < num; i++ )
> +        {
> +            /*
> +             * IOCTL_PRIVCMD_MMAPBATCH doesn't support the mfn which
> +             * error bits are set
> +             */
> +            if ( pfn[i] & PRIVCMD_MMAPBATCH_MFN_ERROR )
> +            {
> +                pfn[i] = ~0UL;
> +                err[i] = -EINVAL;
> +            }
> +        }
>  
>          ioctlx.num = num;
>          ioctlx.dom = dom;
> @@ -333,6 +345,9 @@ static void *linux_privcmd_map_foreign_bulk(xc_interface *xch, xc_osdep_handle h
>  
>          for ( i = 0; i < num; ++i )
>          {
> +            if ( pfn[i] == ~0UL )
> +                continue;
> +
>              switch ( pfn[i] ^ arr[i] )
>              {
>              case 0:

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC Patch v4 8/9] store correct format into tapdisk-params/params
  2014-09-22  5:59 ` [RFC Patch v4 8/9] store correct format into tapdisk-params/params Wen Congyang
@ 2014-09-22 14:37   ` Ian Campbell
  2014-09-23  2:07     ` Wen Congyang
  0 siblings, 1 reply; 55+ messages in thread
From: Ian Campbell @ 2014-09-22 14:37 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Wei Liu, Lai Jiangshan, Dong Eddie, Jiang Yunhong, Ian Jackson,
	xen devel, Yang Hongyang

On Mon, 2014-09-22 at 13:59 +0800, Wen Congyang wrote:

Please prefix the subject "tools: libxl: "

> If the format is raw, we store aio into tapdisk-params/params.

I got confused and thought "tapdisl-params/params" was a path. Could you
spell it out the first time as "into either the tapdisk-params or params
xenstore node" please.

> And we cannot use the API libxl_disk_format_from_string()
> to get the format from the string.

... can you spell out why/where this is important please.

> The API libxl__device_disk_string_of_format() is just
> for blktap2, which needs to pass aio instead of raw to
> tapdisk2.

Should we move it to libxl_blktap2.c then?

>  So use libxl_disk_format_to_string() to
> instead of libxl__device_disk_string_of_format().
> 
> Also update libxl__device_destroy_tapdisk() due to
> tapdisk-params changed.

s/changed/change/.

> Note: the content of tapdisk-params/params has been changed...

AFAICT the actual change is that "aio:/path/to/a/thing" might become
either "raw:/path/to/a/thing" or "empty:/path/to/a/thing", or perhaps
even "unknown:/path/to/a/thing" (or more likely "aio:"->"empty:").

Did you test blktap2 and qemu to be sure that they handle "raw" in
particular correctly?

> @@ -67,6 +68,15 @@ int libxl__device_destroy_tapdisk(libxl__gc *gc, const char *params)
>  
>      *disk++ = '\0';
>  
> +    /* type may be raw */
> +    rc = libxl_disk_format_from_string(type, &format);
> +    if (rc < 0) {
> +        LOG(ERROR, "invalid disk type %s", type);
> +        return ERROR_FAIL;

Please propagate rc.

> +    }
> +
> +    type = libxl__device_disk_string_of_format(format);
> +
>      err = tap_ctl_find(type, disk, &tap);
>      if (err < 0) {
>          /* returns -errno */

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC Patch v4 1/9] copy the correct page to memory
  2014-09-22  5:59 ` [RFC Patch v4 1/9] copy the correct page to memory Wen Congyang
@ 2014-09-22 14:38   ` Ian Campbell
  0 siblings, 0 replies; 55+ messages in thread
From: Ian Campbell @ 2014-09-22 14:38 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Lai Jiangshan, Ian Jackson, Jiang Yunhong, Dong Eddie, xen devel,
	Hong Tao, Yang Hongyang

On Mon, 2014-09-22 at 13:59 +0800, Wen Congyang wrote:
> From: Hong Tao <bobby.hong@huawei.com>
> 
> apply_batch() only handles MAX_BATCH_SIZE pages at one time. If
> there is some bogus/unmapped/allocate-only/broken page, we will
> skip it. So when we call apply_batch() again, the first page's
> index is curbatch - invalid_pages. invalid_pages stores the number
> of bogus/unmapped/allocate-only/broken pages we have found.
> 
> In many cases, invalid_pages is 0, so we don't catch this error.
> 
> Signed-off-by: Hong Tao <bobby.hong@huawei.com>
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> Acked-by: Ian Campbell <ian.campbell@citrix.com>

Applied this and "csum the correct page". In both cases I prefixed the
subject "tools: libxc: restore:".

I also applied "pass correct file to qemu if we use blktap2" and "read
nictype from xenstore" prefixing both "tools: libxl: ".

The rest had some comment from me and others, apart from #9 which I'm
about to go and look at now.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC Patch v4 9/9] update libxl__device_disk_from_xs_be() to support blktap device
  2014-09-22  5:59 ` [RFC Patch v4 9/9] update libxl__device_disk_from_xs_be() to support blktap device Wen Congyang
@ 2014-09-22 15:08   ` Ian Campbell
  2014-09-23  3:07     ` Wen Congyang
  0 siblings, 1 reply; 55+ messages in thread
From: Ian Campbell @ 2014-09-22 15:08 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Lai Jiangshan, Ian Jackson, Jiang Yunhong, Dong Eddie, xen devel,
	Shriram Rajagopalan, Yang Hongyang

On Mon, 2014-09-22 at 13:59 +0800, Wen Congyang wrote:
> If the disk backend is blktap device, we store "format:pdev_path"
> in tapdisk-params, and store "phy" in type. So use tapdisk-params
> to set libxl_device_disk instead of params and type.

This seems to be the same format as params -- can you not just
dynamically select the key name and share the parsing code?

> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> Cc: Shriram Rajagopalan <rshriram@cs.ubc.ca>
> ---
>  tools/libxl/libxl.c | 42 ++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 40 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> index aa9345b..8c241aa 100644
> --- a/tools/libxl/libxl.c
> +++ b/tools/libxl/libxl.c
> @@ -2468,6 +2468,45 @@ static int libxl__device_disk_from_xs_be(libxl__gc *gc,
>          goto cleanup;
>      }
>  
> +    disk->format = LIBXL_DISK_FORMAT_UNKNOWN;
> +
> +    /* "tapdisk-params" is only for tapdisk */
> +    tmp = xs_read(ctx->xsh, XBT_NULL,
> +                  libxl__sprintf(gc, "%s/tapdisk-params", be_path), &len);

libxl__xs_read will gc the result, which will make the error handling
easier (you strdup tmp so I think the reasons for using xs_read don't
apply). Also libxl__sprintf can be shortended with GCSPRINTF.

> +        if (disk->format != LIBXL_DISK_FORMAT_VHD &&
> +            disk->format != LIBXL_DISK_FORMAT_RAW) {
> +            LOG(ERROR, "unsupported tapdisk format: %s\n", tmp);

Hrm, if we get here then is there any reason to reject this? obviously
something wrote it etc.

This function is supposed to read the current state and return it, I'm
not sure it should be rejecting what it finds TBH, and it would simplify
things not to do so.

> +            free(tmp);
> +            goto cleanup;
> +        }
> +        free(tmp);
> +
> +        /*
> +         * The backend is tapdisk, so we store tapdev in params, and
> +         * phy in type(see device_disk_add())
> +         */
> +        disk->backend = LIBXL_DISK_BACKEND_TAP;
> +
> +        goto skip_type;

I think you want to put some all all of the following into an else
clause, not simulate that affect with a goto.

> @@ -2520,8 +2560,6 @@ static int libxl__device_disk_from_xs_be(libxl__gc *gc,
>      }
>      disk->is_cdrom = !strcmp(tmp, "cdrom");
>  
> -    disk->format = LIBXL_DISK_FORMAT_UNKNOWN;

Why does this need to move? Also I think UNKNOWN is the fault from
libxl_device_disk_init, so it's probably unnecessary.

Ian.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC Patch v4 5/9] check if mfn is supported by IOCTL_PRIVCMD_MMAPBATCH before calling ioctl()
  2014-09-22 14:26   ` Ian Campbell
@ 2014-09-23  1:40     ` Wen Congyang
  2014-09-23 10:08       ` Ian Campbell
  0 siblings, 1 reply; 55+ messages in thread
From: Wen Congyang @ 2014-09-23  1:40 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Lai Jiangshan, Ian Jackson, Jiang Yunhong, Dong Eddie, xen devel,
	Yang Hongyang

On 09/22/2014 10:26 PM, Ian Campbell wrote:
> On Mon, 2014-09-22 at 13:59 +0800, Wen Congyang wrote:
>> If mfn is invalid, ioctl(fd, IOCTL_PRIVCMD_MMAPBATCH, ..) also returns 0,
>> and set error information in error bits(bits28-31). So if the user input
>> a large valid mfn, we cannot reliably distinguish between a large MFN and
>> an error. So we should check the input mfn before calling ioctl().
>> The user can input more than one mfn, and part of them are ~0UL. In this
>> case, the user expects we can map the memory for all valid mfn. So we
>> cannot just return NULL if some mfn is not supported.
> 
> I don't follow this last bit. linux_privcmd_map_foreign_bulk already
> returns NULL and maps nothing in some error cases (e.g. mmap failure),
> what is the problem also doing that here?

Yes, if mmap fails, linux_privcmd_map_foreign_bulk() returns NULL.
But some mfn is invalid, ioctl() returns 0, and then linux_privcmd_map_foreign_bulk()
doesn't return NULL.

For example:
page0, page1, ... page m, page m+1, ... page n
mfn 0, mfn 1, ... mfn m,  mfn m+1, ... mfn n

If only mfn m is invalid, the user can access page 0, page 1, page m+1.
The user can know which page can't be accessed by the array err[].

If some mfn is valid, but it is large, and IOCTL_PRIVCMD_MMAPBATCH doesn't
support it. The user doesn't know the page can't be accessed, and will
cause page fault(the user program may segment fault) when the user accesses the page.

> 
> The way you have it here we need to worry about what the behaviour of
> Xen/privcmd is on pfn==~0UL, and whether we can rely on it. Far better
> to just declare such attempts to be fundamentally broken on the part of
> the caller and return.

In the function apply_batch():
        /* setup region_mfn[] for batch map, if necessary.
         * For HVM guests, this interface takes PFNs, not MFNs */
        if ( pagetype == XEN_DOMCTL_PFINFO_XTAB
             || pagetype == XEN_DOMCTL_PFINFO_XALLOC )
            region_mfn[i] = ~0UL; /* map will fail but we don't care */
        else
            region_mfn[i] = ctx->hvm ? pfn : ctx->p2m[pfn];
It is why I choose ~0UL. I don't know how to check if the mfn is valid,
and we should allow the caller passes ~0UL, otherwise, it will break
migration.

Thanks
Wen Congyang

> 
> Ian.
> 
>> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
>> ---
>>  tools/libxc/xc_linux_osdep.c | 15 +++++++++++++++
>>  1 file changed, 15 insertions(+)
>>
>> diff --git a/tools/libxc/xc_linux_osdep.c b/tools/libxc/xc_linux_osdep.c
>> index a19e4b6..d11bcee 100644
>> --- a/tools/libxc/xc_linux_osdep.c
>> +++ b/tools/libxc/xc_linux_osdep.c
>> @@ -321,6 +321,18 @@ static void *linux_privcmd_map_foreign_bulk(xc_interface *xch, xc_osdep_handle h
>>          }
>>  
>>          memcpy(pfn, arr, num * sizeof(*arr));
>> +        for ( i = 0; i < num; i++ )
>> +        {
>> +            /*
>> +             * IOCTL_PRIVCMD_MMAPBATCH doesn't support the mfn which
>> +             * error bits are set
>> +             */
>> +            if ( pfn[i] & PRIVCMD_MMAPBATCH_MFN_ERROR )
>> +            {
>> +                pfn[i] = ~0UL;
>> +                err[i] = -EINVAL;
>> +            }
>> +        }
>>  
>>          ioctlx.num = num;
>>          ioctlx.dom = dom;
>> @@ -333,6 +345,9 @@ static void *linux_privcmd_map_foreign_bulk(xc_interface *xch, xc_osdep_handle h
>>  
>>          for ( i = 0; i < num; ++i )
>>          {
>> +            if ( pfn[i] == ~0UL )
>> +                continue;
>> +
>>              switch ( pfn[i] ^ arr[i] )
>>              {
>>              case 0:
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC Patch v4 8/9] store correct format into tapdisk-params/params
  2014-09-22 14:37   ` Ian Campbell
@ 2014-09-23  2:07     ` Wen Congyang
  2014-09-23 10:11       ` Ian Campbell
  0 siblings, 1 reply; 55+ messages in thread
From: Wen Congyang @ 2014-09-23  2:07 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Wei Liu, Lai Jiangshan, Dong Eddie, Jiang Yunhong, Ian Jackson,
	xen devel, Yang Hongyang

On 09/22/2014 10:37 PM, Ian Campbell wrote:
> On Mon, 2014-09-22 at 13:59 +0800, Wen Congyang wrote:
> 
> Please prefix the subject "tools: libxl: "

OK

> 
>> If the format is raw, we store aio into tapdisk-params/params.
> 
> I got confused and thought "tapdisl-params/params" was a path. Could you
> spell it out the first time as "into either the tapdisk-params or params
> xenstore node" please.

OK

> 
>> And we cannot use the API libxl_disk_format_from_string()
>> to get the format from the string.
> 
> ... can you spell out why/where this is important please.

The API libxl_device_disk_list() will return all disks. But the
type is wrong if it is a tapdisk device.

> 
>> The API libxl__device_disk_string_of_format() is just
>> for blktap2, which needs to pass aio instead of raw to
>> tapdisk2.
> 
> Should we move it to libxl_blktap2.c then?

I agree with it.

> 
>>  So use libxl_disk_format_to_string() to
>> instead of libxl__device_disk_string_of_format().
>>
>> Also update libxl__device_destroy_tapdisk() due to
>> tapdisk-params changed.
> 
> s/changed/change/.

OK

> 
>> Note: the content of tapdisk-params/params has been changed...
> 
> AFAICT the actual change is that "aio:/path/to/a/thing" might become
> either "raw:/path/to/a/thing" or "empty:/path/to/a/thing", or perhaps
> even "unknown:/path/to/a/thing" (or more likely "aio:"->"empty:").

Yes

> 
> Did you test blktap2 and qemu to be sure that they handle "raw" in
> particular correctly?

Sorry, I don't understand what do you want to test.

> 
>> @@ -67,6 +68,15 @@ int libxl__device_destroy_tapdisk(libxl__gc *gc, const char *params)
>>  
>>      *disk++ = '\0';
>>  
>> +    /* type may be raw */
>> +    rc = libxl_disk_format_from_string(type, &format);
>> +    if (rc < 0) {
>> +        LOG(ERROR, "invalid disk type %s", type);
>> +        return ERROR_FAIL;
> 
> Please propagate rc.

OK

Thanks
Wen Congyang

> 
>> +    }
>> +
>> +    type = libxl__device_disk_string_of_format(format);
>> +
>>      err = tap_ctl_find(type, disk, &tap);
>>      if (err < 0) {
>>          /* returns -errno */
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC Patch v4 7/9] correct xc_domain_save()'s return value
  2014-09-22 14:06                 ` [RFC Patch v4 7/9] correct xc_domain_save()'s return value Andrew Cooper
@ 2014-09-23  2:14                   ` Wen Congyang
  2014-09-23  9:00                     ` Andrew Cooper
  0 siblings, 1 reply; 55+ messages in thread
From: Wen Congyang @ 2014-09-23  2:14 UTC (permalink / raw)
  To: Andrew Cooper, Ian Campbell, Ian Jackson
  Cc: Olaf Hering, Lai Jiangshan, Jiang Yunhong, Dong Eddie, xen devel,
	Yang Hongyang

On 09/22/2014 10:06 PM, Andrew Cooper wrote:
> On 22/09/14 14:13, Ian Campbell wrote:
>> On Mon, 2014-09-22 at 14:03 +0100, Ian Jackson wrote:
>>> Ian Campbell writes ("Re: [Xen-devel] [RFC Patch v4 7/9] correct xc_domain_save()'s return value"):
>>>> libxc doesn't know that, if it is important then it seems like the
>>>> failure + errno ought to be marshalled across the IPC link.
>>> Yes, but ...
>>>
>>>> It may be that this can be easily handled in
>>>> libxl__srm_callout_sendreply + helper_getreply. Ian J -- what do you
>>>> think?
>>> ... while that would be possbile, we have another option.
>>>
>>> We could say that the callbacks return errno values.  That would
>>> simplify the API and avoid having the IPC involve accesses to global
>>> variables (ie, things not in the functions' parameter lists).
>>>
>>> If we do that then it becomes the responsibility of xc_domain_save to
>>> either change its own API to return errno, or to save the callback's
>>> return value in errno.
>> Hrm. libxc is already a complete mess wrt error returning/handling
>> because some proportion of the code incorrectly does/assumes this sort
>> of thing is happening (because people were confused about the syscall
>> returns from the kernel vs. process context). Having a place in libxc
>> where this is now done on purpose seems a bit like setting the rope on
>> fire to me...
>>
>> Ian.
>>
> 
> libxc is an absolute mess, but this is far from the only codepath (even
> in xc_domain_save()) which ends up like this.
> 
> The *only* safe assumption is that ==0 is success and !=0 is failure for
> xc_domain_save(), and errno may or may not be relevant, whether rc is -1
> or not.

Do you mean: errno is undefined even if rc is -1?

Thanks
Wen Congyang

> 
> For the suspend callback itself, I have encountered 3 different error
> schemes which stem from a complete lack of expectations set out beside
> the function pointer definition in xenguest.h, which is why I had to
> copy the old "!=0" check in migration v2.  Even intree in the past, -1
> and 1 have been used for errors from this function pointer.
> 
> 
> It is my opinion that it is not worth changing any of the error handling
> until someone does all of libxc and makes it all consistent.  The risk
> for introducing subtle bugs into in (and out-of-) tree callers is just
> too high, and this change alone does not fix xc_domain_save() to be
> consistent.
> 
> ~Andrew
> 
> .
> 

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC Patch v4 9/9] update libxl__device_disk_from_xs_be() to support blktap device
  2014-09-22 15:08   ` Ian Campbell
@ 2014-09-23  3:07     ` Wen Congyang
  2014-09-23 10:12       ` Ian Campbell
  0 siblings, 1 reply; 55+ messages in thread
From: Wen Congyang @ 2014-09-23  3:07 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Lai Jiangshan, Ian Jackson, Jiang Yunhong, Dong Eddie, xen devel,
	Shriram Rajagopalan, Yang Hongyang

On 09/22/2014 11:08 PM, Ian Campbell wrote:
> On Mon, 2014-09-22 at 13:59 +0800, Wen Congyang wrote:
>> If the disk backend is blktap device, we store "format:pdev_path"
>> in tapdisk-params, and store "phy" in type. So use tapdisk-params
>> to set libxl_device_disk instead of params and type.
> 
> This seems to be the same format as params -- can you not just
> dynamically select the key name and share the parsing code?

OK. I will try it.

> 
>> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
>> Cc: Shriram Rajagopalan <rshriram@cs.ubc.ca>
>> ---
>>  tools/libxl/libxl.c | 42 ++++++++++++++++++++++++++++++++++++++++--
>>  1 file changed, 40 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
>> index aa9345b..8c241aa 100644
>> --- a/tools/libxl/libxl.c
>> +++ b/tools/libxl/libxl.c
>> @@ -2468,6 +2468,45 @@ static int libxl__device_disk_from_xs_be(libxl__gc *gc,
>>          goto cleanup;
>>      }
>>  
>> +    disk->format = LIBXL_DISK_FORMAT_UNKNOWN;
>> +
>> +    /* "tapdisk-params" is only for tapdisk */
>> +    tmp = xs_read(ctx->xsh, XBT_NULL,
>> +                  libxl__sprintf(gc, "%s/tapdisk-params", be_path), &len);
> 
> libxl__xs_read will gc the result, which will make the error handling
> easier (you strdup tmp so I think the reasons for using xs_read don't
> apply). Also libxl__sprintf can be shortended with GCSPRINTF.

Hmm, I agree with it, and I think we should also the similar codes.

> 
>> +        if (disk->format != LIBXL_DISK_FORMAT_VHD &&
>> +            disk->format != LIBXL_DISK_FORMAT_RAW) {
>> +            LOG(ERROR, "unsupported tapdisk format: %s\n", tmp);
> 
> Hrm, if we get here then is there any reason to reject this? obviously
> something wrote it etc.

Hmm, we can get here only when the xenstore node is corrupted.

> 
> This function is supposed to read the current state and return it, I'm
> not sure it should be rejecting what it finds TBH, and it would simplify
> things not to do so.

The xenstore node is corrupted, what should we do? Ignore this disk or return
error?

> 
>> +            free(tmp);
>> +            goto cleanup;
>> +        }
>> +        free(tmp);
>> +
>> +        /*
>> +         * The backend is tapdisk, so we store tapdev in params, and
>> +         * phy in type(see device_disk_add())
>> +         */
>> +        disk->backend = LIBXL_DISK_BACKEND_TAP;
>> +
>> +        goto skip_type;
> 
> I think you want to put some all all of the following into an else
> clause, not simulate that affect with a goto.

OK

> 
>> @@ -2520,8 +2560,6 @@ static int libxl__device_disk_from_xs_be(libxl__gc *gc,
>>      }
>>      disk->is_cdrom = !strcmp(tmp, "cdrom");
>>  
>> -    disk->format = LIBXL_DISK_FORMAT_UNKNOWN;
> 
> Why does this need to move? Also I think UNKNOWN is the fault from
> libxl_device_disk_init, so it's probably unnecessary.

Yes.

Thanks
Wen Congyang

> 
> Ian.
> 
> .
> 

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC Patch v4 7/9] correct xc_domain_save()'s return value
  2014-09-23  2:14                   ` Wen Congyang
@ 2014-09-23  9:00                     ` Andrew Cooper
  2014-09-23  9:10                       ` Wen Congyang
  0 siblings, 1 reply; 55+ messages in thread
From: Andrew Cooper @ 2014-09-23  9:00 UTC (permalink / raw)
  To: Wen Congyang, Ian Campbell, Ian Jackson
  Cc: Olaf Hering, Lai Jiangshan, Jiang Yunhong, Dong Eddie, xen devel,
	Yang Hongyang

On 23/09/14 03:14, Wen Congyang wrote:
> On 09/22/2014 10:06 PM, Andrew Cooper wrote:
>> On 22/09/14 14:13, Ian Campbell wrote:
>>> On Mon, 2014-09-22 at 14:03 +0100, Ian Jackson wrote:
>>>> Ian Campbell writes ("Re: [Xen-devel] [RFC Patch v4 7/9] correct xc_domain_save()'s return value"):
>>>>> libxc doesn't know that, if it is important then it seems like the
>>>>> failure + errno ought to be marshalled across the IPC link.
>>>> Yes, but ...
>>>>
>>>>> It may be that this can be easily handled in
>>>>> libxl__srm_callout_sendreply + helper_getreply. Ian J -- what do you
>>>>> think?
>>>> ... while that would be possbile, we have another option.
>>>>
>>>> We could say that the callbacks return errno values.  That would
>>>> simplify the API and avoid having the IPC involve accesses to global
>>>> variables (ie, things not in the functions' parameter lists).
>>>>
>>>> If we do that then it becomes the responsibility of xc_domain_save to
>>>> either change its own API to return errno, or to save the callback's
>>>> return value in errno.
>>> Hrm. libxc is already a complete mess wrt error returning/handling
>>> because some proportion of the code incorrectly does/assumes this sort
>>> of thing is happening (because people were confused about the syscall
>>> returns from the kernel vs. process context). Having a place in libxc
>>> where this is now done on purpose seems a bit like setting the rope on
>>> fire to me...
>>>
>>> Ian.
>>>
>> libxc is an absolute mess, but this is far from the only codepath (even
>> in xc_domain_save()) which ends up like this.
>>
>> The *only* safe assumption is that ==0 is success and !=0 is failure for
>> xc_domain_save(), and errno may or may not be relevant, whether rc is -1
>> or not.
> Do you mean: errno is undefined even if rc is -1?
>
> Thanks
> Wen Congyang

Correct, last time I checked.  The error handling in libxc is in dire
need of fixing from scratch.

~Andrew

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC Patch v4 7/9] correct xc_domain_save()'s return value
  2014-09-23  9:00                     ` Andrew Cooper
@ 2014-09-23  9:10                       ` Wen Congyang
  2014-09-23  9:25                         ` Andrew Cooper
  0 siblings, 1 reply; 55+ messages in thread
From: Wen Congyang @ 2014-09-23  9:10 UTC (permalink / raw)
  To: Andrew Cooper, Ian Campbell, Ian Jackson
  Cc: Olaf Hering, Lai Jiangshan, Jiang Yunhong, Dong Eddie, xen devel,
	Yang Hongyang

On 09/23/2014 05:00 PM, Andrew Cooper wrote:
> On 23/09/14 03:14, Wen Congyang wrote:
>> On 09/22/2014 10:06 PM, Andrew Cooper wrote:
>>> On 22/09/14 14:13, Ian Campbell wrote:
>>>> On Mon, 2014-09-22 at 14:03 +0100, Ian Jackson wrote:
>>>>> Ian Campbell writes ("Re: [Xen-devel] [RFC Patch v4 7/9] correct xc_domain_save()'s return value"):
>>>>>> libxc doesn't know that, if it is important then it seems like the
>>>>>> failure + errno ought to be marshalled across the IPC link.
>>>>> Yes, but ...
>>>>>
>>>>>> It may be that this can be easily handled in
>>>>>> libxl__srm_callout_sendreply + helper_getreply. Ian J -- what do you
>>>>>> think?
>>>>> ... while that would be possbile, we have another option.
>>>>>
>>>>> We could say that the callbacks return errno values.  That would
>>>>> simplify the API and avoid having the IPC involve accesses to global
>>>>> variables (ie, things not in the functions' parameter lists).
>>>>>
>>>>> If we do that then it becomes the responsibility of xc_domain_save to
>>>>> either change its own API to return errno, or to save the callback's
>>>>> return value in errno.
>>>> Hrm. libxc is already a complete mess wrt error returning/handling
>>>> because some proportion of the code incorrectly does/assumes this sort
>>>> of thing is happening (because people were confused about the syscall
>>>> returns from the kernel vs. process context). Having a place in libxc
>>>> where this is now done on purpose seems a bit like setting the rope on
>>>> fire to me...
>>>>
>>>> Ian.
>>>>
>>> libxc is an absolute mess, but this is far from the only codepath (even
>>> in xc_domain_save()) which ends up like this.
>>>
>>> The *only* safe assumption is that ==0 is success and !=0 is failure for
>>> xc_domain_save(), and errno may or may not be relevant, whether rc is -1
>>> or not.
>> Do you mean: errno is undefined even if rc is -1?
>>
>> Thanks
>> Wen Congyang
> 
> Correct, last time I checked.  The error handling in libxc is in dire
> need of fixing from scratch.

For v2 version migration, we can do it like this. But xc_domain_save()
is a public API, and is used some years. So I am not sure if I can
change its behavior...

Thanks
Wen Congyang

> 
> ~Andrew
> .
> 

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC Patch v4 7/9] correct xc_domain_save()'s return value
  2014-09-23  9:10                       ` Wen Congyang
@ 2014-09-23  9:25                         ` Andrew Cooper
  2014-09-23  9:29                           ` Wen Congyang
  2014-09-23  9:52                           ` Wen Congyang
  0 siblings, 2 replies; 55+ messages in thread
From: Andrew Cooper @ 2014-09-23  9:25 UTC (permalink / raw)
  To: Wen Congyang, Ian Campbell, Ian Jackson
  Cc: Olaf Hering, Lai Jiangshan, Jiang Yunhong, Dong Eddie, xen devel,
	Yang Hongyang

On 23/09/14 10:10, Wen Congyang wrote:
> On 09/23/2014 05:00 PM, Andrew Cooper wrote:
>> On 23/09/14 03:14, Wen Congyang wrote:
>>> On 09/22/2014 10:06 PM, Andrew Cooper wrote:
>>>> On 22/09/14 14:13, Ian Campbell wrote:
>>>>> On Mon, 2014-09-22 at 14:03 +0100, Ian Jackson wrote:
>>>>>> Ian Campbell writes ("Re: [Xen-devel] [RFC Patch v4 7/9] correct xc_domain_save()'s return value"):
>>>>>>> libxc doesn't know that, if it is important then it seems like the
>>>>>>> failure + errno ought to be marshalled across the IPC link.
>>>>>> Yes, but ...
>>>>>>
>>>>>>> It may be that this can be easily handled in
>>>>>>> libxl__srm_callout_sendreply + helper_getreply. Ian J -- what do you
>>>>>>> think?
>>>>>> ... while that would be possbile, we have another option.
>>>>>>
>>>>>> We could say that the callbacks return errno values.  That would
>>>>>> simplify the API and avoid having the IPC involve accesses to global
>>>>>> variables (ie, things not in the functions' parameter lists).
>>>>>>
>>>>>> If we do that then it becomes the responsibility of xc_domain_save to
>>>>>> either change its own API to return errno, or to save the callback's
>>>>>> return value in errno.
>>>>> Hrm. libxc is already a complete mess wrt error returning/handling
>>>>> because some proportion of the code incorrectly does/assumes this sort
>>>>> of thing is happening (because people were confused about the syscall
>>>>> returns from the kernel vs. process context). Having a place in libxc
>>>>> where this is now done on purpose seems a bit like setting the rope on
>>>>> fire to me...
>>>>>
>>>>> Ian.
>>>>>
>>>> libxc is an absolute mess, but this is far from the only codepath (even
>>>> in xc_domain_save()) which ends up like this.
>>>>
>>>> The *only* safe assumption is that ==0 is success and !=0 is failure for
>>>> xc_domain_save(), and errno may or may not be relevant, whether rc is -1
>>>> or not.
>>> Do you mean: errno is undefined even if rc is -1?
>>>
>>> Thanks
>>> Wen Congyang
>> Correct, last time I checked.  The error handling in libxc is in dire
>> need of fixing from scratch.
> For v2 version migration, we can do it like this. But xc_domain_save()
> is a public API, and is used some years. So I am not sure if I can
> change its behavior...
>
> Thanks
> Wen Congyang

xc_domain_safe() is free to be changed.  libxc is an unstable API, and
has been changed for every Xen release I recall.

The migrationv2 code is just as guilty of using -1 and an undefined
errno, although it does promise that there will be a relevant error in
the log.  It stems from the same problems around the callbacks where
there is insufficient information passed, but also from the likes of
{read,write}_exact() which uses -1/0 for EOF.  I considered changing it
to -1/EBADF and still not decided which is the lessor of two evils.  (As
far as I am aware, the legacy code has the same problem.)

~Andrew

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC Patch v4 7/9] correct xc_domain_save()'s return value
  2014-09-23  9:25                         ` Andrew Cooper
@ 2014-09-23  9:29                           ` Wen Congyang
  2014-09-23 10:09                             ` Ian Campbell
  2014-09-23  9:52                           ` Wen Congyang
  1 sibling, 1 reply; 55+ messages in thread
From: Wen Congyang @ 2014-09-23  9:29 UTC (permalink / raw)
  To: Andrew Cooper, Ian Campbell, Ian Jackson
  Cc: Olaf Hering, Lai Jiangshan, Jiang Yunhong, Dong Eddie, xen devel,
	Yang Hongyang

On 09/23/2014 05:25 PM, Andrew Cooper wrote:
> On 23/09/14 10:10, Wen Congyang wrote:
>> On 09/23/2014 05:00 PM, Andrew Cooper wrote:
>>> On 23/09/14 03:14, Wen Congyang wrote:
>>>> On 09/22/2014 10:06 PM, Andrew Cooper wrote:
>>>>> On 22/09/14 14:13, Ian Campbell wrote:
>>>>>> On Mon, 2014-09-22 at 14:03 +0100, Ian Jackson wrote:
>>>>>>> Ian Campbell writes ("Re: [Xen-devel] [RFC Patch v4 7/9] correct xc_domain_save()'s return value"):
>>>>>>>> libxc doesn't know that, if it is important then it seems like the
>>>>>>>> failure + errno ought to be marshalled across the IPC link.
>>>>>>> Yes, but ...
>>>>>>>
>>>>>>>> It may be that this can be easily handled in
>>>>>>>> libxl__srm_callout_sendreply + helper_getreply. Ian J -- what do you
>>>>>>>> think?
>>>>>>> ... while that would be possbile, we have another option.
>>>>>>>
>>>>>>> We could say that the callbacks return errno values.  That would
>>>>>>> simplify the API and avoid having the IPC involve accesses to global
>>>>>>> variables (ie, things not in the functions' parameter lists).
>>>>>>>
>>>>>>> If we do that then it becomes the responsibility of xc_domain_save to
>>>>>>> either change its own API to return errno, or to save the callback's
>>>>>>> return value in errno.
>>>>>> Hrm. libxc is already a complete mess wrt error returning/handling
>>>>>> because some proportion of the code incorrectly does/assumes this sort
>>>>>> of thing is happening (because people were confused about the syscall
>>>>>> returns from the kernel vs. process context). Having a place in libxc
>>>>>> where this is now done on purpose seems a bit like setting the rope on
>>>>>> fire to me...
>>>>>>
>>>>>> Ian.
>>>>>>
>>>>> libxc is an absolute mess, but this is far from the only codepath (even
>>>>> in xc_domain_save()) which ends up like this.
>>>>>
>>>>> The *only* safe assumption is that ==0 is success and !=0 is failure for
>>>>> xc_domain_save(), and errno may or may not be relevant, whether rc is -1
>>>>> or not.
>>>> Do you mean: errno is undefined even if rc is -1?
>>>>
>>>> Thanks
>>>> Wen Congyang
>>> Correct, last time I checked.  The error handling in libxc is in dire
>>> need of fixing from scratch.
>> For v2 version migration, we can do it like this. But xc_domain_save()
>> is a public API, and is used some years. So I am not sure if I can
>> change its behavior...
>>
>> Thanks
>> Wen Congyang
> 
> xc_domain_safe() is free to be changed.  libxc is an unstable API, and
> has been changed for every Xen release I recall.

So, if the user wants to a stable API, he should use libxl. Is it right?

Thanks
Wen Congyang

> 
> The migrationv2 code is just as guilty of using -1 and an undefined
> errno, although it does promise that there will be a relevant error in
> the log.  It stems from the same problems around the callbacks where
> there is insufficient information passed, but also from the likes of
> {read,write}_exact() which uses -1/0 for EOF.  I considered changing it
> to -1/EBADF and still not decided which is the lessor of two evils.  (As
> far as I am aware, the legacy code has the same problem.)
> 
> ~Andrew
> 
> .
> 

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC Patch v4 7/9] correct xc_domain_save()'s return value
  2014-09-23  9:25                         ` Andrew Cooper
  2014-09-23  9:29                           ` Wen Congyang
@ 2014-09-23  9:52                           ` Wen Congyang
  1 sibling, 0 replies; 55+ messages in thread
From: Wen Congyang @ 2014-09-23  9:52 UTC (permalink / raw)
  To: Andrew Cooper, Ian Campbell, Ian Jackson
  Cc: Olaf Hering, Lai Jiangshan, Jiang Yunhong, Dong Eddie, xen devel,
	Yang Hongyang

On 09/23/2014 05:25 PM, Andrew Cooper wrote:
> On 23/09/14 10:10, Wen Congyang wrote:
>> On 09/23/2014 05:00 PM, Andrew Cooper wrote:
>>> On 23/09/14 03:14, Wen Congyang wrote:
>>>> On 09/22/2014 10:06 PM, Andrew Cooper wrote:
>>>>> On 22/09/14 14:13, Ian Campbell wrote:
>>>>>> On Mon, 2014-09-22 at 14:03 +0100, Ian Jackson wrote:
>>>>>>> Ian Campbell writes ("Re: [Xen-devel] [RFC Patch v4 7/9] correct xc_domain_save()'s return value"):
>>>>>>>> libxc doesn't know that, if it is important then it seems like the
>>>>>>>> failure + errno ought to be marshalled across the IPC link.
>>>>>>> Yes, but ...
>>>>>>>
>>>>>>>> It may be that this can be easily handled in
>>>>>>>> libxl__srm_callout_sendreply + helper_getreply. Ian J -- what do you
>>>>>>>> think?
>>>>>>> ... while that would be possbile, we have another option.
>>>>>>>
>>>>>>> We could say that the callbacks return errno values.  That would
>>>>>>> simplify the API and avoid having the IPC involve accesses to global
>>>>>>> variables (ie, things not in the functions' parameter lists).
>>>>>>>
>>>>>>> If we do that then it becomes the responsibility of xc_domain_save to
>>>>>>> either change its own API to return errno, or to save the callback's
>>>>>>> return value in errno.
>>>>>> Hrm. libxc is already a complete mess wrt error returning/handling
>>>>>> because some proportion of the code incorrectly does/assumes this sort
>>>>>> of thing is happening (because people were confused about the syscall
>>>>>> returns from the kernel vs. process context). Having a place in libxc
>>>>>> where this is now done on purpose seems a bit like setting the rope on
>>>>>> fire to me...
>>>>>>
>>>>>> Ian.
>>>>>>
>>>>> libxc is an absolute mess, but this is far from the only codepath (even
>>>>> in xc_domain_save()) which ends up like this.
>>>>>
>>>>> The *only* safe assumption is that ==0 is success and !=0 is failure for
>>>>> xc_domain_save(), and errno may or may not be relevant, whether rc is -1
>>>>> or not.
>>>> Do you mean: errno is undefined even if rc is -1?
>>>>
>>>> Thanks
>>>> Wen Congyang
>>> Correct, last time I checked.  The error handling in libxc is in dire
>>> need of fixing from scratch.
>> For v2 version migration, we can do it like this. But xc_domain_save()
>> is a public API, and is used some years. So I am not sure if I can
>> change its behavior...
>>
>> Thanks
>> Wen Congyang
> 
> xc_domain_safe() is free to be changed.  libxc is an unstable API, and
> has been changed for every Xen release I recall.

What about this patch?

>From 61f9470134c988e1ff3ae50eaa2e252d20379588 Mon Sep 17 00:00:00 2001
From: Wen Congyang <wency@cn.fujitsu.com>
Date: Tue, 23 Sep 2014 17:45:54 +0800
Subject: [PATCH] correct xc_domain_save()'s return value

If suspend_and_state() fails, the errno may be 0. But we assume
that the errno is not 0. So remove assert().

If the callback checkpoint() fails, it means that remus
failed. But xc_domain_save() returns 0.

This patch changes xc_domain_save()'s behavior, and the errno is
undefined even if xc_domain_save() returns 1.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 tools/libxc/xc_domain_save.c | 59 +++++++++++++++++++++++++-------------------
 1 file changed, 33 insertions(+), 26 deletions(-)

diff --git a/tools/libxc/xc_domain_save.c b/tools/libxc/xc_domain_save.c
index 254fdb3..d96fd24 100644
--- a/tools/libxc/xc_domain_save.c
+++ b/tools/libxc/xc_domain_save.c
@@ -807,7 +807,7 @@ int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iter
     xc_dominfo_t info;
     DECLARE_DOMCTL;
 
-    int rc, frc, i, j, last_iter = 0, iter = 0;
+    int rc = 1, frc, i, j, last_iter = 0, iter = 0;
     int live  = (flags & XCFLAGS_LIVE);
     int debug = (flags & XCFLAGS_DEBUG);
     int superpages = !!hvm;
@@ -2073,8 +2073,12 @@ int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iter
     goto out_rc;
 
  out:
-    rc = errno;
-    assert(rc);
+    /*
+     * When we come here, some error happened. But the errno may be undefined.
+     * For example, the callback suspend() fails, and we cannot get correct
+     * errno, because it may be implemented in libxl.
+     */
+    rc = 1;
  out_rc:
     completed = 1;
 
@@ -2113,30 +2117,34 @@ int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iter
     compressing = (flags & XCFLAGS_CHECKPOINT_COMPRESS);
 
     /* checkpoint_cb can spend arbitrarily long in between rounds */
-    if (!rc && callbacks->checkpoint &&
-        callbacks->checkpoint(callbacks->data) > 0)
+    if ( !rc && callbacks->checkpoint )
     {
-        /* reset stats timer */
-        print_stats(xch, dom, 0, &time_stats, &shadow_stats, 0);
-
-        /* last_iter = 1; */
-        if ( suspend_and_state(callbacks->suspend, callbacks->data, xch,
-                               io_fd, dom, &info) )
+        if ( callbacks->checkpoint(callbacks->data) > 0 )
         {
-            ERROR("Domain appears not to have suspended");
-            goto out;
-        }
-        DPRINTF("SUSPEND shinfo %08lx\n", info.shared_info_frame);
-        print_stats(xch, dom, 0, &time_stats, &shadow_stats, 1);
+            /* reset stats timer */
+            print_stats(xch, dom, 0, &time_stats, &shadow_stats, 0);
 
-        if ( xc_shadow_control(xch, dom,
-                               XEN_DOMCTL_SHADOW_OP_CLEAN, HYPERCALL_BUFFER(to_send),
-                               dinfo->p2m_size, NULL, 0, &shadow_stats) != dinfo->p2m_size )
-        {
-            PERROR("Error flushing shadow PT");
-        }
+            /* last_iter = 1; */
+            if ( suspend_and_state(callbacks->suspend, callbacks->data, xch,
+                                   io_fd, dom, &info) )
+            {
+                ERROR("Domain appears not to have suspended");
+                goto out;
+            }
+            DPRINTF("SUSPEND shinfo %08lx\n", info.shared_info_frame);
+            print_stats(xch, dom, 0, &time_stats, &shadow_stats, 1);
 
-        goto copypages;
+            if ( xc_shadow_control(xch, dom,
+                                   XEN_DOMCTL_SHADOW_OP_CLEAN, HYPERCALL_BUFFER(to_send),
+                                   dinfo->p2m_size, NULL, 0, &shadow_stats) != dinfo->p2m_size )
+            {
+                PERROR("Error flushing shadow PT");
+            }
+
+            goto copypages;
+        }
+        else
+            rc = 1;
     }
 
     if ( tmem_saved != 0 && live )
@@ -2174,11 +2182,10 @@ int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iter
     free(hvm_buf);
     outbuf_free(&ob_pagebuf);
 
-    errno = rc;
 exit:
-    DPRINTF("Save exit of domid %u with errno=%d\n", dom, errno);
+    DPRINTF("Save exit of domid %u with rc=%d\n", dom, rc);
 
-    return !!errno;
+    return rc;
 }
 
 /*
-- 

> 
> The migrationv2 code is just as guilty of using -1 and an undefined
> errno, although it does promise that there will be a relevant error in
> the log.  It stems from the same problems around the callbacks where
> there is insufficient information passed, but also from the likes of
> {read,write}_exact() which uses -1/0 for EOF.  I considered changing it
> to -1/EBADF and still not decided which is the lessor of two evils.  (As
> far as I am aware, the legacy code has the same problem.)
> 
> ~Andrew
> 
> .
> 

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [RFC Patch v4 5/9] check if mfn is supported by IOCTL_PRIVCMD_MMAPBATCH before calling ioctl()
  2014-09-23  1:40     ` Wen Congyang
@ 2014-09-23 10:08       ` Ian Campbell
  2014-09-24  1:36         ` Wen Congyang
  0 siblings, 1 reply; 55+ messages in thread
From: Ian Campbell @ 2014-09-23 10:08 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Lai Jiangshan, Ian Jackson, Jiang Yunhong, Dong Eddie, xen devel,
	Yang Hongyang

On Tue, 2014-09-23 at 09:40 +0800, Wen Congyang wrote:
> On 09/22/2014 10:26 PM, Ian Campbell wrote:
> > On Mon, 2014-09-22 at 13:59 +0800, Wen Congyang wrote:
> >> If mfn is invalid, ioctl(fd, IOCTL_PRIVCMD_MMAPBATCH, ..) also returns 0,
> >> and set error information in error bits(bits28-31). So if the user input
> >> a large valid mfn, we cannot reliably distinguish between a large MFN and
> >> an error. So we should check the input mfn before calling ioctl().
> >> The user can input more than one mfn, and part of them are ~0UL. In this
> >> case, the user expects we can map the memory for all valid mfn. So we
> >> cannot just return NULL if some mfn is not supported.
> > 
> > I don't follow this last bit. linux_privcmd_map_foreign_bulk already
> > returns NULL and maps nothing in some error cases (e.g. mmap failure),
> > what is the problem also doing that here?
> 
> Yes, if mmap fails, linux_privcmd_map_foreign_bulk() returns NULL.
> But some mfn is invalid, ioctl() returns 0, and then linux_privcmd_map_foreign_bulk()
> doesn't return NULL.

The caller of the mapping function must already handle a NULL return by
erroring out.

> For example:
> page0, page1, ... page m, page m+1, ... page n
> mfn 0, mfn 1, ... mfn m,  mfn m+1, ... mfn n
> 
> If only mfn m is invalid, the user can access page 0, page 1, page m+1.
> The user can know which page can't be accessed by the array err[].

Not for these problematic MFNs they can't, because err cannot be
trusted.

> If some mfn is valid, but it is large, and IOCTL_PRIVCMD_MMAPBATCH doesn't
> support it. The user doesn't know the page can't be accessed, and will
> cause page fault(the user program may segment fault) when the user accesses the page.

If the mfn is valid but is large enough to have the top nibble (the
"error nibble") set then there is no way to do proper error reporting,
because the mfn and the error code will get intermixed, so you can't
even tell success from failure.

> It is why I choose ~0UL. I don't know how to check if the mfn is valid,
> and we should allow the caller passes ~0UL, otherwise, it will break
> migration.

You don't need to know if it is valid, just that the error-nibble isn't
set. I'd be OK with special casing ~0UL as a way for the caller to
indicate that they don't want a mapping at a given index (~0UL is
already called INVALID_MFN elsewhere) and that they are prepared to deal
with that case explicitly.

That's a different case from the caller supplying an MFN which they
think is good but which is actually bad because the error-nibble has set
bits in it. The problem with that sort of bad MFN is that the caller
doesn't know they are bad and will expect the mapping to succeed, either
now or on a subsequent iteration, but these MFNs can *never* be mapped
(using this interface). This is IMHO a fairly catastrophic failure and
it should be treated as such, by returning NULL and requiring the caller
to deal with it as it would any other NULL return (by failing
immediately).

Ian.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC Patch v4 7/9] correct xc_domain_save()'s return value
  2014-09-23  9:29                           ` Wen Congyang
@ 2014-09-23 10:09                             ` Ian Campbell
  0 siblings, 0 replies; 55+ messages in thread
From: Ian Campbell @ 2014-09-23 10:09 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Olaf Hering, Lai Jiangshan, Andrew Cooper, Jiang Yunhong,
	Ian Jackson, xen devel, Dong Eddie, Yang Hongyang

On Tue, 2014-09-23 at 17:29 +0800, Wen Congyang wrote:
> So, if the user wants to a stable API, he should use libxl. Is it right?

Precisely.

Ian.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC Patch v4 8/9] store correct format into tapdisk-params/params
  2014-09-23  2:07     ` Wen Congyang
@ 2014-09-23 10:11       ` Ian Campbell
  2014-09-23 10:32         ` Wen Congyang
  0 siblings, 1 reply; 55+ messages in thread
From: Ian Campbell @ 2014-09-23 10:11 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Wei Liu, Lai Jiangshan, Dong Eddie, Jiang Yunhong, Ian Jackson,
	xen devel, Yang Hongyang

On Tue, 2014-09-23 at 10:07 +0800, Wen Congyang wrote:
> On 09/22/2014 10:37 PM, Ian Campbell wrote:
> > On Mon, 2014-09-22 at 13:59 +0800, Wen Congyang wrote:
> > 
> >> And we cannot use the API libxl_disk_format_from_string()
> >> to get the format from the string.
> > 
> > ... can you spell out why/where this is important please.
> 
> The API libxl_device_disk_list() will return all disks. But the
> type is wrong if it is a tapdisk device.

Thanks, please add this to the commit message.

> > 
> > Did you test blktap2 and qemu to be sure that they handle "raw" in
> > particular correctly?
> 
> Sorry, I don't understand what do you want to test.

That both blktap2 and qemu still correctly handle raw disk images after
your changes, since AIUI the xenstore contents will have changed for
them.

Ian.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC Patch v4 9/9] update libxl__device_disk_from_xs_be() to support blktap device
  2014-09-23  3:07     ` Wen Congyang
@ 2014-09-23 10:12       ` Ian Campbell
  0 siblings, 0 replies; 55+ messages in thread
From: Ian Campbell @ 2014-09-23 10:12 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Lai Jiangshan, Ian Jackson, Jiang Yunhong, Dong Eddie, xen devel,
	Shriram Rajagopalan, Yang Hongyang

On Tue, 2014-09-23 at 11:07 +0800, Wen Congyang wrote:
> >> +        if (disk->format != LIBXL_DISK_FORMAT_VHD &&
> >> +            disk->format != LIBXL_DISK_FORMAT_RAW) {
> >> +            LOG(ERROR, "unsupported tapdisk format: %s\n", tmp);
> > 
> > Hrm, if we get here then is there any reason to reject this? obviously
> > something wrote it etc.
> 
> Hmm, we can get here only when the xenstore node is corrupted.
> 
> > 
> > This function is supposed to read the current state and return it, I'm
> > not sure it should be rejecting what it finds TBH, and it would simplify
> > things not to do so.
> 
> The xenstore node is corrupted, what should we do? Ignore this disk or return
> error?

I think if it contains a valid format string you should just continue
using that value, rather than trying to second guess what other code may
or may not have written.

If it contains some completely corrupt value then you should error out,
which IIRC you do already in the error handling of the from_string call)

Ian.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC Patch v4 8/9] store correct format into tapdisk-params/params
  2014-09-23 10:11       ` Ian Campbell
@ 2014-09-23 10:32         ` Wen Congyang
  0 siblings, 0 replies; 55+ messages in thread
From: Wen Congyang @ 2014-09-23 10:32 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Wei Liu, Lai Jiangshan, Dong Eddie, Jiang Yunhong, Ian Jackson,
	xen devel, Yang Hongyang

On 09/23/2014 06:11 PM, Ian Campbell wrote:
> On Tue, 2014-09-23 at 10:07 +0800, Wen Congyang wrote:
>> On 09/22/2014 10:37 PM, Ian Campbell wrote:
>>> On Mon, 2014-09-22 at 13:59 +0800, Wen Congyang wrote:
>>>
>>>> And we cannot use the API libxl_disk_format_from_string()
>>>> to get the format from the string.
>>>
>>> ... can you spell out why/where this is important please.
>>
>> The API libxl_device_disk_list() will return all disks. But the
>> type is wrong if it is a tapdisk device.
> 
> Thanks, please add this to the commit message.

OK

> 
>>>
>>> Did you test blktap2 and qemu to be sure that they handle "raw" in
>>> particular correctly?
>>
>> Sorry, I don't understand what do you want to test.
> 
> That both blktap2 and qemu still correctly handle raw disk images after
> your changes, since AIUI the xenstore contents will have changed for
> them.

OK, I will check it

Thanks
Wen Congyang

> 
> Ian.
> 
> .
> 

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC Patch v4 5/9] check if mfn is supported by IOCTL_PRIVCMD_MMAPBATCH before calling ioctl()
  2014-09-23 10:08       ` Ian Campbell
@ 2014-09-24  1:36         ` Wen Congyang
  0 siblings, 0 replies; 55+ messages in thread
From: Wen Congyang @ 2014-09-24  1:36 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Lai Jiangshan, Ian Jackson, Jiang Yunhong, Dong Eddie, xen devel,
	Yang Hongyang

On 09/23/2014 06:08 PM, Ian Campbell wrote:
> On Tue, 2014-09-23 at 09:40 +0800, Wen Congyang wrote:
>> On 09/22/2014 10:26 PM, Ian Campbell wrote:
>>> On Mon, 2014-09-22 at 13:59 +0800, Wen Congyang wrote:
>>>> If mfn is invalid, ioctl(fd, IOCTL_PRIVCMD_MMAPBATCH, ..) also returns 0,
>>>> and set error information in error bits(bits28-31). So if the user input
>>>> a large valid mfn, we cannot reliably distinguish between a large MFN and
>>>> an error. So we should check the input mfn before calling ioctl().
>>>> The user can input more than one mfn, and part of them are ~0UL. In this
>>>> case, the user expects we can map the memory for all valid mfn. So we
>>>> cannot just return NULL if some mfn is not supported.
>>>
>>> I don't follow this last bit. linux_privcmd_map_foreign_bulk already
>>> returns NULL and maps nothing in some error cases (e.g. mmap failure),
>>> what is the problem also doing that here?
>>
>> Yes, if mmap fails, linux_privcmd_map_foreign_bulk() returns NULL.
>> But some mfn is invalid, ioctl() returns 0, and then linux_privcmd_map_foreign_bulk()
>> doesn't return NULL.
> 
> The caller of the mapping function must already handle a NULL return by
> erroring out.
> 
>> For example:
>> page0, page1, ... page m, page m+1, ... page n
>> mfn 0, mfn 1, ... mfn m,  mfn m+1, ... mfn n
>>
>> If only mfn m is invalid, the user can access page 0, page 1, page m+1.
>> The user can know which page can't be accessed by the array err[].
> 
> Not for these problematic MFNs they can't, because err cannot be
> trusted.
> 
>> If some mfn is valid, but it is large, and IOCTL_PRIVCMD_MMAPBATCH doesn't
>> support it. The user doesn't know the page can't be accessed, and will
>> cause page fault(the user program may segment fault) when the user accesses the page.
> 
> If the mfn is valid but is large enough to have the top nibble (the
> "error nibble") set then there is no way to do proper error reporting,
> because the mfn and the error code will get intermixed, so you can't
> even tell success from failure.
> 
> 
>> It is why I choose ~0UL. I don't know how to check if the mfn is valid,
>> and we should allow the caller passes ~0UL, otherwise, it will break
>> migration.
> 
> You don't need to know if it is valid, just that the error-nibble isn't
> set. I'd be OK with special casing ~0UL as a way for the caller to
> indicate that they don't want a mapping at a given index (~0UL is
> already called INVALID_MFN elsewhere) and that they are prepared to deal
> with that case explicitly.
> 
> That's a different case from the caller supplying an MFN which they
> think is good but which is actually bad because the error-nibble has set
> bits in it. The problem with that sort of bad MFN is that the caller
> doesn't know they are bad and will expect the mapping to succeed, either
> now or on a subsequent iteration, but these MFNs can *never* be mapped
> (using this interface). This is IMHO a fairly catastrophic failure and
> it should be treated as such, by returning NULL and requiring the caller
> to deal with it as it would any other NULL return (by failing
> immediately).

OK. I will update this patch soon

Thanks
Wen Congyang

> 
> Ian.
> 
> .
> 

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC Patch v4 3/9] pass correct file to qemu if we use blktap2
  2014-09-22  5:59 ` [RFC Patch v4 3/9] pass correct file to qemu if we use blktap2 Wen Congyang
@ 2014-12-11 16:45   ` George Dunlap
  2014-12-13 17:06     ` Wei Liu
  0 siblings, 1 reply; 55+ messages in thread
From: George Dunlap @ 2014-12-11 16:45 UTC (permalink / raw)
  To: Wen Congyang; +Cc: Ian Jackson, Ian Campbell, Jan Beulich, xen devel

On Mon, Sep 22, 2014 at 6:59 AM, Wen Congyang <wency@cn.fujitsu.com> wrote:
> If we use blktap2, the correct file should be blktap device
> not the pdev_path.
>
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> Cc: Shriram Rajagopalan <rshriram@cs.ubc.ca>
> Acked-by: Ian Campbell <ian.campbell@citrix.com>

If I'm reading this correctly, this is actually a fairly serious bug
in libxl -- it means that when using backend=tap with HVM domains,
qemu will actually *bypass entirely* the tapdisk process and access
the underlying file directly.

This should probably be backported, potentially as far back as 4.2.

 -George

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC Patch v4 3/9] pass correct file to qemu if we use blktap2
  2014-12-11 16:45   ` George Dunlap
@ 2014-12-13 17:06     ` Wei Liu
  2014-12-15 10:18       ` Ian Campbell
  0 siblings, 1 reply; 55+ messages in thread
From: Wei Liu @ 2014-12-13 17:06 UTC (permalink / raw)
  To: George Dunlap
  Cc: wei.liu2, Ian Campbell, Wen Congyang, Ian Jackson, xen devel,
	Jan Beulich

On Thu, Dec 11, 2014 at 04:45:09PM +0000, George Dunlap wrote:
> On Mon, Sep 22, 2014 at 6:59 AM, Wen Congyang <wency@cn.fujitsu.com> wrote:
> > If we use blktap2, the correct file should be blktap device
> > not the pdev_path.
> >
> > Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> > Cc: Shriram Rajagopalan <rshriram@cs.ubc.ca>
> > Acked-by: Ian Campbell <ian.campbell@citrix.com>
> 
> If I'm reading this correctly, this is actually a fairly serious bug
> in libxl -- it means that when using backend=tap with HVM domains,
> qemu will actually *bypass entirely* the tapdisk process and access
> the underlying file directly.
> 

Is it because qemu will corrupt the underlying file so it's very
serious? 

Wei.

> This should probably be backported, potentially as far back as 4.2.
> 
>  -George
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC Patch v4 3/9] pass correct file to qemu if we use blktap2
  2014-12-13 17:06     ` Wei Liu
@ 2014-12-15 10:18       ` Ian Campbell
  2014-12-15 11:10         ` George Dunlap
  0 siblings, 1 reply; 55+ messages in thread
From: Ian Campbell @ 2014-12-15 10:18 UTC (permalink / raw)
  To: Wei Liu; +Cc: George Dunlap, Ian Jackson, Jan Beulich, Wen Congyang, xen devel

On Sat, 2014-12-13 at 17:06 +0000, Wei Liu wrote:
> On Thu, Dec 11, 2014 at 04:45:09PM +0000, George Dunlap wrote:
> > On Mon, Sep 22, 2014 at 6:59 AM, Wen Congyang <wency@cn.fujitsu.com> wrote:
> > > If we use blktap2, the correct file should be blktap device
> > > not the pdev_path.
> > >
> > > Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> > > Cc: Shriram Rajagopalan <rshriram@cs.ubc.ca>
> > > Acked-by: Ian Campbell <ian.campbell@citrix.com>
> > 
> > If I'm reading this correctly, this is actually a fairly serious bug
> > in libxl -- it means that when using backend=tap with HVM domains,
> > qemu will actually *bypass entirely* the tapdisk process and access
> > the underlying file directly.
> > 
> 
> Is it because qemu will corrupt the underlying file so it's very
> serious? 

I think it ends up being reasonably safe due to the unplug stuff, in
general only one of the two paths should be active at any given time and
there is appropriate flushing/syncing on the unplug etc.

In fact, I'm not 100% sure this wasn't a deliberate design decision at
one point to avoid the overhead of doing both qdisk and tapdisk. (I'm
not going to argue that this still makes sense though).

If qemu is corrupting files then that's a different matter, which would
indeed be serious, and which this change might paper over/avoid.

Ian.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC Patch v4 3/9] pass correct file to qemu if we use blktap2
  2014-12-15 10:18       ` Ian Campbell
@ 2014-12-15 11:10         ` George Dunlap
  0 siblings, 0 replies; 55+ messages in thread
From: George Dunlap @ 2014-12-15 11:10 UTC (permalink / raw)
  To: Ian Campbell, Wei Liu; +Cc: Ian Jackson, Jan Beulich, Wen Congyang, xen devel

On 12/15/2014 10:18 AM, Ian Campbell wrote:
> On Sat, 2014-12-13 at 17:06 +0000, Wei Liu wrote:
>> On Thu, Dec 11, 2014 at 04:45:09PM +0000, George Dunlap wrote:
>>> On Mon, Sep 22, 2014 at 6:59 AM, Wen Congyang <wency@cn.fujitsu.com> wrote:
>>>> If we use blktap2, the correct file should be blktap device
>>>> not the pdev_path.
>>>>
>>>> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
>>>> Cc: Shriram Rajagopalan <rshriram@cs.ubc.ca>
>>>> Acked-by: Ian Campbell <ian.campbell@citrix.com>
>>>
>>> If I'm reading this correctly, this is actually a fairly serious bug
>>> in libxl -- it means that when using backend=tap with HVM domains,
>>> qemu will actually *bypass entirely* the tapdisk process and access
>>> the underlying file directly.
>>>
>>
>> Is it because qemu will corrupt the underlying file so it's very
>> serious? 
> 
> I think it ends up being reasonably safe due to the unplug stuff, in
> general only one of the two paths should be active at any given time and
> there is appropriate flushing/syncing on the unplug etc.
> 
> In fact, I'm not 100% sure this wasn't a deliberate design decision at
> one point to avoid the overhead of doing both qdisk and tapdisk. (I'm
> not going to argue that this still makes sense though).
> 
> If qemu is corrupting files then that's a different matter, which would
> indeed be serious, and which this change might paper over/avoid.

Yes, I think when I wrote that I had forgotten that even with just PV /
emulated there's usually a duplicate datapath that we need to sort out,
and that we already sort that out with the device unplug protocol.

Since half the point of tapdisk was to be able to do things like
snapshotting / other fancy tricks as colo does, not just implement
specific image formats, then if it was deliberate it was kind of dumb.
I'd rather just assume that it was overlooked. :-)

In any case, as it's already fixed in 4.5, I don't think the exact
degree of seriousness matters too much: it's above the threshold for
"should be backported" IMHO, and well below the threshold of "needs the
security response process".  So it should just be backported to 4.4,
4.3, and 4.2  (or some subset thereof, if we're designating stable trees).

 -George

^ permalink raw reply	[flat|nested] 55+ messages in thread

end of thread, other threads:[~2014-12-15 11:10 UTC | newest]

Thread overview: 55+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-09-22  5:59 [RFC Patch v4 0/9] Some bugfix patches Wen Congyang
2014-09-22  5:59 ` [RFC Patch v4 1/9] copy the correct page to memory Wen Congyang
2014-09-22 14:38   ` Ian Campbell
2014-09-22  5:59 ` [RFC Patch v4 2/9] csum the correct page Wen Congyang
2014-09-22  5:59 ` [RFC Patch v4 3/9] pass correct file to qemu if we use blktap2 Wen Congyang
2014-12-11 16:45   ` George Dunlap
2014-12-13 17:06     ` Wei Liu
2014-12-15 10:18       ` Ian Campbell
2014-12-15 11:10         ` George Dunlap
2014-09-22  5:59 ` [RFC Patch v4 4/9] read nictype from xenstore Wen Congyang
2014-09-22  5:59 ` [RFC Patch v4 5/9] check if mfn is supported by IOCTL_PRIVCMD_MMAPBATCH before calling ioctl() Wen Congyang
2014-09-22 14:26   ` Ian Campbell
2014-09-23  1:40     ` Wen Congyang
2014-09-23 10:08       ` Ian Campbell
2014-09-24  1:36         ` Wen Congyang
2014-09-22  5:59 ` [RFC Patch v4 6/9] don't zero out ioreq page and handle the pended I/O after resuming Wen Congyang
2014-09-22 11:22   ` Jan Beulich
2014-09-22  5:59 ` [RFC Patch v4 7/9] correct xc_domain_save()'s return value Wen Congyang
2014-09-22  7:30   ` Olaf Hering
2014-09-22  7:34     ` Wen Congyang
2014-09-22  7:46       ` Olaf Hering
2014-09-22  8:06         ` Wen Congyang
2014-09-22  8:13           ` Olaf Hering
2014-09-22  8:24             ` Wen Congyang
2014-09-22  8:41             ` Wen Congyang
2014-09-22 12:42           ` Ian Campbell
2014-09-22 13:03             ` Ian Jackson
2014-09-22 13:13               ` Ian Campbell
2014-09-22 13:57                 ` [PATCH RFC 0/3] " Ian Jackson
2014-09-22 13:58                   ` Ian Campbell
2014-09-22 14:00                     ` Ian Campbell
2014-09-22 13:58                   ` [PATCH 1/3] libxl: save helper: remove redundant declaration Ian Jackson
2014-09-22 13:58                   ` [PATCH 2/3] libxl: save helper: transport errno with all callback return values Ian Jackson
2014-09-22 13:58                   ` [PATCH 3/3] libxl: save/restore callbacks: enforce a useful errno value Ian Jackson
2014-09-22 14:07                     ` Ian Campbell
2014-09-22 14:08                       ` Ian Jackson
2014-09-22 14:06                 ` [RFC Patch v4 7/9] correct xc_domain_save()'s return value Andrew Cooper
2014-09-23  2:14                   ` Wen Congyang
2014-09-23  9:00                     ` Andrew Cooper
2014-09-23  9:10                       ` Wen Congyang
2014-09-23  9:25                         ` Andrew Cooper
2014-09-23  9:29                           ` Wen Congyang
2014-09-23 10:09                             ` Ian Campbell
2014-09-23  9:52                           ` Wen Congyang
2014-09-22  5:59 ` [RFC Patch v4 8/9] store correct format into tapdisk-params/params Wen Congyang
2014-09-22 14:37   ` Ian Campbell
2014-09-23  2:07     ` Wen Congyang
2014-09-23 10:11       ` Ian Campbell
2014-09-23 10:32         ` Wen Congyang
2014-09-22  5:59 ` [RFC Patch v4 9/9] update libxl__device_disk_from_xs_be() to support blktap device Wen Congyang
2014-09-22 15:08   ` Ian Campbell
2014-09-23  3:07     ` Wen Congyang
2014-09-23 10:12       ` Ian Campbell
2014-09-22 12:44 ` [RFC Patch v4 0/9] Some bugfix patches Ian Campbell
2014-09-22 12:56   ` Wen Congyang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).