From: Wen Congyang <wency@cn.fujitsu.com>
To: Andrew Cooper <andrew.cooper3@citrix.com>,
Ian Campbell <Ian.Campbell@citrix.com>,
Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Olaf Hering <olaf@aepfle.de>,
Lai Jiangshan <laijs@cn.fujitsu.com>,
Jiang Yunhong <yunhong.jiang@intel.com>,
Dong Eddie <eddie.dong@intel.com>,
xen devel <xen-devel@lists.xen.org>,
Yang Hongyang <yanghy@cn.fujitsu.com>
Subject: Re: [RFC Patch v4 7/9] correct xc_domain_save()'s return value
Date: Tue, 23 Sep 2014 17:52:38 +0800 [thread overview]
Message-ID: <542142E6.9050408@cn.fujitsu.com> (raw)
In-Reply-To: <54213C71.6090909@citrix.com>
On 09/23/2014 05:25 PM, Andrew Cooper wrote:
> On 23/09/14 10:10, Wen Congyang wrote:
>> On 09/23/2014 05:00 PM, Andrew Cooper wrote:
>>> On 23/09/14 03:14, Wen Congyang wrote:
>>>> On 09/22/2014 10:06 PM, Andrew Cooper wrote:
>>>>> On 22/09/14 14:13, Ian Campbell wrote:
>>>>>> On Mon, 2014-09-22 at 14:03 +0100, Ian Jackson wrote:
>>>>>>> Ian Campbell writes ("Re: [Xen-devel] [RFC Patch v4 7/9] correct xc_domain_save()'s return value"):
>>>>>>>> libxc doesn't know that, if it is important then it seems like the
>>>>>>>> failure + errno ought to be marshalled across the IPC link.
>>>>>>> Yes, but ...
>>>>>>>
>>>>>>>> It may be that this can be easily handled in
>>>>>>>> libxl__srm_callout_sendreply + helper_getreply. Ian J -- what do you
>>>>>>>> think?
>>>>>>> ... while that would be possbile, we have another option.
>>>>>>>
>>>>>>> We could say that the callbacks return errno values. That would
>>>>>>> simplify the API and avoid having the IPC involve accesses to global
>>>>>>> variables (ie, things not in the functions' parameter lists).
>>>>>>>
>>>>>>> If we do that then it becomes the responsibility of xc_domain_save to
>>>>>>> either change its own API to return errno, or to save the callback's
>>>>>>> return value in errno.
>>>>>> Hrm. libxc is already a complete mess wrt error returning/handling
>>>>>> because some proportion of the code incorrectly does/assumes this sort
>>>>>> of thing is happening (because people were confused about the syscall
>>>>>> returns from the kernel vs. process context). Having a place in libxc
>>>>>> where this is now done on purpose seems a bit like setting the rope on
>>>>>> fire to me...
>>>>>>
>>>>>> Ian.
>>>>>>
>>>>> libxc is an absolute mess, but this is far from the only codepath (even
>>>>> in xc_domain_save()) which ends up like this.
>>>>>
>>>>> The *only* safe assumption is that ==0 is success and !=0 is failure for
>>>>> xc_domain_save(), and errno may or may not be relevant, whether rc is -1
>>>>> or not.
>>>> Do you mean: errno is undefined even if rc is -1?
>>>>
>>>> Thanks
>>>> Wen Congyang
>>> Correct, last time I checked. The error handling in libxc is in dire
>>> need of fixing from scratch.
>> For v2 version migration, we can do it like this. But xc_domain_save()
>> is a public API, and is used some years. So I am not sure if I can
>> change its behavior...
>>
>> Thanks
>> Wen Congyang
>
> xc_domain_safe() is free to be changed. libxc is an unstable API, and
> has been changed for every Xen release I recall.
What about this patch?
>From 61f9470134c988e1ff3ae50eaa2e252d20379588 Mon Sep 17 00:00:00 2001
From: Wen Congyang <wency@cn.fujitsu.com>
Date: Tue, 23 Sep 2014 17:45:54 +0800
Subject: [PATCH] correct xc_domain_save()'s return value
If suspend_and_state() fails, the errno may be 0. But we assume
that the errno is not 0. So remove assert().
If the callback checkpoint() fails, it means that remus
failed. But xc_domain_save() returns 0.
This patch changes xc_domain_save()'s behavior, and the errno is
undefined even if xc_domain_save() returns 1.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
tools/libxc/xc_domain_save.c | 59 +++++++++++++++++++++++++-------------------
1 file changed, 33 insertions(+), 26 deletions(-)
diff --git a/tools/libxc/xc_domain_save.c b/tools/libxc/xc_domain_save.c
index 254fdb3..d96fd24 100644
--- a/tools/libxc/xc_domain_save.c
+++ b/tools/libxc/xc_domain_save.c
@@ -807,7 +807,7 @@ int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iter
xc_dominfo_t info;
DECLARE_DOMCTL;
- int rc, frc, i, j, last_iter = 0, iter = 0;
+ int rc = 1, frc, i, j, last_iter = 0, iter = 0;
int live = (flags & XCFLAGS_LIVE);
int debug = (flags & XCFLAGS_DEBUG);
int superpages = !!hvm;
@@ -2073,8 +2073,12 @@ int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iter
goto out_rc;
out:
- rc = errno;
- assert(rc);
+ /*
+ * When we come here, some error happened. But the errno may be undefined.
+ * For example, the callback suspend() fails, and we cannot get correct
+ * errno, because it may be implemented in libxl.
+ */
+ rc = 1;
out_rc:
completed = 1;
@@ -2113,30 +2117,34 @@ int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iter
compressing = (flags & XCFLAGS_CHECKPOINT_COMPRESS);
/* checkpoint_cb can spend arbitrarily long in between rounds */
- if (!rc && callbacks->checkpoint &&
- callbacks->checkpoint(callbacks->data) > 0)
+ if ( !rc && callbacks->checkpoint )
{
- /* reset stats timer */
- print_stats(xch, dom, 0, &time_stats, &shadow_stats, 0);
-
- /* last_iter = 1; */
- if ( suspend_and_state(callbacks->suspend, callbacks->data, xch,
- io_fd, dom, &info) )
+ if ( callbacks->checkpoint(callbacks->data) > 0 )
{
- ERROR("Domain appears not to have suspended");
- goto out;
- }
- DPRINTF("SUSPEND shinfo %08lx\n", info.shared_info_frame);
- print_stats(xch, dom, 0, &time_stats, &shadow_stats, 1);
+ /* reset stats timer */
+ print_stats(xch, dom, 0, &time_stats, &shadow_stats, 0);
- if ( xc_shadow_control(xch, dom,
- XEN_DOMCTL_SHADOW_OP_CLEAN, HYPERCALL_BUFFER(to_send),
- dinfo->p2m_size, NULL, 0, &shadow_stats) != dinfo->p2m_size )
- {
- PERROR("Error flushing shadow PT");
- }
+ /* last_iter = 1; */
+ if ( suspend_and_state(callbacks->suspend, callbacks->data, xch,
+ io_fd, dom, &info) )
+ {
+ ERROR("Domain appears not to have suspended");
+ goto out;
+ }
+ DPRINTF("SUSPEND shinfo %08lx\n", info.shared_info_frame);
+ print_stats(xch, dom, 0, &time_stats, &shadow_stats, 1);
- goto copypages;
+ if ( xc_shadow_control(xch, dom,
+ XEN_DOMCTL_SHADOW_OP_CLEAN, HYPERCALL_BUFFER(to_send),
+ dinfo->p2m_size, NULL, 0, &shadow_stats) != dinfo->p2m_size )
+ {
+ PERROR("Error flushing shadow PT");
+ }
+
+ goto copypages;
+ }
+ else
+ rc = 1;
}
if ( tmem_saved != 0 && live )
@@ -2174,11 +2182,10 @@ int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iter
free(hvm_buf);
outbuf_free(&ob_pagebuf);
- errno = rc;
exit:
- DPRINTF("Save exit of domid %u with errno=%d\n", dom, errno);
+ DPRINTF("Save exit of domid %u with rc=%d\n", dom, rc);
- return !!errno;
+ return rc;
}
/*
--
>
> The migrationv2 code is just as guilty of using -1 and an undefined
> errno, although it does promise that there will be a relevant error in
> the log. It stems from the same problems around the callbacks where
> there is insufficient information passed, but also from the likes of
> {read,write}_exact() which uses -1/0 for EOF. I considered changing it
> to -1/EBADF and still not decided which is the lessor of two evils. (As
> far as I am aware, the legacy code has the same problem.)
>
> ~Andrew
>
> .
>
next prev parent reply other threads:[~2014-09-23 9:52 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-22 5:59 [RFC Patch v4 0/9] Some bugfix patches Wen Congyang
2014-09-22 5:59 ` [RFC Patch v4 1/9] copy the correct page to memory Wen Congyang
2014-09-22 14:38 ` Ian Campbell
2014-09-22 5:59 ` [RFC Patch v4 2/9] csum the correct page Wen Congyang
2014-09-22 5:59 ` [RFC Patch v4 3/9] pass correct file to qemu if we use blktap2 Wen Congyang
2014-12-11 16:45 ` George Dunlap
2014-12-13 17:06 ` Wei Liu
2014-12-15 10:18 ` Ian Campbell
2014-12-15 11:10 ` George Dunlap
2014-09-22 5:59 ` [RFC Patch v4 4/9] read nictype from xenstore Wen Congyang
2014-09-22 5:59 ` [RFC Patch v4 5/9] check if mfn is supported by IOCTL_PRIVCMD_MMAPBATCH before calling ioctl() Wen Congyang
2014-09-22 14:26 ` Ian Campbell
2014-09-23 1:40 ` Wen Congyang
2014-09-23 10:08 ` Ian Campbell
2014-09-24 1:36 ` Wen Congyang
2014-09-22 5:59 ` [RFC Patch v4 6/9] don't zero out ioreq page and handle the pended I/O after resuming Wen Congyang
2014-09-22 11:22 ` Jan Beulich
2014-09-22 5:59 ` [RFC Patch v4 7/9] correct xc_domain_save()'s return value Wen Congyang
2014-09-22 7:30 ` Olaf Hering
2014-09-22 7:34 ` Wen Congyang
2014-09-22 7:46 ` Olaf Hering
2014-09-22 8:06 ` Wen Congyang
2014-09-22 8:13 ` Olaf Hering
2014-09-22 8:24 ` Wen Congyang
2014-09-22 8:41 ` Wen Congyang
2014-09-22 12:42 ` Ian Campbell
2014-09-22 13:03 ` Ian Jackson
2014-09-22 13:13 ` Ian Campbell
2014-09-22 13:57 ` [PATCH RFC 0/3] " Ian Jackson
2014-09-22 13:58 ` Ian Campbell
2014-09-22 14:00 ` Ian Campbell
2014-09-22 13:58 ` [PATCH 1/3] libxl: save helper: remove redundant declaration Ian Jackson
2014-09-22 13:58 ` [PATCH 2/3] libxl: save helper: transport errno with all callback return values Ian Jackson
2014-09-22 13:58 ` [PATCH 3/3] libxl: save/restore callbacks: enforce a useful errno value Ian Jackson
2014-09-22 14:07 ` Ian Campbell
2014-09-22 14:08 ` Ian Jackson
2014-09-22 14:06 ` [RFC Patch v4 7/9] correct xc_domain_save()'s return value Andrew Cooper
2014-09-23 2:14 ` Wen Congyang
2014-09-23 9:00 ` Andrew Cooper
2014-09-23 9:10 ` Wen Congyang
2014-09-23 9:25 ` Andrew Cooper
2014-09-23 9:29 ` Wen Congyang
2014-09-23 10:09 ` Ian Campbell
2014-09-23 9:52 ` Wen Congyang [this message]
2014-09-22 5:59 ` [RFC Patch v4 8/9] store correct format into tapdisk-params/params Wen Congyang
2014-09-22 14:37 ` Ian Campbell
2014-09-23 2:07 ` Wen Congyang
2014-09-23 10:11 ` Ian Campbell
2014-09-23 10:32 ` Wen Congyang
2014-09-22 5:59 ` [RFC Patch v4 9/9] update libxl__device_disk_from_xs_be() to support blktap device Wen Congyang
2014-09-22 15:08 ` Ian Campbell
2014-09-23 3:07 ` Wen Congyang
2014-09-23 10:12 ` Ian Campbell
2014-09-22 12:44 ` [RFC Patch v4 0/9] Some bugfix patches Ian Campbell
2014-09-22 12:56 ` Wen Congyang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=542142E6.9050408@cn.fujitsu.com \
--to=wency@cn.fujitsu.com \
--cc=Ian.Campbell@citrix.com \
--cc=Ian.Jackson@eu.citrix.com \
--cc=andrew.cooper3@citrix.com \
--cc=eddie.dong@intel.com \
--cc=laijs@cn.fujitsu.com \
--cc=olaf@aepfle.de \
--cc=xen-devel@lists.xen.org \
--cc=yanghy@cn.fujitsu.com \
--cc=yunhong.jiang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.