From: Keir Fraser <keir.fraser@eu.citrix.com>
To: "Tian, Kevin" <kevin.tian@intel.com>,
'James Song' <jsong@novell.com>,
"xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>
Subject: Re: when timer go back in dom0 save and restore or migrate, PV domain hung
Date: Wed, 26 Nov 2008 14:11:16 +0000 [thread overview]
Message-ID: <C5530984.296E3%keir.fraser@eu.citrix.com> (raw)
In-Reply-To: <0A882F4D99BBF6449D58E61AAFD7EDD601E23B98@pdsmsx502.ccr.corp.intel.com>
[-- Attachment #1.1: Type: text/plain, Size: 6923 bytes --]
The problem hasn't been fully explained, but I can say that PV guests expect
system time to jump across s/r and deal with that. For example, Linux
doesn't use Xen system time internally, but uses its progress to
periodically update jiffies, which does not warp across s/r.
We have had problems corrupting wc_sec/wc_nsec in xc_domain_restore.c, but
that was fixed some time ago.
-- Keir
On 26/11/08 14:00, "Tian, Kevin" <kevin.tian@intel.com> wrote:
> This is not a s/r or lm specific issue. For example, system time can be
> changed even when pv guest is running. Your patch only hacks restore point
> once, and wc_sec can still be changed later when system time is changed
> on-the-fly again.
>
> IIRC, pv guest can catch up wall clock change in timer interrupt, and
> time_resume will sync internal processed system time with new system time
> after restored. But I'm not sure whether it's enough. Actually the more
> interesting is the uptime difference. For example, timer with expiration
> calculated on previous system time may wait nearly infinite if uptime among
> two boxes vary a lot. But I think such issue should have been considered
> already, e.g. some user tool assistance. I think Keir can comment better here.
>
> BTW, do you happen to know what exactly dom0 hangs on? In some busy loop to
> catch up time, or long delay to some critical timer expiration?
>
> Thanks,
> Kevin
>
>>
>>
>>
>> From: xen-devel-bounces@lists.xensource.com
>> [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of James Song
>> Sent: Tuesday, November 25, 2008 4:02 PM
>> To: xen-devel@lists.xensource.com
>> Subject: [Xen-devel] when timer go back in dom0 save and restore or migrate,
>> PV domain hung
>>
>>
>> Hi,
>> I find PV domin hung, When we take those steps
>> 1, save PV domain
>> 2, change system time of PV domain back
>> 3, restore a PV domain
>> or
>> 1, migrate a PV domain from Machine A to Machine B
>> 2, the system time of Machine B is slower than Machine A.
>> the problem is wc_sec will be change when system-time chanaged in dom0 or
>> restore in a slower-system-time machine, but when restoring, xen don't
>> restore the wc_sec of share_info from xenstore and use native one. So guest
>> os will hang.
>> this patch will work for this issue.
>>
>> Thanks
>> -- Song Wei
>>
>> diff -r a5ed0dbc829f tools/libxc/xc_domain_restore.c
>> --- a/tools/libxc/xc_domain_restore.c Tue Nov 18 14:34:14 2008 +0800
>> +++ b/tools/libxc/xc_domain_restore.c Fri Nov 21 17:34:15 2008 +0800
>> @@ -328,6 +328,16 @@
>>
>> /* For info only */
>> nr_pfns = 0;
>> + //jsong@novell.com, james song
>> + memset(&domctl, 0, sizeof(domctl));
>> + domctl.domain = dom;
>> + domctl.cmd = XEN_DOMCTL_restoredomain;
>> + frc = do_domctl(xc_handle, &domctl);
>> + if ( frc != 0 )
>> + {
>> + ERROR("Unable to set flag of restore.");
>> + goto out;
>> + }
>>
>> if ( read_exact(io_fd, &p2m_size, sizeof(unsigned long)) )
>> {
>> @@ -1120,6 +1130,8 @@
>>
>> /* restore saved vcpu_info and arch specific info */
>> MEMCPY_FIELD(new_shared_info, old_shared_info, vcpu_info);
>> + MEMCPY_FIELD(new_shared_info, old_shared_info, wc_nsec);
>> + MEMCPY_FIELD(new_shared_info, old_shared_info, wc_sec);
>> MEMCPY_FIELD(new_shared_info, old_shared_info, arch);
>>
>> /* clear any pending events and the selector */
>> diff -r a5ed0dbc829f xen/arch/x86/time.c
>> --- a/xen/arch/x86/time.c Tue Nov 18 14:34:14 2008 +0800
>> +++ b/xen/arch/x86/time.c Fri Nov 21 17:34:15 2008 +0800
>> @@ -689,7 +689,6 @@
>> wmb();
>> (*version)++;
>> }
>> -
>> void update_vcpu_system_time(struct vcpu *v)
>> {
>> struct cpu_time *t;
>> @@ -703,7 +702,6 @@
>>
>> if ( u->tsc_timestamp == t->local_tsc_stamp )
>> return;
>> -
>> version_update_begin(&u->version);
>>
>> u->tsc_timestamp = t->local_tsc_stamp;
>> @@ -713,14 +711,19 @@
>>
>> version_update_end(&u->version);
>> }
>> -
>> void update_domain_wallclock_time(struct domain *d)
>> {
>> spin_lock(&wc_lock);
>> + if(d->after_restore )
>> + {
>> + d->after_restore = 0;
>> + goto out; //jsong@novell.com
>> + }
>> version_update_begin(&shared_info(d, wc_version));
>> shared_info(d, wc_sec) = wc_sec + d->time_offset_seconds;
>> shared_info(d, wc_nsec) = wc_nsec;
>> version_update_end(&shared_info(d, wc_version));
>> +out:
>> spin_unlock(&wc_lock);
>> }
>>
>> @@ -751,7 +754,6 @@
>> u64 x;
>> u32 y, _wc_sec, _wc_nsec;
>> struct domain *d;
>> -
>> x = (secs * 1000000000ULL) + (u64)nsecs - system_time_base;
>> y = do_div(x, 1000000000);
>>
>> @@ -1050,7 +1052,6 @@
>> struct tm wallclock_time(void)
>> {
>> uint64_t seconds;
>> -
>> if ( !wc_sec )
>> return (struct tm) { 0 };
>>
>> diff -r a5ed0dbc829f xen/common/domctl.c
>> --- a/xen/common/domctl.c Tue Nov 18 14:34:14 2008 +0800
>> +++ b/xen/common/domctl.c Fri Nov 21 17:34:15 2008 +0800
>> @@ -24,7 +24,6 @@
>> #include <asm/current.h>
>> #include <public/domctl.h>
>> #include <xsm/xsm.h>
>> -
>> extern long arch_do_domctl(
>> struct xen_domctl *op, XEN_GUEST_HANDLE(xen_domctl_t) u_domctl);
>>
>> @@ -315,6 +314,16 @@
>> ret = 0;
>> }
>> break;
>> + case XEN_DOMCTL_restoredomain:
>> + {
>> + struct domain *d;
>> + if ( (d = rcu_lock_domain_by_id(op->domain)) == NULL )
>> + break;
>> +
>> + d->after_restore = 1;
>> + rcu_unlock_domain(d);
>> + break;
>> + }
>>
>> case XEN_DOMCTL_createdomain:
>> {
>> diff -r a5ed0dbc829f xen/include/public/domctl.h
>> --- a/xen/include/public/domctl.h Tue Nov 18 14:34:14 2008 +0800
>> +++ b/xen/include/public/domctl.h Fri Nov 21 17:34:15 2008 +0800
>> @@ -61,6 +61,7 @@
>> #define XEN_DOMCTL_destroydomain 2
>> #define XEN_DOMCTL_pausedomain 3
>> #define XEN_DOMCTL_unpausedomain 4
>> +#define XEN_DOMCTL_restoredomain 51
>> #define XEN_DOMCTL_resumedomain 27
>>
>> #define XEN_DOMCTL_getdomaininfo 5
>> diff -r a5ed0dbc829f xen/include/xen/sched.h
>> --- a/xen/include/xen/sched.h Tue Nov 18 14:34:14 2008 +0800
>> +++ b/xen/include/xen/sched.h Fri Nov 21 17:34:15 2008 +0800
>> @@ -231,6 +231,7 @@
>> * cause a deadlock. Acquirers don't spin waiting; they preempt.
>> */
>> spinlock_t hypercall_deadlock_mutex;
>> + int after_restore; //jsong@novell.com
>> };
>>
>> struct domain_setup_info
>> -----------------------------------------------------------------------------
>> ----------------
>> Thanks
>> --Song wei
>>
>
[-- Attachment #1.2: Type: text/html, Size: 11779 bytes --]
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
next prev parent reply other threads:[~2008-11-26 14:11 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-11-25 8:01 when timer go back in dom0 save and restore or migrate, PV domain hung James Song
2008-11-26 14:00 ` Tian, Kevin
2008-11-26 14:11 ` Keir Fraser [this message]
2008-11-26 14:20 ` Tian, Kevin
2008-11-26 14:26 ` Keir Fraser
2008-11-26 14:32 ` Tian, Kevin
2008-11-26 14:58 ` Keir Fraser
2008-11-27 1:17 ` Tian, Kevin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=C5530984.296E3%keir.fraser@eu.citrix.com \
--to=keir.fraser@eu.citrix.com \
--cc=jsong@novell.com \
--cc=kevin.tian@intel.com \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.