All of lore.kernel.org
 help / color / mirror / Atom feed
From: Keir Fraser <keir.fraser@eu.citrix.com>
To: "Tian, Kevin" <kevin.tian@intel.com>,
	'James Song' <jsong@novell.com>,
	"xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>
Subject: Re: when timer go back in dom0 save and restore or migrate, PV domain hung
Date: Wed, 26 Nov 2008 14:26:28 +0000	[thread overview]
Message-ID: <C5530D14.296EA%keir.fraser@eu.citrix.com> (raw)
In-Reply-To: <0A882F4D99BBF6449D58E61AAFD7EDD601E23B9A@pdsmsx502.ccr.corp.intel.com>


[-- Attachment #1.1: Type: text/plain, Size: 8265 bytes --]

hrtimers add wall_to_monotonic to xtime to get a timesource that doesn't (or
shouldn't!) warp.

 -- Keir

On 26/11/08 14:20, "Tian, Kevin" <kevin.tian@intel.com> wrote:

> how about hrtimers? one mode is CLOCK_REALTIME, which uses getnstimeofday as
> expiration. Once system time is changed either in local or new machine, that
> expiration can't be adjusted. but i'm not sure whether it still makes sense to
> try hrtimers in a guest.
>  
> Thanks
> Kevin
> 
>>  
>>  
>> 
>>  From: Keir Fraser  [mailto:keir.fraser@eu.citrix.com]
>> Sent: Wednesday, November 26,  2008 10:11 PM
>> To: Tian, Kevin; 'James Song';  xen-devel@lists.xensource.com
>> Subject: Re: [Xen-devel] when timer go  back in dom0 save and restore or
>> migrate, PV domain hung
>> 
>>  
>> The problem hasn't been fully explained, but I can say  that PV guests expect
>> system time to jump across s/r and deal with that. For  example, Linux
>> doesn't use Xen system time internally, but uses its progress  to
>> periodically update jiffies, which does not warp across s/r.
>> 
>> We have  had problems corrupting wc_sec/wc_nsec in xc_domain_restore.c, but
>> that was  fixed some time ago.
>> 
>>  -- Keir
>> 
>> On 26/11/08 14:00, "Tian,  Kevin" <kevin.tian@intel.com> wrote:
>> 
>>  
>>> This is not a s/r or lm specific issue. For example, system time  can be
>>> changed even when pv guest is running. Your patch only hacks restore  point
>>> once, and wc_sec can still be changed later when system time is  changed
>>> on-the-fly again.
>>> 
>>> IIRC, pv guest can catch up wall clock change in timer interrupt,  and
>>> time_resume will sync internal processed system time with new system  time
>>> after restored. But I'm not sure whether it's enough. Actually the more
>>> interesting is the uptime difference. For example, timer with expiration
>>> calculated on previous system time may wait nearly infinite if uptime among
>>> two boxes vary a lot. But I think such issue should have been considered
>>> already, e.g. some user tool assistance. I think Keir can comment better
>>> here.
>>> 
>>> BTW, do you happen to know what exactly dom0 hangs on? In some  busy loop to
>>> catch up time, or long delay to some critical timer  expiration?
>>> 
>>> Thanks,
>>> Kevin
>>> 
>>>  
>>>> 
>>>>  
>>>>  
>>>> 
>>>>  From:  xen-devel-bounces@lists.xensource.com
>>>> [mailto:xen-devel-bounces@lists.xensource.com]  On Behalf Of James  Song
>>>> Sent: Tuesday, November 25,  2008 4:02 PM
>>>> To:   xen-devel@lists.xensource.com
>>>> Subject: [Xen-devel] when  timer go  back in dom0 save and restore or
>>>> migrate, PV domain  hung
>>>> 
>>>>  
>>>> Hi,
>>>>    I  find PV domin hung, When we take those steps
>>>>          1, save PV  domain
>>>>          2,  change system time of  PV domain back
>>>>          3, restore   a PV domain
>>>>         or   
>>>>          1, migrate  a PV domain  from Machine A to Machine  B
>>>>          2, the system   time of Machine B is slower than Machine A.
>>>>    the  problem is  wc_sec will be change when system-time chanaged in dom0
>>>> or restore in a  slower-system-time machine, but when restoring, xen  don't
>>>> restore the wc_sec  of share_info from xenstore and use native  one. So
>>>> guest os will hang.
>>>> this patch will work for this  issue.
>>>> 
>>>>  Thanks
>>>>  -- Song  Wei
>>>> 
>>>> diff -r  a5ed0dbc829f tools/libxc/xc_domain_restore.c
>>>> ---   a/tools/libxc/xc_domain_restore.c    Tue Nov 18  14:34:14 2008  +0800
>>>> +++ b/tools/libxc/xc_domain_restore.c     Fri Nov 21  17:34:15 2008 +0800
>>>> @@ -328,6  +328,16  @@
>>>>  
>>>>      /* For info  only  */
>>>>      nr_pfns = 0;
>>>> +      //jsong@novell.com, james song
>>>> +     memset(&domctl, 0,  sizeof(domctl));
>>>> +     domctl.domain =  dom;
>>>> +     domctl.cmd    =   XEN_DOMCTL_restoredomain;
>>>> +    frc =  do_domctl(xc_handle,  &domctl);
>>>> +    if ( frc  != 0 )
>>>> +     {
>>>> +              ERROR("Unable  to set flag of  restore.");
>>>> +              goto  out;
>>>> +     }
>>>>  
>>>>      if (   read_exact(io_fd, &p2m_size, sizeof(unsigned long))   )
>>>>      {
>>>> @@ -1120,6 +1130,8   @@
>>>>  
>>>>      /* restore saved  vcpu_info and arch  specific info  */
>>>>      MEMCPY_FIELD(new_shared_info,   old_shared_info, vcpu_info);
>>>> +      MEMCPY_FIELD(new_shared_info, old_shared_info,   wc_nsec);
>>>> +    MEMCPY_FIELD(new_shared_info,   old_shared_info,  wc_sec);
>>>>       MEMCPY_FIELD(new_shared_info,  old_shared_info,   arch);
>>>>  
>>>>      /* clear any  pending events and  the selector */
>>>> diff -r a5ed0dbc829f  xen/arch/x86/time.c
>>>> ---  a/xen/arch/x86/time.c     Tue Nov 18 14:34:14 2008 +0800
>>>> +++   b/xen/arch/x86/time.c    Fri Nov 21 17:34:15 2008  +0800
>>>> @@  -689,7 +689,6  @@
>>>>       wmb();
>>>>      (*version)++;
>>>>  }
>>>> -
>>>>  void   update_vcpu_system_time(struct vcpu  *v)
>>>>  {
>>>>       struct cpu_time        *t;
>>>> @@ -703,7 +702,6   @@
>>>>  
>>>>      if (  u->tsc_timestamp ==  t->local_tsc_stamp  )
>>>>           return;
>>>> -
>>>>       version_update_begin(&u->version);
>>>>  
>>>>       u->tsc_timestamp      = t->local_tsc_stamp;
>>>> @@  -713,14  +711,19  @@
>>>>  
>>>>       version_update_end(&u->version);
>>>>  }
>>>> -
>>>>  void   update_domain_wallclock_time(struct domain   *d)
>>>>  {
>>>>       spin_lock(&wc_lock);
>>>> +     if(d->after_restore  )
>>>> +     {
>>>> +          d->after_restore =  0;
>>>> +        goto  out;  //jsong@novell.com
>>>> +     }
>>>>       version_update_begin(&shared_info(d,   wc_version));
>>>>      shared_info(d,  wc_sec)  =  wc_sec +  d->time_offset_seconds;
>>>>      shared_info(d,   wc_nsec) =  wc_nsec;
>>>>       version_update_end(&shared_info(d,   wc_version));
>>>> +out:
>>>>       spin_unlock(&wc_lock);
>>>>  }
>>>>  
>>>> @@  -751,7 +754,6  @@
>>>>      u64  x;
>>>>      u32 y,  _wc_sec,  _wc_nsec;
>>>>      struct domain   *d;
>>>> -
>>>>      x = (secs * 1000000000ULL)  + (u64)nsecs -  system_time_base;
>>>>      y  = do_div(x,  1000000000);
>>>>  
>>>> @@ -1050,7 +1052,6  @@
>>>>  struct tm   wallclock_time(void)
>>>>  {
>>>>      uint64_t   seconds;
>>>> -
>>>>      if ( !wc_sec   )
>>>>          return  (struct tm) { 0  };
>>>>  
>>>> diff -r a5ed0dbc829f  xen/common/domctl.c
>>>> ---  a/xen/common/domctl.c     Tue Nov 18 14:34:14 2008 +0800
>>>> +++   b/xen/common/domctl.c    Fri Nov 21 17:34:15 2008  +0800
>>>> @@  -24,7 +24,6 @@
>>>>  #include  <asm/current.h>
>>>>  #include   <public/domctl.h>
>>>>  #include   <xsm/xsm.h>
>>>> -
>>>>  extern long   arch_do_domctl(
>>>>      struct xen_domctl  *op,  XEN_GUEST_HANDLE(xen_domctl_t) u_domctl);
>>>>  
>>>> @@  -315,6 +314,16   @@
>>>>          ret =   0;
>>>>      }
>>>>       break;
>>>> +    case XEN_DOMCTL_restoredomain:
>>>> +    {
>>>> +         struct domain  *d;
>>>> +         if ( (d =   rcu_lock_domain_by_id(op->domain)) == NULL  )
>>>> +              break;
>>>> +          
>>>> +         d->after_restore =   1;
>>>> +          rcu_unlock_domain(d);
>>>> +          break;
>>>> +    }
>>>>  
>>>>      case   XEN_DOMCTL_createdomain:
>>>>      {
>>>> diff  -r a5ed0dbc829f  xen/include/public/domctl.h
>>>> ---   a/xen/include/public/domctl.h    Tue Nov 18 14:34:14  2008  +0800
>>>> +++ b/xen/include/public/domctl.h     Fri Nov 21  17:34:15 2008 +0800
>>>> @@ -61,6 +61,7  @@
>>>>  #define  XEN_DOMCTL_destroydomain       2
>>>>  #define   XEN_DOMCTL_pausedomain          3
>>>>  #define  XEN_DOMCTL_unpausedomain       4
>>>> +#define  XEN_DOMCTL_restoredomain        51
>>>>  #define  XEN_DOMCTL_resumedomain        27
>>>>  
>>>>  #define   XEN_DOMCTL_getdomaininfo      5
>>>> diff -r   a5ed0dbc829f xen/include/xen/sched.h
>>>> ---   a/xen/include/xen/sched.h    Tue Nov 18 14:34:14 2008   +0800
>>>> +++ b/xen/include/xen/sched.h    Fri Nov 21  17:34:15  2008 +0800
>>>> @@ -231,6 +231,7  @@
>>>>       * cause a  deadlock.  Acquirers don't spin waiting; they   preempt.
>>>>       */
>>>>       spinlock_t  hypercall_deadlock_mutex;
>>>> +    int after_restore;   //jsong@novell.com
>>>>  };
>>>>  
>>>>  struct   domain_setup_info
>>>> ---------------------------------------------------------------------------
>>>> ------------------
>>>>  Thanks
>>>> --Song   wei
>>>> 
>>> 
>> 
> 



[-- Attachment #1.2: Type: text/html, Size: 14358 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

  reply	other threads:[~2008-11-26 14:26 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-25  8:01 when timer go back in dom0 save and restore or migrate, PV domain hung James Song
2008-11-26 14:00 ` Tian, Kevin
2008-11-26 14:11   ` Keir Fraser
2008-11-26 14:20     ` Tian, Kevin
2008-11-26 14:26       ` Keir Fraser [this message]
2008-11-26 14:32         ` Tian, Kevin
2008-11-26 14:58           ` Keir Fraser
2008-11-27  1:17             ` Tian, Kevin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=C5530D14.296EA%keir.fraser@eu.citrix.com \
    --to=keir.fraser@eu.citrix.com \
    --cc=jsong@novell.com \
    --cc=kevin.tian@intel.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.