xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [Bug 1759] Xen 4.0.1 live migration/restore over-writes new hypervisor's boot-time record
@ 2011-04-13 16:25 Tim M
  2011-04-13 18:34 ` Keir Fraser
  0 siblings, 1 reply; 6+ messages in thread
From: Tim M @ 2011-04-13 16:25 UTC (permalink / raw)
  To: xen-devel

It was suggested that I submit this to xen-devel in addition to the bug report.


I am having the exact problem described in bug 1282. The RedHat 5 errata for
Xen 3 describes the problem nicely so I will quote it:

xen calculates its running time by adding the hypervisor's up-time to the
hypervisor's boot-time record. In live migrations of para-virtualized
guests, however, the guest would over-write the new hypervisor's boot-time
record with the boot-time of the previous hypervisor. This caused
time-dependent processes on the guests to fail 

This bug was apparently fixed in 3.1.1
(http://xenbits.xen.org/hg/xen-unstable.hg/rev/359707941ae8) but I am having
the issue now with Xen 4.0.1 on Debian Squeeze.

Did something change with the migrate/restore process so the previous fix no
longer applies?

Thanks in advance for any help

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Bug 1759] Xen 4.0.1 live migration/restore over-writes new hypervisor's boot-time record
  2011-04-13 16:25 [Bug 1759] Xen 4.0.1 live migration/restore over-writes new hypervisor's boot-time record Tim M
@ 2011-04-13 18:34 ` Keir Fraser
  2011-04-13 19:31   ` Tim M
  0 siblings, 1 reply; 6+ messages in thread
From: Keir Fraser @ 2011-04-13 18:34 UTC (permalink / raw)
  To: Tim M, xen-devel

On 13/04/2011 17:25, "Tim M" <bugs@linuxrehab.com> wrote:

> It was suggested that I submit this to xen-devel in addition to the bug
> report.
> 
> 
> I am having the exact problem described in bug 1282. The RedHat 5 errata for
> Xen 3 describes the problem nicely so I will quote it:
> 
> xen calculates its running time by adding the hypervisor's up-time to the
> hypervisor's boot-time record. In live migrations of para-virtualized
> guests, however, the guest would over-write the new hypervisor's boot-time
> record with the boot-time of the previous hypervisor. This caused
> time-dependent processes on the guests to fail
> 
> This bug was apparently fixed in 3.1.1
> (http://xenbits.xen.org/hg/xen-unstable.hg/rev/359707941ae8) but I am having
> the issue now with Xen 4.0.1 on Debian Squeeze.
> 
> Did something change with the migrate/restore process so the previous fix no
> longer applies?

The fix is still there, albeit in a modified form since the restore code has
changed quite a bit since Xen 3. Can you reliably repro this, with any PV
guest?

 -- Keir

> Thanks in advance for any help
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Bug 1759] Xen 4.0.1 live migration/restore over-writes new hypervisor's boot-time record
  2011-04-13 18:34 ` Keir Fraser
@ 2011-04-13 19:31   ` Tim M
  2011-04-13 20:00     ` Keir Fraser
  0 siblings, 1 reply; 6+ messages in thread
From: Tim M @ 2011-04-13 19:31 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel

On Wed, Apr 13, 2011 at 07:34:01PM +0100, Keir Fraser wrote:
> The fix is still there, albeit in a modified form since the restore code has
> changed quite a bit since Xen 3. Can you reliably repro this, with any PV
> guest?
> 
>  -- Keir

This is 100% reproducible. Every time I migrate from a host with an uptime
longer than the target host, the VM has a clock/time freeze for however long
the uptime difference is. Migrating from a host with shorter uptime to a host
with longer has no problem.

As a demonstration, I have two dom0 hosts that have roughly 6 minutes
difference in uptime. I started a guest VM on the dom0 host with a longer
uptime then SSH'd to the guest and ran this command:

while true ; do date ; sleep 5 ; done

Next I initiated a live migration and this is what the output of the command
looks like (with commentary added):

Wed Apr 13 12:07:48 PDT 2011
Wed Apr 13 12:07:53 PDT 2011
Wed Apr 13 12:07:58 PDT 2011
Wed Apr 13 12:08:03 PDT 2011  [ migration happens here and SSH to the guest "freezes" for about 6 min ]
Wed Apr 13 12:13:55 PDT 2011
Wed Apr 13 12:14:00 PDT 2011
Wed Apr 13 12:14:05 PDT 2011


I have only tried Ubuntu 10.04.2 guests running the 2.6.32 and 2.6.35 server
kernel packages but, as mentioned, this happens every time.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Bug 1759] Xen 4.0.1 live migration/restore over-writes new hypervisor's boot-time record
  2011-04-13 19:31   ` Tim M
@ 2011-04-13 20:00     ` Keir Fraser
  2011-04-14  6:40       ` Ian Campbell
  0 siblings, 1 reply; 6+ messages in thread
From: Keir Fraser @ 2011-04-13 20:00 UTC (permalink / raw)
  To: Tim M; +Cc: xen-devel

On 13/04/2011 20:31, "Tim M" <bugs@linuxrehab.com> wrote:

> Next I initiated a live migration and this is what the output of the command
> looks like (with commentary added):
> 
> Wed Apr 13 12:07:48 PDT 2011
> Wed Apr 13 12:07:53 PDT 2011
> Wed Apr 13 12:07:58 PDT 2011
> Wed Apr 13 12:08:03 PDT 2011  [ migration happens here and SSH to the guest
> "freezes" for about 6 min ]
> Wed Apr 13 12:13:55 PDT 2011
> Wed Apr 13 12:14:00 PDT 2011
> Wed Apr 13 12:14:05 PDT 2011

Looks like the wallclock is correct after the migration however, and it's
that that the patch you referred to was fixing.

> I have only tried Ubuntu 10.04.2 guests running the 2.6.32 and 2.6.35 server
> kernel packages but, as mentioned, this happens every time.

I think this is a domU kernel bug, see a similar report here for example:
http://lists.xensource.com/archives/html/xen-devel/2010-10/msg00057.html
So it could be a common symptom in a range of Debian/Ubuntu kernels.

 -- Keir

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Bug 1759] Xen 4.0.1 live migration/restore over-writes new hypervisor's boot-time record
  2011-04-13 20:00     ` Keir Fraser
@ 2011-04-14  6:40       ` Ian Campbell
  2011-04-14 20:44         ` Tim M
  0 siblings, 1 reply; 6+ messages in thread
From: Ian Campbell @ 2011-04-14  6:40 UTC (permalink / raw)
  To: Keir Fraser; +Cc: Tim M, xen-devel@lists.xensource.com

On Wed, 2011-04-13 at 21:00 +0100, Keir Fraser wrote:
> On 13/04/2011 20:31, "Tim M" <bugs@linuxrehab.com> wrote:

> > I have only tried Ubuntu 10.04.2 guests running the 2.6.32 and 2.6.35 server
> > kernel packages but, as mentioned, this happens every time.
> 
> I think this is a domU kernel bug, see a similar report here for example:
> http://lists.xensource.com/archives/html/xen-devel/2010-10/msg00057.html
> So it could be a common symptom in a range of Debian/Ubuntu kernels.

This issue was introduced in one of the upstream longterm 2.6.32.y
kernels (by 1345126c761f in v2.6.32.16, I think). It was fixed upstream
in e7a3481c0246 "x86/pvclock: Zero last_value on resume" from
v2.6.37-rc6 which was added to the longterm 2.6.32.x branch as
595b62a8acfb in v2.6.32.30. It appears to have also gone into longterm
v2.6.35.12 as ac9a0f1a28f5.

This issue is fixed in the 2.6.32-31 package currently in Debian stable.

I can't speak for Ubuntu. You should contact them.

Ian.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Bug 1759] Xen 4.0.1 live migration/restore over-writes new hypervisor's boot-time record
  2011-04-14  6:40       ` Ian Campbell
@ 2011-04-14 20:44         ` Tim M
  0 siblings, 0 replies; 6+ messages in thread
From: Tim M @ 2011-04-14 20:44 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Keir Fraser, xen-devel@lists.xensource.com

On Thu, Apr 14, 2011 at 07:40:08AM +0100, Ian Campbell wrote:
> This issue is fixed in the 2.6.32-31 package currently in Debian stable.
 
I installed a DomU guest with Debian and the suggested kernel and live
migrations work properly. Thanks for all the help!

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-04-14 20:44 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-04-13 16:25 [Bug 1759] Xen 4.0.1 live migration/restore over-writes new hypervisor's boot-time record Tim M
2011-04-13 18:34 ` Keir Fraser
2011-04-13 19:31   ` Tim M
2011-04-13 20:00     ` Keir Fraser
2011-04-14  6:40       ` Ian Campbell
2011-04-14 20:44         ` Tim M

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).