xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: "Roger Pau Monné" <roger.pau@citrix.com>
To: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	xen-devel@lists.xen.org, David Vrabel <david.vrabel@citrix.com>,
	Alex Bligh <alex@alex.org.uk>,
	Anthony PERARD <anthony.perard@citrix.com>,
	Diana Crisan <dcrisan@flexiant.com>
Subject: Re: HVM Migration of domU on Qemu-upstream DM causes stuck system clock with ACPI
Date: Mon, 3 Jun 2013 10:37:49 +0200	[thread overview]
Message-ID: <51AC55DD.7000507@citrix.com> (raw)
In-Reply-To: <51A8BD48.6060104@citrix.com>

On 31/05/13 17:10, Roger Pau Monné wrote:
> On 31/05/13 15:07, George Dunlap wrote:
>> On 31/05/13 13:40, Ian Campbell wrote:
>>> On Fri, 2013-05-31 at 12:57 +0100, Alex Bligh wrote:
>>>> --On 31 May 2013 12:49:18 +0100 George Dunlap
>>>> <george.dunlap@eu.citrix.com>
>>>> wrote:
>>>>
>>>>> No -- Linux is asking, "Can you give me an alarm in 5ns?"  And Xen is
>>>>> saying, "No".  So Linux is saying, "OK, how about 5us?  10us?
>>>>> 20us?"  By
>>>>> the time it reaches 4ms, Linux has had enough, and says, "If this timer
>>>>> is so bad that it can't give me an event within 4ms it just won't use
>>>>> timers at all, thank you very much."
>>>>>
>>>>> The problem appears to be that Linux thinks it's asking for
>>>>> something in
>>>>> the future, but is actually asking for something in the past.  It must
>>>>> look at its watch just before the final domain pause, and then asks for
>>>>> the time just after the migration resumes on the other side.  So it
>>>>> doesn't realize that 10ms (or something) has already passed, and that
>>>>> it's actually asking for a timer in the past.  The Xen timer driver in
>>>>> Linux specifically asks Xen for times set in the past to return an
>>>>> error.
>>>>> Xen is returning an error because the time is in the past, Linux thinks
>>>>> it's getting an error because the time is too close in the future and
>>>>> tries asking a little further away.
>>>>>
>>>>> Unfortunately I think this is something which needs to be fixed on the
>>>>> Linux side; I don't really see how we can work around it in Xen.
>>>> I don't think fixing it only on the Linux side is a great idea, not
>>>> least
>>>> as it makes any current Linux image not live migrateable reliably.
>>>> That's
>>>> pretty horrible.
>>> Ultimately though a guest bug is a guest bug, we don't really want to be
>>> filling the hypervisor with lots of quirky exceptions to interfaces in
>>> order to work around them, otherwise where does it end?
>>>
>>> A kernel side fix can be pushed to the distros fairly aggressively (it's
>>> mostly just a case of getting an upstream stable backport then filing
>>> bugs with the main ones, we've done it before) and for users upgrading
>>> the kernel via the distros is really not so hard and mostly reuses the
>>> process they must have in place for guest kernel security updates and
>>> other important kernel bugs anyway.
>>
>> In any case, it seems I was wrong -- Linux does "look at its watch"
>> every time it asks.
>>
>> The generic timer interface is "set me a timer N nanoseconds in the
>> future"; the Xen timer implementation executes
>> pvclock_clocksource_read() and adds the delta.  So it may well actually
>> be a bug in Xen.
>>
>> Stand by for further investigation...

I've been investigating further during the weekend, and although I'm not
familiar with the timer code in Xen, I think the problem comes from the
fact that in __update_vcpu_system_time when Xen detects that the guest
is using a vtsc it adds offsets to the time passed to the guest, while
in VCPUOP_set_singleshot_timer Xen compares the time passed from the
guest using NOW(), which is just the Xen uptime, without taking into
account any offsets.

This only happens after migration because Xen automatically switches to
vtsc when it detects that the guest has been migrated. I'm currently
setting up a Linux PVHVM on shared storage to perform some testing, but
one possible solution might be to add tsc_mode="native_paravirt" to the
PVHVM config file, and another one would be fixing
VCPUOP_set_singleshot_timer to take into account the vtsc offsets and
correctly translate the time passed from the guest.

  reply	other threads:[~2013-06-03  8:37 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1223417765.8633857.1368537033873.JavaMail.root@zimbra002>
2013-05-14 13:11 ` HVM Migration of domU on Qemu-upstream DM causes stuck system clock with ACPI Diana Crisan
2013-05-14 16:09   ` George Dunlap
2013-05-15 10:05     ` Diana Crisan
2013-05-15 13:46   ` Alex Bligh
2013-05-20 11:11     ` George Dunlap
2013-05-20 19:28       ` Konrad Rzeszutek Wilk
2013-05-20 22:38         ` Alex Bligh
2013-05-21  1:04           ` Konrad Rzeszutek Wilk
2013-05-21 10:22             ` Diana Crisan
2013-05-21 10:47               ` David Vrabel
2013-05-21 11:16                 ` Diana Crisan
2013-05-21 12:49                   ` David Vrabel
2013-05-21 13:16                     ` Alex Bligh
2013-05-24 16:16                       ` George Dunlap
2013-05-25 10:18                         ` Alex Bligh
2013-05-26  8:38                           ` Ian Campbell
2013-05-28 15:06                             ` Diana Crisan
2013-05-29 16:16                               ` Alex Bligh
2013-05-29 19:04                                 ` Ian Campbell
2013-05-30 14:30                                   ` George Dunlap
2013-05-30 15:39                                 ` Frediano Ziglio
2013-05-30 15:26                               ` George Dunlap
2013-05-30 15:55                                 ` Diana Crisan
2013-05-30 16:06                                   ` George Dunlap
2013-05-30 17:02                                     ` Diana Crisan
2013-05-31  8:34                                     ` Diana Crisan
2013-05-31 10:54                                       ` George Dunlap
2013-05-31 10:59                                         ` George Dunlap
2013-05-31 11:41                                           ` George Dunlap
2013-05-31 21:30                                           ` Konrad Rzeszutek Wilk
2013-05-31 22:51                                             ` Alex Bligh
2013-06-03  9:43                                             ` George Dunlap
2013-05-31 11:18                                         ` Alex Bligh
2013-05-31 11:36                                         ` Diana Crisan
2013-05-31 11:41                                           ` Diana Crisan
2013-05-31 11:49                                             ` George Dunlap
2013-05-31 11:57                                               ` Alex Bligh
2013-05-31 12:40                                                 ` Ian Campbell
2013-05-31 13:07                                                   ` George Dunlap
2013-05-31 15:10                                                     ` Roger Pau Monné
2013-06-03  8:37                                                       ` Roger Pau Monné [this message]
2013-06-03 10:05                                                         ` Stefano Stabellini
2013-06-03 10:23                                                           ` Roger Pau Monné
2013-06-03 10:30                                                             ` Stefano Stabellini
2013-06-03 11:16                                                             ` George Dunlap
2013-06-03 11:24                                                               ` Diana Crisan
2013-06-03 14:01                                                               ` Diana Crisan
2013-06-03 17:09                                                               ` Alex Bligh
2013-06-03 17:12                                                                 ` George Dunlap
2013-06-03 17:18                                                                   ` Alex Bligh
2013-06-03 17:25                                                                     ` George Dunlap
2013-06-03 17:42                                                                       ` Alex Bligh
2013-06-03 10:25                                                         ` George Dunlap
2013-05-31 13:16                                                   ` Alex Bligh
2013-05-31 14:36                                                     ` Ian Campbell
2013-05-31 15:18                                                       ` Alex Bligh
2013-05-31 12:34                                               ` Ian Campbell
2013-05-30 14:32   ` George Dunlap
2013-05-30 14:42     ` Diana Crisan
2013-06-03 17:18 Alex Bligh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51AC55DD.7000507@citrix.com \
    --to=roger.pau@citrix.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=alex@alex.org.uk \
    --cc=anthony.perard@citrix.com \
    --cc=david.vrabel@citrix.com \
    --cc=dcrisan@flexiant.com \
    --cc=george.dunlap@eu.citrix.com \
    --cc=konrad.wilk@oracle.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).