public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: bugzilla-daemon@kernel.org
To: kvm@vger.kernel.org
Subject: [Bug 217423] TSC synchronization issue in VM restore
Date: Fri, 23 Jun 2023 14:41:02 +0000	[thread overview]
Message-ID: <bug-217423-28872-SViA10PlQq@https.bugzilla.kernel.org/> (raw)
In-Reply-To: <bug-217423-28872@https.bugzilla.kernel.org/>

https://bugzilla.kernel.org/show_bug.cgi?id=217423

--- Comment #4 from robert.hoo.linux@gmail.com ---
On 5/18/2023 8:48 PM, bugzilla-daemon@kernel.org wrote:
[...]
(Sorry for late response)
>> Can elaborate more on this hrtimer issue/code path?
> 
> Below are the steps in detail, I traced them via bpftrace, to simplify the
> analysis, the preemption timer on host is disabled, guest is running with
> TSC timer deadline mode.
> 
> TSC changes before save VM:
> 1 create VM/VCPU, guest TSC start from 0 (VCPU initial value)
>    host_tsc0 = 0 + offset0
> 2 pause VM after guest start finished (about 200ms)
>    host_tsc1 = guest_tsc1 + offset0
>    guest_tsc1_deadline = guest_tsc1 + expire1
> 3 save VM state
>    save guest_tsc1 by reading MSR_IA32_TSC
>    save guest_tsc1_deadline by reading MSR_IA32_TSC_DEADLINE
> 
> TSC changes in restore VM (to simplify the analysis, step 4
> and step 5 ignore the host TSC changes in restore process):
> 4 create VM/VCPU, guest TSC start from 0 (VCPU initial value)
>    host_tsc3 = 0 + offset1
> 5 restore VM state
>    set MSR_IA32_TSC by guest_tsc1
>    set MSR_IA32_TSC_DEADLINE by guest_tsc1_deadline
> 6 start VM
>    VCPU_RUN
> 
> In step 5 setting MSR_IA32_TSC, because the guest_tsc1 is within 1 second,
> KVM will take this update as TSC synchronize, then skip update offset1.
> This means the guest TSC is still at 0 (initialize value).

IIUC, here no matter synchronizing = true or false, offset will always be 
updated, i.e. __kvm_synchronize_tsc() will be called. But the offset value will 
differ.

I guess your environment is tsc_stable, then offset = kvm->arch.cur_tsc_offset, 
which is 0. That is to say, the elapsed time isn't counted in by the heuristics 
method in current code, that's the culprit.

static void kvm_synchronize_tsc(struct kvm_vcpu *vcpu, u64 data)
{
        ...
        offset = kvm_compute_l1_tsc_offset(vcpu, data);
        ...

        /*
         * For a reliable TSC, we can match TSC offsets, and for an unstable
         * TSC, we add elapsed time in this computation.  We could let the
         * compensation code attempt to catch up if we fall behind, but
         * it's better to try to match offsets from the beginning.
          */
        if (synchronizing &&
            vcpu->arch.virtual_tsc_khz == kvm->arch.last_tsc_khz) {
                if (!kvm_check_tsc_unstable()) {
                        offset = kvm->arch.cur_tsc_offset;
                } else {
                        u64 delta = nsec_to_cycles(vcpu, elapsed);
                        data += delta;
                        offset = kvm_compute_l1_tsc_offset(vcpu, data);
                }
                matched = true;
        }

        __kvm_synchronize_tsc(vcpu, offset, data, ns, matched);
        raw_spin_unlock_irqrestore(&kvm->arch.tsc_write_lock, flags);
}


>> An alternative, I think, is to bypass this directly write IA32_MSR_TSC way
>> to set/sync TSC offsets, but follow new approach introduced in your VMM by
>>
>> commit 828ca89628bfcb1b8f27535025f69dd00eb55207
>> Author: Oliver Upton <oliver.upton@linux.dev>
>> Date:   Thu Sep 16 18:15:38 2021 +0000
>>
>>       KVM: x86: Expose TSC offset controls to userspace
>>
>> ...
>>
>> Documentation/virt/kvm/devices/vcpu.rst:
>>
>> 4.1 ATTRIBUTE: KVM_VCPU_TSC_OFFSET
>>
>> :Parameters: 64-bit unsigned TSC offset
>>
>> ...
>>
>> Specifies the guest's TSC offset relative to the host's TSC. The guest's
>> TSC is then derived by the following equation:
>>
>>     guest_tsc = host_tsc + KVM_VCPU_TSC_OFFSET
>>
>> The following describes a possible algorithm to use for this purpose
>> ...
> 
> "TSC counts the time during which the VM was paused.", This new feature works
> for live migration. But if we save/restore VM with snapshot, the TSC should
> be
> paused either?
> 
Not sure what's host's TSC situation when host is, say, suspended/hibernated.
VM 
Save/Restore can refer to that.
But, the key point of this new approach is to use OFFSET rather than direct TSC 
value, this is like x86 TSC_ADJUST was introduced, and is preferred.
Via this new interface,
"... Ensure that the KVM_CLOCK_REALTIME flag is set in the provided structure.
KVM will advance the VM's kvmclock to account for elapsed time since recording 
the clock values.", therefore I think it can solve your problem, rather than 
modify the ancient and heuristics code at high risk.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

      parent reply	other threads:[~2023-06-23 14:41 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-09 14:01 [Bug 217423] New: TSC synchronization issue in VM restore bugzilla-daemon
2023-05-09 16:27 ` [Bug 217423] " bugzilla-daemon
2023-05-17  6:56 ` [Bug 217423] New: " Robert Hoo
2023-05-17  6:56 ` [Bug 217423] " bugzilla-daemon
2023-05-18 12:48 ` bugzilla-daemon
2023-06-23 14:40   ` Robert Hoo
2023-06-23 14:41 ` bugzilla-daemon [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-217423-28872-SViA10PlQq@https.bugzilla.kernel.org/ \
    --to=bugzilla-daemon@kernel.org \
    --cc=kvm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox