Re: KVM: x86: update masterclock when kvmclock_offset is calculated

kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Marcelo Tosatti <mtosatti@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm-devel <kvm@vger.kernel.org>, Gleb Natapov <gleb@redhat.com>
Subject: Re: KVM: x86: update masterclock when kvmclock_offset is calculated
Date: Tue, 3 Sep 2013 00:03:51 -0300	[thread overview]
Message-ID: <20130903030351.GA3167@amt.cnet> (raw)
In-Reply-To: <521DEF00.2030200@redhat.com>

On Wed, Aug 28, 2013 at 02:37:20PM +0200, Paolo Bonzini wrote:
> Il 28/08/2013 04:52, Marcelo Tosatti ha scritto:
> > On Thu, Aug 22, 2013 at 07:05:20PM +0200, Paolo Bonzini wrote:
> >> Il 20/08/2013 20:20, Marcelo Tosatti ha scritto:
> >>>
> >>> The offset to add to the hosts monotonic time, kvmclock_offset, is
> >>> calculated against the monotonic time at KVM_SET_CLOCK ioctl time.
> >>>
> >>> Request a master clock update at this time, to reduce a potentially
> >>> unbounded difference between the values of the masterclock and
> >>> the clock value used to calculate kvmclock_offset.
> >>>
> >>> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
> >>>
> >>> Index: linux-2.6-kvmclock-fixes/arch/x86/kvm/x86.c
> >>> ===================================================================
> >>> --- linux-2.6-kvmclock-fixes.orig/arch/x86/kvm/x86.c
> >>> +++ linux-2.6-kvmclock-fixes/arch/x86/kvm/x86.c
> >>> @@ -3806,6 +3806,7 @@ long kvm_arch_vm_ioctl(struct file *filp
> >>>  		delta = user_ns.clock - now_ns;
> >>>  		local_irq_enable();
> >>>  		kvm->arch.kvmclock_offset = delta;
> >>> +		kvm_gen_update_masterclock(kvm);
> >>>  		break;
> >>>  	}
> >>>  	case KVM_GET_CLOCK: {
> >>
> >> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
> >>
> >> While reviewing this patch, which BTW looks good, I noticed the handling
> >> of KVM_REQ_MCLOCK_INPROGRESS, the dummy request that is never processed
> >> and is only used to block guest entry.
> >>
> >> It seems to me that this bit is not necessary.  After
> >> KVM_REQ_CLOCK_UPDATE is issued, no guest entries will happen because
> >> kvm_guest_time_update will try to take the pvclock_gtod_sync_lock,
> >> currently taken by kvm_gen_update_masterclock.
> > 
> > Not entirely clear, to cancel guest entry the bit is necessary:
> > 
> >         if (vcpu->mode == EXITING_GUEST_MODE || vcpu->requests
> >             || need_resched() || signal_pending(current)) {
> >                 vcpu->mode = OUTSIDE_GUEST_MODE;
> >                 smp_wmb();
> >                 local_irq_enable();
> >                 preempt_enable();
> >                 r = 1;
> >                 goto cancel_injection;
> >         }
> 
> Yes, KVM_REQ_CLOCK_UPDATE is enough to cancel guest entries too.  See
> code below.
> 
> >> Thus, you do not need the dummy request.  You can simply issue
> >> KVM_REQ_CLOCK_UPDATE before calling pvclock_update_vm_gtod_copy (with
> >> the side effect of exiting VCPUs).  VCPUs will stall in
> >> kvm_guest_time_update until pvclock_gtod_sync_lock is released by
> >> kvm_gen_update_masterclock.  What do you think?
> > 
> > Not sure its safe. Can you describe the safety of your proposal in more
> > detail ?
> 
> Here is the code I was thinking of:
> 
> 	spin_lock(&ka->pvclock_gtod_sync_lock);
> 	make_all_cpus_request(kvm, KVM_REQ_CLOCK_UPDATE);
> 
> 	/*
> 	 * No guest entries from this point: VCPUs will be spinning
> 	 * on pvclock_gtod_sync_lock in kvm_guest_time_update.
> 	 */
> 	pvclock_update_vm_gtod_copy(kvm);
> 
> 	/*
> 	 * Let kvm_guest_time_update continue: entering the guest
> 	 * is now allowed too.
> 	 */
> 	spin_unlock(&ka->pvclock_gtod_sync_lock);
> 
> KVM_REQ_CLOCK_UPDATE is used to cancel guest entry and execute
> kvm_guest_time_update.  But kvm_guest_time_update will spin on
> pvclock_gtod_sync_lock until pvclock_update_vm_gtod_copy exits and
> kvm_gen_update_masterclock releases the spinlock.

Not safe because there are places which set KVM_REQ_CLOCK_UPDATE without
kicking target vcpu out of guest mode. Unless you use a modified
make_all_cpus_request.

The point of REQ_MCLOCK_INPROGRESS request is to guarantee that the
following is not possible:

- 2 vcpus in guest mode with per-vcpu kvmclock areas with
different {system_timestamp, tsc_offset} values.

To achieve that:

- Kick all vcpus out of guest mode (via a request bit that can't be
  cleared).
- Update the {system_timestamp, tsc_offset} values.
- Clear the request bit.

> >> On top of this, optionally the spinlock could become an rw_semaphore
> >> so that clock updates for different VCPUs will not be serialized.  The
> >> effect is probably not visible, though.
> > 
> > Still not clear of the benefits, but this area certainly welcomes
> > performance improvements (the global kick is one thing we discussed 
> > and that should be improved).
> 
> This unfortunately does not eliminate the global kick, so there is
> likely no performance improvement yet.  It simplifies the logic a bit
> though.
> 
> The change I suggested above is to make pvclock_gtod_sync_lock an rwsem
> or, probably better, a seqlock. 
> VCPUs reading ka->use_master_clock,
> ka->master_cycle_now, ka->master_kernel_ns can then run concurrently,
> with no locked operations in kvm_guest_time_update.

Good point.

next prev parent reply	other threads:[~2013-09-03  3:04 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-20 18:20 KVM: x86: update masterclock when kvmclock_offset is calculated Marcelo Tosatti
2013-08-22 17:05 ` Paolo Bonzini
2013-08-23 10:00   ` Paolo Bonzini
2013-08-28  2:55     ` KVM: x86: update masterclock when kvmclock_offset is calculated (v2) Marcelo Tosatti
2013-08-28 14:47       ` Gleb Natapov
2013-08-28  2:55     ` KVM: x86: update masterclock when kvmclock_offset is calculated Marcelo Tosatti
2013-08-28  2:52   ` Marcelo Tosatti
2013-08-28 12:37     ` Paolo Bonzini
2013-09-03  3:03       ` Marcelo Tosatti [this message]
2013-09-03 10:42         ` Paolo Bonzini
2013-09-03 13:56           ` Marcelo Tosatti
2013-09-03 14:20             ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130903030351.GA3167@amt.cnet \
    --to=mtosatti@redhat.com \
    --cc=gleb@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).