From mboxrd@z Thu Jan 1 00:00:00 1970 From: Paolo Bonzini Subject: Re: KVM: x86: update masterclock when kvmclock_offset is calculated Date: Wed, 28 Aug 2013 14:37:20 +0200 Message-ID: <521DEF00.2030200@redhat.com> References: <20130820182012.GA814@amt.cnet> <521644D0.9050902@redhat.com> <20130828025243.GA13653@amt.cnet> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: kvm-devel , Gleb Natapov To: Marcelo Tosatti Return-path: Received: from mail-qc0-f180.google.com ([209.85.216.180]:60058 "EHLO mail-qc0-f180.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752891Ab3H1MhY (ORCPT ); Wed, 28 Aug 2013 08:37:24 -0400 Received: by mail-qc0-f180.google.com with SMTP id l13so2926911qcy.11 for ; Wed, 28 Aug 2013 05:37:23 -0700 (PDT) In-Reply-To: <20130828025243.GA13653@amt.cnet> Sender: kvm-owner@vger.kernel.org List-ID: Il 28/08/2013 04:52, Marcelo Tosatti ha scritto: > On Thu, Aug 22, 2013 at 07:05:20PM +0200, Paolo Bonzini wrote: >> Il 20/08/2013 20:20, Marcelo Tosatti ha scritto: >>> >>> The offset to add to the hosts monotonic time, kvmclock_offset, is >>> calculated against the monotonic time at KVM_SET_CLOCK ioctl time. >>> >>> Request a master clock update at this time, to reduce a potentially >>> unbounded difference between the values of the masterclock and >>> the clock value used to calculate kvmclock_offset. >>> >>> Signed-off-by: Marcelo Tosatti >>> >>> Index: linux-2.6-kvmclock-fixes/arch/x86/kvm/x86.c >>> =================================================================== >>> --- linux-2.6-kvmclock-fixes.orig/arch/x86/kvm/x86.c >>> +++ linux-2.6-kvmclock-fixes/arch/x86/kvm/x86.c >>> @@ -3806,6 +3806,7 @@ long kvm_arch_vm_ioctl(struct file *filp >>> delta = user_ns.clock - now_ns; >>> local_irq_enable(); >>> kvm->arch.kvmclock_offset = delta; >>> + kvm_gen_update_masterclock(kvm); >>> break; >>> } >>> case KVM_GET_CLOCK: { >> >> Reviewed-by: Paolo Bonzini >> >> While reviewing this patch, which BTW looks good, I noticed the handling >> of KVM_REQ_MCLOCK_INPROGRESS, the dummy request that is never processed >> and is only used to block guest entry. >> >> It seems to me that this bit is not necessary. After >> KVM_REQ_CLOCK_UPDATE is issued, no guest entries will happen because >> kvm_guest_time_update will try to take the pvclock_gtod_sync_lock, >> currently taken by kvm_gen_update_masterclock. > > Not entirely clear, to cancel guest entry the bit is necessary: > > if (vcpu->mode == EXITING_GUEST_MODE || vcpu->requests > || need_resched() || signal_pending(current)) { > vcpu->mode = OUTSIDE_GUEST_MODE; > smp_wmb(); > local_irq_enable(); > preempt_enable(); > r = 1; > goto cancel_injection; > } Yes, KVM_REQ_CLOCK_UPDATE is enough to cancel guest entries too. See code below. >> Thus, you do not need the dummy request. You can simply issue >> KVM_REQ_CLOCK_UPDATE before calling pvclock_update_vm_gtod_copy (with >> the side effect of exiting VCPUs). VCPUs will stall in >> kvm_guest_time_update until pvclock_gtod_sync_lock is released by >> kvm_gen_update_masterclock. What do you think? > > Not sure its safe. Can you describe the safety of your proposal in more > detail ? Here is the code I was thinking of: spin_lock(&ka->pvclock_gtod_sync_lock); make_all_cpus_request(kvm, KVM_REQ_CLOCK_UPDATE); /* * No guest entries from this point: VCPUs will be spinning * on pvclock_gtod_sync_lock in kvm_guest_time_update. */ pvclock_update_vm_gtod_copy(kvm); /* * Let kvm_guest_time_update continue: entering the guest * is now allowed too. */ spin_unlock(&ka->pvclock_gtod_sync_lock); KVM_REQ_CLOCK_UPDATE is used to cancel guest entry and execute kvm_guest_time_update. But kvm_guest_time_update will spin on pvclock_gtod_sync_lock until pvclock_update_vm_gtod_copy exits and kvm_gen_update_masterclock releases the spinlock. >> On top of this, optionally the spinlock could become an rw_semaphore >> so that clock updates for different VCPUs will not be serialized. The >> effect is probably not visible, though. > > Still not clear of the benefits, but this area certainly welcomes > performance improvements (the global kick is one thing we discussed > and that should be improved). This unfortunately does not eliminate the global kick, so there is likely no performance improvement yet. It simplifies the logic a bit though. The change I suggested above is to make pvclock_gtod_sync_lock an rwsem or, probably better, a seqlock. VCPUs reading ka->use_master_clock, ka->master_cycle_now, ka->master_kernel_ns can then run concurrently, with no locked operations in kvm_guest_time_update. Paolo