From mboxrd@z Thu Jan 1 00:00:00 1970 From: Pankaj Gupta Subject: Re: [PATCH v3] KVM: x86: remove irq disablement around KVM_SET_CLOCK/KVM_GET_CLOCK Date: Tue, 18 Apr 2017 01:06:25 -0400 (EDT) Message-ID: <1155909922.16498605.1492491985388.JavaMail.zimbra@redhat.com> References: <20170412172313.GA26589@amt.cnet> <624265555.15138041.1492063596237.JavaMail.zimbra@redhat.com> <20170413184907.GA11691@amt.cnet> <565553136.16248642.1492427123503.JavaMail.zimbra@redhat.com> <20170417155136.GA21968@amt.cnet> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Cc: kvm-devel , Paolo Bonzini , Radim =?utf-8?B?S3LEjW3DocWZ?= To: Marcelo Tosatti Return-path: Received: from mx1.redhat.com ([209.132.183.28]:35788 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751263AbdDRFG2 (ORCPT ); Tue, 18 Apr 2017 01:06:28 -0400 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 05E184DD7C for ; Tue, 18 Apr 2017 05:06:28 +0000 (UTC) In-Reply-To: <20170417155136.GA21968@amt.cnet> Sender: kvm-owner@vger.kernel.org List-ID: > > The disablement of interrupts at KVM_SET_CLOCK/KVM_GET_CLOCK > attempts to disable software suspend from causing "non atomic behaviour" of > the operation: > > Add a helper function to compute the kernel time and convert nanoseconds > back to CPU specific cycles. Note that these must not be called in > preemptible > context, as that would mean the kernel could enter software suspend > state, > which would cause non-atomic operation. > > However, assume the kernel can enter software suspend at the following 2 > points: > > ktime_get_ts(&ts); > 1. > hypothetical_ktime_get_ts(&ts) > monotonic_to_bootbased(&ts); > 2. > > monotonic_to_bootbased() should be correct relative to a ktime_get_ts(&ts) > performed after point 1 (that is after resuming from software suspend), > hypothetical_ktime_get_ts() > > Therefore it is also correct for the ktime_get_ts(&ts) before point 1, > which is > > ktime_get_ts(&ts) = hypothetical_ktime_get_ts(&ts) + > time-to-execute-suspend-code > > Note CLOCK_MONOTONIC does not count during suspension. > > So remove the irq disablement, which causes the following warning on > -RT kernels: > > With this reasoning, and the -RT bug that the irq disablement causes > (because spin_lock is now a sleeping lock), remove the IRQ protection as it > causes: > > [ 1064.668109] in_atomic(): 0, irqs_disabled(): 1, pid: 15296, name:m > [ 1064.668110] INFO: lockdep is turned off. > [ 1064.668110] irq event stamp: 0 > [ 1064.668112] hardirqs last enabled at (0): [< (null)>] ) > [ 1064.668116] hardirqs last disabled at (0): [] c0 > [ 1064.668118] softirqs last enabled at (0): [] c0 > [ 1064.668118] softirqs last disabled at (0): [< (null)>] ) > [ 1064.668121] CPU: 13 PID: 15296 Comm: qemu-kvm Not tainted 3.10.0-1 > [ 1064.668121] Hardware name: Dell Inc. PowerEdge R730/0H21J3, BIOS 5 > [ 1064.668123] ffff8c1796b88000 00000000afe7344c ffff8c179abf3c68 f3 > [ 1064.668125] ffff8c179abf3c90 ffffffff930ccb3d ffff8c1b992b3610 f0 > [ 1064.668126] 00007ffc1a26fbc0 ffff8c179abf3cb0 ffffffff9375f694 f0 > [ 1064.668126] Call Trace: > [ 1064.668132] [] dump_stack+0x19/0x1b > [ 1064.668135] [] __might_sleep+0x12d/0x1f0 > [ 1064.668138] [] rt_spin_lock+0x24/0x60 > [ 1064.668155] [] __get_kvmclock_ns+0x36/0x110 [k] > [ 1064.668159] [] ? futex_wait_queue_me+0x103/0x10 > [ 1064.668171] [] kvm_arch_vm_ioctl+0xa2/0xd70 [k] > [ 1064.668173] [] ? futex_wait+0x1ac/0x2a0 > > v2: notice get_kvmclock_ns with the same problem (Pankaj). > v3: remove useless helper function (Pankaj). > > Signed-off-by: Marcelo Tosatti > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 1faf620..deacbe1 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -1775,7 +1775,7 @@ static void kvm_gen_update_masterclock(struct kvm *kvm) > #endif > } > > -static u64 __get_kvmclock_ns(struct kvm *kvm) > +u64 get_kvmclock_ns(struct kvm *kvm) > { > struct kvm_arch *ka = &kvm->arch; > struct pvclock_vcpu_time_info hv_clock; > @@ -1796,18 +1796,6 @@ static u64 __get_kvmclock_ns(struct kvm *kvm) > return __pvclock_read_cycles(&hv_clock, rdtsc()); > } > > -u64 get_kvmclock_ns(struct kvm *kvm) > -{ > - unsigned long flags; > - s64 ns; > - > - local_irq_save(flags); > - ns = __get_kvmclock_ns(kvm); > - local_irq_restore(flags); > - > - return ns; > -} > - > static void kvm_setup_pvclock_page(struct kvm_vcpu *v) > { > struct kvm_vcpu_arch *vcpu = &v->arch; > @@ -4196,10 +4184,8 @@ long kvm_arch_vm_ioctl(struct file *filp, > goto out; > > r = 0; > - local_irq_disable(); > - now_ns = __get_kvmclock_ns(kvm); > + now_ns = get_kvmclock_ns(kvm); > kvm->arch.kvmclock_offset += user_ns.clock - now_ns; > - local_irq_enable(); > kvm_gen_update_masterclock(kvm); > break; > } > @@ -4207,11 +4193,9 @@ long kvm_arch_vm_ioctl(struct file *filp, > struct kvm_clock_data user_ns; > u64 now_ns; > > - local_irq_disable(); > - now_ns = __get_kvmclock_ns(kvm); > + now_ns = get_kvmclock_ns(kvm); > user_ns.clock = now_ns; > user_ns.flags = kvm->arch.use_master_clock ? KVM_CLOCK_TSC_STABLE : 0; > - local_irq_enable(); > memset(&user_ns.pad, 0, sizeof(user_ns.pad)); > > r = -EFAULT; Over all patch looks good to me. I think explanation provided by Marcelo is valid. Reviewed-by: Pankaj Gupta >