From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933796AbdBQKOt (ORCPT ); Fri, 17 Feb 2017 05:14:49 -0500 Received: from mx1.redhat.com ([209.132.183.28]:37028 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933685AbdBQKOp (ORCPT ); Fri, 17 Feb 2017 05:14:45 -0500 From: Vitaly Kuznetsov To: Thomas Gleixner Cc: Andy Lutomirski , "K. Y. Srinivasan" , X86 ML , Ingo Molnar , "H. Peter Anvin" , Haiyang Zhang , Stephen Hemminger , Dexuan Cui , "linux-kernel\@vger.kernel.org" , devel@linuxdriverproject.org, Linux Virtualization Subject: Re: [PATCH v2 0/3] x86/vdso: Add Hyper-V TSC page clocksource support References: <20170214124408.25931-1-vkuznets@redhat.com> <87a89o22ml.fsf@vitty.brq.redhat.com> <87y3x7zh6h.fsf@vitty.brq.redhat.com> Date: Fri, 17 Feb 2017 11:14:42 +0100 In-Reply-To: (Thomas Gleixner's message of "Thu, 16 Feb 2017 18:50:58 +0100 (CET)") Message-ID: <87tw7txgx9.fsf@vitty.brq.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Fri, 17 Feb 2017 10:14:45 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Thomas Gleixner writes: > On Wed, 15 Feb 2017, Vitaly Kuznetsov wrote: >> Actually, we already have an implementation of TSC page update in KVM >> (see arch/x86/kvm/hyperv.c, kvm_hv_setup_tsc_page()) and the update does >> the following: >> >> 0) stash seq into seq_prev >> 1) seq = 0 making all reads from the page invalid >> 2) smp_wmb() >> 3) update tsc_scale, tsc_offset >> 4) smp_wmb() >> 5) set seq = seq_prev + 1 > > I hope they handle the case where seq_prev overflows and becomes 0 :) > >> As far as I understand this helps with situations you described above as >> guest will notice either invalid value of 0 or seq change. In case the >> implementation in real Hyper-V is the same we're safe with compile >> barriers only. > > On x86 that's correct. smp_rmb() resolves to barrier(), but you certainly > need the smp_wmb() on the writer side. > > Now looking at the above your reader side code is bogus: > > + while (1) { > + sequence = tsc_pg->tsc_sequence; > + if (!sequence) > + break; > > Why would you break out of the loop when seq is 0? The 0 is just telling > you that there is an update in progress. Not only. As far as I understand (and I *think* K. Y. pointed this out) when VM is migrating to another host TSC page clocksource is disabled for extended period of time so we're better off reading from MSR than looping here. With regards to VDSO this means reverting to doing normal syscall. > > The Linux seqcount writer side is: > > seq++; > smp_wmb(); > > update... > > smp_wmb(); > seq++; > > and it's defined that an odd sequence count, i.e. bit 0 set means update in > progress. Which is nice, because you don't have to treat 0 special on the > writer side and you don't need extra storage to stash seq away :) > > So the reader side does: > > do { > while (1) { > s = READ_ONCE(seq); > if (!(s & 0x01)) > break; > cpu_relax(); > } > smp_rmb(); > > read data ... > > smp_rmb(); > } while (s != seq) > > So for that hyperv thing you want: > > do { > while (1) { > s = READ_ONCE(seq); > if (s) > break; > cpu_relax(); > } > smp_rmb(); > > read data ... > > smp_rmb(); > } while (s != seq) > > Thanks, > > tglx -- Vitaly