From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S964841AbcIPPGo (ORCPT ); Fri, 16 Sep 2016 11:06:44 -0400 Received: from mx1.redhat.com ([209.132.183.28]:40862 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756016AbcIPPGg (ORCPT ); Fri, 16 Sep 2016 11:06:36 -0400 Subject: Re: [PATCH 2/2] x86, kvm: use kvmclock to compute TSC deadline value To: =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= References: <1473200999-123004-1-git-send-email-pbonzini@redhat.com> <1473200999-123004-3-git-send-email-pbonzini@redhat.com> <20160915150851.GA15815@potion> <20160915195949.GA17095@potion> <20160916145957.GF17296@potion> Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, dmatlack@google.com, luto@kernel.org, peterhornyack@google.com, x86@kernel.org From: Paolo Bonzini Message-ID: Date: Fri, 16 Sep 2016 17:06:30 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: <20160916145957.GF17296@potion> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Fri, 16 Sep 2016 15:06:35 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 16/09/2016 16:59, Radim Krčmář wrote: > KVM_MSR_DEADLINE would be interface in kvmclock nanosecond values and > MSR_IA32_TSCDEADLINE in TSC values. KVM_MSR_DEADLINE would follow > similar rules as MSR_IA32_TSCDEADLINE -- the interrupt fires when > kvmclock reaches the value, you read what you write, and 0 disarms it. > > If the TSC deadline timer was enabled, then the guest could write to > both MSR_IA32_TSCDEADLINE and KVM_MSR_DEADLINE, but only one could be > armed at any time (non-zero write to one will set the other to 0). > > The dual interface would allow unconditinal addition of the PV feature > without regressing users that currently use MSR_IA32_TSCDEADLINE and > adapted their stack to handle KVM's TSC shortcomings ... So far so good. My question is: what happens if you write to KVM_MSR_DEADLINE and read from MSR_IA32_TSCDEADLINE, or vice versa? The possibilities are: a) you read a 0 b) you read the value converted to the other unit c) you read another value such as -1 (a) and (c) are the simplest of course. (c) may make sense when writing to MSR_IA32_TSCDEADLINE and reading from KVM_MSR_DEADLINE, since we can decide which values are valid or not; -1 is technically a valid TSC deadline. I'm not sure about whether to allow (b). In the end KVM is going to convert a nsec deadline to a TSC value internally, and vice versa. On the other hand, if we do, userspace needs to figure out (on migration) whether the guest set up a TSC or a nanosecond deadline. >> this lets userspace decide whether to set a nsec-based >> deadline or a TSC-based deadline after migration. > > Hm, isn't switching to TSC-based deadline after migration pointless? Yes, but I didn't mean that. I meant preserving which MSR was written to arm the timer, and redoing the same on the destination. >>>> This still wouldn't handle old hosts of course. >>> >>> The question is whether we want to carry around 150 LOC because of old >>> hosts. I'd just fix Linux to avoid deadline TSC without invariant TSC. >>> :) >> >> Yes, that would automatically blacklist it on KVM. You'd also need to >> update the recent optimization to the TSC deadline timer, to also work >> on other APIC timer modes or at least in your new PV mode. > > All modes shouldn't be much harder than just the PV mode. The PV mode would still be a bit easier since it's still the TSC deadline timer just with a nicer interface that is not based on the TSC. Depends on how you code it though, I guess. Paolo