From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marcelo Tosatti Subject: Re: [patch 09/18] KVM: x86: introduce facility to support vsyscall pvclock, via MSR Date: Mon, 29 Oct 2012 16:40:15 -0200 Message-ID: <20121029184015.GA30422@amt.cnet> References: <20121024131340.742340256@redhat.com> <20121024131621.707068244@redhat.com> <508E9697.2000003@parallels.com> <508EC089.5030409@goop.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: kvm@vger.kernel.org, johnstul@us.ibm.com, zamsden@gmail.com, gleb@redhat.com, avi@redhat.com, pbonzini@redhat.com To: Jeremy Fitzhardinge , Glauber Costa Return-path: Received: from mx1.redhat.com ([209.132.183.28]:13529 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758851Ab2J2SlG (ORCPT ); Mon, 29 Oct 2012 14:41:06 -0400 Content-Disposition: inline In-Reply-To: <508EC089.5030409@goop.org> Sender: kvm-owner@vger.kernel.org List-ID: On Mon, Oct 29, 2012 at 10:44:41AM -0700, Jeremy Fitzhardinge wrote: > On 10/29/2012 07:45 AM, Glauber Costa wrote: > > On 10/24/2012 05:13 PM, Marcelo Tosatti wrote: > >> Allow a guest to register a second location for the VCPU time info > >> > >> structure for each vcpu (as described by MSR_KVM_SYSTEM_TIME_NEW). > >> This is intended to allow the guest kernel to map this information > >> into a usermode accessible page, so that usermode can efficiently > >> calculate system time from the TSC without having to make a syscall. > >> > >> Signed-off-by: Marcelo Tosatti > > Can you please be a bit more specific about why we need this? Why does > > the host need to provide us with two pages with the exact same data? Why > > can't just do it with mapping tricks in the guest? > > In Xen the pvclock structure is embedded within a pile of other stuff > that shouldn't be mapped into guest memory, so providing for a second > location allows it to be placed whereever is convenient for the guest. > That's a restriction of the Xen ABI, but I don't know if it affects KVM. > > J It is possible to share the data for KVM in theory, but: - It is a small amount of memory. - It requires aligning to page size (the in-kernel percpu array is currently cacheline aligned). - It is possible to modify flags separately for userspace/kernelspace, if desired. This justifies the duplication IMO (code is simple and clean).