From mboxrd@z Thu Jan  1 00:00:00 1970
From: Avi Kivity <avi@qumranet.com>
Subject: Re: KVM: x86: move vapic page handling out of fast path
Date: Mon, 23 Jun 2008 05:29:34 +0300
Message-ID: <485F0A8E.3030605@qumranet.com>
References: <20080619174347.GA9236@dmt.cnet> <485DDC5B.1020707@qumranet.com> <20080622170558.GA6587@dmt.cnet>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: kvm-devel <kvm@vger.kernel.org>
To: Marcelo Tosatti <mtosatti@redhat.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from il.qumranet.com ([212.179.150.194]:10586 "EHLO il.qumranet.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1752921AbYFWC3W (ORCPT <rfc822;kvm@vger.kernel.org>);
	Sun, 22 Jun 2008 22:29:22 -0400
In-Reply-To: <20080622170558.GA6587@dmt.cnet>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

Marcelo Tosatti wrote:
> On Sun, Jun 22, 2008 at 08:00:11AM +0300, Avi Kivity wrote:
>   
>> Marcelo Tosatti wrote:
>>     
>>> I fail to see the point of handling the vapic page grab and ref
>>> counting in __vcpu_run's heavyweight enter/exit path.
>>>
>>>   
>>>       
>> It's to avoid pinning the page indefinitely.
>>
>>     
>>> So move it to kvm_lapic_set_vapic_addr / kvm_free_lapic time.
>>>
>>> Other than the obvious improvement for non-Flexpriority case, this      
>>> kills a down_read/up_read pair in heavy exits and reduces code size.    
>>>   
>>>       
>> With mmu notifiers we can do this and still not ping the page:
>>
>> fast path:
>>
>>
>> if (vapic_addr && !vapic_page)
>>    enter_vapic();
>>
>>
>> mmu notifier:
>>
>> if (gpa == vapic_addr)
>>    exit_vapic()
>>
>>
>> So let's wait with this until mmu notifiers are merged.
>>     
>
> But what is the point, or advantage, of having the _any_ vapic page
> handling in __vcpu_run ? 
>
> The reference for it is grabbed at kvm_lapic_set_vapic_addr() (which
> does not take any spinlock, so its safe to swapin the page) and released
> at guest exit.
>
>   

The page can't be swapped out since its reference count is elevated 
indefinitely.


> So, what do you have against this patch ?
>   

We need to move away from reference counts, they make kvm brittle.  The 
patch improves the current state of things (since pages are pinned 
indefinitely anyway now) but takes the wrong direction for the future.

Note that kvmclock has the same issue, so we might as well share the 
solution:

struct kvm_fast_guest_page {
     gfn_t gfn;
     struct page *page;
     spinlock_t lock;
     struct list_head link;
}

The mmu notifier callbacks can scan this list and null any pages that 
match the gfn being cleared, but in the normal cast, ->page can be 
accessed directly.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.