From mboxrd@z Thu Jan  1 00:00:00 1970
From: Marcelo Tosatti <mtosatti@redhat.com>
Subject: Re: [patch 06/11] KVM: introduce kvm->srcu and convert
	kvm_set_memory_region to SRCU update
Date: Wed, 23 Dec 2009 12:19:41 -0200
Message-ID: <20091223141941.GA14329@amt.cnet>
References: <20091223113833.742662117@redhat.com> <20091223114438.751996250@redhat.com> <4B320DF7.3070005@redhat.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: kvm@vger.kernel.org
To: Avi Kivity <avi@redhat.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mx1.redhat.com ([209.132.183.28]:60178 "EHLO mx1.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1751982AbZLWOUy (ORCPT <rfc822;kvm@vger.kernel.org>);
	Wed, 23 Dec 2009 09:20:54 -0500
Received: from int-mx05.intmail.prod.int.phx2.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.18])
	by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id nBNEKr92014225
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK)
	for <kvm@vger.kernel.org>; Wed, 23 Dec 2009 09:20:54 -0500
Content-Disposition: inline
In-Reply-To: <4B320DF7.3070005@redhat.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

On Wed, Dec 23, 2009 at 02:32:55PM +0200, Avi Kivity wrote:
> On 12/23/2009 01:38 PM, Marcelo Tosatti wrote:
>> Use two steps for memslot deletion: mark the slot invalid (which stops
>> instantiation of new shadow pages for that slot, but allows destruction),
>> then instantiate the new empty slot.
>>
>> Also simplifies kvm_handle_hva locking.
>>
>>
>>   	r = kvm_arch_prepare_memory_region(kvm,&new, old, user_alloc);
>>   	if (r)
>>   		goto out_free;
>>    
>
> r == 0

Huh? kvm_arch_prepare_memory_region returns a suitable error code on
failure, 0 on success.

>>
>> -	spin_lock(&kvm->mmu_lock);
>> -	if (mem->slot>= kvm->memslots->nmemslots)
>> -		kvm->memslots->nmemslots = mem->slot + 1;
>> +#ifdef CONFIG_DMAR
>> +	/* map the pages in iommu page table */
>> +	if (npages)
>> +		r = kvm_iommu_map_pages(kvm,&new);
>> +		if (r)
>> +			goto out_free;
>>    
>
> Bad indentation.
>
> (still r == 0)

kvm_iommu_map_pages returns 0 on success, error code otherwise? The
error code of kvm_iommu_map_pages was returned before the patchset, BTW.

>> +#endif
>>
>> -	*memslot = new;
>> -	spin_unlock(&kvm->mmu_lock);
>> +	slots = kzalloc(sizeof(struct kvm_memslots), GFP_KERNEL);
>> +	if (!slots)
>> +		goto out_free;
>>    
>
> goto out_free with r == 0.
>
> Best to assign r = -ENOMEM each time, to avoid these headaches.

Fixed.

>> +int memslot_id(struct kvm *kvm, gfn_t gfn)
>>    
>
> Should be either static or kvm_memslot_id().  But the source is already  
> like that, so leave it.

OK.

>> @@ -807,23 +808,18 @@ static int kvm_handle_hva(struct kvm *kv
>>   			  int (*handler)(struct kvm *kvm, unsigned long *rmapp,
>>   					 unsigned long data))
>>   {
>> -	int i, j;
>> +	int i, j, idx;
>>   	int retval = 0;
>> -	struct kvm_memslots *slots = kvm->memslots;
>> +	struct kvm_memslots *slots;
>> +
>> +	idx = srcu_read_lock(&kvm->srcu);
>>    
>
> Maybe a better place is in the mmu notifiers, so the code is shared  
> (once other archs start to use mmu notifiers).

Done.

>> @@ -3020,16 +3017,20 @@ nomem:
>>    */
>>   unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm)
>>   {
>> -	int i;
>> +	int i, idx;
>>   	unsigned int nr_mmu_pages;
>>   	unsigned int  nr_pages = 0;
>> +	struct kvm_memslots *slots;
>>
>> -	for (i = 0; i<  kvm->memslots->nmemslots; i++)
>> -		nr_pages += kvm->memslots->memslots[i].npages;
>> +	idx = srcu_read_lock(&kvm->srcu);
>> +	slots = rcu_dereference(kvm->memslots);
>> +	for (i = 0; i<  slots->nmemslots; i++)
>> +		nr_pages += slots->memslots[i].npages;
>>
>>   	nr_mmu_pages = nr_pages * KVM_PERMILLE_MMU_PAGES / 1000;
>>   	nr_mmu_pages = max(nr_mmu_pages,
>>   			(unsigned int) KVM_MIN_ALLOC_MMU_PAGES);
>> +	srcu_read_unlock(&kvm->srcu, idx);
>>
>>    
>
> Again, would like to move the srcu_locking to an outer scope.

This one was for documentation purposes only, since
kvm_mmu_calculate_mmu_pages is only called from commit_memory_region
(which mutually excludes itself).

I can remove it if you'd prefer.

>> @@ -1503,10 +1504,18 @@ static void enter_pmode(struct kvm_vcpu
>>   static gva_t rmode_tss_base(struct kvm *kvm)
>>   {
>>   	if (!kvm->arch.tss_addr) {
>> -		gfn_t base_gfn = kvm->memslots->memslots[0].base_gfn +
>> -				 kvm->memslots->memslots[0].npages - 3;
>> +		struct kvm_memslots *slots;
>> +		gfn_t base_gfn;
>> +		int idx;
>> +
>> +		idx = srcu_read_lock(&kvm->srcu);
>> +		slots = rcu_dereference(kvm->memslots);
>> + 		base_gfn = slots->memslots[0].base_gfn +
>> +				 slots->memslots[0].npages - 3;
>> +		srcu_read_unlock(&kvm->srcu, idx);
>>   		return base_gfn<<  PAGE_SHIFT;
>>   	}
>> +
>>   	return kvm->arch.tss_addr;
>>   }
>>    
>
> Shouldn't we already hold the srcu_lock as part of normal guest exit  
> (similar to down_read() we do today)?

Yes, but:

kvm_arch_vcpu_setup -> vmx_vcpu_reset -> vmx_set_cr0 -> enter_pmode

So its called outside guest context path (memslot info is used outside
guest context in this case).