public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] kvm memslot read-locking with mmu_lock
@ 2008-01-21 12:37 Andrea Arcangeli
       [not found] ` <20080121123710.GF6970-lysg2Xt5kKMAvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Andrea Arcangeli @ 2008-01-21 12:37 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

This adds locking to the memslots so they can be looked up with only
the mmu_lock. Entries with memslot->userspace_addr have to be ignored
because they're not fully inserted yet.

Signed-off-by: Andrea Arcangeli <andrea-atKUWr5tajBWk0Htik3J/w@public.gmane.org>

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 8a90403..35a2ee0 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3219,14 +3249,20 @@ int kvm_arch_set_memory_region(struct kvm *kvm,
 	 */
 	if (!user_alloc) {
 		if (npages && !old.rmap) {
-			memslot->userspace_addr = do_mmap(NULL, 0,
-						     npages * PAGE_SIZE,
-						     PROT_READ | PROT_WRITE,
-						     MAP_SHARED | MAP_ANONYMOUS,
-						     0);
-
-			if (IS_ERR((void *)memslot->userspace_addr))
-				return PTR_ERR((void *)memslot->userspace_addr);
+			unsigned long userspace_addr;
+			
+			userspace_addr = do_mmap(NULL, 0,
+						 npages * PAGE_SIZE,
+						 PROT_READ | PROT_WRITE,
+						 MAP_SHARED | MAP_ANONYMOUS,
+						 0);
+			if (IS_ERR((void *)userspace_addr))
+				return PTR_ERR((void *)userspace_addr);
+
+			/* set userspace_addr atomically for kvm_hva_to_rmapp */
+			spin_lock(&kvm->mmu_lock);
+			memslot->userspace_addr = userspace_addr;
+			spin_unlock(&kvm->mmu_lock);
 		} else {
 			if (!old.user_alloc && old.rmap) {
 				int ret;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 4295623..a67e38f 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -298,7 +299,15 @@ int __kvm_set_memory_region(struct kvm *kvm,
 		memset(new.rmap, 0, npages * sizeof(*new.rmap));
 
 		new.user_alloc = user_alloc;
-		new.userspace_addr = mem->userspace_addr;
+		/*
+		 * hva_to_rmmap() serialzies with the mmu_lock and to be
+		 * safe it has to ignore memslots with !user_alloc &&
+		 * !userspace_addr.
+		 */
+		if (user_alloc)
+			new.userspace_addr = mem->userspace_addr;
+		else
+			new.userspace_addr = 0;
 	}
 
 	/* Allocate page dirty bitmap if needed */
@@ -311,14 +320,18 @@ int __kvm_set_memory_region(struct kvm *kvm,
 		memset(new.dirty_bitmap, 0, dirty_bytes);
 	}
 
+	spin_lock(&kvm->mmu_lock);
 	if (mem->slot >= kvm->nmemslots)
 		kvm->nmemslots = mem->slot + 1;
 
 	*memslot = new;
+	spin_unlock(&kvm->mmu_lock);
 
 	r = kvm_arch_set_memory_region(kvm, mem, old, user_alloc);
 	if (r) {
+		spin_lock(&kvm->mmu_lock);
 		*memslot = old;
+		spin_unlock(&kvm->mmu_lock);
 		goto out_free;
 	}
 

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] kvm memslot read-locking with mmu_lock
       [not found] ` <20080121123710.GF6970-lysg2Xt5kKMAvxtiuMwx3w@public.gmane.org>
@ 2008-01-22 13:47   ` Avi Kivity
       [not found]     ` <4795F3F0.90403-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Avi Kivity @ 2008-01-22 13:47 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

Andrea Arcangeli wrote:
> This adds locking to the memslots so they can be looked up with only
> the mmu_lock. Entries with memslot->userspace_addr have to be ignored
> because they're not fully inserted yet.
>
>   
What is the motivation for this?  Calls from mmu notifiers that don't 
have mmap_sem held?


>  
>  	/* Allocate page dirty bitmap if needed */
> @@ -311,14 +320,18 @@ int __kvm_set_memory_region(struct kvm *kvm,
>  		memset(new.dirty_bitmap, 0, dirty_bytes);
>  	}
>  
> +	spin_lock(&kvm->mmu_lock);
>  	if (mem->slot >= kvm->nmemslots)
>  		kvm->nmemslots = mem->slot + 1;
>  
>  	*memslot = new;
> +	spin_unlock(&kvm->mmu_lock);
>  
>  	r = kvm_arch_set_memory_region(kvm, mem, old, user_alloc);
>  	if (r) {
> +		spin_lock(&kvm->mmu_lock);
>  		*memslot = old;
> +		spin_unlock(&kvm->mmu_lock);
>  		goto out_free;
>  	}
>  
>   

This is arch independent code, I'm surprised mmu_lock is visible here?

What are the new lookup rules?  We don't hold mmu_lock everywhere we 
look up a gfn, do we?


-- 
error compiling committee.c: too many arguments to function


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] kvm memslot read-locking with mmu_lock
       [not found]     ` <4795F3F0.90403-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2008-01-22 14:32       ` Andrea Arcangeli
       [not found]         ` <20080122143210.GC7331-lysg2Xt5kKMAvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Andrea Arcangeli @ 2008-01-22 14:32 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

On Tue, Jan 22, 2008 at 03:47:28PM +0200, Avi Kivity wrote:
> Andrea Arcangeli wrote:
>> This adds locking to the memslots so they can be looked up with only
>> the mmu_lock. Entries with memslot->userspace_addr have to be ignored
>> because they're not fully inserted yet.
>>
>>   
> What is the motivation for this?  Calls from mmu notifiers that don't have 
> mmap_sem held?

Exactly.

>
>>   	/* Allocate page dirty bitmap if needed */
>> @@ -311,14 +320,18 @@ int __kvm_set_memory_region(struct kvm *kvm,
>>  		memset(new.dirty_bitmap, 0, dirty_bytes);
>>  	}
>>  +	spin_lock(&kvm->mmu_lock);
>>  	if (mem->slot >= kvm->nmemslots)
>>  		kvm->nmemslots = mem->slot + 1;
>>   	*memslot = new;
>> +	spin_unlock(&kvm->mmu_lock);
>>   	r = kvm_arch_set_memory_region(kvm, mem, old, user_alloc);
>>  	if (r) {
>> +		spin_lock(&kvm->mmu_lock);
>>  		*memslot = old;
>> +		spin_unlock(&kvm->mmu_lock);
>>  		goto out_free;
>>  	}
>>    
>
> This is arch independent code, I'm surprised mmu_lock is visible here?

The mmu_lock is arch independent as far as I can tell. Pretty much
like the mm->page_table_lock is also independent. All archs will have
some form of shadow pagetables in software or hardware, and mmu_lock
is the lock to take to serialize the pagetable updates and it also
allows to walk the memslots in readonly mode.

> What are the new lookup rules?  We don't hold mmu_lock everywhere we look 
> up a gfn, do we?

It's safe to loop over the memslots by just skipping the ones with
userland_addr == 0 by only holding the mmu_lock. The memslots contents
can't change by holding the mmu_lock. The mmu_lock also serializes the
rmap structures inside the memslot and the spte updates. So by just
taking the mmu_lock it's trivial to do "search memslot", "translate
the hva to its relative rmapp", "find all sptes relative to the hva
and overwrite them with nonpresent-fault".

If the mmu_notifiers would have been registered inside the vma things
would look very different in this area and it might have been possible
to embed the mmu-notifier inside the memslot itself, to avoid the
"search memslot" op.

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] kvm memslot read-locking with mmu_lock
       [not found]         ` <20080122143210.GC7331-lysg2Xt5kKMAvxtiuMwx3w@public.gmane.org>
@ 2008-01-22 14:38           ` Avi Kivity
       [not found]             ` <4795FFF9.8010400-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Avi Kivity @ 2008-01-22 14:38 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

Andrea Arcangeli wrote:
>
>> This is arch independent code, I'm surprised mmu_lock is visible here?
>>     
>
> The mmu_lock is arch independent as far as I can tell. Pretty much
> like the mm->page_table_lock is also independent. All archs will have
> some form of shadow pagetables in software or hardware, and mmu_lock
> is the lock to take to serialize the pagetable updates and it also
> allows to walk the memslots in readonly mode.
>
>   

Well, s390 has everything in hardware, but I suppose they can just 
ignore the lock.

>> What are the new lookup rules?  We don't hold mmu_lock everywhere we look 
>> up a gfn, do we?
>>     
>
> It's safe to loop over the memslots by just skipping the ones with
> userland_addr == 0 by only holding the mmu_lock. The memslots contents
> can't change by holding the mmu_lock. The mmu_lock also serializes the
> rmap structures inside the memslot and the spte updates. So by just
> taking the mmu_lock it's trivial to do "search memslot", "translate
> the hva to its relative rmapp", "find all sptes relative to the hva
> and overwrite them with nonpresent-fault".
>
>   

But we lookup memslots in parallel in the guest walker and similar 
places, relying on mmap_sem being taken for read.

Maybe we need a rwlock instead, and drop this overloaded usage of mmap_sem.

> If the mmu_notifiers would have been registered inside the vma things
> would look very different in this area and it might have been possible
> to embed the mmu-notifier inside the memslot itself, to avoid the
> "search memslot" op.
>   

Nothing guarantees a 1:1 mapping between memslots and vma.  You can have 
a vma backing multiple memslots, or a memslot spanning multiple vmas.

-- 
error compiling committee.c: too many arguments to function


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] kvm memslot read-locking with mmu_lock
       [not found]             ` <4795FFF9.8010400-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2008-01-22 14:50               ` Andrea Arcangeli
       [not found]                 ` <20080122145043.GF7331-lysg2Xt5kKMAvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Andrea Arcangeli @ 2008-01-22 14:50 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

On Tue, Jan 22, 2008 at 04:38:49PM +0200, Avi Kivity wrote:
> Andrea Arcangeli wrote:
>>
>>> This is arch independent code, I'm surprised mmu_lock is visible here?
>>>     
>>
>> The mmu_lock is arch independent as far as I can tell. Pretty much
>> like the mm->page_table_lock is also independent. All archs will have
>> some form of shadow pagetables in software or hardware, and mmu_lock
>> is the lock to take to serialize the pagetable updates and it also
>> allows to walk the memslots in readonly mode.
>>
>>   
>
> Well, s390 has everything in hardware, but I suppose they can just ignore 
> the lock.

If they don't take it in their lowlevel it's enough I think. It'll
still be useful to lookup memslots like in every other arch.

> But we lookup memslots in parallel in the guest walker and similar places, 
> relying on mmap_sem being taken for read.

That's ok. The vmas could also be looked up either with mmap_sam in
read mode or with only the page_table_lock taken (these days it
doesn't work anymore with ptlocks). The mmu_lock is only taken in the
modifications of the memslots, the lookups don't require it. The
modifications have to take both mmap_sem in write mode and the
mmu_lock. Lookups requires either mmap_sem in read mode or the
mmu_lock. The lookups with mmap_sem in read mode won't ever see a
userspace_addr =0, the mmu_lock have to skip over the userspace_addr=0 instead.

> Maybe we need a rwlock instead, and drop this overloaded usage of mmap_sem.

Adding another lock just for the memslots should be feasible. However
it can't be a semaphore or it won't work for the
mmu-notifiers. A rwlock would be ok, but then all memslots lookups
will have to be updated to check for userspace_addr != 0 like the
kvm_hva_to_rmapp does.

> Nothing guarantees a 1:1 mapping between memslots and vma.  You can have a 
> vma backing multiple memslots, or a memslot spanning multiple vmas.

Yes, that's why it would add complications to have it in the vma, one
would need to add a vma split to guarantee the 1:1 mapping, and then
it'd be a bit faster too (though the memslots can easily be rbtree-indexed).

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] kvm memslot read-locking with mmu_lock
       [not found]                 ` <20080122145043.GF7331-lysg2Xt5kKMAvxtiuMwx3w@public.gmane.org>
@ 2008-01-23  8:15                   ` Carsten Otte
  0 siblings, 0 replies; 6+ messages in thread
From: Carsten Otte @ 2008-01-23  8:15 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Avi Kivity

Andrea Arcangeli wrote:
> On Tue, Jan 22, 2008 at 04:38:49PM +0200, Avi Kivity wrote:
>> Andrea Arcangeli wrote:
>>>> This is arch independent code, I'm surprised mmu_lock is visible here?
>>>>     
>>> The mmu_lock is arch independent as far as I can tell. Pretty much
>>> like the mm->page_table_lock is also independent. All archs will have
>>> some form of shadow pagetables in software or hardware, and mmu_lock
>>> is the lock to take to serialize the pagetable updates and it also
>>> allows to walk the memslots in readonly mode.
>>>
>>>   
>> Well, s390 has everything in hardware, but I suppose they can just ignore 
>> the lock.
> 
> If they don't take it in their lowlevel it's enough I think. It'll
> still be useful to lookup memslots like in every other arch.
The lock won't be a performance issue for s390. We don't need to look 
them up frequently afaics.

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2008-01-23  8:15 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-01-21 12:37 [PATCH] kvm memslot read-locking with mmu_lock Andrea Arcangeli
     [not found] ` <20080121123710.GF6970-lysg2Xt5kKMAvxtiuMwx3w@public.gmane.org>
2008-01-22 13:47   ` Avi Kivity
     [not found]     ` <4795F3F0.90403-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2008-01-22 14:32       ` Andrea Arcangeli
     [not found]         ` <20080122143210.GC7331-lysg2Xt5kKMAvxtiuMwx3w@public.gmane.org>
2008-01-22 14:38           ` Avi Kivity
     [not found]             ` <4795FFF9.8010400-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2008-01-22 14:50               ` Andrea Arcangeli
     [not found]                 ` <20080122145043.GF7331-lysg2Xt5kKMAvxtiuMwx3w@public.gmane.org>
2008-01-23  8:15                   ` Carsten Otte

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox