* [PATCH] kvm memslot read-locking with mmu_lock
@ 2008-01-21 12:37 Andrea Arcangeli
[not found] ` <20080121123710.GF6970-lysg2Xt5kKMAvxtiuMwx3w@public.gmane.org>
0 siblings, 1 reply; 6+ messages in thread
From: Andrea Arcangeli @ 2008-01-21 12:37 UTC (permalink / raw)
To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
This adds locking to the memslots so they can be looked up with only
the mmu_lock. Entries with memslot->userspace_addr have to be ignored
because they're not fully inserted yet.
Signed-off-by: Andrea Arcangeli <andrea-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 8a90403..35a2ee0 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3219,14 +3249,20 @@ int kvm_arch_set_memory_region(struct kvm *kvm,
*/
if (!user_alloc) {
if (npages && !old.rmap) {
- memslot->userspace_addr = do_mmap(NULL, 0,
- npages * PAGE_SIZE,
- PROT_READ | PROT_WRITE,
- MAP_SHARED | MAP_ANONYMOUS,
- 0);
-
- if (IS_ERR((void *)memslot->userspace_addr))
- return PTR_ERR((void *)memslot->userspace_addr);
+ unsigned long userspace_addr;
+
+ userspace_addr = do_mmap(NULL, 0,
+ npages * PAGE_SIZE,
+ PROT_READ | PROT_WRITE,
+ MAP_SHARED | MAP_ANONYMOUS,
+ 0);
+ if (IS_ERR((void *)userspace_addr))
+ return PTR_ERR((void *)userspace_addr);
+
+ /* set userspace_addr atomically for kvm_hva_to_rmapp */
+ spin_lock(&kvm->mmu_lock);
+ memslot->userspace_addr = userspace_addr;
+ spin_unlock(&kvm->mmu_lock);
} else {
if (!old.user_alloc && old.rmap) {
int ret;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 4295623..a67e38f 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -298,7 +299,15 @@ int __kvm_set_memory_region(struct kvm *kvm,
memset(new.rmap, 0, npages * sizeof(*new.rmap));
new.user_alloc = user_alloc;
- new.userspace_addr = mem->userspace_addr;
+ /*
+ * hva_to_rmmap() serialzies with the mmu_lock and to be
+ * safe it has to ignore memslots with !user_alloc &&
+ * !userspace_addr.
+ */
+ if (user_alloc)
+ new.userspace_addr = mem->userspace_addr;
+ else
+ new.userspace_addr = 0;
}
/* Allocate page dirty bitmap if needed */
@@ -311,14 +320,18 @@ int __kvm_set_memory_region(struct kvm *kvm,
memset(new.dirty_bitmap, 0, dirty_bytes);
}
+ spin_lock(&kvm->mmu_lock);
if (mem->slot >= kvm->nmemslots)
kvm->nmemslots = mem->slot + 1;
*memslot = new;
+ spin_unlock(&kvm->mmu_lock);
r = kvm_arch_set_memory_region(kvm, mem, old, user_alloc);
if (r) {
+ spin_lock(&kvm->mmu_lock);
*memslot = old;
+ spin_unlock(&kvm->mmu_lock);
goto out_free;
}
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] kvm memslot read-locking with mmu_lock
[not found] ` <20080121123710.GF6970-lysg2Xt5kKMAvxtiuMwx3w@public.gmane.org>
@ 2008-01-22 13:47 ` Avi Kivity
[not found] ` <4795F3F0.90403-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
0 siblings, 1 reply; 6+ messages in thread
From: Avi Kivity @ 2008-01-22 13:47 UTC (permalink / raw)
To: Andrea Arcangeli; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Andrea Arcangeli wrote:
> This adds locking to the memslots so they can be looked up with only
> the mmu_lock. Entries with memslot->userspace_addr have to be ignored
> because they're not fully inserted yet.
>
>
What is the motivation for this? Calls from mmu notifiers that don't
have mmap_sem held?
>
> /* Allocate page dirty bitmap if needed */
> @@ -311,14 +320,18 @@ int __kvm_set_memory_region(struct kvm *kvm,
> memset(new.dirty_bitmap, 0, dirty_bytes);
> }
>
> + spin_lock(&kvm->mmu_lock);
> if (mem->slot >= kvm->nmemslots)
> kvm->nmemslots = mem->slot + 1;
>
> *memslot = new;
> + spin_unlock(&kvm->mmu_lock);
>
> r = kvm_arch_set_memory_region(kvm, mem, old, user_alloc);
> if (r) {
> + spin_lock(&kvm->mmu_lock);
> *memslot = old;
> + spin_unlock(&kvm->mmu_lock);
> goto out_free;
> }
>
>
This is arch independent code, I'm surprised mmu_lock is visible here?
What are the new lookup rules? We don't hold mmu_lock everywhere we
look up a gfn, do we?
--
error compiling committee.c: too many arguments to function
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] kvm memslot read-locking with mmu_lock
[not found] ` <4795F3F0.90403-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2008-01-22 14:32 ` Andrea Arcangeli
[not found] ` <20080122143210.GC7331-lysg2Xt5kKMAvxtiuMwx3w@public.gmane.org>
0 siblings, 1 reply; 6+ messages in thread
From: Andrea Arcangeli @ 2008-01-22 14:32 UTC (permalink / raw)
To: Avi Kivity; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
On Tue, Jan 22, 2008 at 03:47:28PM +0200, Avi Kivity wrote:
> Andrea Arcangeli wrote:
>> This adds locking to the memslots so they can be looked up with only
>> the mmu_lock. Entries with memslot->userspace_addr have to be ignored
>> because they're not fully inserted yet.
>>
>>
> What is the motivation for this? Calls from mmu notifiers that don't have
> mmap_sem held?
Exactly.
>
>> /* Allocate page dirty bitmap if needed */
>> @@ -311,14 +320,18 @@ int __kvm_set_memory_region(struct kvm *kvm,
>> memset(new.dirty_bitmap, 0, dirty_bytes);
>> }
>> + spin_lock(&kvm->mmu_lock);
>> if (mem->slot >= kvm->nmemslots)
>> kvm->nmemslots = mem->slot + 1;
>> *memslot = new;
>> + spin_unlock(&kvm->mmu_lock);
>> r = kvm_arch_set_memory_region(kvm, mem, old, user_alloc);
>> if (r) {
>> + spin_lock(&kvm->mmu_lock);
>> *memslot = old;
>> + spin_unlock(&kvm->mmu_lock);
>> goto out_free;
>> }
>>
>
> This is arch independent code, I'm surprised mmu_lock is visible here?
The mmu_lock is arch independent as far as I can tell. Pretty much
like the mm->page_table_lock is also independent. All archs will have
some form of shadow pagetables in software or hardware, and mmu_lock
is the lock to take to serialize the pagetable updates and it also
allows to walk the memslots in readonly mode.
> What are the new lookup rules? We don't hold mmu_lock everywhere we look
> up a gfn, do we?
It's safe to loop over the memslots by just skipping the ones with
userland_addr == 0 by only holding the mmu_lock. The memslots contents
can't change by holding the mmu_lock. The mmu_lock also serializes the
rmap structures inside the memslot and the spte updates. So by just
taking the mmu_lock it's trivial to do "search memslot", "translate
the hva to its relative rmapp", "find all sptes relative to the hva
and overwrite them with nonpresent-fault".
If the mmu_notifiers would have been registered inside the vma things
would look very different in this area and it might have been possible
to embed the mmu-notifier inside the memslot itself, to avoid the
"search memslot" op.
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] kvm memslot read-locking with mmu_lock
[not found] ` <20080122143210.GC7331-lysg2Xt5kKMAvxtiuMwx3w@public.gmane.org>
@ 2008-01-22 14:38 ` Avi Kivity
[not found] ` <4795FFF9.8010400-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
0 siblings, 1 reply; 6+ messages in thread
From: Avi Kivity @ 2008-01-22 14:38 UTC (permalink / raw)
To: Andrea Arcangeli; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Andrea Arcangeli wrote:
>
>> This is arch independent code, I'm surprised mmu_lock is visible here?
>>
>
> The mmu_lock is arch independent as far as I can tell. Pretty much
> like the mm->page_table_lock is also independent. All archs will have
> some form of shadow pagetables in software or hardware, and mmu_lock
> is the lock to take to serialize the pagetable updates and it also
> allows to walk the memslots in readonly mode.
>
>
Well, s390 has everything in hardware, but I suppose they can just
ignore the lock.
>> What are the new lookup rules? We don't hold mmu_lock everywhere we look
>> up a gfn, do we?
>>
>
> It's safe to loop over the memslots by just skipping the ones with
> userland_addr == 0 by only holding the mmu_lock. The memslots contents
> can't change by holding the mmu_lock. The mmu_lock also serializes the
> rmap structures inside the memslot and the spte updates. So by just
> taking the mmu_lock it's trivial to do "search memslot", "translate
> the hva to its relative rmapp", "find all sptes relative to the hva
> and overwrite them with nonpresent-fault".
>
>
But we lookup memslots in parallel in the guest walker and similar
places, relying on mmap_sem being taken for read.
Maybe we need a rwlock instead, and drop this overloaded usage of mmap_sem.
> If the mmu_notifiers would have been registered inside the vma things
> would look very different in this area and it might have been possible
> to embed the mmu-notifier inside the memslot itself, to avoid the
> "search memslot" op.
>
Nothing guarantees a 1:1 mapping between memslots and vma. You can have
a vma backing multiple memslots, or a memslot spanning multiple vmas.
--
error compiling committee.c: too many arguments to function
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] kvm memslot read-locking with mmu_lock
[not found] ` <4795FFF9.8010400-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2008-01-22 14:50 ` Andrea Arcangeli
[not found] ` <20080122145043.GF7331-lysg2Xt5kKMAvxtiuMwx3w@public.gmane.org>
0 siblings, 1 reply; 6+ messages in thread
From: Andrea Arcangeli @ 2008-01-22 14:50 UTC (permalink / raw)
To: Avi Kivity; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
On Tue, Jan 22, 2008 at 04:38:49PM +0200, Avi Kivity wrote:
> Andrea Arcangeli wrote:
>>
>>> This is arch independent code, I'm surprised mmu_lock is visible here?
>>>
>>
>> The mmu_lock is arch independent as far as I can tell. Pretty much
>> like the mm->page_table_lock is also independent. All archs will have
>> some form of shadow pagetables in software or hardware, and mmu_lock
>> is the lock to take to serialize the pagetable updates and it also
>> allows to walk the memslots in readonly mode.
>>
>>
>
> Well, s390 has everything in hardware, but I suppose they can just ignore
> the lock.
If they don't take it in their lowlevel it's enough I think. It'll
still be useful to lookup memslots like in every other arch.
> But we lookup memslots in parallel in the guest walker and similar places,
> relying on mmap_sem being taken for read.
That's ok. The vmas could also be looked up either with mmap_sam in
read mode or with only the page_table_lock taken (these days it
doesn't work anymore with ptlocks). The mmu_lock is only taken in the
modifications of the memslots, the lookups don't require it. The
modifications have to take both mmap_sem in write mode and the
mmu_lock. Lookups requires either mmap_sem in read mode or the
mmu_lock. The lookups with mmap_sem in read mode won't ever see a
userspace_addr =0, the mmu_lock have to skip over the userspace_addr=0 instead.
> Maybe we need a rwlock instead, and drop this overloaded usage of mmap_sem.
Adding another lock just for the memslots should be feasible. However
it can't be a semaphore or it won't work for the
mmu-notifiers. A rwlock would be ok, but then all memslots lookups
will have to be updated to check for userspace_addr != 0 like the
kvm_hva_to_rmapp does.
> Nothing guarantees a 1:1 mapping between memslots and vma. You can have a
> vma backing multiple memslots, or a memslot spanning multiple vmas.
Yes, that's why it would add complications to have it in the vma, one
would need to add a vma split to guarantee the 1:1 mapping, and then
it'd be a bit faster too (though the memslots can easily be rbtree-indexed).
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] kvm memslot read-locking with mmu_lock
[not found] ` <20080122145043.GF7331-lysg2Xt5kKMAvxtiuMwx3w@public.gmane.org>
@ 2008-01-23 8:15 ` Carsten Otte
0 siblings, 0 replies; 6+ messages in thread
From: Carsten Otte @ 2008-01-23 8:15 UTC (permalink / raw)
To: Andrea Arcangeli; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Avi Kivity
Andrea Arcangeli wrote:
> On Tue, Jan 22, 2008 at 04:38:49PM +0200, Avi Kivity wrote:
>> Andrea Arcangeli wrote:
>>>> This is arch independent code, I'm surprised mmu_lock is visible here?
>>>>
>>> The mmu_lock is arch independent as far as I can tell. Pretty much
>>> like the mm->page_table_lock is also independent. All archs will have
>>> some form of shadow pagetables in software or hardware, and mmu_lock
>>> is the lock to take to serialize the pagetable updates and it also
>>> allows to walk the memslots in readonly mode.
>>>
>>>
>> Well, s390 has everything in hardware, but I suppose they can just ignore
>> the lock.
>
> If they don't take it in their lowlevel it's enough I think. It'll
> still be useful to lookup memslots like in every other arch.
The lock won't be a performance issue for s390. We don't need to look
them up frequently afaics.
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2008-01-23 8:15 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-01-21 12:37 [PATCH] kvm memslot read-locking with mmu_lock Andrea Arcangeli
[not found] ` <20080121123710.GF6970-lysg2Xt5kKMAvxtiuMwx3w@public.gmane.org>
2008-01-22 13:47 ` Avi Kivity
[not found] ` <4795F3F0.90403-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2008-01-22 14:32 ` Andrea Arcangeli
[not found] ` <20080122143210.GC7331-lysg2Xt5kKMAvxtiuMwx3w@public.gmane.org>
2008-01-22 14:38 ` Avi Kivity
[not found] ` <4795FFF9.8010400-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2008-01-22 14:50 ` Andrea Arcangeli
[not found] ` <20080122145043.GF7331-lysg2Xt5kKMAvxtiuMwx3w@public.gmane.org>
2008-01-23 8:15 ` Carsten Otte
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox