From: Marcelo Tosatti <mtosatti@redhat.com>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: Avi Kivity <avi@redhat.com>, KVM list <kvm@vger.kernel.org>
Subject: Re: unit tests and get_user_pages_ptes_fast()
Date: Mon, 4 Oct 2010 20:59:53 -0300 [thread overview]
Message-ID: <20101004235953.GA1474@amt.cnet> (raw)
In-Reply-To: <20101004134052.GQ26357@random.random>
On Mon, Oct 04, 2010 at 03:40:52PM +0200, Andrea Arcangeli wrote:
> Hi Avi,
>
> On Mon, Oct 04, 2010 at 11:35:28AM +0200, Avi Kivity wrote:
> > During the last kvm forum, I described a unit test framework that can
> > help test the kvm APIs. Briefly, it starts a process in host userspace,
> > which sets up a memory slot mapping gpa 0:3G to hva 0:3G. It then sets
> > up guest registers for unpaged protected mode (or paged protected mode
> > with 1:1 mapping). The effect is that we have a 1:1 gva->hva
> > translation, and can use KVM_RUN to run host code in guest mode.
> >
> > There is a snag however. kvm calls get_user_pages_fast(.write = 1), and
> > the host process maps its code pages read-only.
> >
> > The way I'd like to work around this is to map read-only accesses to
> > read-only pages as read-only. This also prevents ksm cow pages from
> > being broken by read accesses. It can also be used to get the page size
> > for transparent huge pages (and later hugetlbfs too).
> >
> > So, for a read fault we do:
> >
> > pte_t pte;
> > get_user_pages_ptes_fast(..., page, &pte, 1, .write = 0)
> > ...
> > if (pte_transhuge(pte)) // or however it's called
> > ...
> > ...
> > if (pte_write(pte))
> > map writeable
> > else
> > map readonly
> >
> > Any snags? or alternatives?
>
> I think we tried to do this write=0 before, it provides benefits for
> swapins from swapcache too (after the page is unmapped but the
> swapcache is present and uptodate and clean). If I recall correctly
> the only obstacle was the fact the out of sync code wants to map the
> spte writable if it can, but if we use write=0 in gup, it won't know
> if it can. If it'd wait a real write access from guest, we'd risk
> creating a spte wrprotect fault for nothing (it should only write a
> write access from the guest if the linux VM solves the page fault with
> a readonly pte, even the minor-read-fault from exclusive swapcache
> will not set the dirty bit but it will still set the pte_write on the
> pte so the out of sync code should take advantage of that information
> in the host pte). In the past you suggested some TRY_WRITE operation
> instead of only read/write. We just need to pass the write bit of the
> pte somehow to the gup caller.
Yep, the drawback is the unnecessary write fault. What i have here is:
--- kvm.orig/virt/kvm/kvm_main.c
+++ kvm/virt/kvm/kvm_main.c
@@ -827,7 +827,7 @@ unsigned long gfn_to_hva(struct kvm *kvm
}
EXPORT_SYMBOL_GPL(gfn_to_hva);
-pfn_t gfn_to_pfn(struct kvm *kvm, gfn_t gfn)
+pfn_t gfn_to_pfn(struct kvm *kvm, gfn_t gfn, int *writable)
{
struct page *page[1];
unsigned long addr;
@@ -842,8 +842,16 @@ pfn_t gfn_to_pfn(struct kvm *kvm, gfn_t
return page_to_pfn(bad_page);
}
+ *writable = 1;
npages = get_user_pages_fast(addr, 1, 1, page);
+ /* attempt to map read-only */
+ if (unlikely(npages != 1)) {
+ npages = get_user_pages_fast(addr, 1, 0, page);
+ if (npages == 1)
+ *writable = 0;
+ }
+
if (unlikely(npages != 1)) {
struct vm_area_struct *vma;
Can rebase and resend, if you'd like.
> Full agreement that once we use write=0 for read faults your problem
> with the debugging userland running in guest mode will have better
> chance to work too :).
>
> Andrea
next prev parent reply other threads:[~2010-10-05 0:01 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-10-04 9:35 unit tests and get_user_pages_ptes_fast() Avi Kivity
2010-10-04 13:40 ` Andrea Arcangeli
2010-10-04 23:59 ` Marcelo Tosatti [this message]
2010-10-05 7:36 ` Avi Kivity
2010-10-05 9:22 ` Marcelo Tosatti
2010-10-05 14:15 ` Andrea Arcangeli
2010-10-05 14:25 ` Avi Kivity
2010-10-05 14:32 ` Andrea Arcangeli
2010-10-05 14:36 ` Avi Kivity
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20101004235953.GA1474@amt.cnet \
--to=mtosatti@redhat.com \
--cc=aarcange@redhat.com \
--cc=avi@redhat.com \
--cc=kvm@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.