From: Paul Mackerras <paulus@samba.org>
To: Alexander Graf <agraf@suse.de>
Cc: "kvm-ppc@vger.kernel.org" <kvm-ppc@vger.kernel.org>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>
Subject: Re: [PATCH 4/5] KVM: PPC: Book3S HV: Don't give the guest RW access to RO pages
Date: Sat, 24 Nov 2012 20:32:37 +1100 [thread overview]
Message-ID: <20121124093237.GF23537@bloggs.ozlabs.ibm.com> (raw)
In-Reply-To: <E3BDC278-EA1D-4B28-BA91-A2676BFA7232@suse.de>
On Sat, Nov 24, 2012 at 10:05:37AM +0100, Alexander Graf wrote:
>
>
> On 23.11.2012, at 23:13, Paul Mackerras <paulus@samba.org> wrote:
>
> > On Fri, Nov 23, 2012 at 04:47:45PM +0100, Alexander Graf wrote:
> >>
> >> On 22.11.2012, at 10:28, Paul Mackerras wrote:
> >>
> >>> Currently, if the guest does an H_PROTECT hcall requesting that the
> >>> permissions on a HPT entry be changed to allow writing, we make the
> >>> requested change even if the page is marked read-only in the host
> >>> Linux page tables. This is a problem since it would for instance
> >>> allow a guest to modify a page that KSM has decided can be shared
> >>> between multiple guests.
> >>>
> >>> To fix this, if the new permissions for the page allow writing, we need
> >>> to look up the memslot for the page, work out the host virtual address,
> >>> and look up the Linux page tables to get the PTE for the page. If that
> >>> PTE is read-only, we reduce the HPTE permissions to read-only.
> >>
> >> How does KSM handle this usually? If you reduce the permissions to R/O, how do you ever get a R/W page from a deduplicated one?
> >
> > The scenario goes something like this:
> >
> > 1. Guest creates an HPTE with RO permissions.
> > 2. KSM decides the page is identical to another page and changes the
> > HPTE to point to a shared copy. Permissions are still RO.
> > 3. Guest decides it wants write access to the page and does an
> > H_PROTECT hcall to change the permissions on the HPTE to RW.
> >
> > The bug is that we actually make the requested change in step 3.
> > Instead we should leave it at RO, then when the guest tries to write
> > to the page, we take a hypervisor page fault, copy the page and give
> > the guest write access to its own copy of the page.
> >
> > So what this patch does is add code to H_PROTECT so that if the guest
> > is requesting RW access, we check the Linux PTE to see if the
> > underlying guest page is RO, and if so reduce the permissions in the
> > HPTE to RO.
>
> But this will be guest visible, because now H_PROTECT doesn't actually mark the page R/W in the HTAB, right?
No - the guest view of the HPTE has R/W permissions. The guest view
of the HPTE is made up of doubleword 0 from the real HPT plus
rev->guest_rpte for doubleword 1 (where rev is the entry in the revmap
array, kvm->arch.revmap, for the HPTE). The guest view can be
different from the host/hardware view, which is in the real HPT. For
instance, the guest view of a HPTE might be valid but the host view
might be invalid because the underlying real page has been paged out -
in that case we use a software bit which we call HPTE_V_ABSENT to
remind ourselves that there is something valid there from the guest's
point of view. Or the guest view can be R/W but the host view is RO,
as in the case where KSM has merged the page.
> So the flow with this patch is:
>
> - guest page permission fault
This comes through the host (kvmppc_hpte_hv_fault()) which looks at
the guest view of the HPTE, sees that it has RO permissions, and sends
the page fault to the guest.
> - guest does H_PROTECT to mark page r/w
> - H_PROTECT doesn't do anything
> - guest returns from permission handler, triggers write fault
This comes once again to kvmppc_hpte_hv_fault(), which sees that the
guest view of the HPTE has R/W permissions now, and sends the page
fault to kvmppc_book3s_hv_page_fault(), which requests write access to
the page, possibly triggering copy-on-write or whatever, and updates
the real HPTE to have R/W permissions and possibly point to a new page
of memory.
>
> 2 questions here:
>
> How does the host know that the page is actually r/w?
I assume you mean RO? It looks up the memslot for the guest physical
address (which it gets from rev->guest_rpte), uses that to work out
the host virtual address (i.e. the address in qemu's address space),
looks up the Linux PTE in qemu's Linux page tables, and looks at the
_PAGE_RW bit there.
> How does this work on 970? I thought page faults always go straight to the guest there.
They do, which is why PPC970 can't do any of this. On PPC970 we have
kvm->arch.using_mmu_notifiers == 0, and that makes the code pin every
page of guest memory that is mapped by a guest HPTE (with a Linux
guest, that means every page, because of the linear mapping). On
POWER7 we have kvm->arch.using_mmu_notifiers == 1, which enables
host paging and deduplication of guest memory.
Paul.
next prev parent reply other threads:[~2012-11-24 9:32 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-22 9:24 [PATCH 0/5] KVM: PPC: Fix various bugs and vulnerabilities in HV KVM Paul Mackerras
2012-11-22 9:25 ` [PATCH 1/5] KVM: PPC: Book3S HV: Handle guest-caused machine checks on POWER7 without panicking Paul Mackerras
2012-11-23 14:13 ` Alexander Graf
2012-11-23 21:42 ` Paul Mackerras
2012-11-26 13:15 ` Alexander Graf
2012-11-26 21:33 ` Paul Mackerras
2012-11-26 21:55 ` Alexander Graf
2012-11-26 22:03 ` Alexander Graf
2012-11-26 23:11 ` Paul Mackerras
2012-11-24 8:37 ` [PATCH v2] " Paul Mackerras
2012-11-26 23:16 ` Alexander Graf
2012-11-26 23:18 ` Paul Mackerras
2012-11-26 23:20 ` Alexander Graf
2012-11-27 0:20 ` Paul Mackerras
2012-12-22 14:09 ` [PATCH] KVM: PPC: Book3S HV: Fix compilation without CONFIG_PPC_POWERNV Andreas Schwab
2013-01-06 13:05 ` Alexander Graf
2012-11-22 9:27 ` [PATCH 2/5] KVM: PPC: Book3S HV: Reset reverse-map chains when resetting the HPT Paul Mackerras
2012-11-22 9:28 ` [PATCH 3/5] KVM: PPC: Book3S HV: Improve handling of local vs. global TLB invalidations Paul Mackerras
2012-11-23 15:43 ` Alexander Graf
2012-11-23 22:07 ` Paul Mackerras
2012-11-26 13:10 ` Alexander Graf
2012-11-26 21:48 ` Paul Mackerras
2012-11-26 22:03 ` Alexander Graf
2012-11-26 23:16 ` Paul Mackerras
2012-11-26 23:18 ` Alexander Graf
2012-11-22 9:28 ` [PATCH 4/5] KVM: PPC: Book3S HV: Don't give the guest RW access to RO pages Paul Mackerras
2012-11-23 15:47 ` Alexander Graf
2012-11-23 22:13 ` Paul Mackerras
2012-11-24 9:05 ` Alexander Graf
2012-11-24 9:32 ` Paul Mackerras [this message]
2012-11-26 13:09 ` Alexander Graf
2012-11-22 9:29 ` [PATCH 5/5] KVM: PPC: Book3S HV: Report correct HPT entry index when reading HPT Paul Mackerras
2012-11-23 15:48 ` [PATCH 0/5] KVM: PPC: Fix various bugs and vulnerabilities in HV KVM Alexander Graf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121124093237.GF23537@bloggs.ozlabs.ibm.com \
--to=paulus@samba.org \
--cc=agraf@suse.de \
--cc=kvm-ppc@vger.kernel.org \
--cc=kvm@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox