Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Gleb Natapov <gleb@redhat.com>
To: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org, glin@suse.de,
	agraf@suse.de, brogers@suse.de, afaerber@suse.de,
	lnussel@suse.de, edk2-devel@lists.sf.net, stable@vger.kernel.org
Subject: Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots
Date: Mon, 2 Sep 2013 13:11:15 +0300	[thread overview]
Message-ID: <20130902101114.GR22899@redhat.com> (raw)
In-Reply-To: <522462D6.8090806@linux.vnet.ibm.com>

On Mon, Sep 02, 2013 at 06:05:10PM +0800, Xiao Guangrong wrote:
> On 09/02/2013 05:49 PM, Gleb Natapov wrote:
> > On Mon, Sep 02, 2013 at 05:42:25PM +0800, Xiao Guangrong wrote:
> >> On 09/01/2013 05:17 PM, Gleb Natapov wrote:
> >>> On Fri, Aug 30, 2013 at 02:41:37PM +0200, Paolo Bonzini wrote:
> >>>> Page tables in a read-only memory slot will currently cause a triple
> >>>> fault because the page walker uses gfn_to_hva and it fails on such a slot.
> >>>>
> >>>> OVMF uses such a page table; however, real hardware seems to be fine with
> >>>> that as long as the accessed/dirty bits are set.  Save whether the slot
> >>>> is readonly, and later check it when updating the accessed and dirty bits.
> >>>>
> >>> The fix looks OK to me, but some comment below.
> >>>
> >>>> Cc: stable@vger.kernel.org
> >>>> Cc: gleb@redhat.com
> >>>> Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
> >>>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> >>>> ---
> >>>> 	CCing to stable@ since the regression was introduced with
> >>>> 	support for readonly memory slots.
> >>>>
> >>>>  arch/x86/kvm/paging_tmpl.h |  7 ++++++-
> >>>>  include/linux/kvm_host.h   |  1 +
> >>>>  virt/kvm/kvm_main.c        | 14 +++++++++-----
> >>>>  3 files changed, 16 insertions(+), 6 deletions(-)
> >>>>
> >>>> diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
> >>>> index 0433301..dadc5c0 100644
> >>>> --- a/arch/x86/kvm/paging_tmpl.h
> >>>> +++ b/arch/x86/kvm/paging_tmpl.h
> >>>> @@ -99,6 +99,7 @@ struct guest_walker {
> >>>>  	pt_element_t prefetch_ptes[PTE_PREFETCH_NUM];
> >>>>  	gpa_t pte_gpa[PT_MAX_FULL_LEVELS];
> >>>>  	pt_element_t __user *ptep_user[PT_MAX_FULL_LEVELS];
> >>>> +	bool pte_writable[PT_MAX_FULL_LEVELS];
> >>>>  	unsigned pt_access;
> >>>>  	unsigned pte_access;
> >>>>  	gfn_t gfn;
> >>>> @@ -235,6 +236,9 @@ static int FNAME(update_accessed_dirty_bits)(struct kvm_vcpu *vcpu,
> >>>>  		if (pte == orig_pte)
> >>>>  			continue;
> >>>>  
> >>>> +		if (unlikely(!walker->pte_writable[level - 1]))
> >>>> +			return -EACCES;
> >>>> +
> >>>>  		ret = FNAME(cmpxchg_gpte)(vcpu, mmu, ptep_user, index, orig_pte, pte);
> >>>>  		if (ret)
> >>>>  			return ret;
> >>>> @@ -309,7 +313,8 @@ retry_walk:
> >>>>  			goto error;
> >>>>  		real_gfn = gpa_to_gfn(real_gfn);
> >>>>  
> >>>> -		host_addr = gfn_to_hva(vcpu->kvm, real_gfn);
> >>>> +		host_addr = gfn_to_hva_read(vcpu->kvm, real_gfn,
> >>>> +					    &walker->pte_writable[walker->level - 1]);
> >>> The use of gfn_to_hva_read is misleading. The code can still write into
> >>> gfn. Lets rename gfn_to_hva_read to gfn_to_hva_prot() and gfn_to_hva()
> >>> to gfn_to_hva_write().
> >>
> >> Yes. I agreed.
> >>
> >>>
> >>> This makes me think are there other places where gfn_to_hva() was
> >>> used, but gfn_to_hva_prot() should have been?
> >>>  - kvm_host_page_size() looks incorrect. We never use huge page to map
> >>>    read only memory slots currently.
> >>
> >> It only checks whether gfn have been mapped, I think we can use
> >> gfn_to_hva_read() instead, the real permission will be checked when we translate
> >> the gfn to pfn.
> >>
> > Yes, all the cases I listed should be changed to use function that looks
> > at both regular and RO slots.
> > 
> >>>  - kvm_handle_bad_page() also looks incorrect and may cause incorrect
> >>>    address to be reported to userspace.
> >>
> >> I have no idea on this point. kvm_handle_bad_page() is called when it failed to
> >> translate the target gfn to pfn, then the emulator can detect the error on target gfn
> >> properly. no? Or i misunderstood your meaning?
> >>
> > I am talking about the following code:
> > 
> >         if (pfn == KVM_PFN_ERR_HWPOISON) {
> >                 kvm_send_hwpoison_signal(gfn_to_hva(vcpu->kvm, gfn), current);
> >                 return 0;
> >         }
> > 
> > pfn will be KVM_PFN_ERR_HWPOISON gfn is backed by faulty memory, we need
> > to report the liner address of the faulty memory to a userspace here,
> > but if gfn is in a RO slot gfn_to_hva() will not return correct address
> > here.
> 
> Got it, thanks for your explanation.
> 
> BTW, if you and Paolo are busy on other things, i am happy to fix these issues. :)
I am busy with reviews mostly :). If you are not to busy with lockless
write protection then fine with me. Lest wait for Paolo's input on
proposed API though.

--
			Gleb.

next prev parent reply	other threads:[~2013-09-02 10:11 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-30 12:41 [PATCH] KVM: mmu: allow page tables to be in read-only slots Paolo Bonzini
2013-09-01  9:17 ` Gleb Natapov
2013-09-02  9:42   ` Xiao Guangrong
2013-09-02  9:49     ` Gleb Natapov
2013-09-02 10:05       ` Xiao Guangrong
2013-09-02 10:11         ` Gleb Natapov [this message]
2013-09-02 15:58           ` [edk2] " Paolo Bonzini
2013-09-02 15:58             ` Paolo Bonzini
2013-09-02 16:27             ` Gleb Natapov
2013-10-14 15:28   ` Paolo Bonzini
2013-09-02  9:20 ` Xiao Guangrong
2013-09-02  9:25   ` Gleb Natapov
2013-09-02 10:00     ` Xiao Guangrong
2013-09-02 10:07       ` Gleb Natapov
2013-09-02 15:56         ` [edk2] " Paolo Bonzini
2013-09-02 15:56           ` Paolo Bonzini
2013-09-02 16:26           ` Gleb Natapov
2013-09-02 16:00     ` [edk2] " Paolo Bonzini
2013-09-02 16:00       ` Paolo Bonzini
2013-09-02 16:16       ` Gleb Natapov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130902101114.GR22899@redhat.com \
    --to=gleb@redhat.com \
    --cc=afaerber@suse.de \
    --cc=agraf@suse.de \
    --cc=brogers@suse.de \
    --cc=edk2-devel@lists.sf.net \
    --cc=glin@suse.de \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lnussel@suse.de \
    --cc=pbonzini@redhat.com \
    --cc=stable@vger.kernel.org \
    --cc=xiaoguangrong@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.