From: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
To: Gleb Natapov <gleb@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
linux-kernel@vger.kernel.org, kvm@vger.kernel.org, glin@suse.de,
agraf@suse.de, brogers@suse.de, afaerber@suse.de,
lnussel@suse.de, edk2-devel@lists.sf.net, stable@vger.kernel.org
Subject: Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots
Date: Mon, 02 Sep 2013 18:05:10 +0800 [thread overview]
Message-ID: <522462D6.8090806@linux.vnet.ibm.com> (raw)
In-Reply-To: <20130902094907.GP22899@redhat.com>
On 09/02/2013 05:49 PM, Gleb Natapov wrote:
> On Mon, Sep 02, 2013 at 05:42:25PM +0800, Xiao Guangrong wrote:
>> On 09/01/2013 05:17 PM, Gleb Natapov wrote:
>>> On Fri, Aug 30, 2013 at 02:41:37PM +0200, Paolo Bonzini wrote:
>>>> Page tables in a read-only memory slot will currently cause a triple
>>>> fault because the page walker uses gfn_to_hva and it fails on such a slot.
>>>>
>>>> OVMF uses such a page table; however, real hardware seems to be fine with
>>>> that as long as the accessed/dirty bits are set. Save whether the slot
>>>> is readonly, and later check it when updating the accessed and dirty bits.
>>>>
>>> The fix looks OK to me, but some comment below.
>>>
>>>> Cc: stable@vger.kernel.org
>>>> Cc: gleb@redhat.com
>>>> Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
>>>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>>>> ---
>>>> CCing to stable@ since the regression was introduced with
>>>> support for readonly memory slots.
>>>>
>>>> arch/x86/kvm/paging_tmpl.h | 7 ++++++-
>>>> include/linux/kvm_host.h | 1 +
>>>> virt/kvm/kvm_main.c | 14 +++++++++-----
>>>> 3 files changed, 16 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
>>>> index 0433301..dadc5c0 100644
>>>> --- a/arch/x86/kvm/paging_tmpl.h
>>>> +++ b/arch/x86/kvm/paging_tmpl.h
>>>> @@ -99,6 +99,7 @@ struct guest_walker {
>>>> pt_element_t prefetch_ptes[PTE_PREFETCH_NUM];
>>>> gpa_t pte_gpa[PT_MAX_FULL_LEVELS];
>>>> pt_element_t __user *ptep_user[PT_MAX_FULL_LEVELS];
>>>> + bool pte_writable[PT_MAX_FULL_LEVELS];
>>>> unsigned pt_access;
>>>> unsigned pte_access;
>>>> gfn_t gfn;
>>>> @@ -235,6 +236,9 @@ static int FNAME(update_accessed_dirty_bits)(struct kvm_vcpu *vcpu,
>>>> if (pte == orig_pte)
>>>> continue;
>>>>
>>>> + if (unlikely(!walker->pte_writable[level - 1]))
>>>> + return -EACCES;
>>>> +
>>>> ret = FNAME(cmpxchg_gpte)(vcpu, mmu, ptep_user, index, orig_pte, pte);
>>>> if (ret)
>>>> return ret;
>>>> @@ -309,7 +313,8 @@ retry_walk:
>>>> goto error;
>>>> real_gfn = gpa_to_gfn(real_gfn);
>>>>
>>>> - host_addr = gfn_to_hva(vcpu->kvm, real_gfn);
>>>> + host_addr = gfn_to_hva_read(vcpu->kvm, real_gfn,
>>>> + &walker->pte_writable[walker->level - 1]);
>>> The use of gfn_to_hva_read is misleading. The code can still write into
>>> gfn. Lets rename gfn_to_hva_read to gfn_to_hva_prot() and gfn_to_hva()
>>> to gfn_to_hva_write().
>>
>> Yes. I agreed.
>>
>>>
>>> This makes me think are there other places where gfn_to_hva() was
>>> used, but gfn_to_hva_prot() should have been?
>>> - kvm_host_page_size() looks incorrect. We never use huge page to map
>>> read only memory slots currently.
>>
>> It only checks whether gfn have been mapped, I think we can use
>> gfn_to_hva_read() instead, the real permission will be checked when we translate
>> the gfn to pfn.
>>
> Yes, all the cases I listed should be changed to use function that looks
> at both regular and RO slots.
>
>>> - kvm_handle_bad_page() also looks incorrect and may cause incorrect
>>> address to be reported to userspace.
>>
>> I have no idea on this point. kvm_handle_bad_page() is called when it failed to
>> translate the target gfn to pfn, then the emulator can detect the error on target gfn
>> properly. no? Or i misunderstood your meaning?
>>
> I am talking about the following code:
>
> if (pfn == KVM_PFN_ERR_HWPOISON) {
> kvm_send_hwpoison_signal(gfn_to_hva(vcpu->kvm, gfn), current);
> return 0;
> }
>
> pfn will be KVM_PFN_ERR_HWPOISON gfn is backed by faulty memory, we need
> to report the liner address of the faulty memory to a userspace here,
> but if gfn is in a RO slot gfn_to_hva() will not return correct address
> here.
Got it, thanks for your explanation.
BTW, if you and Paolo are busy on other things, i am happy to fix these issues. :)
next prev parent reply other threads:[~2013-09-02 10:05 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-30 12:41 [PATCH] KVM: mmu: allow page tables to be in read-only slots Paolo Bonzini
2013-09-01 9:17 ` Gleb Natapov
2013-09-02 9:42 ` Xiao Guangrong
2013-09-02 9:49 ` Gleb Natapov
2013-09-02 10:05 ` Xiao Guangrong [this message]
2013-09-02 10:11 ` Gleb Natapov
2013-09-02 15:58 ` Paolo Bonzini
2013-09-02 16:27 ` Gleb Natapov
2013-10-14 15:28 ` Paolo Bonzini
2013-09-02 9:20 ` Xiao Guangrong
2013-09-02 9:25 ` Gleb Natapov
2013-09-02 10:00 ` Xiao Guangrong
2013-09-02 10:07 ` Gleb Natapov
2013-09-02 15:56 ` Paolo Bonzini
2013-09-02 16:26 ` Gleb Natapov
2013-09-02 16:00 ` Paolo Bonzini
2013-09-02 16:16 ` Gleb Natapov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=522462D6.8090806@linux.vnet.ibm.com \
--to=xiaoguangrong@linux.vnet.ibm.com \
--cc=afaerber@suse.de \
--cc=agraf@suse.de \
--cc=brogers@suse.de \
--cc=edk2-devel@lists.sf.net \
--cc=gleb@redhat.com \
--cc=glin@suse.de \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lnussel@suse.de \
--cc=pbonzini@redhat.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).