* KVM: MMU: drop read-only large sptes when creating lower level sptes @ 2014-02-24 16:59 Marcelo Tosatti 2014-02-25 3:30 ` Xiao Guangrong 2014-02-25 8:23 ` Paolo Bonzini 0 siblings, 2 replies; 6+ messages in thread From: Marcelo Tosatti @ 2014-02-24 16:59 UTC (permalink / raw) To: kvm-devel; +Cc: Paolo Bonzini, Xiao Guangrong Read-only large sptes can be created due to read-only faults as follows: - QEMU pagetable entry that maps guest memory is read-only due to COW. - Guest read faults such memory, COW is not broken, because it is a read-only fault. - Enable dirty logging, large spte not nuked because it is read-only. - Write-fault on such memory causes guest to loop endlessly (which must go down to level 1 because dirty logging is enabled). Fix by dropping large spte when necessary. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index e50425d..9b53135 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -2672,6 +2672,7 @@ static int __direct_map(struct kvm_vcpu *vcpu, gpa_t v, int write, break; } + drop_large_spte(vcpu, iterator.sptep); if (!is_shadow_present_pte(*iterator.sptep)) { u64 base_addr = iterator.addr; ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: KVM: MMU: drop read-only large sptes when creating lower level sptes 2014-02-24 16:59 KVM: MMU: drop read-only large sptes when creating lower level sptes Marcelo Tosatti @ 2014-02-25 3:30 ` Xiao Guangrong 2014-02-25 13:13 ` Marcelo Tosatti 2014-02-25 8:23 ` Paolo Bonzini 1 sibling, 1 reply; 6+ messages in thread From: Xiao Guangrong @ 2014-02-25 3:30 UTC (permalink / raw) To: Marcelo Tosatti, kvm-devel; +Cc: Paolo Bonzini On 02/25/2014 12:59 AM, Marcelo Tosatti wrote: > > Read-only large sptes can be created due to read-only faults as > follows: > > - QEMU pagetable entry that maps guest memory is read-only > due to COW. > - Guest read faults such memory, COW is not broken, because > it is a read-only fault. > - Enable dirty logging, large spte not nuked because it is read-only. > - Write-fault on such memory causes guest to loop endlessly > (which must go down to level 1 because dirty logging is enabled). Hi Marcelo, It surprised me that the large-readonly mapping was not dropped by mmu-notifer as this is write fault on readonly mapping in Qemu. Hmm... i missed something? ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: KVM: MMU: drop read-only large sptes when creating lower level sptes 2014-02-25 3:30 ` Xiao Guangrong @ 2014-02-25 13:13 ` Marcelo Tosatti 2014-02-25 15:11 ` Xiao Guangrong 0 siblings, 1 reply; 6+ messages in thread From: Marcelo Tosatti @ 2014-02-25 13:13 UTC (permalink / raw) To: Xiao Guangrong; +Cc: kvm-devel, Paolo Bonzini On Tue, Feb 25, 2014 at 11:30:37AM +0800, Xiao Guangrong wrote: > On 02/25/2014 12:59 AM, Marcelo Tosatti wrote: > > > > Read-only large sptes can be created due to read-only faults as > > follows: > > > > - QEMU pagetable entry that maps guest memory is read-only > > due to COW. > > - Guest read faults such memory, COW is not broken, because > > it is a read-only fault. > > - Enable dirty logging, large spte not nuked because it is read-only. > > - Write-fault on such memory causes guest to loop endlessly > > (which must go down to level 1 because dirty logging is enabled). > > Hi Marcelo, > > It surprised me that the large-readonly mapping was not dropped > by mmu-notifer as this is write fault on readonly mapping in Qemu. > Hmm... i missed something? You mean COW was not broken by gup? (that is the problem, so a read-only large spte is created). Don't see why mmu-notifier should invalidate the spte? ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: KVM: MMU: drop read-only large sptes when creating lower level sptes 2014-02-25 13:13 ` Marcelo Tosatti @ 2014-02-25 15:11 ` Xiao Guangrong 2014-02-25 22:52 ` Marcelo Tosatti 0 siblings, 1 reply; 6+ messages in thread From: Xiao Guangrong @ 2014-02-25 15:11 UTC (permalink / raw) To: Marcelo Tosatti; +Cc: kvm-devel, Paolo Bonzini On Feb 25, 2014, at 9:13 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote: > On Tue, Feb 25, 2014 at 11:30:37AM +0800, Xiao Guangrong wrote: >> On 02/25/2014 12:59 AM, Marcelo Tosatti wrote: >>> >>> Read-only large sptes can be created due to read-only faults as >>> follows: >>> >>> - QEMU pagetable entry that maps guest memory is read-only >>> due to COW. >>> - Guest read faults such memory, COW is not broken, because >>> it is a read-only fault. >>> - Enable dirty logging, large spte not nuked because it is read-only. >>> - Write-fault on such memory causes guest to loop endlessly >>> (which must go down to level 1 because dirty logging is enabled). >> >> Hi Marcelo, >> >> It surprised me that the large-readonly mapping was not dropped >> by mmu-notifer as this is write fault on readonly mapping in Qemu. >> Hmm... i missed something? > > You mean COW was not broken by gup? (that is the problem, so a > read-only large spte is created). > I mean the final step that “Write-fault on such memory causes guest” should break the readonly QEMU pagetable entry and change it to writable, at that time, mmu-notifier should be called since the page’s permission has been changed. After read the code more carefully, i figure the reason out that mmu-notifier is not called when the page can be reused. I am thinking whether it is reasonable: - anyway, the permission bits on the pre are changed, why not call mmu-notifier->change_pte()? - kvm and other users of mum-notifier need to check the permission changing very carefully, that is really subtle, like the case fixed here. - if we do proper prefetch when it happen, the addition page-fault can be avoided, that is good for the performance. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: KVM: MMU: drop read-only large sptes when creating lower level sptes 2014-02-25 15:11 ` Xiao Guangrong @ 2014-02-25 22:52 ` Marcelo Tosatti 0 siblings, 0 replies; 6+ messages in thread From: Marcelo Tosatti @ 2014-02-25 22:52 UTC (permalink / raw) To: Xiao Guangrong; +Cc: kvm-devel, Paolo Bonzini On Tue, Feb 25, 2014 at 11:11:19PM +0800, Xiao Guangrong wrote: > > On Feb 25, 2014, at 9:13 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote: > > > On Tue, Feb 25, 2014 at 11:30:37AM +0800, Xiao Guangrong wrote: > >> On 02/25/2014 12:59 AM, Marcelo Tosatti wrote: > >>> > >>> Read-only large sptes can be created due to read-only faults as > >>> follows: > >>> > >>> - QEMU pagetable entry that maps guest memory is read-only > >>> due to COW. > >>> - Guest read faults such memory, COW is not broken, because > >>> it is a read-only fault. > >>> - Enable dirty logging, large spte not nuked because it is read-only. > >>> - Write-fault on such memory causes guest to loop endlessly > >>> (which must go down to level 1 because dirty logging is enabled). > >> > >> Hi Marcelo, > >> > >> It surprised me that the large-readonly mapping was not dropped > >> by mmu-notifer as this is write fault on readonly mapping in Qemu. > >> Hmm... i missed something? > > > > You mean COW was not broken by gup? (that is the problem, so a > > read-only large spte is created). > > > > I mean the final step that “Write-fault on such memory causes guest” should > break the readonly QEMU pagetable entry and change it to writable, at that > time, mmu-notifier should be called since the page’s permission has been > changed. > > After read the code more carefully, i figure the reason out that mmu-notifier > is not called when the page can be reused. Well a transition from read-only host-pte to read-write host-pte does not require the read-only spte to be upgraded immediatelly, right? Permission is upgraded lazily. > I am thinking whether it is > reasonable: > - anyway, the permission bits on the pre are changed, why not call > mmu-notifier->change_pte()? It could, but lazy updates are better i suppose. > - kvm and other users of mum-notifier need to check the permission > changing very carefully, that is really subtle, like the case fixed here. Yes. > - if we do proper prefetch when it happen, the addition page-fault can > be avoided, that is good for the performance. Agree. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: KVM: MMU: drop read-only large sptes when creating lower level sptes 2014-02-24 16:59 KVM: MMU: drop read-only large sptes when creating lower level sptes Marcelo Tosatti 2014-02-25 3:30 ` Xiao Guangrong @ 2014-02-25 8:23 ` Paolo Bonzini 1 sibling, 0 replies; 6+ messages in thread From: Paolo Bonzini @ 2014-02-25 8:23 UTC (permalink / raw) To: Marcelo Tosatti, kvm-devel; +Cc: Xiao Guangrong Il 24/02/2014 17:59, Marcelo Tosatti ha scritto: > > Read-only large sptes can be created due to read-only faults as > follows: > > - QEMU pagetable entry that maps guest memory is read-only > due to COW. > - Guest read faults such memory, COW is not broken, because > it is a read-only fault. > - Enable dirty logging, large spte not nuked because it is read-only. > - Write-fault on such memory causes guest to loop endlessly > (which must go down to level 1 because dirty logging is enabled). > > Fix by dropping large spte when necessary. > > Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> > > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c > index e50425d..9b53135 100644 > --- a/arch/x86/kvm/mmu.c > +++ b/arch/x86/kvm/mmu.c > @@ -2672,6 +2672,7 @@ static int __direct_map(struct kvm_vcpu *vcpu, gpa_t v, int write, > break; > } > > + drop_large_spte(vcpu, iterator.sptep); > if (!is_shadow_present_pte(*iterator.sptep)) { > u64 base_addr = iterator.addr; > > > Queued for kvm/master, will push as soon as I test it. Thanks, Paolo ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2014-02-25 22:53 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-02-24 16:59 KVM: MMU: drop read-only large sptes when creating lower level sptes Marcelo Tosatti 2014-02-25 3:30 ` Xiao Guangrong 2014-02-25 13:13 ` Marcelo Tosatti 2014-02-25 15:11 ` Xiao Guangrong 2014-02-25 22:52 ` Marcelo Tosatti 2014-02-25 8:23 ` Paolo Bonzini
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).