From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrea Arcangeli Subject: Re: [PATCH] KVM: PCIPT: direct mmio pfn check Date: Wed, 25 Jun 2008 02:57:39 +0200 Message-ID: <20080625005739.GM6938@duo.random> References: <1214232737-21267-1-git-send-email-benami@il.ibm.com> <1214232737-21267-2-git-send-email-benami@il.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: amit.shah@qumranet.com, kvm@vger.kernel.org, aliguori@us.ibm.com, allen.m.kay@intel.com, muli@il.ibm.com To: benami@il.ibm.com, Avi Kivity Return-path: Received: from host36-195-149-62.serverdedicati.aruba.it ([62.149.195.36]:35630 "EHLO mx.cpushare.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752439AbYFYA5m (ORCPT ); Tue, 24 Jun 2008 20:57:42 -0400 Content-Disposition: inline In-Reply-To: <1214232737-21267-2-git-send-email-benami@il.ibm.com> Sender: kvm-owner@vger.kernel.org List-ID: We must apply the Ben-Ami's patch that completes the pfn-mmio support with the needed checks for PageReserved. The way the mem_map has to be allocated there will always be non-ram pages that have a mem_map array backing and to speedup the pfn_valid check the remaining non-ram check is achieved with PG_reserved bitflag. Whenever a non-ram page currently used by linux returns true out of the pfn_valid && PageReserved(pfn_to_page(pfn)) check, it's a kernel bug and needs fixing. So the below code is quite final, if numa or non-numa fails to provide this invariant then it's a sparse bug that should be fixed in the kernel tree, not worked around in kvm. On top of Ben-Ami's patch, I also need the below incremental patch applied because with the reserved-ram patch to allow pci-passthrough w/o VT-d what happens is that qemu has to do the emulated-DMA on those non-ram pages (reserved-ram isn't used by linux so is also has to be marked PageReserved). It's ok to export those PageReserved pages through vm->fault, the put_page will do nothing on those when the kvm->mmap vma is teardown. So the kvm export of the guest physical memory to qemu userland for dma, will work as long as the -reserved-ram patch will ensure that 'struct page' backing exists for all pages where there's dma (so the virtualized ram). My current reserved-ram patch ensures this by reserving the ram early in the e820 map so the initial pagetables are allocated above the kernel .text relocation and then I make the sparse code think the reserved-ram is actually available (so struct pages are allocated) and finally I've to reserve those pages in the bootmem allocator so they remain PageReserved but with 'struct page' backing. So things works fine. Signed-off-by: Andrea Arcangeli Index: virt/kvm/kvm_main.c --- virt/kvm/kvm_main.c.orig 2008-06-25 02:39:51.000000000 +0200 +++ virt/kvm/kvm_main.c 2008-06-25 02:40:35.000000000 +0200 @@ -604,10 +604,9 @@ struct page *gfn_to_page(struct kvm *kvm pfn_t pfn; pfn = gfn_to_pfn(kvm, gfn); - if (!is_mmio_pfn(pfn)) + if (pfn_valid(pfn)) return pfn_to_page(pfn); - - WARN_ON(is_mmio_pfn(pfn)); + WARN_ON(1); get_page(bad_page); return bad_page; Thanks. On Mon, Jun 23, 2008 at 05:52:17PM +0300, benami@il.ibm.com wrote: > From: Ben-Ami Yassour > > In some cases it is not enough to identify mmio memory slots by > pfn_valid. This patch adds checking the PageReserved as well. > > Signed-off-by: Ben-Ami Yassour > Signed-off-by: Muli Ben-Yehuda > --- > virt/kvm/kvm_main.c | 22 +++++++++++++++------- > 1 files changed, 15 insertions(+), 7 deletions(-) > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index f9427e2..27b2eff 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -76,6 +76,14 @@ static inline int valid_vcpu(int n) > return likely(n >= 0 && n < KVM_MAX_VCPUS); > } > > +static inline int is_mmio_pfn(pfn_t pfn) > +{ > + if (pfn_valid(pfn)) > + return PageReserved(pfn_to_page(pfn)); > + > + return true; > +} > + > /* > * Switches to specified vcpu, until a matching vcpu_put() > */ > @@ -582,7 +590,7 @@ pfn_t gfn_to_pfn(struct kvm *kvm, gfn_t gfn) > } > > pfn = ((addr - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff; > - BUG_ON(pfn_valid(pfn)); > + BUG_ON(!is_mmio_pfn(pfn)); > } else > pfn = page_to_pfn(page[0]); > > @@ -596,10 +604,10 @@ struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn) > pfn_t pfn; > > pfn = gfn_to_pfn(kvm, gfn); > - if (pfn_valid(pfn)) > + if (!is_mmio_pfn(pfn)) > return pfn_to_page(pfn); > > - WARN_ON(!pfn_valid(pfn)); > + WARN_ON(is_mmio_pfn(pfn)); > > get_page(bad_page); > return bad_page; > @@ -615,7 +623,7 @@ EXPORT_SYMBOL_GPL(kvm_release_page_clean); > > void kvm_release_pfn_clean(pfn_t pfn) > { > - if (pfn_valid(pfn)) > + if (!is_mmio_pfn(pfn)) > put_page(pfn_to_page(pfn)); > } > EXPORT_SYMBOL_GPL(kvm_release_pfn_clean); > @@ -641,7 +649,7 @@ EXPORT_SYMBOL_GPL(kvm_set_page_dirty); > > void kvm_set_pfn_dirty(pfn_t pfn) > { > - if (pfn_valid(pfn)) { > + if (!is_mmio_pfn(pfn)) { > struct page *page = pfn_to_page(pfn); > if (!PageReserved(page)) > SetPageDirty(page); > @@ -651,14 +659,14 @@ EXPORT_SYMBOL_GPL(kvm_set_pfn_dirty); > > void kvm_set_pfn_accessed(pfn_t pfn) > { > - if (pfn_valid(pfn)) > + if (!is_mmio_pfn(pfn)) > mark_page_accessed(pfn_to_page(pfn)); > } > EXPORT_SYMBOL_GPL(kvm_set_pfn_accessed); > > void kvm_get_pfn(pfn_t pfn) > { > - if (pfn_valid(pfn)) > + if (!is_mmio_pfn(pfn)) > get_page(pfn_to_page(pfn)); > } > EXPORT_SYMBOL_GPL(kvm_get_pfn); > -- > 1.5.5.1 > > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html