All of lore.kernel.org
 help / color / mirror / Atom feed
From: Anthony Liguori <anthony@codemonkey.ws>
To: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: "Kay, Allen M" <allen.m.kay@intel.com>,
	kvm@vger.kernel.org, Amit Shah <amit.shah@qumranet.com>,
	Ben-Ami Yassour1 <BENAMI@il.ibm.com>,
	Avi Kivity <avik@qumranet.com>, Chris Wright <chrisw@redhat.com>,
	"Han, Weidong" <weidong.han@intel.com>
Subject: Re: [PATCH 4/4][VTD] vt-d specific files in KVM
Date: Tue, 10 Jun 2008 09:26:04 -0500	[thread overview]
Message-ID: <484E8EFC.10007@codemonkey.ws> (raw)
In-Reply-To: <20080610102759.GG7307@il.ibm.com>

Muli Ben-Yehuda wrote:
> On Mon, Jun 09, 2008 at 05:43:15PM -0700, Kay, Allen M wrote:
>   
>> vt-d specific files in KVM for contructing vt-d page tables and
>> programming vt-d context entries.
>>     
>
> Hi Allen,
>
> Some comments below, patches will follow up.
>  
>   
>> Signed-off-by: Allen M. Kay <allen.m.kay@intel.com>
>>     
>
>   
>> diff --git a/arch/x86/kvm/vtd.c b/arch/x86/kvm/vtd.c
>> new file mode 100644
>> index 0000000..634802c
>> --- /dev/null
>> +++ b/arch/x86/kvm/vtd.c
>> @@ -0,0 +1,197 @@
>> +/*
>> + * Copyright (c) 2006, Intel Corporation.
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms and conditions of the GNU General Public License,
>> + * version 2, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope it will be useful, but WITHOUT
>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>> + * more details.
>> + *
>> + * You should have received a copy of the GNU General Public License along with
>> + * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
>> + * Place - Suite 330, Boston, MA 02111-1307 USA.
>> + *
>> + * Copyright (C) 2006-2008 Intel Corporation
>> + * Author: Allen M. Kay <allen.m.kay@intel.com>
>> + * Author: Weidong Han <weidong.han@intel.com>
>> + */
>> +
>> +#include <linux/list.h>
>> +#include <linux/kvm_host.h>
>> +#include <linux/pci.h>
>> +#include <linux/dmar.h>
>> +#include <linux/intel-iommu.h>
>> +#include "vtd.h"
>> +
>> +int kvm_iommu_map_pages(struct kvm *kvm,
>> +	gfn_t base_gfn, unsigned long npages)
>> +{
>> +	gfn_t gfn = base_gfn;
>> +	pfn_t pfn;
>> +	struct page *page;
>> +	int i, rc;
>> +
>> +	if (!kvm->arch.domain)
>> +		return -EFAULT;
>> +
>> +	printk(KERN_DEBUG "kvm_iommu_map_page: gpa = %lx\n",
>> +		gfn << PAGE_SHIFT);
>> +	printk(KERN_DEBUG "kvm_iommu_map_page: hpa = %lx\n",
>> +		gfn_to_pfn(kvm, base_gfn) << PAGE_SHIFT);
>> +	printk(KERN_DEBUG "kvm_iommu_map_page: size = %lx\n",
>> +		npages*PAGE_SIZE);
>> +
>> +	for (i = 0; i < npages; i++) {
>> +		pfn = gfn_to_pfn(kvm, gfn);
>> +		if (pfn_valid(pfn)) {
>>     
>
> Checking against pfn_valid() isn't enough to differentiate between RAM
> and MMIO areas. I think the consensus was that we also need to check
> PageReserved(), i.e.,
>
> if (pfn_valid(pfn) && !PageReserved(pfn_to_page(pfn))) ...
>   

When checking the error return of gfn_to_pfn(), you should use 
is_error_pfn().  There's no need to differentiate mmio/ram pages in the 
code, the goal is just error checking.

>> +			rc = kvm_intel_iommu_page_mapping(kvm->arch.domain,
>> +				gfn << PAGE_SHIFT, pfn << PAGE_SHIFT,
>> +				PAGE_SIZE, DMA_PTE_READ | DMA_PTE_WRITE);
>> +			if (rc) {
>> +				page = gfn_to_page(kvm, gfn);
>> +				put_page(page);
>>     

kvm_release_pfn_clean() should be used here.

> If we fail to map some of the domain's memory, shouldn't we bail out
> of giving it pass-through access at all?
>
>   
>> +			}
>> +		} else {
>> +			printk(KERN_DEBUG "kvm_iommu_map_page:"
>> +				"invalid pfn=%lx\n", pfn);
>> +			return 0;
>>     
>
> I think we should BUG_ON() (or at least WARN_ON()) if we hit a slot
> that has both RAM and an MMIO region. 
>
>   
>> +		}
>> +		gfn++;
>> +	}
>> +	return 0;
>> +}
>> +EXPORT_SYMBOL_GPL(kvm_iommu_map_pages);
>> +
>> +static int kvm_iommu_map_memslots(struct kvm *kvm)
>> +{
>> +	int i, rc;
>> +	for (i = 0; i < kvm->nmemslots; i++) {
>> +		rc = kvm_iommu_map_pages(kvm, kvm->memslots[i].base_gfn,
>> +				kvm->memslots[i].npages);
>> +		if (rc)
>> +			return rc;
>> +	}
>> +	return 0;
>> +}
>> +
>> +static int kvm_iommu_unmap_memslots(struct kvm *kvm);
>> +int kvm_iommu_map_guest(struct kvm *kvm,
>> +	struct kvm_pci_passthrough_dev *pci_pt_dev)
>> +{
>> +	struct pci_dev *pdev = NULL;
>> +
>> +	printk(KERN_DEBUG "kvm_iommu_map_guest: host bdf = %x:%x:%x\n",
>> +		pci_pt_dev->host.busnr,
>> +		PCI_SLOT(pci_pt_dev->host.devfn),
>> +		PCI_FUNC(pci_pt_dev->host.devfn));
>> +
>> +	for_each_pci_dev(pdev) {
>> +		if ((pdev->bus->number == pci_pt_dev->host.busnr) &&
>> +			(pdev->devfn == pci_pt_dev->host.devfn))
>> +			goto found;
>>     
>
> We can stick the `found' stanza in a seperate function and call it
> here, which gets rid of one goto.
>
>   
>> +	}
>> +	if (kvm->arch.domain) {
>> +		kvm_intel_iommu_domain_exit(kvm->arch.domain);
>> +		kvm->arch.domain = NULL;
>> +	}
>> +	return -ENODEV;
>> +found:
>> +	kvm->arch.domain = kvm_intel_iommu_domain_alloc(pdev);
>> +	if (kvm->arch.domain == NULL)
>> +		printk(KERN_WARN "kvm_iommu_map_guest: domain == NULL\n");
>> +	else
>> +		printk(KERN_INFO "kvm_iommu_map_guest: domain = %p\n",
>> +			kvm->arch.domain);
>> +	if (kvm_iommu_map_memslots(kvm)) {
>>     
>
> We shouldn't call map_memslots if domain == NULL.
>
>   
>> +		kvm_iommu_unmap_memslots(kvm);
>> +		return -EFAULT;
>> +	}
>> +	kvm_intel_iommu_context_mapping(kvm->arch.domain, pdev);
>> +	return 0;
>> +}
>> +EXPORT_SYMBOL_GPL(kvm_iommu_map_guest);
>> +
>> +static int kvm_iommu_put_pages(struct kvm *kvm,
>> +	gfn_t base_gfn, unsigned long npages)
>> +{
>> +	gfn_t gfn = base_gfn;
>> +	struct page *page;
>> +	int i;
>> +
>> +	if (!kvm->arch.domain)
>> +		return -EFAULT;
>> +
>> +	printk(KERN_DEBUG "kvm_iommu_put_pages: gpa = %lx\n",
>> +		gfn << PAGE_SHIFT);
>> +	printk(KERN_DEBUG "kvm_iommu_put_pages: hpa = %lx\n",
>> +		gfn_to_pfn(kvm, gfn) << PAGE_SHIFT);
>> +	printk(KERN_DEBUG "kvm_iommu_put_pages: size = %lx\n",
>> +		npages*PAGE_SIZE);
>> +
>> +	for (i = 0; i < npages; i++) {
>> +		page = gfn_to_page(kvm, gfn);
>> +		put_page(page);
>> +		gfn++;
>>     

Likewise, you should use kvm_release_pfn_dirty() here.

Note, this patch series isn't bisect friendly.  In the third patch, you 
introduce a makefile change for vtd.o but don't introduce the file until 
the fourth patch.  This will break bisection.

Regards,

Anthony Liguori

  reply	other threads:[~2008-06-10 14:26 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-06-10  0:43 [PATCH 4/4][VTD] vt-d specific files in KVM Kay, Allen M
2008-06-10 10:27 ` Muli Ben-Yehuda
2008-06-10 14:26   ` Anthony Liguori [this message]
2008-06-10 14:56     ` Muli Ben-Yehuda
2008-06-10 15:02       ` Anthony Liguori
2008-06-10 15:15         ` Muli Ben-Yehuda
2008-06-10 15:24           ` Anthony Liguori
2008-06-10 16:07             ` Muli Ben-Yehuda
2008-06-20 18:24 ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=484E8EFC.10007@codemonkey.ws \
    --to=anthony@codemonkey.ws \
    --cc=BENAMI@il.ibm.com \
    --cc=allen.m.kay@intel.com \
    --cc=amit.shah@qumranet.com \
    --cc=avik@qumranet.com \
    --cc=chrisw@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=muli@il.ibm.com \
    --cc=weidong.han@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.