public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* device assignemnt: updated patches
@ 2008-07-22 12:13 Ben-Ami Yassour
  2008-07-22 12:13 ` [PATCH 1/4] KVM: Add irq ack notifier list Ben-Ami Yassour
  2008-07-26  9:05 ` device assignemnt: updated patches Avi Kivity
  0 siblings, 2 replies; 19+ messages in thread
From: Ben-Ami Yassour @ 2008-07-22 12:13 UTC (permalink / raw)
  To: avi; +Cc: amit.shah, kvm, muli, benami, weidong.han, anthony

Following are the device assignment patches with the fixes of the
comments that were sent for the previous version.

Here is the list of changes that were made with respect to the previous
version:
1. Replace the interrupt ack hook patches with the notifiers list patch
by Avi.
2. Remove code from ioapic - the ack notifier is called before the
checking the irr bit.
3. Remove the interrupt ack work queue and handle it directly.
4. Minimize the function for finding an assigned device.
5. Remove the pt_lock, after making the changes above it is no longer
needed.
6. Move declarations from kvm_para.h
7. Add irq array to ioctl API. Note that this is only a change to the
API, currently only single irq is supported.
8. Renaming: use "assigned device" and not "passthrough device"
9. Fix device release error path
10. Moving the assigned devices list pointer to the device struct itself
and remove the extra structure. 

Pending comment: shared guest interrputs are not tested.

Comments are appriciated.

Regards,
Ben




^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/4] KVM: Add irq ack notifier list
  2008-07-22 12:13 device assignemnt: updated patches Ben-Ami Yassour
@ 2008-07-22 12:13 ` Ben-Ami Yassour
  2008-07-22 12:13   ` [PATCH 2/4] KVM: pci device assignment Ben-Ami Yassour
  2008-07-26  8:19   ` [PATCH 1/4] KVM: Add irq ack notifier list Avi Kivity
  2008-07-26  9:05 ` device assignemnt: updated patches Avi Kivity
  1 sibling, 2 replies; 19+ messages in thread
From: Ben-Ami Yassour @ 2008-07-22 12:13 UTC (permalink / raw)
  To: avi; +Cc: amit.shah, kvm, muli, benami, weidong.han, anthony

From: Avi Kivity <avi@qumranet.com>

This can be used by kvm subsystems that are interested in when
interrupts
are acked, for example time drift compenstation.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 arch/x86/kvm/irq.c         |   22 ++++++++++++++++++++++
 arch/x86/kvm/irq.h         |    5 +++++
 include/asm-x86/kvm_host.h |    7 +++++++
 3 files changed, 34 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
index 0d9e552..9091195 100644
--- a/arch/x86/kvm/irq.c
+++ b/arch/x86/kvm/irq.c
@@ -111,3 +111,25 @@ void kvm_set_irq(struct kvm *kvm, int irq, int level)
 	kvm_ioapic_set_irq(kvm->arch.vioapic, irq, level);
 	kvm_pic_set_irq(pic_irqchip(kvm), irq, level);
 }
+
+void kvm_notify_acked_irq(struct kvm *kvm, unsigned gsi)
+{
+	struct kvm_irq_ack_notifier *kian;
+	struct hlist_node *n;
+
+	hlist_for_each_entry(kian, n, &kvm->arch.irq_ack_notifier_list, link)
+		if (kian->gsi == gsi)
+			kian->irq_acked(kian);
+}
+
+void kvm_register_irq_ack_notifier(struct kvm *kvm,
+				   struct kvm_irq_ack_notifier *kian)
+{
+	hlist_add_head(&kian->link, &kvm->arch.irq_ack_notifier_list);
+}
+
+void kvm_unregister_irq_ack_notifier(struct kvm *kvm,
+				     struct kvm_irq_ack_notifier *kian)
+{
+	hlist_del(&kian->link);
+}
diff --git a/arch/x86/kvm/irq.h b/arch/x86/kvm/irq.h
index 07ff2ae..95fe718 100644
--- a/arch/x86/kvm/irq.h
+++ b/arch/x86/kvm/irq.h
@@ -83,6 +83,11 @@ static inline int irqchip_in_kernel(struct kvm *kvm)
 void kvm_pic_reset(struct kvm_kpic_state *s);
 
 void kvm_set_irq(struct kvm *kvm, int irq, int level);
+void kvm_notify_acked_irq(struct kvm *kvm, unsigned gsi);
+void kvm_register_irq_ack_notifier(struct kvm *kvm,
+				   struct kvm_irq_ack_notifier *kian);
+void kvm_unregister_irq_ack_notifier(struct kvm *kvm,
+				     struct kvm_irq_ack_notifier *kian);
 
 void kvm_timer_intr_post(struct kvm_vcpu *vcpu, int vec);
 void kvm_inject_pending_timer_irqs(struct kvm_vcpu *vcpu);
diff --git a/include/asm-x86/kvm_host.h b/include/asm-x86/kvm_host.h
index 4a47859..e2864e6 100644
--- a/include/asm-x86/kvm_host.h
+++ b/include/asm-x86/kvm_host.h
@@ -319,6 +319,12 @@ struct kvm_mem_alias {
 	gfn_t target_gfn;
 };
 
+struct kvm_irq_ack_notifier {
+	struct hlist_node link;
+	unsigned gsi;
+	void (*irq_acked)(struct kvm_irq_ack_notifier *kian);
+};
+
 struct kvm_arch{
 	int naliases;
 	struct kvm_mem_alias aliases[KVM_ALIAS_SLOTS];
@@ -334,6 +340,7 @@ struct kvm_arch{
 	struct kvm_pic *vpic;
 	struct kvm_ioapic *vioapic;
 	struct kvm_pit *vpit;
+	struct hlist_head irq_ack_notifier_list;
 
 	int round_robin_prev_vcpu;
 	unsigned int tss_addr;
-- 
1.5.6


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/4] KVM: pci device assignment
  2008-07-22 12:13 ` [PATCH 1/4] KVM: Add irq ack notifier list Ben-Ami Yassour
@ 2008-07-22 12:13   ` Ben-Ami Yassour
  2008-07-22 12:13     ` [PATCH 3/4] VT-d: changes to support KVM Ben-Ami Yassour
                       ` (3 more replies)
  2008-07-26  8:19   ` [PATCH 1/4] KVM: Add irq ack notifier list Avi Kivity
  1 sibling, 4 replies; 19+ messages in thread
From: Ben-Ami Yassour @ 2008-07-22 12:13 UTC (permalink / raw)
  To: avi; +Cc: amit.shah, kvm, muli, benami, weidong.han, anthony

Based on a patch from: Amit Shah <amit.shah@qumranet.com>

This patch adds support for handling PCI devices that are assigned to
the guest.

The device to be assigned to the guest is registered in the host kernel
and interrupt delivery is handled. If a device is already assigned, or
the device driver for it is still loaded on the host, the device
assignment
is failed by conveying a -EBUSY reply to the userspace.

Devices that share their interrupt line are not supported at the moment.

By itself, this patch will not make devices work within the guest.
The VT-d extension is required to enable the device to perform DMA.
Another alternative is PVDMA.

Signed-off-by: Amit Shah <amit.shah@qumranet.com>
Signed-off-by: Ben-Ami Yassour <benami@il.ibm.com>
Signed-off-by: Weidong Han <weidong.han@intel.com>
---
 arch/x86/kvm/i8259.c       |    5 +-
 arch/x86/kvm/irq.c         |    2 +-
 arch/x86/kvm/irq.h         |    3 +-
 arch/x86/kvm/x86.c         |  215 ++++++++++++++++++++++++++++++++++++++++++++
 include/asm-x86/kvm.h      |    1 +
 include/asm-x86/kvm_host.h |   20 ++++
 include/asm-x86/kvm_para.h |    1 -
 include/linux/kvm.h        |   21 +++++
 virt/kvm/ioapic.c          |    4 +
 virt/kvm/ioapic.h          |    1 +
 10 files changed, 269 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c
index 55e179a..d6793f0 100644
--- a/arch/x86/kvm/i8259.c
+++ b/arch/x86/kvm/i8259.c
@@ -159,9 +159,10 @@ static inline void pic_intack(struct kvm_kpic_state *s, int irq)
 		s->irr &= ~(1 << irq);
 }
 
-int kvm_pic_read_irq(struct kvm_pic *s)
+int kvm_pic_read_irq(struct kvm *kvm)
 {
 	int irq, irq2, intno;
+	struct kvm_pic *s = pic_irqchip(kvm);
 
 	irq = pic_get_irq(&s->pics[0]);
 	if (irq >= 0) {
@@ -186,6 +187,8 @@ int kvm_pic_read_irq(struct kvm_pic *s)
 		irq = 7;
 		intno = s->pics[0].irq_base + irq;
 	}
+	kvm_notify_acked_irq(kvm, irq);
+
 	pic_update_irq(s);
 
 	return intno;
diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
index 9091195..3c508af 100644
--- a/arch/x86/kvm/irq.c
+++ b/arch/x86/kvm/irq.c
@@ -72,7 +72,7 @@ int kvm_cpu_get_interrupt(struct kvm_vcpu *v)
 		if (kvm_apic_accept_pic_intr(v)) {
 			s = pic_irqchip(v->kvm);
 			s->output = 0;		/* PIC */
-			vector = kvm_pic_read_irq(s);
+			vector = kvm_pic_read_irq(v->kvm);
 		}
 	}
 	return vector;
diff --git a/arch/x86/kvm/irq.h b/arch/x86/kvm/irq.h
index 95fe718..479a3d2 100644
--- a/arch/x86/kvm/irq.h
+++ b/arch/x86/kvm/irq.h
@@ -63,11 +63,12 @@ struct kvm_pic {
 	void *irq_request_opaque;
 	int output;		/* intr from master PIC */
 	struct kvm_io_device dev;
+	void (*ack_notifier)(void *opaque, int irq);
 };
 
 struct kvm_pic *kvm_create_pic(struct kvm *kvm);
 void kvm_pic_set_irq(void *opaque, int irq, int level);
-int kvm_pic_read_irq(struct kvm_pic *s);
+int kvm_pic_read_irq(struct kvm *kvm);
 void kvm_pic_update_irq(struct kvm_pic *s);
 
 static inline struct kvm_pic *pic_irqchip(struct kvm *kvm)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 376ef73..d9aa931 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4,10 +4,12 @@
  * derived from drivers/kvm/kvm_main.c
  *
  * Copyright (C) 2006 Qumranet, Inc.
+ * Copyright (C) 2008 Qumranet, Inc.
  *
  * Authors:
  *   Avi Kivity   <avi@qumranet.com>
  *   Yaniv Kamay  <yaniv@qumranet.com>
+ *   Amit Shah    <amit.shah@qumranet.com>
  *
  * This work is licensed under the terms of the GNU GPL, version 2.  See
  * the COPYING file in the top-level directory.
@@ -23,8 +25,10 @@
 #include "x86.h"
 
 #include <linux/clocksource.h>
+#include <linux/interrupt.h>
 #include <linux/kvm.h>
 #include <linux/fs.h>
+#include <linux/pci.h>
 #include <linux/vmalloc.h>
 #include <linux/module.h>
 #include <linux/mman.h>
@@ -98,6 +102,204 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
 	{ NULL }
 };
 
+struct kvm_assigned_dev_kernel
+*kvm_find_assigned_dev(struct list_head *head,
+		     struct kvm_assigned_dev_info *assigned_dev_info)
+{
+	struct list_head *ptr;
+	struct kvm_assigned_dev_kernel *match;
+
+	list_for_each(ptr, head) {
+		match = list_entry(ptr, struct kvm_assigned_dev_kernel, list);
+		if ((match->host.busnr == assigned_dev_info->busnr) &&
+		    (match->host.devfn == assigned_dev_info->devfn))
+			return match;
+	}
+	return NULL;
+}
+
+static void kvm_assigned_dev_interrupt_work_handler(struct work_struct *work)
+{
+	struct kvm_assigned_dev_work *int_work;
+
+	int_work = container_of(work, struct kvm_assigned_dev_work, work);
+
+	/* This is taken to safely inject irq inside the guest. When
+	 * the interrupt injection (or the ioapic code) uses a
+	 * finer-grained lock, update this
+	 */
+	mutex_lock(&int_work->assigned_dev->kvm->lock);
+	kvm_set_irq(int_work->assigned_dev->kvm,
+		    int_work->assigned_dev->guest.irq[0], 1);
+	mutex_unlock(&int_work->assigned_dev->kvm->lock);
+	kvm_put_kvm(int_work->assigned_dev->kvm);
+}
+
+/* FIXME: Implement the OR logic needed to make shared interrupts on
+ * this line behave properly
+ */
+static irqreturn_t kvm_assigned_dev_intr(int irq, void *dev_id)
+{
+	struct kvm_assigned_dev_kernel *assigned_dev =
+		(struct kvm_assigned_dev_kernel *) dev_id;
+
+	kvm_get_kvm(assigned_dev->kvm);
+	schedule_work(&assigned_dev->int_work.work);
+	disable_irq_nosync(irq);
+	return IRQ_HANDLED;
+}
+
+/* Ack the irq line for an assigned device */
+static void kvm_assigned_dev_ack_irq(struct kvm_irq_ack_notifier *kian)
+{
+	struct kvm_assigned_dev_kernel *dev;
+
+	if (kian->gsi == -1)
+		return;
+
+	dev = container_of(kian, struct kvm_assigned_dev_kernel,
+			   ack_notifier);
+	kvm_set_irq(dev->kvm, dev->guest.irq[0], 0);
+	enable_irq(dev->host.irq[0]);
+}
+
+static int
+kvm_vm_ioctl_device_assignment(struct kvm *kvm,
+			       struct kvm_assigned_dev *assigned_dev)
+{
+	int r = 0;
+	struct kvm_assigned_dev_kernel *match;
+	struct pci_dev *dev;
+
+	if (assigned_dev->host.num_valid_irqs != 1) {
+		printk(KERN_INFO "%s: Unsupported number of irqs %d\n",
+		       __func__, assigned_dev->host.num_valid_irqs);
+		return -EINVAL;
+	}
+
+	mutex_lock(&kvm->lock);
+
+	/* Check if this is a request to update the irq of the device
+	 * in the guest (BIOS/ kernels can dynamically reprogram irq
+	 * numbers).  This also protects us from adding the same
+	 * device twice.
+	 */
+	match = kvm_find_assigned_dev(&kvm->arch.assigned_dev_head,
+				      &assigned_dev->host);
+	if (match) {
+		match->guest.irq[0] = assigned_dev->guest.irq[0];
+		match->ack_notifier.gsi = assigned_dev->guest.irq[0];
+		goto out;
+	}
+
+	match = kzalloc(sizeof(struct kvm_assigned_dev_kernel), GFP_KERNEL);
+	if (match == NULL) {
+		printk(KERN_INFO "%s: Couldn't allocate memory\n",
+		       __func__);
+		r = -ENOMEM;
+		goto out;
+	}
+	dev = pci_get_bus_and_slot(assigned_dev->host.busnr,
+				   assigned_dev->host.devfn);
+	if (!dev) {
+		printk(KERN_INFO "%s: host device not found\n", __func__);
+		r = -EINVAL;
+		goto out_free;
+	}
+	if (pci_enable_device(dev)) {
+		printk(KERN_INFO "%s: Could not enable PCI device\n", __func__);
+		r = -EBUSY;
+		goto out_put;
+	}
+	r = pci_request_regions(dev, "kvm_assigned_device");
+	if (r) {
+		printk(KERN_INFO "%s: Could not get access to device regions\n",
+		       __func__);
+		goto out_disable;
+	}
+	match->guest.busnr = assigned_dev->guest.busnr;
+	match->guest.devfn = assigned_dev->guest.devfn;
+	match->host.busnr = assigned_dev->host.busnr;
+	match->host.devfn = assigned_dev->host.devfn;
+	match->dev = dev;
+
+	INIT_WORK(&match->int_work.work,
+		  kvm_assigned_dev_interrupt_work_handler);
+
+	match->kvm = kvm;
+	match->int_work.assigned_dev = match;
+
+	list_add(&match->list, &kvm->arch.assigned_dev_head);
+
+	if (irqchip_in_kernel(kvm)) {
+		match->guest.irq[0] = assigned_dev->guest.irq[0];
+		match->host.irq[0] = dev->irq;
+		match->ack_notifier.gsi = assigned_dev->guest.irq[0];
+		match->ack_notifier.irq_acked = kvm_assigned_dev_ack_irq;
+		kvm_register_irq_ack_notifier(kvm, &match->ack_notifier);
+
+		/* Even though this is PCI, we don't want to use shared
+		 * interrupts. Sharing host devices with guest-assigned devices
+		 * on the same interrupt line is not a happy situation: there
+		 * are going to be long delays in accepting, acking, etc.
+		 */
+		if (request_irq(dev->irq, kvm_assigned_dev_intr, 0,
+				"kvm_assigned_device", (void *)match)) {
+			printk(KERN_INFO "%s: couldn't allocate irq for pv "
+			       "device\n", __func__);
+			r = -EIO;
+			goto out_list_del;
+		}
+	}
+
+
+out:
+	mutex_unlock(&kvm->lock);
+	return r;
+out_list_del:
+	list_del(&match->list);
+	pci_release_regions(dev);
+out_disable:
+	pci_disable_device(dev);
+out_put:
+	pci_dev_put(dev);
+out_free:
+	kfree(match);
+	mutex_unlock(&kvm->lock);
+	return r;
+}
+
+static void kvm_free_assigned_devices(struct kvm *kvm)
+{
+	struct list_head *ptr, *ptr2;
+	struct kvm_assigned_dev_kernel *assigned_dev;
+
+	list_for_each_safe(ptr, ptr2, &kvm->arch.assigned_dev_head) {
+		assigned_dev = list_entry(ptr,
+					  struct kvm_assigned_dev_kernel,
+					  list);
+
+		if (irqchip_in_kernel(kvm) && assigned_dev->host.irq[0])
+			free_irq(assigned_dev->host.irq[0],
+				 (void *)assigned_dev);
+
+		kvm_unregister_irq_ack_notifier(kvm,
+						&assigned_dev->ack_notifier);
+
+		if (cancel_work_sync(&assigned_dev->int_work.work))
+			/* We had pending work. That means we will have to take
+			 * care of kvm_put_kvm.
+			 */
+			kvm_put_kvm(kvm);
+
+		pci_release_regions(assigned_dev->dev);
+		pci_disable_device(assigned_dev->dev);
+		pci_dev_put(assigned_dev->dev);
+
+		list_del(&assigned_dev->list);
+		kfree(assigned_dev);
+	}
+}
 
 unsigned long segment_base(u16 selector)
 {
@@ -1746,6 +1948,17 @@ long kvm_arch_vm_ioctl(struct file *filp,
 		r = 0;
 		break;
 	}
+	case KVM_UPDATE_ASSIGNED_DEVICE: {
+		struct kvm_assigned_dev assigned_dev;
+
+		r = -EFAULT;
+		if (copy_from_user(&assigned_dev, argp, sizeof assigned_dev))
+			goto out;
+		r = kvm_vm_ioctl_device_assignment(kvm, &assigned_dev);
+		if (r)
+			goto out;
+		break;
+	}
 	case KVM_GET_PIT: {
 		struct kvm_pit_state ps;
 		r = -EFAULT;
@@ -3925,6 +4138,7 @@ struct  kvm *kvm_arch_create_vm(void)
 		return ERR_PTR(-ENOMEM);
 
 	INIT_LIST_HEAD(&kvm->arch.active_mmu_pages);
+	INIT_LIST_HEAD(&kvm->arch.assigned_dev_head);
 
 	return kvm;
 }
@@ -3957,6 +4171,7 @@ static void kvm_free_vcpus(struct kvm *kvm)
 
 void kvm_arch_destroy_vm(struct kvm *kvm)
 {
+	kvm_free_assigned_devices(kvm);
 	kvm_free_pit(kvm);
 	kfree(kvm->arch.vpic);
 	kfree(kvm->arch.vioapic);
diff --git a/include/asm-x86/kvm.h b/include/asm-x86/kvm.h
index 8f13749..12b4b25 100644
--- a/include/asm-x86/kvm.h
+++ b/include/asm-x86/kvm.h
@@ -208,4 +208,5 @@ struct kvm_pit_channel_state {
 struct kvm_pit_state {
 	struct kvm_pit_channel_state channels[3];
 };
+
 #endif
diff --git a/include/asm-x86/kvm_host.h b/include/asm-x86/kvm_host.h
index e2864e6..34eb3e7 100644
--- a/include/asm-x86/kvm_host.h
+++ b/include/asm-x86/kvm_host.h
@@ -325,6 +325,25 @@ struct kvm_irq_ack_notifier {
 	void (*irq_acked)(struct kvm_irq_ack_notifier *kian);
 };
 
+/* For assigned devices, we schedule work in the system workqueue to
+ * inject interrupts into the guest when an interrupt occurs on the
+ * physical device and also when the guest acks the interrupt.
+ */
+struct kvm_assigned_dev_work {
+	struct work_struct work;
+	struct kvm_assigned_dev_kernel *assigned_dev;
+};
+
+struct kvm_assigned_dev_kernel {
+	struct kvm_irq_ack_notifier ack_notifier;
+	struct list_head list;
+	struct kvm_assigned_dev_info guest;
+	struct kvm_assigned_dev_info host;
+	struct kvm_assigned_dev_work int_work;
+	struct pci_dev *dev;
+	struct kvm *kvm;
+};
+
 struct kvm_arch{
 	int naliases;
 	struct kvm_mem_alias aliases[KVM_ALIAS_SLOTS];
@@ -337,6 +356,7 @@ struct kvm_arch{
 	 * Hash table of struct kvm_mmu_page.
 	 */
 	struct list_head active_mmu_pages;
+	struct list_head assigned_dev_head;
 	struct kvm_pic *vpic;
 	struct kvm_ioapic *vioapic;
 	struct kvm_pit *vpit;
diff --git a/include/asm-x86/kvm_para.h b/include/asm-x86/kvm_para.h
index 76f3921..3aa1731 100644
--- a/include/asm-x86/kvm_para.h
+++ b/include/asm-x86/kvm_para.h
@@ -143,5 +143,4 @@ static inline unsigned int kvm_arch_para_features(void)
 }
 
 #endif
-
 #endif
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 6edba45..c436c08 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -382,6 +382,7 @@ struct kvm_trace_rec {
 #define KVM_CAP_PV_MMU 13
 #define KVM_CAP_MP_STATE 14
 #define KVM_CAP_COALESCED_MMIO 15
+#define KVM_CAP_DEVICE_ASSIGNMENT 16
 
 /*
  * ioctls for VM fds
@@ -411,6 +412,8 @@ struct kvm_trace_rec {
 			_IOW(KVMIO,  0x67, struct kvm_coalesced_mmio_zone)
 #define KVM_UNREGISTER_COALESCED_MMIO \
 			_IOW(KVMIO,  0x68, struct kvm_coalesced_mmio_zone)
+#define KVM_UPDATE_ASSIGNED_DEVICE _IOR(KVMIO, 0x69,		\
+					struct kvm_assigned_dev)
 
 /*
  * ioctls for vcpu fds
@@ -475,4 +478,22 @@ struct kvm_trace_rec {
 #define KVM_TRC_STLB_INVAL       (KVM_TRC_HANDLER + 0x18)
 #define KVM_TRC_PPC_INSTR        (KVM_TRC_HANDLER + 0x19)
 
+#define ASSIGNED_DEV_MAX_IRQ 16
+
+/* Stores information for identifying host PCI devices assigned to the
+ * guest: this is used in the host kernel and in the userspace.
+ */
+struct kvm_assigned_dev_info {
+	__u32 busnr;
+	__u32 devfn;
+	__u32 irq[ASSIGNED_DEV_MAX_IRQ];
+	__u32 num_valid_irqs; /* currently only 1 is supported */
+};
+
+/* Mapping between host and guest PCI device */
+struct kvm_assigned_dev {
+	struct kvm_assigned_dev_info guest;
+	struct kvm_assigned_dev_info host;
+};
+
 #endif
diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
index c0d2287..5d68d0b 100644
--- a/virt/kvm/ioapic.c
+++ b/virt/kvm/ioapic.c
@@ -39,6 +39,7 @@
 
 #include "ioapic.h"
 #include "lapic.h"
+#include "irq.h"
 
 #if 0
 #define ioapic_debug(fmt,arg...) printk(KERN_WARNING fmt,##arg)
@@ -293,6 +294,9 @@ static void __kvm_ioapic_update_eoi(struct kvm_ioapic *ioapic, int gsi)
 	ASSERT(ent->fields.trig_mode == IOAPIC_LEVEL_TRIG);
 
 	ent->fields.remote_irr = 0;
+
+	kvm_notify_acked_irq(ioapic->kvm, gsi);
+
 	if (!ent->fields.mask && (ioapic->irr & (1 << gsi)))
 		ioapic_service(ioapic, gsi);
 }
diff --git a/virt/kvm/ioapic.h b/virt/kvm/ioapic.h
index 7f16675..a42743f 100644
--- a/virt/kvm/ioapic.h
+++ b/virt/kvm/ioapic.h
@@ -58,6 +58,7 @@ struct kvm_ioapic {
 	} redirtbl[IOAPIC_NUM_PINS];
 	struct kvm_io_device dev;
 	struct kvm *kvm;
+	void (*ack_notifier)(void *opaque, int irq);
 };
 
 #ifdef DEBUG
-- 
1.5.6


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 3/4] VT-d: changes to support KVM
  2008-07-22 12:13   ` [PATCH 2/4] KVM: pci device assignment Ben-Ami Yassour
@ 2008-07-22 12:13     ` Ben-Ami Yassour
  2008-07-22 12:13       ` [PATCH 4/4] KVM: Device assignemnt with VT-d Ben-Ami Yassour
  2008-07-26  8:46     ` [PATCH 2/4] KVM: pci device assignment Avi Kivity
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 19+ messages in thread
From: Ben-Ami Yassour @ 2008-07-22 12:13 UTC (permalink / raw)
  To: avi; +Cc: amit.shah, kvm, muli, benami, weidong.han, anthony, Kay, Allen M

From: Kay, Allen M <allen.m.kay@intel.com>

This patch extends the VT-d driver to support KVM

[Ben: fixed memory pinning]

Signed-off-by: Kay, Allen M <allen.m.kay@intel.com>
Signed-off-by: Weidong Han <weidong.han@intel.com>
Signed-off-by: Ben-Ami Yassour <benami@il.ibm.com>
---
 drivers/pci/dmar.c                           |    4 +-
 drivers/pci/intel-iommu.c                    |  117 +++++++++++++++++++++++++-
 drivers/pci/iova.c                           |    2 +-
 {drivers/pci => include/linux}/intel-iommu.h |   11 +++
 {drivers/pci => include/linux}/iova.h        |    0 
 5 files changed, 127 insertions(+), 7 deletions(-)
 rename {drivers/pci => include/linux}/intel-iommu.h (94%)
 rename {drivers/pci => include/linux}/iova.h (100%)

diff --git a/drivers/pci/dmar.c b/drivers/pci/dmar.c
index f941f60..a58a5b0 100644
--- a/drivers/pci/dmar.c
+++ b/drivers/pci/dmar.c
@@ -26,8 +26,8 @@
 
 #include <linux/pci.h>
 #include <linux/dmar.h>
-#include "iova.h"
-#include "intel-iommu.h"
+#include <linux/iova.h>
+#include <linux/intel-iommu.h>
 
 #undef PREFIX
 #define PREFIX "DMAR:"
diff --git a/drivers/pci/intel-iommu.c b/drivers/pci/intel-iommu.c
index 3f7b81c..212bc30 100644
--- a/drivers/pci/intel-iommu.c
+++ b/drivers/pci/intel-iommu.c
@@ -20,6 +20,7 @@
  * Author: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
  */
 
+#undef DEBUG
 #include <linux/init.h>
 #include <linux/bitmap.h>
 #include <linux/debugfs.h>
@@ -33,8 +34,8 @@
 #include <linux/dma-mapping.h>
 #include <linux/mempool.h>
 #include <linux/timer.h>
-#include "iova.h"
-#include "intel-iommu.h"
+#include <linux/iova.h>
+#include <linux/intel-iommu.h>
 #include <asm/proto.h> /* force_iommu in this header in x86-64*/
 #include <asm/cacheflush.h>
 #include <asm/gart.h>
@@ -160,7 +161,7 @@ static inline void *alloc_domain_mem(void)
 	return iommu_kmem_cache_alloc(iommu_domain_cache);
 }
 
-static inline void free_domain_mem(void *vaddr)
+static void free_domain_mem(void *vaddr)
 {
 	kmem_cache_free(iommu_domain_cache, vaddr);
 }
@@ -1414,7 +1415,7 @@ static void domain_remove_dev_info(struct dmar_domain *domain)
  * find_domain
  * Note: we use struct pci_dev->dev.archdata.iommu stores the info
  */
-struct dmar_domain *
+static struct dmar_domain *
 find_domain(struct pci_dev *pdev)
 {
 	struct device_domain_info *info;
@@ -2430,3 +2431,111 @@ int __init intel_iommu_init(void)
 	return 0;
 }
 
+void intel_iommu_domain_exit(struct dmar_domain *domain)
+{
+	u64 end;
+
+	/* Domain 0 is reserved, so dont process it */
+	if (!domain)
+		return;
+
+	end = DOMAIN_MAX_ADDR(domain->gaw);
+	end = end & (~PAGE_MASK_4K);
+
+	/* clear ptes */
+	dma_pte_clear_range(domain, 0, end);
+
+	/* free page tables */
+	dma_pte_free_pagetable(domain, 0, end);
+
+	iommu_free_domain(domain);
+	free_domain_mem(domain);
+}
+EXPORT_SYMBOL_GPL(intel_iommu_domain_exit);
+
+struct dmar_domain *intel_iommu_domain_alloc(struct pci_dev *pdev)
+{
+	struct dmar_drhd_unit *drhd;
+	struct dmar_domain *domain;
+	struct intel_iommu *iommu;
+
+	drhd = dmar_find_matched_drhd_unit(pdev);
+	if (!drhd) {
+		printk(KERN_ERR "intel_iommu_domain_alloc: drhd == NULL\n");
+		return NULL;
+	}
+
+	iommu = drhd->iommu;
+	if (!iommu) {
+		printk(KERN_ERR
+			"intel_iommu_domain_alloc: iommu == NULL\n");
+		return NULL;
+	}
+	domain = iommu_alloc_domain(iommu);
+	if (!domain) {
+		printk(KERN_ERR
+			"intel_iommu_domain_alloc: domain == NULL\n");
+		return NULL;
+	}
+	if (domain_init(domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
+		printk(KERN_ERR
+			"intel_iommu_domain_alloc: domain_init() failed\n");
+		intel_iommu_domain_exit(domain);
+		return NULL;
+	}
+	return domain;
+}
+EXPORT_SYMBOL_GPL(intel_iommu_domain_alloc);
+
+int intel_iommu_context_mapping(
+	struct dmar_domain *domain, struct pci_dev *pdev)
+{
+	int rc;
+	rc = domain_context_mapping(domain, pdev);
+	return rc;
+}
+EXPORT_SYMBOL_GPL(intel_iommu_context_mapping);
+
+int intel_iommu_page_mapping(
+	struct dmar_domain *domain, dma_addr_t iova,
+	u64 hpa, size_t size, int prot)
+{
+	int rc;
+	rc = domain_page_mapping(domain, iova, hpa, size, prot);
+	return rc;
+}
+EXPORT_SYMBOL_GPL(intel_iommu_page_mapping);
+
+void intel_iommu_detach_dev(struct dmar_domain *domain, u8 bus, u8 devfn)
+{
+	detach_domain_for_dev(domain, bus, devfn);
+}
+EXPORT_SYMBOL_GPL(intel_iommu_detach_dev);
+
+struct dmar_domain *
+intel_iommu_find_domain(struct pci_dev *pdev)
+{
+	return find_domain(pdev);
+}
+EXPORT_SYMBOL_GPL(intel_iommu_find_domain);
+
+int intel_iommu_found(void)
+{
+	return g_num_of_iommus;
+}
+EXPORT_SYMBOL_GPL(intel_iommu_found);
+
+u64 intel_iommu_iova_to_pfn(struct dmar_domain *domain, u64 iova)
+{
+	struct dma_pte *pte;
+	u64 pfn;
+
+	pfn = 0;
+	pte = addr_to_dma_pte(domain, iova);
+
+	if (pte)
+		pfn = dma_pte_addr(*pte);
+
+	return pfn >> PAGE_SHIFT_4K;
+}
+EXPORT_SYMBOL_GPL(intel_iommu_iova_to_pfn);
diff --git a/drivers/pci/iova.c b/drivers/pci/iova.c
index 3ef4ac0..2287116 100644
--- a/drivers/pci/iova.c
+++ b/drivers/pci/iova.c
@@ -7,7 +7,7 @@
  * Author: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
  */
 
-#include "iova.h"
+#include <linux/iova.h>
 
 void
 init_iova_domain(struct iova_domain *iovad, unsigned long pfn_32bit)
diff --git a/drivers/pci/intel-iommu.h b/include/linux/intel-iommu.h
similarity index 94%
rename from drivers/pci/intel-iommu.h
rename to include/linux/intel-iommu.h
index afc0ad9..1490fc0 100644
--- a/drivers/pci/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -341,4 +341,15 @@ static inline void iommu_prepare_gfx_mapping(void)
 }
 #endif /* !CONFIG_DMAR_GFX_WA */
 
+void intel_iommu_domain_exit(struct dmar_domain *domain);
+struct dmar_domain *intel_iommu_domain_alloc(struct pci_dev *pdev);
+int intel_iommu_context_mapping(struct dmar_domain *domain,
+				struct pci_dev *pdev);
+int intel_iommu_page_mapping(struct dmar_domain *domain, dma_addr_t iova,
+			     u64 hpa, size_t size, int prot);
+void intel_iommu_detach_dev(struct dmar_domain *domain, u8 bus, u8 devfn);
+struct dmar_domain *intel_iommu_find_domain(struct pci_dev *pdev);
+int intel_iommu_found(void);
+u64 intel_iommu_iova_to_pfn(struct dmar_domain *domain, u64 iova);
+
 #endif
diff --git a/drivers/pci/iova.h b/include/linux/iova.h
similarity index 100%
rename from drivers/pci/iova.h
rename to include/linux/iova.h
-- 
1.5.6


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 4/4] KVM: Device assignemnt with VT-d
  2008-07-22 12:13     ` [PATCH 3/4] VT-d: changes to support KVM Ben-Ami Yassour
@ 2008-07-22 12:13       ` Ben-Ami Yassour
  2008-07-22 12:18         ` device assignment - userspace part Ben-Ami Yassour
                           ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Ben-Ami Yassour @ 2008-07-22 12:13 UTC (permalink / raw)
  To: avi; +Cc: amit.shah, kvm, muli, benami, weidong.han, anthony, Kay, Allen M

From: Kay, Allen M <allen.m.kay@intel.com>

This patch includes the functions to support VT-d for passthrough
devices.

[Ben: fixed memory pinning, cleanup]

Signed-off-by: Kay, Allen M <allen.m.kay@intel.com>
Signed-off-by: Weidong Han <weidong.han@intel.com>
Signed-off-by: Ben-Ami Yassour <benami@il.ibm.com>
---
 arch/x86/kvm/Makefile      |    2 +-
 arch/x86/kvm/vtd.c         |  182 ++++++++++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/x86.c         |   11 +++
 include/asm-x86/kvm_host.h |    3 +
 include/linux/kvm_host.h   |    6 ++
 virt/kvm/kvm_main.c        |    8 ++-
 6 files changed, 210 insertions(+), 2 deletions(-)
 create mode 100644 arch/x86/kvm/vtd.c

diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index d0e940b..5d9d079 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -11,7 +11,7 @@ endif
 EXTRA_CFLAGS += -Ivirt/kvm -Iarch/x86/kvm
 
 kvm-objs := $(common-objs) x86.o mmu.o x86_emulate.o i8259.o irq.o lapic.o \
-	i8254.o
+	i8254.o vtd.o
 obj-$(CONFIG_KVM) += kvm.o
 kvm-intel-objs = vmx.o
 obj-$(CONFIG_KVM_INTEL) += kvm-intel.o
diff --git a/arch/x86/kvm/vtd.c b/arch/x86/kvm/vtd.c
new file mode 100644
index 0000000..7a3cf4e
--- /dev/null
+++ b/arch/x86/kvm/vtd.c
@@ -0,0 +1,182 @@
+/*
+ * Copyright (c) 2006, Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
+ * Place - Suite 330, Boston, MA 02111-1307 USA.
+ *
+ * Copyright (C) 2006-2008 Intel Corporation
+ * Author: Allen M. Kay <allen.m.kay@intel.com>
+ * Author: Weidong Han <weidong.han@intel.com>
+ */
+
+#include <linux/list.h>
+#include <linux/kvm_host.h>
+#include <linux/pci.h>
+#include <linux/dmar.h>
+#include <linux/intel-iommu.h>
+
+static int kvm_iommu_unmap_memslots(struct kvm *kvm);
+
+int kvm_iommu_map_pages(struct kvm *kvm,
+			gfn_t base_gfn, unsigned long npages)
+{
+	gfn_t gfn = base_gfn;
+	pfn_t pfn;
+	int i, rc;
+	struct dmar_domain *domain = kvm->arch.intel_iommu_domain;
+
+	if (!domain)
+		return -EFAULT;
+
+	for (i = 0; i < npages; i++) {
+		pfn = gfn_to_pfn(kvm, gfn);
+		if (!is_mmio_pfn(pfn)) {
+			rc = intel_iommu_page_mapping(domain,
+						      gfn << PAGE_SHIFT,
+						      pfn << PAGE_SHIFT,
+						      PAGE_SIZE,
+						      DMA_PTE_READ |
+						      DMA_PTE_WRITE);
+			if (rc)
+				kvm_release_pfn_clean(pfn);
+		} else {
+			printk(KERN_DEBUG "kvm_iommu_map_page:"
+			       "invalid pfn=%lx\n", pfn);
+			return 0;
+		}
+
+		gfn++;
+	}
+	return 0;
+}
+
+static int kvm_iommu_map_memslots(struct kvm *kvm)
+{
+	int i, rc;
+	for (i = 0; i < kvm->nmemslots; i++) {
+		rc = kvm_iommu_map_pages(kvm, kvm->memslots[i].base_gfn,
+					 kvm->memslots[i].npages);
+		if (rc)
+			return rc;
+	}
+	return 0;
+}
+
+int kvm_iommu_map_guest(struct kvm *kvm,
+			struct kvm_assigned_dev *assigned_dev)
+{
+	struct pci_dev *pdev = NULL;
+
+	printk(KERN_DEBUG "VT-d direct map: host bdf = %x:%x:%x\n",
+	       assigned_dev->host.busnr,
+	       PCI_SLOT(assigned_dev->host.devfn),
+	       PCI_FUNC(assigned_dev->host.devfn));
+
+	for_each_pci_dev(pdev) {
+		if ((pdev->bus->number == assigned_dev->host.busnr) &&
+		    (pdev->devfn == assigned_dev->host.devfn)) {
+			break;
+		}
+	}
+
+	if (pdev == NULL) {
+		if (kvm->arch.intel_iommu_domain) {
+			intel_iommu_domain_exit(kvm->arch.intel_iommu_domain);
+			kvm->arch.intel_iommu_domain = NULL;
+		}
+		return -ENODEV;
+	}
+
+	kvm->arch.intel_iommu_domain = intel_iommu_domain_alloc(pdev);
+
+	if (kvm_iommu_map_memslots(kvm)) {
+		kvm_iommu_unmap_memslots(kvm);
+		return -EFAULT;
+	}
+
+	intel_iommu_detach_dev(kvm->arch.intel_iommu_domain,
+			       pdev->bus->number, pdev->devfn);
+
+	if (intel_iommu_context_mapping(kvm->arch.intel_iommu_domain,
+					pdev)) {
+		printk(KERN_ERR "Domain context map for %s failed",
+		       pci_name(pdev));
+		return -EFAULT;
+	}
+	return 0;
+}
+
+static int kvm_iommu_put_pages(struct kvm *kvm,
+			       gfn_t base_gfn, unsigned long npages)
+{
+	gfn_t gfn = base_gfn;
+	pfn_t pfn;
+	struct dmar_domain *domain = kvm->arch.intel_iommu_domain;
+	int i;
+
+	if (!domain)
+		return -EFAULT;
+
+	for (i = 0; i < npages; i++) {
+		pfn = (pfn_t)intel_iommu_iova_to_pfn(domain,
+						     gfn << PAGE_SHIFT);
+		kvm_release_pfn_clean(pfn);
+		gfn++;
+	}
+	return 0;
+}
+
+static int kvm_iommu_unmap_memslots(struct kvm *kvm)
+{
+	int i, rc;
+	for (i = 0; i < kvm->nmemslots; i++) {
+		rc = kvm_iommu_put_pages(kvm, kvm->memslots[i].base_gfn,
+					 kvm->memslots[i].npages);
+		if (rc)
+			return rc;
+	}
+	return 0;
+}
+
+int kvm_iommu_unmap_guest(struct kvm *kvm)
+{
+	struct kvm_assigned_dev_kernel *entry;
+	struct pci_dev *pdev = NULL;
+	struct dmar_domain *domain = kvm->arch.intel_iommu_domain;
+
+	if (!domain)
+		return 0;
+
+	list_for_each_entry(entry, &kvm->arch.assigned_dev_head, list) {
+		printk(KERN_DEBUG "VT-d unmap: host bdf = %x:%x:%x\n",
+		       entry->host.busnr,
+		       PCI_SLOT(entry->host.devfn),
+		       PCI_FUNC(entry->host.devfn));
+
+		for_each_pci_dev(pdev) {
+			if ((pdev->bus->number == entry->host.busnr) &&
+			    (pdev->devfn == entry->host.devfn))
+				break;
+		}
+
+		if (pdev == NULL)
+			return -ENODEV;
+
+		/* detach kvm dmar domain */
+		intel_iommu_detach_dev(domain,
+				       pdev->bus->number, pdev->devfn);
+	}
+	kvm_iommu_unmap_memslots(kvm);
+	intel_iommu_domain_exit(domain);
+	return 0;
+}
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index d9aa931..e57c9e4 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -33,6 +33,7 @@
 #include <linux/module.h>
 #include <linux/mman.h>
 #include <linux/highmem.h>
+#include <linux/intel-iommu.h>
 
 #include <asm/uaccess.h>
 #include <asm/msr.h>
@@ -253,9 +254,17 @@ kvm_vm_ioctl_device_assignment(struct kvm *kvm,
 	}
 
 
+	if (intel_iommu_found()) {
+		r = kvm_iommu_map_guest(kvm, assigned_dev);
+		if (r)
+			goto out_intr;
+	}
+
 out:
 	mutex_unlock(&kvm->lock);
 	return r;
+out_intr:
+	free_irq(dev->irq, match);
 out_list_del:
 	list_del(&match->list);
 	pci_release_regions(dev);
@@ -4171,6 +4180,8 @@ static void kvm_free_vcpus(struct kvm *kvm)
 
 void kvm_arch_destroy_vm(struct kvm *kvm)
 {
+	if (intel_iommu_found())
+		kvm_iommu_unmap_guest(kvm);
 	kvm_free_assigned_devices(kvm);
 	kvm_free_pit(kvm);
 	kfree(kvm->arch.vpic);
diff --git a/include/asm-x86/kvm_host.h b/include/asm-x86/kvm_host.h
index 34eb3e7..75b088b 100644
--- a/include/asm-x86/kvm_host.h
+++ b/include/asm-x86/kvm_host.h
@@ -357,6 +357,7 @@ struct kvm_arch{
 	 */
 	struct list_head active_mmu_pages;
 	struct list_head assigned_dev_head;
+	struct dmar_domain *intel_iommu_domain;
 	struct kvm_pic *vpic;
 	struct kvm_ioapic *vioapic;
 	struct kvm_pit *vpit;
@@ -506,6 +507,8 @@ int emulator_write_phys(struct kvm_vcpu *vcpu, gpa_t gpa,
 int kvm_pv_mmu_op(struct kvm_vcpu *vcpu, unsigned long bytes,
 		  gpa_t addr, unsigned long *ret);
 
+int is_mmio_pfn(pfn_t pfn);
+
 extern bool tdp_enabled;
 
 enum emulation_result {
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 3798097..506b595 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -279,6 +279,12 @@ int kvm_cpu_has_interrupt(struct kvm_vcpu *v);
 int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu);
 void kvm_vcpu_kick(struct kvm_vcpu *vcpu);
 
+int kvm_iommu_map_pages(struct kvm *kvm, gfn_t base_gfn,
+			unsigned long npages);
+int kvm_iommu_map_guest(struct kvm *kvm,
+			struct kvm_assigned_dev *assigned_dev);
+int kvm_iommu_unmap_guest(struct kvm *kvm);
+
 static inline void kvm_guest_enter(void)
 {
 	account_system_vtime(current);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 5b470a1..49c4ed3 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -41,6 +41,7 @@
 #include <linux/pagemap.h>
 #include <linux/mman.h>
 #include <linux/swap.h>
+#include <linux/intel-iommu.h>
 
 #include <asm/processor.h>
 #include <asm/io.h>
@@ -76,7 +77,7 @@ static inline int valid_vcpu(int n)
 	return likely(n >= 0 && n < KVM_MAX_VCPUS);
 }
 
-static inline int is_mmio_pfn(pfn_t pfn)
+inline int is_mmio_pfn(pfn_t pfn)
 {
 	if (pfn_valid(pfn))
 		return PageReserved(pfn_to_page(pfn));
@@ -431,6 +432,11 @@ int __kvm_set_memory_region(struct kvm *kvm,
 	}
 
 	kvm_free_physmem_slot(&old, &new);
+
+	/* map the pages in iommu page table */
+	if (intel_iommu_found())
+		kvm_iommu_map_pages(kvm, base_gfn, npages);
+
 	return 0;
 
 out_free:
-- 
1.5.6


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* device assignment - userspace part
  2008-07-22 12:13       ` [PATCH 4/4] KVM: Device assignemnt with VT-d Ben-Ami Yassour
@ 2008-07-22 12:18         ` Ben-Ami Yassour
  2008-07-22 12:18           ` [PATCH 1/1] KVM/userspace: Support for assigning PCI devices to guest Ben-Ami Yassour
  2008-07-26  9:02         ` [PATCH 4/4] KVM: Device assignemnt with VT-d Avi Kivity
  2008-07-28  6:49         ` Han, Weidong
  2 siblings, 1 reply; 19+ messages in thread
From: Ben-Ami Yassour @ 2008-07-22 12:18 UTC (permalink / raw)
  To: avi; +Cc: amit.shah, kvm, muli, benami, weidong.han, anthony

Followed is the userspace patch for device assignment



^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/1] KVM/userspace: Support for assigning PCI devices to guest
  2008-07-22 12:18         ` device assignment - userspace part Ben-Ami Yassour
@ 2008-07-22 12:18           ` Ben-Ami Yassour
  0 siblings, 0 replies; 19+ messages in thread
From: Ben-Ami Yassour @ 2008-07-22 12:18 UTC (permalink / raw)
  To: avi
  Cc: amit.shah, kvm, muli, benami, weidong.han, anthony, Nir Peleg,
	Glauber de Oliveira Costa

Based on a patch from: Amit Shah <amit.shah@qumranet.com>

We can assign a device from the host machine to a guest. The
original code comes from Neocleus.

A new command-line option, -pcidevice is added.
For example, to invoke it for an Ethernet device sitting at
PCI bus:dev.fn 04:08.0 with host IRQ 18, use this:

        -pcidevice Ethernet/04:08.0

The host ethernet driver is to be removed before doing assigning
the device to a guest. If not, the device assignment fails but the
guest continues without the assignment.

If kvm uses the in-kernel irqchip, interrupts are routed to the
guest via the kvm module (accompanied kernel changes are
necessary).

Signed-off-by: Amit Shah <amit.shah@qumranet.com>
Signed-off-by: Nir Peleg <nir@tutis.com>
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ben-Ami Yassour <benami@il.ibm.com>
---
 libkvm/libkvm-x86.c         |    8 +
 libkvm/libkvm.h             |   15 ++
 qemu/Makefile.target        |    1 +
 qemu/hw/device-assignment.c |  575 +++++++++++++++++++++++++++++++++++++++++++
 qemu/hw/device-assignment.h |   95 +++++++
 qemu/hw/isa.h               |    2 +
 qemu/hw/pc.c                |    9 +
 qemu/hw/pci.c               |   12 +
 qemu/hw/pci.h               |    1 +
 qemu/hw/piix_pci.c          |   19 ++
 qemu/vl.c                   |   17 ++
 11 files changed, 754 insertions(+), 0 deletions(-)
 create mode 100644 qemu/hw/device-assignment.c
 create mode 100644 qemu/hw/device-assignment.h

diff --git a/libkvm/libkvm-x86.c b/libkvm/libkvm-x86.c
index ea97bdd..ea4f0ef 100644
--- a/libkvm/libkvm-x86.c
+++ b/libkvm/libkvm-x86.c
@@ -126,6 +126,14 @@ static int kvm_init_tss(kvm_context_t kvm)
 	return 0;
 }
 
+#ifdef KVM_CAP_DEVICE_ASSIGNMENT
+int kvm_update_assigned_device(kvm_context_t kvm,
+			       struct kvm_assigned_dev *assigned_dev)
+{
+	return ioctl(kvm->vm_fd, KVM_UPDATE_ASSIGNED_DEVICE, assigned_dev);
+}
+#endif
+
 int kvm_arch_create_default_phys_mem(kvm_context_t kvm,
 				       unsigned long phys_mem_bytes,
 				       void **vm_mem)
diff --git a/libkvm/libkvm.h b/libkvm/libkvm.h
index 9f06fcc..276f6f0 100644
--- a/libkvm/libkvm.h
+++ b/libkvm/libkvm.h
@@ -647,6 +647,21 @@ int kvm_disable_tpr_access_reporting(kvm_context_t kvm, int vcpu);
 
 int kvm_enable_vapic(kvm_context_t kvm, int vcpu, uint64_t vapic);
 
+#ifdef KVM_CAP_DEVICE_ASSIGNMENT
+/*!
+ * \brief Notifies host kernel about changes to a PCI device assigned to guest
+ *
+ * Used for PCI device assignment, this function notifies the host
+ * kernel about the assigning of the physical PCI device and the guest
+ * PCI parameters or updates to the PCI config space from the guest
+ * (mainly the device irq)
+ *
+ * \param kvm Pointer to the current kvm_context
+ * \param assigned_dev Parameters like irq, PCI bus, devfn number, etc
+ */
+int kvm_update_assigned_device(kvm_context_t kvm,
+			       struct kvm_assigned_dev *assigned_dev);
+#endif
 #endif
 
 #if defined(__s390__)
diff --git a/qemu/Makefile.target b/qemu/Makefile.target
index 54480e4..94a6393 100644
--- a/qemu/Makefile.target
+++ b/qemu/Makefile.target
@@ -602,6 +602,7 @@ OBJS+= ide.o pckbd.o ps2.o vga.o $(SOUND_HW) dma.o
 OBJS+= fdc.o mc146818rtc.o serial.o i8259.o i8254.o pcspk.o pc.o
 OBJS+= cirrus_vga.o apic.o parallel.o acpi.o piix_pci.o
 OBJS+= usb-uhci.o vmmouse.o vmport.o vmware_vga.o extboot.o
+OBJS+= device-assignment.o
 ifeq ($(USE_KVM_PIT), 1)
 OBJS+= i8254-kvm.o
 endif
diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c
new file mode 100644
index 0000000..ea98b18
--- /dev/null
+++ b/qemu/hw/device-assignment.c
@@ -0,0 +1,575 @@
+/*
+ * Copyright (c) 2007, Neocleus Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
+ * Place - Suite 330, Boston, MA 02111-1307 USA.
+ *
+ *
+ *  Pass a PCI device from the host to a guest VM.
+ *
+ *  Adapted for KVM by Qumranet.
+ *
+ *  Copyright (c) 2007, Neocleus, Alex Novik (alex@neocleus.com)
+ *  Copyright (c) 2007, Neocleus, Guy Zana (guy@neocleus.com)
+ *  Copyright (C) 2008, Qumranet, Amit Shah (amit.shah@qumranet.com)
+ */
+#include <stdio.h>
+#include <pthread.h>
+#include <sys/io.h>
+#include <sys/ioctl.h>
+#include <linux/types.h>
+
+/* From linux/ioport.h */
+#define IORESOURCE_IO		0x00000100	/* Resource type */
+#define IORESOURCE_MEM		0x00000200
+#define IORESOURCE_IRQ		0x00000400
+#define IORESOURCE_DMA		0x00000800
+#define IORESOURCE_PREFETCH	0x00001000	/* No side effects */
+
+#include "device-assignment.h"
+#include "irq.h"
+
+#include "qemu-kvm.h"
+#include <linux/kvm_para.h>
+extern FILE *logfile;
+
+/* #define DEVICE_ASSIGNMENT_DEBUG */
+
+#ifdef DEVICE_ASSIGNMENT_DEBUG
+#define DEBUG(fmt, args...) fprintf(stderr, "%s: " fmt, __func__ , ## args)
+#else
+#define DEBUG(fmt, args...)
+#endif
+
+#define assigned_dev_ioport_write(suffix)				\
+static void assigned_dev_ioport_write##suffix(void *opaque,             \
+					      uint32_t addr,            \
+                                              uint32_t value)		\
+{									\
+	assigned_dev_region_t *r_access =                               \
+                  (assigned_dev_region_t *)opaque;			\
+	uint32_t r_pio = (unsigned long)r_access->r_virtbase		\
+		+ (addr - r_access->e_physbase);			\
+	if (r_access->debug & DEVICE_ASSIGNMENT_DEBUG_PIO) {		\
+		fprintf(logfile, "assigned_dev_ioport_write" #suffix	\
+			": r_pio=%08x e_physbase=%08x"			\
+			" r_virtbase=%08lx value=%08x\n",		\
+			r_pio, (int)r_access->e_physbase,		\
+			(unsigned long)r_access->r_virtbase, value);	\
+	}								\
+	out##suffix(value, r_pio);					\
+}
+assigned_dev_ioport_write(b)
+assigned_dev_ioport_write(w)
+assigned_dev_ioport_write(l)
+
+#define assigned_dev_ioport_read(suffix)				\
+static uint32_t assigned_dev_ioport_read##suffix(void *opaque,          \
+                                                 uint32_t addr)  	\
+{									\
+	assigned_dev_region_t *r_access =                               \
+                    (assigned_dev_region_t *)opaque;     		\
+	uint32_t r_pio = (addr - r_access->e_physbase)			\
+		+ (unsigned long)r_access->r_virtbase;			\
+		uint32_t value = in##suffix(r_pio);			\
+		if (r_access->debug & DEVICE_ASSIGNMENT_DEBUG_PIO) {	\
+			fprintf(logfile, "assigned_dev_ioport_read"     \
+                                #suffix	": r_pio=%08x "                 \
+                                "e_physbase=%08x r_virtbase=%08lx "	\
+				"value=%08x\n",				\
+				r_pio, (int)r_access->e_physbase,	\
+				(unsigned long)r_access->r_virtbase,    \
+                                value); 				\
+		}							\
+		return value;						\
+}
+
+assigned_dev_ioport_read(b)
+assigned_dev_ioport_read(w)
+assigned_dev_ioport_read(l)
+
+void assigned_dev_iomem_map(PCIDevice * pci_dev, int region_num,
+			    uint32_t e_phys, uint32_t e_size, int type)
+{
+	assigned_dev_t *r_dev = (assigned_dev_t *) pci_dev;
+	assigned_dev_region_t *region = &r_dev->v_addrs[region_num];
+	int first_map = (region->e_size == 0);
+	int ret = 0;
+
+	DEBUG("e_phys=%08x r_virt=%p type=%d len=%08x region_num=%d \n",
+	      e_phys, r_dev->v_addrs[region_num].r_virtbase, type, e_size,
+	      region_num);
+
+	region->e_physbase = e_phys;
+	region->e_size = e_size;
+
+	if (!first_map)
+		kvm_destroy_phys_mem(kvm_context, e_phys, e_size);
+
+	if (e_size > 0)
+		ret = kvm_register_userspace_phys_mem(kvm_context,
+						      e_phys,
+						      region->r_virtbase,
+						      e_size,
+						      0);
+	if (ret != 0)
+		fprintf(logfile, "Error: create new mapping failed\n");
+}
+
+static void assigned_dev_ioport_map(PCIDevice *pci_dev, int region_num,
+				    uint32_t addr, uint32_t size, int type)
+{
+	assigned_dev_t *r_dev = (assigned_dev_t *) pci_dev;
+	int i;
+	uint32_t ((*rf[])(void *, uint32_t)) =  { assigned_dev_ioport_readb,
+						  assigned_dev_ioport_readw,
+						  assigned_dev_ioport_readl
+	};
+	void ((*wf[])(void *, uint32_t, uint32_t)) =
+		{ assigned_dev_ioport_writeb,
+		  assigned_dev_ioport_writew,
+		  assigned_dev_ioport_writel
+		};
+
+	r_dev->v_addrs[region_num].e_physbase = addr;
+	DEBUG("assigned_dev_ioport_map: address=0x%x type=0x%x len=%d"
+	      "region_num=%d \n", addr, type, size, region_num);
+
+	for (i = 0; i < 3; i++) {
+		register_ioport_write(addr, size, 1<<i, wf[i],
+				      (void *) (r_dev->v_addrs + region_num));
+		register_ioport_read(addr, size, 1<<i, rf[i],
+				     (void *) (r_dev->v_addrs + region_num));
+	}
+}
+
+static void assigned_dev_pci_write_config(PCIDevice *d, uint32_t address,
+					  uint32_t val, int len)
+{
+	int fd, r;
+
+	DEBUG("(%x.%x): address=%04x val=0x%08x len=%d\n",
+	      ((d->devfn >> 3) & 0x1F), (d->devfn & 0x7), (uint16_t) address,
+	      val, len);
+
+	if (address == 0x4)
+		pci_default_write_config(d, address, val, len);
+
+	if ((address >= 0x10 && address <= 0x24) || address == 0x34 ||
+	    address == 0x3c || address == 0x3d) {
+		/* used for update-mappings (BAR emulation) */
+		pci_default_write_config(d, address, val, len);
+		return;
+	}
+
+	DEBUG("NON BAR (%x.%x): address=%04x val=0x%08x len=%d\n",
+	      ((d->devfn >> 3) & 0x1F), (d->devfn & 0x7), (uint16_t) address,
+	      val, len);
+	fd = ((assigned_dev_t *)d)->real_device.config_fd;
+	lseek(fd, address, SEEK_SET);
+again:
+	r = write(fd, &val, len);
+	if (r < 0) {
+		if (errno == EINTR || errno == EAGAIN)
+			goto again;
+		fprintf(stderr, "%s: write failed, errno = %d\n", __func__,
+			errno);
+	}
+}
+
+static uint32_t assigned_dev_pci_read_config(PCIDevice *d, uint32_t address,
+					     int len)
+{
+	uint32_t val = 0;
+	int fd, r;
+
+	if ((address >= 0x10 && address <= 0x24) || address == 0x34 ||
+	    address == 0x3c || address == 0x3d) {
+		val = pci_default_read_config(d, address, len);
+		DEBUG("(%x.%x): address=%04x val=0x%08x len=%d\n",
+		      (d->devfn >> 3) & 0x1F, (d->devfn & 0x7), address, val,
+		      len);
+		return val;
+	}
+
+	/* vga specific, remove later */
+	if (address == 0xFC)
+		goto do_log;
+
+	fd = ((assigned_dev_t *)d)->real_device.config_fd;
+	lseek(fd, address, SEEK_SET);
+again:
+	r = read(fd, &val, len);
+	if (r < 0) {
+		if (errno == EINTR || errno == EAGAIN)
+			goto again;
+		fprintf(stderr, "%s: read failed, errno = %d\n", __func__,
+			errno);
+	}
+
+do_log:
+	DEBUG("(%x.%x): address=%04x val=0x%08x len=%d\n",
+	      (d->devfn >> 3) & 0x1F, (d->devfn & 0x7), address, val, len);
+
+	/* kill the special capabilities */
+	if (address == 4 && len == 4)
+		val &= ~0x100000;
+	else if (address == 6)
+		val &= ~0x10;
+
+	return val;
+}
+
+static int assigned_dev_register_regions(pci_region_t *io_regions,
+					 unsigned long regions_num,
+					 assigned_dev_t *pci_dev)
+{
+	uint32_t i;
+	pci_region_t *cur_region = io_regions;
+
+	for (i = 0; i < regions_num; i++, cur_region++) {
+		if (!cur_region->valid)
+			continue;
+#ifdef DEVICE_ASSIGNMENT_DEBUG
+		pci_dev->v_addrs[i].debug |= DEVICE_ASSIGNMENT_DEBUG_MMIO |
+			DEVICE_ASSIGNMENT_DEBUG_PIO;
+#endif
+		pci_dev->v_addrs[i].num = i;
+
+		/* handle memory io regions */
+		if (cur_region->type & IORESOURCE_MEM) {
+			int t = cur_region->type & IORESOURCE_PREFETCH
+				? PCI_ADDRESS_SPACE_MEM_PREFETCH
+				: PCI_ADDRESS_SPACE_MEM;
+
+			/* map physical memory */
+			pci_dev->v_addrs[i].e_physbase = cur_region->base_addr;
+			pci_dev->v_addrs[i].r_virtbase =
+				mmap(NULL,
+				     (cur_region->size + 0xFFF) & 0xFFFFF000,
+				     PROT_WRITE | PROT_READ, MAP_SHARED,
+				     cur_region->resource_fd, (off_t) 0);
+
+			if ((void *) -1 == pci_dev->v_addrs[i].r_virtbase) {
+				fprintf(stderr, "Error: Couldn't mmap 0x%x!\n",
+					(uint32_t) (cur_region->base_addr));
+				return -1;
+			}
+			pci_dev->v_addrs[i].r_size = cur_region->size;
+			pci_dev->v_addrs[i].e_size = 0;
+
+			/* add offset */
+			pci_dev->v_addrs[i].r_virtbase +=
+				(cur_region->base_addr & 0xFFF);
+
+			pci_register_io_region((PCIDevice *) pci_dev, i,
+					       cur_region->size, t,
+					       assigned_dev_iomem_map);
+
+			continue;
+		}
+		/* handle port io regions */
+
+		pci_register_io_region((PCIDevice *) pci_dev, i,
+				       cur_region->size, PCI_ADDRESS_SPACE_IO,
+				       assigned_dev_ioport_map);
+
+		pci_dev->v_addrs[i].e_physbase = cur_region->base_addr;
+		pci_dev->v_addrs[i].r_virtbase =
+			(void *)(long)cur_region->base_addr;
+		/* not relevant for port io */
+		pci_dev->v_addrs[i].memory_index = 0;
+	}
+
+	/* success */
+	return 0;
+
+}
+
+static int get_real_device(assigned_dev_t *pci_dev, uint8_t r_bus,
+			   uint8_t r_dev, uint8_t r_func)
+{
+	char dir[128], name[128], comp[16];
+	int fd, r = 0;
+	FILE *f;
+	unsigned long long start, end, size, flags;
+	pci_region_t *rp;
+	pci_dev_t *dev = &pci_dev->real_device;
+
+	dev->region_number = 0;
+
+	sprintf(dir, "/sys/bus/pci/devices/0000:%02x:%02x.%x/",
+		r_bus, r_dev, r_func);
+	strcpy(name, dir);
+	strcat(name, "config");
+	fd = open(name, O_RDWR);
+	if (fd == -1) {
+		fprintf(stderr, "%s: %m\n", name);
+		return 1;
+	}
+	dev->config_fd = fd;
+again:
+	r = read(fd, pci_dev->dev.config, sizeof pci_dev->dev.config);
+	if (r < 0) {
+		if (errno == EINTR || errno == EAGAIN)
+			goto again;
+		fprintf(stderr, "%s: read failed, errno = %d\n", __func__,
+			errno);
+	}
+
+	strcpy(name, dir);
+	strcat(name, "resource");
+	f = fopen(name, "r");
+	if (f == NULL) {
+		fprintf(stderr, "%s: %m\n", name);
+		return 1;
+	}
+
+	for (r = 0; fscanf(f, "%lli %lli %lli\n", &start, &end, &flags) == 3;
+	     r++) {
+		rp = dev->regions + r;
+		rp->valid = 0;
+		size = end - start + 1;
+		flags &= IORESOURCE_IO | IORESOURCE_MEM | IORESOURCE_PREFETCH;
+		if (size == 0 || (flags & ~IORESOURCE_PREFETCH) == 0)
+			continue;
+		if (flags & IORESOURCE_MEM) {
+			flags &= ~IORESOURCE_IO;
+			sprintf(comp, "resource%d", r);
+			strcpy(name, dir);
+			strcat(name, comp);
+			fd = open(name, O_RDWR);
+			if (fd == -1)
+				continue;		/* probably ROM */
+			rp->resource_fd = fd;
+		} else
+			flags &= ~IORESOURCE_PREFETCH;
+
+		rp->type = flags;
+		rp->valid = 1;
+		rp->base_addr = start;
+		rp->size = size;
+		DEBUG("region %d size %d start 0x%x type %d "
+		      "resource_fd %d\n", r, rp->size, start, rp->type,
+		      rp->resource_fd);
+	}
+	fclose(f);
+
+	dev->region_number = r;
+	return 0;
+}
+
+static assigned_dev_t *register_real_device(PCIBus *e_bus,
+					    const char *e_dev_name,
+					    int e_devfn, uint8_t r_bus,
+					    uint8_t r_dev,
+					    uint8_t r_func)
+{
+	int rc;
+	assigned_dev_t *pci_dev;
+	uint8_t e_device, e_intx;
+
+	DEBUG("register_real_device: Registering real physical "
+	      "device %s (devfn=0x%x)\n", e_dev_name, e_devfn);
+	
+	pci_dev = (assigned_dev_t *)
+		pci_register_device(e_bus, e_dev_name,
+				    sizeof(assigned_dev_t), e_devfn,
+				    assigned_dev_pci_read_config,
+				    assigned_dev_pci_write_config);
+
+	if (NULL == pci_dev) {
+		fprintf(stderr, "register_real_device: Error: Couldn't "
+			"register real device %s\n", e_dev_name);
+		return NULL;
+	}
+	if (get_real_device(pci_dev, r_bus, r_dev, r_func)) {
+		fprintf(stderr, "register_real_device: Error: Couldn't get "
+			"real device (%s)!\n", e_dev_name);
+		return NULL;
+	}
+
+	/* handle real device's MMIO/PIO BARs */
+	if (assigned_dev_register_regions(pci_dev->real_device.regions,
+					  pci_dev->real_device.region_number,
+					  pci_dev))
+		return NULL;
+
+	/* handle interrupt routing */
+	e_device = (pci_dev->dev.devfn >> 3) & 0x1f;
+	e_intx = pci_dev->dev.config[0x3d] - 1;
+	pci_dev->intpin = e_intx;
+	pci_dev->run = 0;
+	pci_dev->girq = 0;
+	pci_dev->h_busnr = r_bus;
+	pci_dev->h_devfn = PCI_DEVFN(r_dev, r_func);
+
+#ifdef KVM_CAP_DEVICE_ASSIGNMENT
+	if (kvm_enabled()) {
+		struct kvm_assigned_dev assigned_dev_data;
+
+		memset(&assigned_dev_data, 0, sizeof(assigned_dev_data));
+		assigned_dev_data.guest.busnr = pci_bus_num(e_bus);
+		assigned_dev_data.guest.devfn = PCI_DEVFN(e_device, r_func);
+		assigned_dev_data.guest.num_valid_irqs = 1;
+		assigned_dev_data.host.busnr  = pci_dev->h_busnr;
+		assigned_dev_data.host.devfn  = pci_dev->h_devfn;
+		assigned_dev_data.host.num_valid_irqs = 1;
+		/* We'll set the value of the guest irq as and when
+		 * the piix config gets updated. See assigned_dev_update_irq.
+		 * The host irq field never gets used anyway
+		 */
+		rc = kvm_update_assigned_device(kvm_context,
+						&assigned_dev_data);
+		if (rc < 0) {
+			fprintf(stderr, "Could not notify kernel about "
+				"assigned device\n");
+			perror("pt-ioctl");
+			return NULL;
+		}
+	}
+#endif
+
+	fprintf(logfile, "Registered host PCI device %02x:%02x.%1x "
+		"as guest device %02x:%02x.%1x\n",
+		r_bus, r_dev, r_func,
+		pci_bus_num(e_bus), e_device, r_func);
+
+	return pci_dev;
+}
+
+#define	MAX_ASSIGNED_DEVS 4
+struct {
+	char name[128];
+	int bus;
+	int dev;
+	int func;
+	assigned_dev_t *assigned_dev;
+} assigned_devices[MAX_ASSIGNED_DEVS];
+
+int num_assigned_devices;
+extern int piix_get_irq(int);
+
+#ifdef KVM_CAP_DEVICE_ASSIGNMENT
+/* The pci config space got updated. Check if irq numbers have changed
+ * for our devices
+ */
+void assigned_dev_update_irq(PCIDevice *d)
+{
+	int i, irq, r;
+	assigned_dev_t *assigned_dev;
+
+	for (i = 0; i < num_assigned_devices; i++) {
+		assigned_dev = assigned_devices[i].assigned_dev;
+		if (assigned_dev == NULL)
+			continue;
+
+		irq = pci_map_irq(&assigned_dev->dev, assigned_dev->intpin);
+		irq = piix_get_irq(irq);
+		if (irq != assigned_dev->girq) {
+			struct kvm_assigned_dev assigned_dev_data;
+
+			memset(&assigned_dev_data, 0,
+			       sizeof(assigned_dev_data));
+			assigned_dev_data.guest.irq[0] = irq;
+			assigned_dev_data.guest.num_valid_irqs = 1;
+			assigned_dev_data.host.busnr = assigned_dev->h_busnr;
+			assigned_dev_data.host.devfn = assigned_dev->h_devfn;
+			assigned_dev_data.host.num_valid_irqs = 1;
+			r = kvm_update_assigned_device(kvm_context,
+						       &assigned_dev_data);
+			if (r < 0) {
+				perror("assigned_dev_update_irq");
+				continue;
+			}
+			assigned_dev->girq = irq;
+		}
+	}
+}
+#endif
+
+int init_device_assignment(void)
+{
+	/* Do we have any devices to be assigned? */
+	if (num_assigned_devices == 0)
+		return -1;
+
+	iopl(3);
+
+	return 0;
+}
+
+int init_assigned_device(PCIBus *bus, int *index)
+{
+	assigned_dev_t *dev = NULL;
+	int i, ret = 0;
+
+	if (*index == -1) {
+		if (init_device_assignment() < 0)
+			return -1;
+
+		*index = num_assigned_devices - 1;
+	}
+	i = *index;
+
+	dev = register_real_device(bus, assigned_devices[i].name, -1,
+				   assigned_devices[i].bus,
+				   assigned_devices[i].dev,
+				   assigned_devices[i].func);
+	if (dev == NULL) {
+		fprintf(stderr, "Error: Couldn't register device %s\n",
+			assigned_devices[i].name);
+		ret = -1;
+	}
+	assigned_devices[i].assigned_dev = dev;
+
+	--*index;
+	return ret;
+}
+
+void add_assigned_device(const char *arg)
+{
+	/* name/bus:dev.func */
+	char *cp, *cp1;
+
+	if (num_assigned_devices >= MAX_ASSIGNED_DEVS) {
+		fprintf(stderr, "Too many assigned devices (max %d)\n",
+			MAX_ASSIGNED_DEVS);
+		return;
+	}
+	strcpy(assigned_devices[num_assigned_devices].name, arg);
+	cp = strchr(assigned_devices[num_assigned_devices].name, '/');
+	if (cp == NULL)
+		goto bad;
+	*cp++ = 0;
+
+	assigned_devices[num_assigned_devices].bus = strtoul(cp, &cp1, 16);
+	if (*cp1 != ':')
+		goto bad;
+	cp = cp1 + 1;
+
+	assigned_devices[num_assigned_devices].dev = strtoul(cp, &cp1, 16);
+	if (*cp1 != '.')
+		goto bad;
+	cp = cp1 + 1;
+
+	assigned_devices[num_assigned_devices].func = strtoul(cp, &cp1, 16);
+	if (*cp1 != 0)
+		goto bad;
+
+	num_assigned_devices++;
+	return;
+bad:
+	fprintf(stderr, "assigned device arg (%s) not in the form of "
+		"name/bus:dev.func\n", arg);
+}
diff --git a/qemu/hw/device-assignment.h b/qemu/hw/device-assignment.h
new file mode 100644
index 0000000..f80a1d5
--- /dev/null
+++ b/qemu/hw/device-assignment.h
@@ -0,0 +1,95 @@
+/*
+ * Copyright (c) 2007, Neocleus Corporation.
+ * Copyright (c) 2007, Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
+ * Place - Suite 330, Boston, MA 02111-1307 USA.
+ *
+ *  Data structures for storing PCI state
+ *
+ *  Adapted to kvm by Qumranet
+ *
+ *  Copyright (c) 2007, Neocleus, Alex Novik (alex@neocleus.com)
+ *  Copyright (c) 2007, Neocleus, Guy Zana (guy@neocleus.com)
+ *  Copyright (C) 2008, Qumranet, Amit Shah (amit.shah@qumranet.com)
+ */
+
+#ifndef __DEVICE_ASSIGNMENT_H__
+#define __DEVICE_ASSIGNMENT_H__
+
+#include <sys/mman.h>
+#include "qemu-common.h"
+#include "pci.h"
+#include <linux/types.h>
+
+#define DEVICE_ASSIGNMENT_DEBUG_PIO	(0x01)
+#define DEVICE_ASSIGNMENT_DEBUG_MMIO	(0x02)
+
+/* From include/linux/pci.h in the kernel sources */
+#define PCI_DEVFN(slot, func)	((((slot) & 0x1f) << 3) | ((func) & 0x07))
+
+typedef uint32_t pciaddr_t;
+
+#define MAX_IO_REGIONS			(6)
+
+typedef struct pci_region_s {
+	int type;	/* Memory or port I/O */
+	int valid;
+	pciaddr_t base_addr;
+	pciaddr_t size;		/* size of the region */
+	int resource_fd;
+} pci_region_t;
+
+typedef struct pci_dev_s {
+	uint8_t bus, dev, func;	/* Bus inside domain, device and function */
+	int irq;		/* IRQ number */
+	uint16_t region_number;	/* number of active regions */
+
+	/* Port I/O or MMIO Regions */
+	pci_region_t regions[MAX_IO_REGIONS];
+	int config_fd;
+} pci_dev_t;
+
+typedef struct assigned_dev_region_s {
+	target_phys_addr_t e_physbase;
+	uint32_t memory_index;
+	void *r_virtbase;	/* mmapped access address */
+	int num;		/* our index within v_addrs[] */
+	uint32_t e_size;        /* emulated size of region in bytes */
+	uint32_t r_size;        /* real size of region in bytes */
+	uint32_t debug;
+} assigned_dev_region_t;
+
+typedef struct assigned_dev_s {
+	PCIDevice dev;
+	int intpin;
+	uint8_t debug_flags;
+	assigned_dev_region_t v_addrs[PCI_NUM_REGIONS];
+	pci_dev_t real_device;
+	int run;
+	int girq;
+	char sirq[4];
+	unsigned char h_busnr;
+	unsigned int h_devfn;
+	int bound;
+} assigned_dev_t;
+
+/* Initialization functions */
+int init_assigned_device(PCIBus *bus, int *index);
+void add_assigned_device(const char *arg);
+void assigned_dev_set_vector(int irq, int vector);
+void assigned_dev_ack_mirq(int vector);
+
+#define logfile stderr
+
+#endif				/* __DEVICE_ASSIGNMENT_H__ */
diff --git a/qemu/hw/isa.h b/qemu/hw/isa.h
index 89b3004..c720f5e 100644
--- a/qemu/hw/isa.h
+++ b/qemu/hw/isa.h
@@ -1,5 +1,7 @@
 /* ISA bus */
 
+#include "hw.h"
+
 extern target_phys_addr_t isa_mem_base;
 
 int register_ioport_read(int start, int length, int size,
diff --git a/qemu/hw/pc.c b/qemu/hw/pc.c
index da60199..71a491d 100644
--- a/qemu/hw/pc.c
+++ b/qemu/hw/pc.c
@@ -32,6 +32,7 @@
 #include "smbus.h"
 #include "boards.h"
 #include "console.h"
+#include "device-assignment.h"
 
 #include "qemu-kvm.h"
 
@@ -994,6 +995,14 @@ static void pc_init1(ram_addr_t ram_size, int vga_ram_size,
         }
     }
 
+    /* Initialize device assignment */
+    if (pci_enabled) {
+	    int r = -1;
+	    do {
+		    init_assigned_device(pci_bus, &r);
+	    } while (r >= 0);
+    }
+
     rtc_state = rtc_init(0x70, i8259[8]);
 
     qemu_register_boot_set(pc_boot_set, rtc_state);
diff --git a/qemu/hw/pci.c b/qemu/hw/pci.c
index 92683d1..d45d0ce 100644
--- a/qemu/hw/pci.c
+++ b/qemu/hw/pci.c
@@ -50,6 +50,7 @@ struct PCIBus {
 
 static void pci_update_mappings(PCIDevice *d);
 static void pci_set_irq(void *opaque, int irq_num, int level);
+static void assigned_dev_update_irq(PCIDevice *d);
 
 target_phys_addr_t pci_mem_base;
 static int pci_irq_index;
@@ -453,6 +454,12 @@ void pci_default_write_config(PCIDevice *d,
         val >>= 8;
     }
 
+#ifdef KVM_CAP_DEVICE_ASSIGNMENT
+    if (kvm_enabled() && qemu_kvm_irqchip_in_kernel() &&
+	address >= 0x60 && address <= 0x63)
+	assigned_dev_update_irq(d);
+#endif
+
     end = address + len;
     if (end > PCI_COMMAND && address < (PCI_COMMAND + 2)) {
         /* if the command register is modified, we must modify the mappings */
@@ -555,6 +562,11 @@ static void pci_set_irq(void *opaque, int irq_num, int level)
     bus->set_irq(bus->irq_opaque, irq_num, bus->irq_count[irq_num] != 0);
 }
 
+int pci_map_irq(PCIDevice *pci_dev, int pin)
+{
+	return pci_dev->bus->map_irq(pci_dev, pin);
+}
+
 /***********************************************************/
 /* monitor info on PCI */
 
diff --git a/qemu/hw/pci.h b/qemu/hw/pci.h
index 60e4094..e11fbbf 100644
--- a/qemu/hw/pci.h
+++ b/qemu/hw/pci.h
@@ -81,6 +81,7 @@ void pci_register_io_region(PCIDevice *pci_dev, int region_num,
                             uint32_t size, int type,
                             PCIMapIORegionFunc *map_func);
 
+int pci_map_irq(PCIDevice *pci_dev, int pin);
 uint32_t pci_default_read_config(PCIDevice *d,
                                  uint32_t address, int len);
 void pci_default_write_config(PCIDevice *d,
diff --git a/qemu/hw/piix_pci.c b/qemu/hw/piix_pci.c
index 90cb3a6..9ba1d8e 100644
--- a/qemu/hw/piix_pci.c
+++ b/qemu/hw/piix_pci.c
@@ -237,6 +237,25 @@ static void piix3_set_irq(qemu_irq *pic, int irq_num, int level)
     }
 }
 
+int piix3_get_pin(int pic_irq)
+{
+    int i;
+    for (i = 0; i < 4; i++)
+	    if (piix3_dev->config[0x60+i] == pic_irq)
+		    return i;
+    return -1;
+}
+
+int piix_get_irq(int pin)
+{
+    if (piix3_dev)
+	    return piix3_dev->config[0x60+pin];
+    if (piix4_dev)
+	    return piix4_dev->config[0x60+pin];
+
+    return 0;
+}
+
 static void piix3_reset(PCIDevice *d)
 {
     uint8_t *pci_conf = d->config;
diff --git a/qemu/vl.c b/qemu/vl.c
index d9b7db2..04ca724 100644
--- a/qemu/vl.c
+++ b/qemu/vl.c
@@ -37,6 +37,7 @@
 #include "qemu-char.h"
 #include "block.h"
 #include "audio/audio.h"
+#include "hw/device-assignment.h"
 #include "migration.h"
 #include "qemu-kvm.h"
 
@@ -7953,6 +7954,11 @@ static void help(int exitcode)
 #endif
 	   "-no-kvm-irqchip disable KVM kernel mode PIC/IOAPIC/LAPIC\n"
 	   "-no-kvm-pit	    disable KVM kernel mode PIT\n"
+#if defined(TARGET_I386) || defined(TARGET_X86_64)
+	   "-pcidevice name/bus:dev.func\n"
+	   "                expose a PCI device to the guest OS.\n"
+	   "                'name' is just used for debug logs.\n"
+#endif
 #endif
 #ifdef TARGET_I386
            "-std-vga        simulate a standard VGA card with VESA Bochs Extensions\n"
@@ -8076,6 +8082,9 @@ enum {
     QEMU_OPTION_no_kvm,
     QEMU_OPTION_no_kvm_irqchip,
     QEMU_OPTION_no_kvm_pit,
+#if defined(TARGET_I386) || defined(TARGET_X86_64)
+    QEMU_OPTION_pcidevice,
+#endif
     QEMU_OPTION_no_reboot,
     QEMU_OPTION_no_shutdown,
     QEMU_OPTION_show_cursor,
@@ -8165,6 +8174,9 @@ const QEMUOption qemu_options[] = {
 #endif
     { "no-kvm-irqchip", 0, QEMU_OPTION_no_kvm_irqchip },
     { "no-kvm-pit", 0, QEMU_OPTION_no_kvm_pit },
+#if defined(TARGET_I386) || defined(TARGET_X86_64)
+    { "pcidevice", HAS_ARG, QEMU_OPTION_pcidevice },
+#endif
 #endif
 #if defined(TARGET_PPC) || defined(TARGET_SPARC)
     { "g", 1, QEMU_OPTION_g },
@@ -9047,6 +9059,11 @@ int main(int argc, char **argv)
 		kvm_pit = 0;
 		break;
 	    }
+#if defined(TARGET_I386) || defined(TARGET_X86_64)
+	    case QEMU_OPTION_pcidevice:
+		add_assigned_device(optarg);
+		break;
+#endif
 #endif
             case QEMU_OPTION_usb:
                 usb_enabled = 1;
-- 
1.5.6


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/4] KVM: Add irq ack notifier list
  2008-07-22 12:13 ` [PATCH 1/4] KVM: Add irq ack notifier list Ben-Ami Yassour
  2008-07-22 12:13   ` [PATCH 2/4] KVM: pci device assignment Ben-Ami Yassour
@ 2008-07-26  8:19   ` Avi Kivity
  1 sibling, 0 replies; 19+ messages in thread
From: Avi Kivity @ 2008-07-26  8:19 UTC (permalink / raw)
  To: Ben-Ami Yassour; +Cc: amit.shah, kvm, muli, weidong.han, anthony

Ben-Ami Yassour wrote:
> From: Avi Kivity <avi@qumranet.com>
>
> This can be used by kvm subsystems that are interested in when
> interrupts
> are acked, for example time drift compenstation.
>
>   

Please add the pic and ioapic glue code to this patch.

> Signed-off-by: Avi Kivity <avi@qumranet.com>
>   

(You need to add your signoff to patches you send; even if you didn't 
modify them).


-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/4] KVM: pci device assignment
  2008-07-22 12:13   ` [PATCH 2/4] KVM: pci device assignment Ben-Ami Yassour
  2008-07-22 12:13     ` [PATCH 3/4] VT-d: changes to support KVM Ben-Ami Yassour
@ 2008-07-26  8:46     ` Avi Kivity
  2008-07-28  7:27     ` Yang, Sheng
  2008-07-28  8:33     ` Amit Shah
  3 siblings, 0 replies; 19+ messages in thread
From: Avi Kivity @ 2008-07-26  8:46 UTC (permalink / raw)
  To: Ben-Ami Yassour; +Cc: amit.shah, kvm, muli, weidong.han, anthony

Ben-Ami Yassour wrote:
> Based on a patch from: Amit Shah <amit.shah@qumranet.com>
>
> This patch adds support for handling PCI devices that are assigned to
> the guest.
>
> The device to be assigned to the guest is registered in the host kernel
> and interrupt delivery is handled. If a device is already assigned, or
> the device driver for it is still loaded on the host, the device
> assignment
> is failed by conveying a -EBUSY reply to the userspace.
>
> Devices that share their interrupt line are not supported at the moment.
>
> By itself, this patch will not make devices work within the guest.
> The VT-d extension is required to enable the device to perform DMA.
> Another alternative is PVDMA.
>
>  
> +struct kvm_assigned_dev_kernel
> +*kvm_find_assigned_dev(struct list_head *head,
>   

Keep these two on the same line.

> +
> +static int
> +kvm_vm_ioctl_device_assignment(struct kvm *kvm,
>   

Ditto.

> +			       struct kvm_assigned_dev *assigned_dev)
> +{
> +	int r = 0;
> +	struct kvm_assigned_dev_kernel *match;
> +	struct pci_dev *dev;
> +
> +	if (assigned_dev->host.num_valid_irqs != 1) {
> +		printk(KERN_INFO "%s: Unsupported number of irqs %d\n",
> +		       __func__, assigned_dev->host.num_valid_irqs);
> +		return -EINVAL;
> +	}
>   

We also support zero irqs.  While this patch doesn't do much with the 
information (only claim the device), this is still useful.


> diff --git a/include/asm-x86/kvm.h b/include/asm-x86/kvm.h
> index 8f13749..12b4b25 100644
> --- a/include/asm-x86/kvm.h
> +++ b/include/asm-x86/kvm.h
> @@ -208,4 +208,5 @@ struct kvm_pit_channel_state {
>  struct kvm_pit_state {
>  	struct kvm_pit_channel_state channels[3];
>  };
> +
>  #endif
>   

Unrelated.

> diff --git a/include/asm-x86/kvm_host.h b/include/asm-x86/kvm_host.h
> index e2864e6..34eb3e7 100644
> --- a/include/asm-x86/kvm_host.h
> +++ b/include/asm-x86/kvm_host.h
> @@ -325,6 +325,25 @@ struct kvm_irq_ack_notifier {
>  	void (*irq_acked)(struct kvm_irq_ack_notifier *kian);
>  };
>  
> +/* For assigned devices, we schedule work in the system workqueue to
> + * inject interrupts into the guest when an interrupt occurs on the
> + * physical device and also when the guest acks the interrupt.
> + */
> +struct kvm_assigned_dev_work {
> +	struct work_struct work;
> +	struct kvm_assigned_dev_kernel *assigned_dev;
> +};
> +
> +struct kvm_assigned_dev_kernel {
> +	struct kvm_irq_ack_notifier ack_notifier;
> +	struct list_head list;
> +	struct kvm_assigned_dev_info guest;
> +	struct kvm_assigned_dev_info host;
> +	struct kvm_assigned_dev_work int_work;
>   

Just put the work_struct here, and use container_of() instead of 
->assigned_dev.

>  struct kvm_arch{
>  	int naliases;
>  	struct kvm_mem_alias aliases[KVM_ALIAS_SLOTS];
> @@ -337,6 +356,7 @@ struct kvm_arch{
>  	 * Hash table of struct kvm_mmu_page.
>  	 */
>  	struct list_head active_mmu_pages;
> +	struct list_head assigned_dev_head;
>   

This is useful for non-x86 (except s390).  It's okay to leave it to the 
ia64/ppc maintainers, though.

>  	struct kvm_pic *vpic;
>  	struct kvm_ioapic *vioapic;
>  	struct kvm_pit *vpit;
> diff --git a/include/asm-x86/kvm_para.h b/include/asm-x86/kvm_para.h
> index 76f3921..3aa1731 100644
> --- a/include/asm-x86/kvm_para.h
> +++ b/include/asm-x86/kvm_para.h
> @@ -143,5 +143,4 @@ static inline unsigned int kvm_arch_para_features(void)
>  }
>  
>  #endif
> -
>   

Spurious.

>  #endif
> diff --git a/include/linux/kvm.h b/include/linux/kvm.h
> index 6edba45..c436c08 100644
> --- a/include/linux/kvm.h
> +++ b/include/linux/kvm.h
> @@ -382,6 +382,7 @@ struct kvm_trace_rec {
>  #define KVM_CAP_PV_MMU 13
>  #define KVM_CAP_MP_STATE 14
>  #define KVM_CAP_COALESCED_MMIO 15
> +#define KVM_CAP_DEVICE_ASSIGNMENT 16
>  
>  /*
>   * ioctls for VM fds
> @@ -411,6 +412,8 @@ struct kvm_trace_rec {
>  			_IOW(KVMIO,  0x67, struct kvm_coalesced_mmio_zone)
>  #define KVM_UNREGISTER_COALESCED_MMIO \
>  			_IOW(KVMIO,  0x68, struct kvm_coalesced_mmio_zone)
> +#define KVM_UPDATE_ASSIGNED_DEVICE _IOR(KVMIO, 0x69,		\
> +					struct kvm_assigned_dev)
>  
>  /*
>   * ioctls for vcpu fds
> @@ -475,4 +478,22 @@ struct kvm_trace_rec {
>  #define KVM_TRC_STLB_INVAL       (KVM_TRC_HANDLER + 0x18)
>  #define KVM_TRC_PPC_INSTR        (KVM_TRC_HANDLER + 0x19)
>  
> +#define ASSIGNED_DEV_MAX_IRQ 16
> +
> +/* Stores information for identifying host PCI devices assigned to the
> + * guest: this is used in the host kernel and in the userspace.
> + */
> +struct kvm_assigned_dev_info {
> +	__u32 busnr;
> +	__u32 devfn;
> +	__u32 irq[ASSIGNED_DEV_MAX_IRQ];
> +	__u32 num_valid_irqs; /* currently only 1 is supported */
> +};
>   

Move num_valid_irqs before the array.  Add padding so the structure is a 
multiple of 64 bits (needed so that the i386 and x86_64 ABIs are identical).


> +
> +/* Mapping between host and guest PCI device */
> +struct kvm_assigned_dev {
> +	struct kvm_assigned_dev_info guest;
> +	struct kvm_assigned_dev_info host;
> +};
> +
>   


guest.busnr and guest.devfn are meaningless; that's a wart.

What do you think of this API:

  struct kvm_assigned_pci_dev {
      __u32 assigned_dev_id;
      __u32 busnr;
      __u32 devfn;
      __u32 flags;
  };

  struct kvm_assigned_irq {
      __u32 assigned_dev_id;
      __u32 host_irq;
      __u32 guest_irq;
      __u32 flags;
  };

This automatically handles zero, one, or many irqs, and also allows for 
non-pci devices.


-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/4] KVM: Device assignemnt with VT-d
  2008-07-22 12:13       ` [PATCH 4/4] KVM: Device assignemnt with VT-d Ben-Ami Yassour
  2008-07-22 12:18         ` device assignment - userspace part Ben-Ami Yassour
@ 2008-07-26  9:02         ` Avi Kivity
  2008-07-28  6:49         ` Han, Weidong
  2 siblings, 0 replies; 19+ messages in thread
From: Avi Kivity @ 2008-07-26  9:02 UTC (permalink / raw)
  To: Ben-Ami Yassour; +Cc: amit.shah, kvm, muli, weidong.han, anthony, Kay, Allen M

Ben-Ami Yassour wrote:
> From: Kay, Allen M <allen.m.kay@intel.com>
>
> This patch includes the functions to support VT-d for passthrough
> devices.
>
> [Ben: fixed memory pinning, cleanup]
>
> Signed-off-by: Kay, Allen M <allen.m.kay@intel.com>
> Signed-off-by: Weidong Han <weidong.han@intel.com>
> Signed-off-by: Ben-Ami Yassour <benami@il.ibm.com>
> ---
>  arch/x86/kvm/Makefile      |    2 +-
>  arch/x86/kvm/vtd.c         |  182 ++++++++++++++++++++++++++++++++++++++++++++
>  arch/x86/kvm/x86.c         |   11 +++
>  include/asm-x86/kvm_host.h |    3 +
>  include/linux/kvm_host.h   |    6 ++
>  virt/kvm/kvm_main.c        |    8 ++-
>  6 files changed, 210 insertions(+), 2 deletions(-)
>  create mode 100644 arch/x86/kvm/vtd.c
>
> diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
> index d0e940b..5d9d079 100644
> --- a/arch/x86/kvm/Makefile
> +++ b/arch/x86/kvm/Makefile
> @@ -11,7 +11,7 @@ endif
>  EXTRA_CFLAGS += -Ivirt/kvm -Iarch/x86/kvm
>  
>  kvm-objs := $(common-objs) x86.o mmu.o x86_emulate.o i8259.o irq.o lapic.o \
> -	i8254.o
> +	i8254.o vtd.o
>  obj-$(CONFIG_KVM) += kvm.o
>  kvm-intel-objs = vmx.o
>  obj-$(CONFIG_KVM_INTEL) += kvm-intel.o
> diff --git a/arch/x86/kvm/vtd.c b/arch/x86/kvm/vtd.c
> new file mode 100644
> index 0000000..7a3cf4e
> --- /dev/null
> +++ b/arch/x86/kvm/vtd.c
> @@ -0,0 +1,182 @@
> +/*
> + * Copyright (c) 2006, Intel Corporation.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
> + * Place - Suite 330, Boston, MA 02111-1307 USA.
> + *
> + * Copyright (C) 2006-2008 Intel Corporation
> + * Author: Allen M. Kay <allen.m.kay@intel.com>
> + * Author: Weidong Han <weidong.han@intel.com>
> + */
> +
> +#include <linux/list.h>
> +#include <linux/kvm_host.h>
> +#include <linux/pci.h>
> +#include <linux/dmar.h>
> +#include <linux/intel-iommu.h>
> +
> +static int kvm_iommu_unmap_memslots(struct kvm *kvm);
> +
> +int kvm_iommu_map_pages(struct kvm *kvm,
> +			gfn_t base_gfn, unsigned long npages)
> +{
> +	gfn_t gfn = base_gfn;
> +	pfn_t pfn;
> +	int i, rc;
> +	struct dmar_domain *domain = kvm->arch.intel_iommu_domain;
> +
> +	if (!domain)
> +		return -EFAULT;
> +
> +	for (i = 0; i < npages; i++) {
> +		pfn = gfn_to_pfn(kvm, gfn);
> +		if (!is_mmio_pfn(pfn)) {
> +			rc = intel_iommu_page_mapping(domain,
> +						      gfn << PAGE_SHIFT,
> +						      pfn << PAGE_SHIFT,
>   

This overflows on i386, where gfn and pfn are longs while gpas and hpas 
are u64s.

gfn_to_gpa() gets the first one right.  There should also be a 
pfn_to_phys(), but there isn't.

> +						      PAGE_SIZE,
> +						      DMA_PTE_READ |
> +						      DMA_PTE_WRITE);
> +			if (rc)
> +				kvm_release_pfn_clean(pfn);
>   

You're never actually returning rc, so any errors will be swallowed 
silently.

> +		} else {
> +			printk(KERN_DEBUG "kvm_iommu_map_page:"
> +			       "invalid pfn=%lx\n", pfn);
> +			return 0;
> +		}
> +
> +		gfn++;
> +	}
> +	return 0;
>   

Isn't this slow on large guests? (not a merge barrier).

> +}
> +
> +int kvm_iommu_map_guest(struct kvm *kvm,
> +			struct kvm_assigned_dev *assigned_dev)
> +{
> +	struct pci_dev *pdev = NULL;
> +
> +	printk(KERN_DEBUG "VT-d direct map: host bdf = %x:%x:%x\n",
> +	       assigned_dev->host.busnr,
> +	       PCI_SLOT(assigned_dev->host.devfn),
> +	       PCI_FUNC(assigned_dev->host.devfn));
> +
> +	for_each_pci_dev(pdev) {
> +		if ((pdev->bus->number == assigned_dev->host.busnr) &&
> +		    (pdev->devfn == assigned_dev->host.devfn)) {
> +			break;
> +		}
> +	}
>   

Why not keep pdev in kvm_assigned_dev?  We've claimed it, so it's ours.

> +
> +	if (pdev == NULL) {
> +		if (kvm->arch.intel_iommu_domain) {
> +			intel_iommu_domain_exit(kvm->arch.intel_iommu_domain);
> +			kvm->arch.intel_iommu_domain = NULL;
> +		}
> +		return -ENODEV;
> +	}
> +
> +	kvm->arch.intel_iommu_domain = intel_iommu_domain_alloc(pdev);
> +
> +	if (kvm_iommu_map_memslots(kvm)) {
> +		kvm_iommu_unmap_memslots(kvm);
> +		return -EFAULT;
>   

EFAULT is for the kernel touching an invalid userspace page.  You should 
simply propagate the original error.

> +	}
> +
> +	intel_iommu_detach_dev(kvm->arch.intel_iommu_domain,
> +			       pdev->bus->number, pdev->devfn);
> +
> +	if (intel_iommu_context_mapping(kvm->arch.intel_iommu_domain,
> +					pdev)) {
> +		printk(KERN_ERR "Domain context map for %s failed",
> +		       pci_name(pdev));
> +		return -EFAULT;
>   

Ditto.

> +	}
> +	return 0;
> +}
> +
> +static int kvm_iommu_put_pages(struct kvm *kvm,
> +			       gfn_t base_gfn, unsigned long npages)
> +{
> +	gfn_t gfn = base_gfn;
> +	pfn_t pfn;
> +	struct dmar_domain *domain = kvm->arch.intel_iommu_domain;
> +	int i;
> +
> +	if (!domain)
> +		return -EFAULT;
>   

Ditto.

> +
> +	for (i = 0; i < npages; i++) {
> +		pfn = (pfn_t)intel_iommu_iova_to_pfn(domain,
> +						     gfn << PAGE_SHIFT);
>   

Overflow.

> +		kvm_release_pfn_clean(pfn);
> +		gfn++;
> +	}
> +	return 0;
> +}
> +
> +static int kvm_iommu_unmap_memslots(struct kvm *kvm)
> +{
> +	int i, rc;
> +	for (i = 0; i < kvm->nmemslots; i++) {
> +		rc = kvm_iommu_put_pages(kvm, kvm->memslots[i].base_gfn,
> +					 kvm->memslots[i].npages);
> +		if (rc)
> +			return rc;
> +	}
> +	return 0;
> +}
> +
>   

Unmapping should be unfailable, since there's not way to recover from it.

I'm unhappy with this wanton taking-of-references, but as long as mmu 
notifiers are unmerged, I see no other way.


> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index d9aa931..e57c9e4 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -33,6 +33,7 @@
>  #include <linux/module.h>
>  #include <linux/mman.h>
>  #include <linux/highmem.h>
> +#include <linux/intel-iommu.h>
>  
>  #include <asm/uaccess.h>
>  #include <asm/msr.h>
> @@ -253,9 +254,17 @@ kvm_vm_ioctl_device_assignment(struct kvm *kvm,
>  	}
>  
>  
> +	if (intel_iommu_found()) {
> +		r = kvm_iommu_map_guest(kvm, assigned_dev);
> +		if (r)
> +			goto out_intr;
> +	}
> +
>  out:
>  	mutex_unlock(&kvm->lock);
>  	return r;
> +out_intr:
> +	free_irq(dev->irq, match);
>  out_list_del:
>  	list_del(&match->list);
>  	pci_release_regions(dev);
>   


The split between device assignment and irq assignment I suggested 
earlier would definitely help this function.



-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: device assignemnt: updated patches
  2008-07-22 12:13 device assignemnt: updated patches Ben-Ami Yassour
  2008-07-22 12:13 ` [PATCH 1/4] KVM: Add irq ack notifier list Ben-Ami Yassour
@ 2008-07-26  9:05 ` Avi Kivity
  2008-07-26  9:24   ` Han, Weidong
  1 sibling, 1 reply; 19+ messages in thread
From: Avi Kivity @ 2008-07-26  9:05 UTC (permalink / raw)
  To: Ben-Ami Yassour; +Cc: amit.shah, kvm, muli, weidong.han, anthony

Ben-Ami Yassour wrote:
> Following are the device assignment patches with the fixes of the
> comments that were sent for the previous version.
>
> Here is the list of changes that were made with respect to the previous
> version:
> 1. Replace the interrupt ack hook patches with the notifiers list patch
> by Avi.
> 2. Remove code from ioapic - the ack notifier is called before the
> checking the irr bit.
> 3. Remove the interrupt ack work queue and handle it directly.
> 4. Minimize the function for finding an assigned device.
> 5. Remove the pt_lock, after making the changes above it is no longer
> needed.
> 6. Move declarations from kvm_para.h
> 7. Add irq array to ioctl API. Note that this is only a change to the
> API, currently only single irq is supported.
> 8. Renaming: use "assigned device" and not "passthrough device"
> 9. Fix device release error path
> 10. Moving the assigned devices list pointer to the device struct itself
> and remove the extra structure. 
>
> Pending comment: shared guest interrputs are not tested.
>
>   

(and not implemented, either).

Good progress; there are still many issues, but this is inevitable with 
such a large and complex patchset.

How are we standing with merging the changes to the VT-d driver, which 
are a prerequisite?


-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: device assignemnt: updated patches
  2008-07-26  9:05 ` device assignemnt: updated patches Avi Kivity
@ 2008-07-26  9:24   ` Han, Weidong
  2008-07-26  9:32     ` Avi Kivity
  0 siblings, 1 reply; 19+ messages in thread
From: Han, Weidong @ 2008-07-26  9:24 UTC (permalink / raw)
  To: Avi Kivity, Ben-Ami Yassour; +Cc: amit.shah, kvm, muli, anthony

Avi Kivity wrote:
> Ben-Ami Yassour wrote:
>> Following are the device assignment patches with the fixes of the
>> comments that were sent for the previous version.
>> 
>> Here is the list of changes that were made with respect to the
>> previous version: 
>> 1. Replace the interrupt ack hook patches with the notifiers list
>> patch by Avi. 
>> 2. Remove code from ioapic - the ack notifier is called before the
>> checking the irr bit. 
>> 3. Remove the interrupt ack work queue and handle it directly.
>> 4. Minimize the function for finding an assigned device.
>> 5. Remove the pt_lock, after making the changes above it is no
>> longer needed. 
>> 6. Move declarations from kvm_para.h
>> 7. Add irq array to ioctl API. Note that this is only a change to the
>> API, currently only single irq is supported.
>> 8. Renaming: use "assigned device" and not "passthrough device"
>> 9. Fix device release error path
>> 10. Moving the assigned devices list pointer to the device struct
>> itself and remove the extra structure.
>> 
>> Pending comment: shared guest interrputs are not tested.
>> 
>> 
> 
> (and not implemented, either).
> 
> Good progress; there are still many issues, but this is inevitable
> with 
> such a large and complex patchset.
> 
> How are we standing with merging the changes to the VT-d driver, which
> are a prerequisite?

You mean push the changes into VT-d driver first. But the VT-d driver
modification patch maybe need some changes during following development.
How about making it stable in KVM first, then push it into VT-d driver?

Randy (Weidong)

> 
> 
> --
> Do not meddle in the internals of kernels, for they are subtle and
> quick to panic. 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: device assignemnt: updated patches
  2008-07-26  9:24   ` Han, Weidong
@ 2008-07-26  9:32     ` Avi Kivity
  2008-07-26  9:48       ` Han, Weidong
  0 siblings, 1 reply; 19+ messages in thread
From: Avi Kivity @ 2008-07-26  9:32 UTC (permalink / raw)
  To: Han, Weidong; +Cc: Ben-Ami Yassour, amit.shah, kvm, muli, anthony

Han, Weidong wrote:
>> How are we standing with merging the changes to the VT-d driver, which
>> are a prerequisite?
>>     
>
> You mean push the changes into VT-d driver first. But the VT-d driver
> modification patch maybe need some changes during following development.
> How about making it stable in KVM first, then push it into VT-d driver?
>   

Yeah, we do have a chicken-and-egg situation here. I guess I can carry 
that patch while we're working this out.

Longer term, we will have to move to a vendor neutral API so as to 
support amd iommu and non-x86 iommus.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: device assignemnt: updated patches
  2008-07-26  9:32     ` Avi Kivity
@ 2008-07-26  9:48       ` Han, Weidong
  0 siblings, 0 replies; 19+ messages in thread
From: Han, Weidong @ 2008-07-26  9:48 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Ben-Ami Yassour, amit.shah, kvm, muli, anthony

Avi Kivity wrote:
> Han, Weidong wrote:
>>> How are we standing with merging the changes to the VT-d driver,
>>> which are a prerequisite? 
>>> 
>> 
>> You mean push the changes into VT-d driver first. But the VT-d driver
>> modification patch maybe need some changes during following
>> development. How about making it stable in KVM first, then push it
>> into VT-d driver? 
>> 
> 
> Yeah, we do have a chicken-and-egg situation here. I guess I can carry
> that patch while we're working this out.
> 
> Longer term, we will have to move to a vendor neutral API so as to
> support amd iommu and non-x86 iommus.

I agree arch-independent support is direction. But we should make one
work first, then make it generic later.

Randy (Weidong)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH 4/4] KVM: Device assignemnt with VT-d
  2008-07-22 12:13       ` [PATCH 4/4] KVM: Device assignemnt with VT-d Ben-Ami Yassour
  2008-07-22 12:18         ` device assignment - userspace part Ben-Ami Yassour
  2008-07-26  9:02         ` [PATCH 4/4] KVM: Device assignemnt with VT-d Avi Kivity
@ 2008-07-28  6:49         ` Han, Weidong
  2008-07-28 16:34           ` Ben-Ami Yassour
  2 siblings, 1 reply; 19+ messages in thread
From: Han, Weidong @ 2008-07-28  6:49 UTC (permalink / raw)
  To: Ben-Ami Yassour, avi; +Cc: amit.shah, kvm, muli, anthony, Kay, Allen M

Ben-Ami Yassour wrote:
> From: Kay, Allen M <allen.m.kay@intel.com>
> 
> This patch includes the functions to support VT-d for passthrough
> devices.
> 
> [Ben: fixed memory pinning, cleanup]
> 
> Signed-off-by: Kay, Allen M <allen.m.kay@intel.com>
> Signed-off-by: Weidong Han <weidong.han@intel.com>
> Signed-off-by: Ben-Ami Yassour <benami@il.ibm.com>
> ---
>  arch/x86/kvm/Makefile      |    2 +-
>  arch/x86/kvm/vtd.c         |  182
>  ++++++++++++++++++++++++++++++++++++++++++++ arch/x86/kvm/x86.c     
>  |   11 +++ include/asm-x86/kvm_host.h |    3 +
>  include/linux/kvm_host.h   |    6 ++
>  virt/kvm/kvm_main.c        |    8 ++-
>  6 files changed, 210 insertions(+), 2 deletions(-)
>  create mode 100644 arch/x86/kvm/vtd.c
> 
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 5b470a1..49c4ed3 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -41,6 +41,7 @@
>  #include <linux/pagemap.h>
>  #include <linux/mman.h>
>  #include <linux/swap.h>
> +#include <linux/intel-iommu.h>
> 
>  #include <asm/processor.h>
>  #include <asm/io.h>
> @@ -76,7 +77,7 @@ static inline int valid_vcpu(int n)
>  	return likely(n >= 0 && n < KVM_MAX_VCPUS);
>  }
> 
> -static inline int is_mmio_pfn(pfn_t pfn)
> +inline int is_mmio_pfn(pfn_t pfn)
>  {
>  	if (pfn_valid(pfn))
>  		return PageReserved(pfn_to_page(pfn));

I cannot find this is_mmio_pfn() definition on main KVM tree, and failed
to patch it.

Randy (Weidong)




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/4] KVM: pci device assignment
  2008-07-22 12:13   ` [PATCH 2/4] KVM: pci device assignment Ben-Ami Yassour
  2008-07-22 12:13     ` [PATCH 3/4] VT-d: changes to support KVM Ben-Ami Yassour
  2008-07-26  8:46     ` [PATCH 2/4] KVM: pci device assignment Avi Kivity
@ 2008-07-28  7:27     ` Yang, Sheng
  2008-07-28 16:41       ` Ben-Ami Yassour
  2008-07-28  8:33     ` Amit Shah
  3 siblings, 1 reply; 19+ messages in thread
From: Yang, Sheng @ 2008-07-28  7:27 UTC (permalink / raw)
  To: kvm; +Cc: Ben-Ami Yassour, avi, amit.shah, muli, weidong.han, anthony

On Tuesday 22 July 2008 20:13:53 Ben-Ami Yassour wrote:
>
> -int kvm_pic_read_irq(struct kvm_pic *s)
> +int kvm_pic_read_irq(struct kvm *kvm)
>  {
>  	int irq, irq2, intno;
> +	struct kvm_pic *s = pic_irqchip(kvm);
>
>  	irq = pic_get_irq(&s->pics[0]);
>  	if (irq >= 0) {
> @@ -186,6 +187,8 @@ int kvm_pic_read_irq(struct kvm_pic *s)
>  		irq = 7;
>  		intno = s->pics[0].irq_base + irq;
>  	}
> +	kvm_notify_acked_irq(kvm, irq);
> +
>  	pic_update_irq(s);
>
>  	return intno;

One coding style suggestion,

struct kvm_pic has *irq_request_opaque which is struct kvm indeed. How 
about contain kvm in kvm_pic explicitly? (seems cleaner, though needs 
more modification).


Another thing is about host share IRQ: Do we want follow the straight 
forward way to add it? That's it, return IRQ_HANDLED from irq handler 
and wait for EOI of guest?

I found it's not easy to provide support for MSI without shared IRQ 
support. We don't know when guest want to enable MSI or it decided to 
shared IRQ, maybe we can intercept the unmask of IOAPIC to know that, 
but it's a little tricky.

-- 
regards
Yang, Sheng

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/4] KVM: pci device assignment
  2008-07-22 12:13   ` [PATCH 2/4] KVM: pci device assignment Ben-Ami Yassour
                       ` (2 preceding siblings ...)
  2008-07-28  7:27     ` Yang, Sheng
@ 2008-07-28  8:33     ` Amit Shah
  3 siblings, 0 replies; 19+ messages in thread
From: Amit Shah @ 2008-07-28  8:33 UTC (permalink / raw)
  To: Ben-Ami Yassour; +Cc: avi, kvm, muli, weidong.han, anthony

* On Tuesday 22 Jul 2008 17:43:53 Ben-Ami Yassour wrote:
> Based on a patch from: Amit Shah <amit.shah@qumranet.com>
>
> This patch adds support for handling PCI devices that are assigned to
> the guest.
>
> The device to be assigned to the guest is registered in the host kernel
> and interrupt delivery is handled. If a device is already assigned, or
> the device driver for it is still loaded on the host, the device
> assignment
> is failed by conveying a -EBUSY reply to the userspace.
>
> Devices that share their interrupt line are not supported at the moment.
>
> By itself, this patch will not make devices work within the guest.
> The VT-d extension is required to enable the device to perform DMA.
> Another alternative is PVDMA.
>
> Signed-off-by: Amit Shah <amit.shah@qumranet.com>
> Signed-off-by: Ben-Ami Yassour <benami@il.ibm.com>
> Signed-off-by: Weidong Han <weidong.han@intel.com>


> +	if (pci_enable_device(dev)) {
> +		printk(KERN_INFO "%s: Could not enable PCI device\n", __func__);
> +		r = -EBUSY;
> +		goto out_put;
> +	}
> +	r = pci_request_regions(dev, "kvm_assigned_device");
> +	if (r) {
> +		printk(KERN_INFO "%s: Could not get access to device regions\n",
> +		       __func__);
> +		goto out_disable;
> +	}

A driver might already be loaded for this device. If we fail to get the 
regions, it might mean the device was enabled and in use. That's the reason I 
didn't originally have a pci_disable_device() when pci_request_regions fails.

> +out:
> +	mutex_unlock(&kvm->lock);
> +	return r;
> +out_list_del:
> +	list_del(&match->list);
> +	pci_release_regions(dev);
> +out_disable:
> +	pci_disable_device(dev);
> +out_put:
> +	pci_dev_put(dev);
> +out_free:
> +	kfree(match);
> +	mutex_unlock(&kvm->lock);
> +	return r;

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH 4/4] KVM: Device assignemnt with VT-d
  2008-07-28  6:49         ` Han, Weidong
@ 2008-07-28 16:34           ` Ben-Ami Yassour
  0 siblings, 0 replies; 19+ messages in thread
From: Ben-Ami Yassour @ 2008-07-28 16:34 UTC (permalink / raw)
  To: Han, Weidong; +Cc: avi, amit.shah, kvm, Muli Ben-Yehuda, anthony, Kay, Allen M

On Mon, 2008-07-28 at 14:49 +0800, Han, Weidong wrote:
> Ben-Ami Yassour wrote:
> > From: Kay, Allen M <allen.m.kay@intel.com>
> > 
> >  {
> >  	if (pfn_valid(pfn))
> >  		return PageReserved(pfn_to_page(pfn));
> 
> I cannot find this is_mmio_pfn() definition on main KVM tree, and failed
> to patch it.

This patch was missing, I now resent all the patches with the missing
one.

Thanks,
Ben



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/4] KVM: pci device assignment
  2008-07-28  7:27     ` Yang, Sheng
@ 2008-07-28 16:41       ` Ben-Ami Yassour
  0 siblings, 0 replies; 19+ messages in thread
From: Ben-Ami Yassour @ 2008-07-28 16:41 UTC (permalink / raw)
  To: Yang, Sheng; +Cc: kvm, avi, amit.shah, Muli Ben-Yehuda, weidong.han, anthony

On Mon, 2008-07-28 at 15:27 +0800, Yang, Sheng wrote:
> On Tuesday 22 July 2008 20:13:53 Ben-Ami Yassour wrote:
> >
> > -int kvm_pic_read_irq(struct kvm_pic *s)
> > +int kvm_pic_read_irq(struct kvm *kvm)
> >  {
> >  	int irq, irq2, intno;
> > +	struct kvm_pic *s = pic_irqchip(kvm);
> >
> >  	irq = pic_get_irq(&s->pics[0]);
> >  	if (irq >= 0) {
> > @@ -186,6 +187,8 @@ int kvm_pic_read_irq(struct kvm_pic *s)
> >  		irq = 7;
> >  		intno = s->pics[0].irq_base + irq;
> >  	}
> > +	kvm_notify_acked_irq(kvm, irq);
> > +
> >  	pic_update_irq(s);
> >
> >  	return intno;
> 
> One coding style suggestion,
> 
> struct kvm_pic has *irq_request_opaque which is struct kvm indeed. How 
> about contain kvm in kvm_pic explicitly? (seems cleaner, though needs 
> more modification).

Your suggestion is applied in the next set of patches that I sent.

> 
> 
> Another thing is about host share IRQ: Do we want follow the straight 
> forward way to add it? That's it, return IRQ_HANDLED from irq handler 
> and wait for EOI of guest?

That's what the code in the current patches doing now. It also disables
the irq until the EOI of the guest. If we don't do that, then the
interrupt handler keeps getting called (which causes many guest exit)
until the guest finally interacts with the device to release the
interrupt line.

Thanks,
Ben


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2008-07-28 16:42 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-07-22 12:13 device assignemnt: updated patches Ben-Ami Yassour
2008-07-22 12:13 ` [PATCH 1/4] KVM: Add irq ack notifier list Ben-Ami Yassour
2008-07-22 12:13   ` [PATCH 2/4] KVM: pci device assignment Ben-Ami Yassour
2008-07-22 12:13     ` [PATCH 3/4] VT-d: changes to support KVM Ben-Ami Yassour
2008-07-22 12:13       ` [PATCH 4/4] KVM: Device assignemnt with VT-d Ben-Ami Yassour
2008-07-22 12:18         ` device assignment - userspace part Ben-Ami Yassour
2008-07-22 12:18           ` [PATCH 1/1] KVM/userspace: Support for assigning PCI devices to guest Ben-Ami Yassour
2008-07-26  9:02         ` [PATCH 4/4] KVM: Device assignemnt with VT-d Avi Kivity
2008-07-28  6:49         ` Han, Weidong
2008-07-28 16:34           ` Ben-Ami Yassour
2008-07-26  8:46     ` [PATCH 2/4] KVM: pci device assignment Avi Kivity
2008-07-28  7:27     ` Yang, Sheng
2008-07-28 16:41       ` Ben-Ami Yassour
2008-07-28  8:33     ` Amit Shah
2008-07-26  8:19   ` [PATCH 1/4] KVM: Add irq ack notifier list Avi Kivity
2008-07-26  9:05 ` device assignemnt: updated patches Avi Kivity
2008-07-26  9:24   ` Han, Weidong
2008-07-26  9:32     ` Avi Kivity
2008-07-26  9:48       ` Han, Weidong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox