From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: [PATCH 2/4] KVM: pci device assignment Date: Sat, 26 Jul 2008 11:46:13 +0300 Message-ID: <488AE455.3000500@qumranet.com> References: <1216728835-19721-1-git-send-email-benami@il.ibm.com> <1216728835-19721-2-git-send-email-benami@il.ibm.com> <1216728835-19721-3-git-send-email-benami@il.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: amit.shah@qumranet.com, kvm@vger.kernel.org, muli@il.ibm.com, weidong.han@intel.com, anthony@codemonkey.ws To: Ben-Ami Yassour Return-path: Received: from il.qumranet.com ([212.179.150.194]:49932 "EHLO il.qumranet.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751239AbYGZIqQ (ORCPT ); Sat, 26 Jul 2008 04:46:16 -0400 In-Reply-To: <1216728835-19721-3-git-send-email-benami@il.ibm.com> Sender: kvm-owner@vger.kernel.org List-ID: Ben-Ami Yassour wrote: > Based on a patch from: Amit Shah > > This patch adds support for handling PCI devices that are assigned to > the guest. > > The device to be assigned to the guest is registered in the host kernel > and interrupt delivery is handled. If a device is already assigned, or > the device driver for it is still loaded on the host, the device > assignment > is failed by conveying a -EBUSY reply to the userspace. > > Devices that share their interrupt line are not supported at the moment. > > By itself, this patch will not make devices work within the guest. > The VT-d extension is required to enable the device to perform DMA. > Another alternative is PVDMA. > > > +struct kvm_assigned_dev_kernel > +*kvm_find_assigned_dev(struct list_head *head, > Keep these two on the same line. > + > +static int > +kvm_vm_ioctl_device_assignment(struct kvm *kvm, > Ditto. > + struct kvm_assigned_dev *assigned_dev) > +{ > + int r = 0; > + struct kvm_assigned_dev_kernel *match; > + struct pci_dev *dev; > + > + if (assigned_dev->host.num_valid_irqs != 1) { > + printk(KERN_INFO "%s: Unsupported number of irqs %d\n", > + __func__, assigned_dev->host.num_valid_irqs); > + return -EINVAL; > + } > We also support zero irqs. While this patch doesn't do much with the information (only claim the device), this is still useful. > diff --git a/include/asm-x86/kvm.h b/include/asm-x86/kvm.h > index 8f13749..12b4b25 100644 > --- a/include/asm-x86/kvm.h > +++ b/include/asm-x86/kvm.h > @@ -208,4 +208,5 @@ struct kvm_pit_channel_state { > struct kvm_pit_state { > struct kvm_pit_channel_state channels[3]; > }; > + > #endif > Unrelated. > diff --git a/include/asm-x86/kvm_host.h b/include/asm-x86/kvm_host.h > index e2864e6..34eb3e7 100644 > --- a/include/asm-x86/kvm_host.h > +++ b/include/asm-x86/kvm_host.h > @@ -325,6 +325,25 @@ struct kvm_irq_ack_notifier { > void (*irq_acked)(struct kvm_irq_ack_notifier *kian); > }; > > +/* For assigned devices, we schedule work in the system workqueue to > + * inject interrupts into the guest when an interrupt occurs on the > + * physical device and also when the guest acks the interrupt. > + */ > +struct kvm_assigned_dev_work { > + struct work_struct work; > + struct kvm_assigned_dev_kernel *assigned_dev; > +}; > + > +struct kvm_assigned_dev_kernel { > + struct kvm_irq_ack_notifier ack_notifier; > + struct list_head list; > + struct kvm_assigned_dev_info guest; > + struct kvm_assigned_dev_info host; > + struct kvm_assigned_dev_work int_work; > Just put the work_struct here, and use container_of() instead of ->assigned_dev. > struct kvm_arch{ > int naliases; > struct kvm_mem_alias aliases[KVM_ALIAS_SLOTS]; > @@ -337,6 +356,7 @@ struct kvm_arch{ > * Hash table of struct kvm_mmu_page. > */ > struct list_head active_mmu_pages; > + struct list_head assigned_dev_head; > This is useful for non-x86 (except s390). It's okay to leave it to the ia64/ppc maintainers, though. > struct kvm_pic *vpic; > struct kvm_ioapic *vioapic; > struct kvm_pit *vpit; > diff --git a/include/asm-x86/kvm_para.h b/include/asm-x86/kvm_para.h > index 76f3921..3aa1731 100644 > --- a/include/asm-x86/kvm_para.h > +++ b/include/asm-x86/kvm_para.h > @@ -143,5 +143,4 @@ static inline unsigned int kvm_arch_para_features(void) > } > > #endif > - > Spurious. > #endif > diff --git a/include/linux/kvm.h b/include/linux/kvm.h > index 6edba45..c436c08 100644 > --- a/include/linux/kvm.h > +++ b/include/linux/kvm.h > @@ -382,6 +382,7 @@ struct kvm_trace_rec { > #define KVM_CAP_PV_MMU 13 > #define KVM_CAP_MP_STATE 14 > #define KVM_CAP_COALESCED_MMIO 15 > +#define KVM_CAP_DEVICE_ASSIGNMENT 16 > > /* > * ioctls for VM fds > @@ -411,6 +412,8 @@ struct kvm_trace_rec { > _IOW(KVMIO, 0x67, struct kvm_coalesced_mmio_zone) > #define KVM_UNREGISTER_COALESCED_MMIO \ > _IOW(KVMIO, 0x68, struct kvm_coalesced_mmio_zone) > +#define KVM_UPDATE_ASSIGNED_DEVICE _IOR(KVMIO, 0x69, \ > + struct kvm_assigned_dev) > > /* > * ioctls for vcpu fds > @@ -475,4 +478,22 @@ struct kvm_trace_rec { > #define KVM_TRC_STLB_INVAL (KVM_TRC_HANDLER + 0x18) > #define KVM_TRC_PPC_INSTR (KVM_TRC_HANDLER + 0x19) > > +#define ASSIGNED_DEV_MAX_IRQ 16 > + > +/* Stores information for identifying host PCI devices assigned to the > + * guest: this is used in the host kernel and in the userspace. > + */ > +struct kvm_assigned_dev_info { > + __u32 busnr; > + __u32 devfn; > + __u32 irq[ASSIGNED_DEV_MAX_IRQ]; > + __u32 num_valid_irqs; /* currently only 1 is supported */ > +}; > Move num_valid_irqs before the array. Add padding so the structure is a multiple of 64 bits (needed so that the i386 and x86_64 ABIs are identical). > + > +/* Mapping between host and guest PCI device */ > +struct kvm_assigned_dev { > + struct kvm_assigned_dev_info guest; > + struct kvm_assigned_dev_info host; > +}; > + > guest.busnr and guest.devfn are meaningless; that's a wart. What do you think of this API: struct kvm_assigned_pci_dev { __u32 assigned_dev_id; __u32 busnr; __u32 devfn; __u32 flags; }; struct kvm_assigned_irq { __u32 assigned_dev_id; __u32 host_irq; __u32 guest_irq; __u32 flags; }; This automatically handles zero, one, or many irqs, and also allows for non-pci devices. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic.