From: Anthony Liguori <anthony@codemonkey.ws>
To: Amit Shah <amit.shah@qumranet.com>
Cc: muli@il.ibm.com, kvm@vger.kernel.org, weidong.han@intel.com,
allen.m.kay@intel.com,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
benami@il.ibm.com, Ian Jackson <Ian.Jackson@eu.citrix.com>
Subject: [Qemu-devel] Re: [PATCH 1/1] KVM/userspace: Support for assigning PCI devices to guests
Date: Wed, 27 Aug 2008 08:50:26 -0500 [thread overview]
Message-ID: <48B55BA2.7080409@codemonkey.ws> (raw)
In-Reply-To: <1219764544-29764-2-git-send-email-amit.shah@qumranet.com>
Hi Amit,
Amit Shah wrote:
> With this patch, we can assign a device on the host machine to a
> guest.
>
> A new command-line option, -pcidevice is added.
> For example, to invoke it for a device sitting at PCI bus:dev.fn
> 04:08.0 with host IRQ 18, use this:
>
> -pcidevice host=04:08.0
>
Let's make sure it also works via hotplug. We already have a pci_add
command. I would expect the syntax should be host:04:08.0 to be more in
line with usb.
> The host driver for the device, if any, is to be removed before
> assigning the device.
>
> This works only with the in-kernel irqchip method; to use the
> userspace irqchip, a kernel module (irqhook) and some extra changes
> are needed.
>
> Signed-off-by: Amit Shah <amit.shah@qumranet.com>
> ---
> libkvm/libkvm-x86.c | 14 +
> libkvm/libkvm.h | 27 ++
> qemu/Makefile.target | 1 +
> qemu/hw/device-assignment.c | 600 +++++++++++++++++++++++++++++++++++++++++++
> qemu/hw/device-assignment.h | 94 +++++++
> qemu/hw/isa.h | 2 +
> qemu/hw/pc.c | 9 +
> qemu/hw/pci.c | 12 +
> qemu/hw/pci.h | 1 +
> qemu/hw/piix_pci.c | 19 ++
> qemu/vl.c | 18 ++
> 11 files changed, 797 insertions(+), 0 deletions(-)
> create mode 100644 qemu/hw/device-assignment.c
> create mode 100644 qemu/hw/device-assignment.h
>
This patch is too big on it's on. It should be split into logical parts.
> diff --git a/qemu/Makefile.target b/qemu/Makefile.target
> index 2332fe3..750ecd5 100644
> --- a/qemu/Makefile.target
> +++ b/qemu/Makefile.target
> @@ -611,6 +611,7 @@ OBJS+= ide.o pckbd.o ps2.o vga.o $(SOUND_HW) dma.o
> OBJS+= fdc.o mc146818rtc.o serial.o i8259.o i8254.o pcspk.o pc.o
> OBJS+= cirrus_vga.o apic.o parallel.o acpi.o piix_pci.o
> OBJS+= usb-uhci.o vmmouse.o vmport.o vmware_vga.o extboot.o
> +OBJS+= device-assignment.o
>
We should conditionally compile this into qemu only if we know we can
support it (has to be a Linux host for instance).
> ifeq ($(USE_KVM_PIT), 1)
> OBJS+= i8254-kvm.o
> endif
> diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c
> new file mode 100644
> index 0000000..8bc4e5c
> --- /dev/null
> +++ b/qemu/hw/device-assignment.c
> @@ -0,0 +1,600 @@
> +/*
> + * Copyright (c) 2007, Neocleus Corporation.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
> + * Place - Suite 330, Boston, MA 02111-1307 USA.
> + *
> + *
> + * Assign a PCI device from the host to a guest VM.
> + *
> + * Adapted for KVM by Qumranet.
> + *
> + * Copyright (c) 2007, Neocleus, Alex Novik (alex@neocleus.com)
> + * Copyright (c) 2007, Neocleus, Guy Zana (guy@neocleus.com)
> + * Copyright (C) 2008, Qumranet, Amit Shah (amit.shah@qumranet.com)
> + */
> +#include <stdio.h>
> +#include <pthread.h>
>
Why is pthread being included?
> +#include <sys/io.h>
> +#include <sys/ioctl.h>
> +#include <linux/types.h>
> +
> +/* From linux/ioport.h */
> +#define IORESOURCE_IO 0x00000100 /* Resource type */
> +#define IORESOURCE_MEM 0x00000200
> +#define IORESOURCE_IRQ 0x00000400
> +#define IORESOURCE_DMA 0x00000800
> +#define IORESOURCE_PREFETCH 0x00001000 /* No side effects */
> +
> +#include "device-assignment.h"
> +#include "irq.h"
> +
> +#include "qemu-kvm.h"
> +#include <linux/kvm_para.h>
> +
> +extern FILE *logfile;
> +
> +/* #define DEVICE_ASSIGNMENT_DEBUG */
> +
> +#ifdef DEVICE_ASSIGNMENT_DEBUG
> +#define DEBUG(fmt, args...) fprintf(stderr, "%s: " fmt, __func__ , ## args)
> +#else
> +#define DEBUG(fmt, args...)
> +#endif
> +
> +#define assigned_dev_ioport_write(suffix) \
> + static void assigned_dev_ioport_write##suffix(void *opaque, uint32_t addr, \
> + uint32_t value) \
> + { \
> + assigned_dev_region_t *r_access = (assigned_dev_region_t *)opaque; \
> + uint32_t r_pio = (unsigned long)r_access->r_virtbase \
> + + (addr - r_access->e_physbase); \
> + if (r_access->debug & DEVICE_ASSIGNMENT_DEBUG_PIO) { \
> + fprintf(logfile, "assigned_dev_ioport_write" #suffix \
> + ": r_pio=%08x e_physbase=%08x" \
> + " r_virtbase=%08lx value=%08x\n", \
> + r_pio, (int)r_access->e_physbase, \
> + (unsigned long)r_access->r_virtbase, value); \
> + } \
> + iopl(3); \
> + out##suffix(value, r_pio); \
> + }
> +
> +assigned_dev_ioport_write(b)
> +assigned_dev_ioport_write(w)
> +assigned_dev_ioport_write(l)
>
Please don't do this sort of trickery with macros. Also don't write to
logfile, write to stderr. Debug should follow the same style as the
rest of the code.
Also, this whole file needs to be reformated to the QEMU style. How far
has this diverged from the Xen version of the code? If it's
sufficiently close, we should coordinate with those guys such that this
code can eventually go upstream.
> +
> +static void assigned_dev_iomem_map(PCIDevice *pci_dev, int region_num,
> + uint32_t e_phys, uint32_t e_size, int type)
> +{
> + assigned_dev_t *r_dev = (assigned_dev_t *) pci_dev;
> + assigned_dev_region_t *region = &r_dev->v_addrs[region_num];
> + int first_map = (region->e_size == 0);
> + int ret = 0;
> +
> + DEBUG("e_phys=%08x r_virt=%p type=%d len=%08x region_num=%d \n",
> + e_phys, r_dev->v_addrs[region_num].r_virtbase, type, e_size,
> + region_num);
> +
> + region->e_physbase = e_phys;
> + region->e_size = e_size;
> +
> + if (!first_map)
> + kvm_destroy_phys_mem(kvm_context, e_phys, e_size);
>
We try to avoid open coding calls to libkvm functions directly within
QEMU. At the least, it should be wrapped with an if (kvm_enabled()).
> +static void assigned_dev_ioport_map(PCIDevice *pci_dev, int region_num,
> + uint32_t addr, uint32_t size, int type)
> +{
> + assigned_dev_t *r_dev = (assigned_dev_t *) pci_dev;
> + int i;
> + uint32_t ((*rf[])(void *, uint32_t)) =
> + { assigned_dev_ioport_readb,
> + assigned_dev_ioport_readw,
> + assigned_dev_ioport_readl
> + };
> + void ((*wf[])(void *, uint32_t, uint32_t)) =
> + { assigned_dev_ioport_writeb,
> + assigned_dev_ioport_writew,
> + assigned_dev_ioport_writel
> + };
>
I don't think this any nicer (and certainly not less lines) than just
open coding the register_ioport_{read,write} calls.
> + r_dev->v_addrs[region_num].e_physbase = addr;
> + DEBUG("%s: address=0x%x type=0x%x len=%d region_num=%d \n",
> + __func__, addr, type, size, region_num);
> +
> + for (i = 0; i < 3; i++) {
> + register_ioport_write(addr, size, 1<<i, wf[i],
> + (void *) (r_dev->v_addrs + region_num));
> + register_ioport_read(addr, size, 1<<i, rf[i],
> + (void *) (r_dev->v_addrs + region_num));
> + }
> +}
> +
> +static void assigned_dev_pci_write_config(PCIDevice *d, uint32_t address,
> + uint32_t val, int len)
> +{
> + int fd, r;
> +
> + DEBUG("%s: (%x.%x): address=%04x val=0x%08x len=%d\n",
> + __func__, ((d->devfn >> 3) & 0x1F), (d->devfn & 0x7),
> + (uint16_t) address, val, len);
> +
> + if (address == 0x4)
> + pci_default_write_config(d, address, val, len);
> +
> + if ((address >= 0x10 && address <= 0x24) || address == 0x34 ||
> + address == 0x3c || address == 0x3d) {
> + /* used for update-mappings (BAR emulation) */
> + pci_default_write_config(d, address, val, len);
> + return;
> + }
>
Is this first if check correct? It calls pci_default_write_config but
then doesn't return? This definitely needs a comment.
> + DEBUG("%s: NON BAR (%x.%x): address=%04x val=0x%08x len=%d\n",
> + __func__, ((d->devfn >> 3) & 0x1F), (d->devfn & 0x7),
> + (uint16_t) address, val, len);
> + fd = ((assigned_dev_t *)d)->real_device.config_fd;
>
assigned_dev_t is not QEMU style. It should be AssignedDevice or
something similar.
> + lseek(fd, address, SEEK_SET);
>
Needs error checking.
I'm not going to go any further for now. The code needs an awful lot of
cleanup. Please try to do at least a couple passes of cleanup and post
again.
Regards,
Anthony Liguori
next parent reply other threads:[~2008-08-27 13:51 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1219764544-29764-1-git-send-email-amit.shah@qumranet.com>
[not found] ` <1219764544-29764-2-git-send-email-amit.shah@qumranet.com>
2008-08-27 13:50 ` Anthony Liguori [this message]
2008-08-27 14:02 ` [Qemu-devel] Re: [PATCH 1/1] KVM/userspace: Support for assigning PCI devices to guests Anthony Liguori
2008-08-27 14:20 ` Ian Jackson
2008-08-28 11:50 ` Amit Shah
2008-08-28 11:48 ` Amit Shah
2008-08-28 13:11 ` Anthony Liguori
2008-08-28 13:25 ` Amit Shah
2008-08-28 16:37 ` Ian Jackson
2008-08-29 16:54 ` Amit Shah
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=48B55BA2.7080409@codemonkey.ws \
--to=anthony@codemonkey.ws \
--cc=Ian.Jackson@eu.citrix.com \
--cc=allen.m.kay@intel.com \
--cc=amit.shah@qumranet.com \
--cc=benami@il.ibm.com \
--cc=kvm@vger.kernel.org \
--cc=muli@il.ibm.com \
--cc=qemu-devel@nongnu.org \
--cc=weidong.han@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).