From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Michael S. Tsirkin" Subject: Re: [PATCHv5] uio: add generic driver for PCI 2.3 devices Date: Mon, 20 Jul 2009 22:52:27 +0300 Message-ID: <20090720195227.GB19009@redhat.com> References: <20090720072934.GA12509@redhat.com> <20090720190942.GB7604@local> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: anthony@codemonkey.ws, avi@redhat.com, kvm@vger.kernel.org, chrisw@redhat.com, gregkh@suse.de, linux-kernel@vger.kernel.org, Jesse Barnes To: "Hans J. Koch" Return-path: Received: from mx2.redhat.com ([66.187.237.31]:49507 "EHLO mx2.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750922AbZGTTxi (ORCPT ); Mon, 20 Jul 2009 15:53:38 -0400 Content-Disposition: inline In-Reply-To: <20090720190942.GB7604@local> Sender: kvm-owner@vger.kernel.org List-ID: On Mon, Jul 20, 2009 at 09:09:43PM +0200, Hans J. Koch wrote: > On Mon, Jul 20, 2009 at 10:29:34AM +0300, Michael S. Tsirkin wrote: > > This adds a generic uio driver that can bind to any PCI device. Fi= rst > > user will be virtualization where a qemu userspace process needs to= give > > guest OS access to the device. > >=20 > > Interrupts are handled using the Interrupt Disable bit in the PCI c= ommand > > register and Interrupt Status bit in the PCI status register. All = devices > > compliant to PCI 2.3 (circa 2002) and all compliant PCI Express dev= ices should > > support these bits. Driver detects this support, and won't bind to= devices > > which do not support the Interrupt Disable Bit in the command regis= ter. > >=20 > > It's expected that more features of interest to virtualization will= be > > added to this driver in the future. Possibilities are: mmap for dev= ice > > resources, MSI/MSI-X, eventfd (to interface with kvm), iommu. >=20 > Thanks for adding the docs! Looks alright to me. >=20 > Thanks, > Hans >=20 > >=20 > > Signed-off-by: Michael S. Tsirkin > > Acked-by: Chris Wright >=20 > Signed-off-by: Hans J. Koch Jesse just acked this patch in a private mail, as well. Acked-by: Jesse Barnes > >=20 > > --- > > Greg, here's a combined patch including documentation, for upstream= inclusion. > >=20 > > This is intended to solve the problem in virtualization that shared > > interrupts do not work with assigned devices. Earlier versions of t= his > > patch have circulated on kvm@vger. > >=20 > > Changes since v4: > > - add documentation in Docbook format > > Changes since v3: > > - minor driver version fix > > Changes since v2: > > - remove irqcontrol: user can enable interrupts by > > writing command register directly > > - don't claim resources as we don't support mmap yet, > > but note the intent to do so in the commit log > > Changes since v1: > > - some naming changes > > - do a single read to get both command and status register > >=20 > > Documentation/DocBook/uio-howto.tmpl | 163 ++++++++++++++++++++++= ++++ > > MAINTAINERS | 8 ++ > > drivers/uio/Kconfig | 10 ++ > > drivers/uio/Makefile | 1 + > > drivers/uio/uio_pci_generic.c | 207 ++++++++++++++++++++++= ++++++++++++ > > include/linux/pci_regs.h | 1 + > > 6 files changed, 390 insertions(+), 0 deletions(-) > > create mode 100644 drivers/uio/uio_pci_generic.c > >=20 > > diff --git a/Documentation/DocBook/uio-howto.tmpl b/Documentation/D= ocBook/uio-howto.tmpl > > index 8f6e3b2..4d4ce0e 100644 > > --- a/Documentation/DocBook/uio-howto.tmpl > > +++ b/Documentation/DocBook/uio-howto.tmpl > > @@ -25,6 +25,10 @@ > > 2006-2008 > > Hans-J=FCrgen Koch. > > > > + > > + 2009 > > + Red Hat Inc, Michael S. Tsirkin (mst@redhat.com) > > + > > =20 > > > > > > @@ -42,6 +46,13 @@ GPL version 2. > > =20 > > > > > > + 0.9 > > + 2009-07-16 > > + mst > > + Added generic pci driver > > + > > + > > + > > 0.8 > > 2008-12-24 > > hjk > > @@ -809,6 +820,158 @@ framework to set up sysfs files for this regi= on. Simply leave it alone. > > =20 > > > > =20 > > + > > + > > +Generic PCI UIO driver > > + > > + The generic driver is a kernel module named uio_pci_generic. > > + It can work with any device compliant to PCI 2.3 (circa 2002) and > > + any compliant PCI Express device. Using this, you only need to > > + write the userspace driver, removing the need to write > > + a hardware-specific kernel module. > > + > > + > > + > > +Making the driver recognize the device > > + > > +Since the driver does not declare any device ids, it will not get = loaded > > +automatically and will not automatically bind to any devices, you = must load it > > +and allocate id to the driver yourself. For example: > > + > > + modprobe uio_pci_generic > > + echo "8086 10f5" > /sys/bus/pci/drivers/uio_pci_gene= ric/new_id > > + > > + > > + > > +If there already is a hardware specific kernel driver for your dev= ice, the > > +generic driver still won't bind to it, in this case if you want to= use the > > +generic driver (why would you?) you'll have to manually unbind the= hardware > > +specific driver and bind the generic driver, like this: > > + > > + echo -n 0000:00:19.0 > /sys/bus/pci/drivers/e1000e/unbind > > + echo -n 0000:00:19.0 > /sys/bus/pci/drivers/uio_pci_generic= /bind > > + > > + > > + > > +You can verify that the device has been bound to the driver > > +by looking for it in sysfs, for example like the following: > > + > > + ls -l /sys/bus/pci/devices/0000:00:19.0/driver > > + > > +Which if successful should print > > + > > + .../0000:00:19.0/driver -> ../../../bus/pci/drivers/uio_pci_g= eneric > > + > > +Note that the generic driver will not bind to old PCI 2.2 devices. > > +If binding the device failed, run the following command: > > + > > + dmesg > > + > > +and look in the output for failure reasons > > + > > + > > + > > + > > +Things to know about uio_pci_generic > > + > > +Interrupts are handled using the Interrupt Disable bit in the PCI = command > > +register and Interrupt Status bit in the PCI status register. All= devices > > +compliant to PCI 2.3 (circa 2002) and all compliant PCI Express de= vices should > > +support these bits. uio_pci_generic detects this support, and won= 't bind to > > +devices which do not support the Interrupt Disable Bit in the comm= and register. > > + > > + > > +On each interrupt, uio_pci_generic sets the Interrupt Disable bit. > > +This prevents the device from generating further interrupts > > +until the bit is cleared. The userspace driver should clear this > > +bit before blocking and waiting for more interrupts. > > + > > + > > + > > +Writing userspace driver using uio_pci_generic > > + > > +Userspace driver can use pci sysfs interface, or the > > +libpci libray that wraps it, to talk to the device and to > > +re-enable interrupts by writing to the command register. > > + > > + > > + > > +Example code using uio_pci_generic > > + > > +Here is some sample userspace driver code using uio_pci_generic: > > + > > +#include <stdlib.h> > > +#include <stdio.h> > > +#include <unistd.h> > > +#include <sys/types.h> > > +#include <sys/stat.h> > > +#include <fcntl.h> > > +#include <errno.h> > > + > > +int main() > > +{ > > + int uiofd; > > + int configfd; > > + int err; > > + int i; > > + unsigned icount; > > + unsigned char command_high; > > + > > + uiofd =3D open("/dev/uio0", O_RDONLY); > > + if (uiofd < 0) { > > + perror("uio open:"); > > + return errno; > > + } > > + configfd =3D open("/sys/class/uio/uio0/device/config", = O_RDWR); > > + if (uiofd < 0) { > > + perror("config open:"); > > + return errno; > > + } > > + > > + /* Read and cache command value */ > > + err =3D pread(configfd, &command_high, 1, 5); > > + if (err !=3D 1) { > > + perror("command config read:"); > > + return errno; > > + } > > + command_high &=3D ~0x4; > > + > > + for(i =3D 0;; ++i) { > > + /* Print out a message, for debugging. */ > > + if (i =3D=3D 0) > > + fprintf(stderr, "Started uio test driver.\n"); > > + else > > + fprintf(stderr, "Interrupts: %d\n", icount); > > + > > + /****************************************/ > > + /* Here we got an interrupt from the > > + device. Do something to it. */ > > + /****************************************/ > > + > > + /* Re-enable interrupts. */ > > + err =3D pwrite(configfd, &command_high, 1, 5); > > + if (err !=3D 1) { > > + perror("config write:"); > > + break; > > + } > > + > > + /* Wait for next interrupt. */ > > + err =3D read(uiofd, &icount, 4); > > + if (err !=3D 4) { > > + perror("uio read:"); > > + break; > > + } > > + > > + } > > + return errno; > > +} > > + > > + > > + > > + > > + > > + > > + > > > > Further information > > > > diff --git a/MAINTAINERS b/MAINTAINERS > > index 18c3f0c..39c7207 100644 > > --- a/MAINTAINERS > > +++ b/MAINTAINERS > > @@ -2538,6 +2538,14 @@ T: git git://git.kernel.org/pub/scm/linux/ke= rnel/git/arnd/asm-generic.git > > S: Maintained > > F: include/asm-generic > > =20 > > +GENERIC UIO DRIVER FOR PCI DEVICES > > +P: Michael S. Tsirkin > > +M: mst@redhat.com > > +L: kvm@vger.kernel.org > > +L: linux-kernel@vger.kernel.org > > +S: Supported > > +F: drivers/uio/uio_pci_generic.c > > + > > GFS2 FILE SYSTEM > > P: Steven Whitehouse > > M: swhiteho@redhat.com > > diff --git a/drivers/uio/Kconfig b/drivers/uio/Kconfig > > index 7f86534..0f14c8e 100644 > > --- a/drivers/uio/Kconfig > > +++ b/drivers/uio/Kconfig > > @@ -89,4 +89,14 @@ config UIO_SERCOS3 > > =20 > > If you compile this as a module, it will be called uio_sercos3. > > =20 > > +config UIO_PCI_GENERIC > > + tristate "Generic driver for PCI 2.3 and PCI Express cards" > > + depends on PCI > > + default n > > + help > > + Generic driver that you can bind, dynamically, to any > > + PCI 2.3 compliant and PCI Express card. It is useful, > > + primarily, for virtualization scenarios. > > + If you compile this as a module, it will be called uio_pci_gene= ric. > > + > > endif > > diff --git a/drivers/uio/Makefile b/drivers/uio/Makefile > > index 5c2586d..73b2e75 100644 > > --- a/drivers/uio/Makefile > > +++ b/drivers/uio/Makefile > > @@ -5,3 +5,4 @@ obj-$(CONFIG_UIO_PDRV_GENIRQ) +=3D uio_pdrv_genirq.= o > > obj-$(CONFIG_UIO_SMX) +=3D uio_smx.o > > obj-$(CONFIG_UIO_AEC) +=3D uio_aec.o > > obj-$(CONFIG_UIO_SERCOS3) +=3D uio_sercos3.o > > +obj-$(CONFIG_UIO_PCI_GENERIC) +=3D uio_pci_generic.o > > diff --git a/drivers/uio/uio_pci_generic.c b/drivers/uio/uio_pci_ge= neric.c > > new file mode 100644 > > index 0000000..313da35 > > --- /dev/null > > +++ b/drivers/uio/uio_pci_generic.c > > @@ -0,0 +1,207 @@ > > +/* uio_pci_generic - generic UIO driver for PCI 2.3 devices > > + * > > + * Copyright (C) 2009 Red Hat, Inc. > > + * Author: Michael S. Tsirkin > > + * > > + * This work is licensed under the terms of the GNU GPL, version 2= =2E > > + * > > + * Since the driver does not declare any device ids, you must allo= cate > > + * id and bind the device to the driver yourself. For example: > > + * > > + * # echo "8086 10f5" > /sys/bus/pci/drivers/uio_pci_generic/new_i= d > > + * # echo -n 0000:00:19.0 > /sys/bus/pci/drivers/e1000e/unbind > > + * # echo -n 0000:00:19.0 > /sys/bus/pci/drivers/uio_pci_generic/b= ind > > + * # ls -l /sys/bus/pci/devices/0000:00:19.0/driver > > + * .../0000:00:19.0/driver -> ../../../bus/pci/drivers/uio_pci_gen= eric > > + * > > + * Driver won't bind to devices which do not support the Interrupt= Disable Bit > > + * in the command register. All devices compliant to PCI 2.3 (circ= a 2002) and > > + * all compliant PCI Express devices should support this bit. > > + */ > > + > > +#include > > +#include > > +#include > > +#include > > +#include > > + > > +#define DRIVER_VERSION "0.01.0" > > +#define DRIVER_AUTHOR "Michael S. Tsirkin " > > +#define DRIVER_DESC "Generic UIO driver for PCI 2.3 devices" > > + > > +struct uio_pci_generic_dev { > > + struct uio_info info; > > + struct pci_dev *pdev; > > + spinlock_t lock; /* guards command register accesses */ > > +}; > > + > > +static inline struct uio_pci_generic_dev * > > +to_uio_pci_generic_dev(struct uio_info *info) > > +{ > > + return container_of(info, struct uio_pci_generic_dev, info); > > +} > > + > > +/* Interrupt handler. Read/modify/write the command register to di= sable > > + * the interrupt. */ > > +static irqreturn_t irqhandler(int irq, struct uio_info *info) > > +{ > > + struct uio_pci_generic_dev *gdev =3D to_uio_pci_generic_dev(info)= ; > > + struct pci_dev *pdev =3D gdev->pdev; > > + irqreturn_t ret =3D IRQ_NONE; > > + u32 cmd_status_dword; > > + u16 origcmd, newcmd, status; > > + > > + /* We do a single dword read to retrieve both command and status. > > + * Document assumptions that make this possible. */ > > + BUILD_BUG_ON(PCI_COMMAND % 4); > > + BUILD_BUG_ON(PCI_COMMAND + 2 !=3D PCI_STATUS); > > + > > + spin_lock_irq(&gdev->lock); > > + pci_block_user_cfg_access(pdev); > > + > > + /* Read both command and status registers in a single 32-bit oper= ation. > > + * Note: we could cache the value for command and move the status= read > > + * out of the lock if there was a way to get notified of user cha= nges > > + * to command register through sysfs. Should be good for shared i= rqs. */ > > + pci_read_config_dword(pdev, PCI_COMMAND, &cmd_status_dword); > > + origcmd =3D cmd_status_dword; > > + status =3D cmd_status_dword >> 16; > > + > > + /* Check interrupt status register to see whether our device > > + * triggered the interrupt. */ > > + if (!(status & PCI_STATUS_INTERRUPT)) > > + goto done; > > + > > + /* We triggered the interrupt, disable it. */ > > + newcmd =3D origcmd | PCI_COMMAND_INTX_DISABLE; > > + if (newcmd !=3D origcmd) > > + pci_write_config_word(pdev, PCI_COMMAND, newcmd); > > + > > + /* UIO core will signal the user process. */ > > + ret =3D IRQ_HANDLED; > > +done: > > + > > + pci_unblock_user_cfg_access(pdev); > > + spin_unlock_irq(&gdev->lock); > > + return ret; > > +} > > + > > +/* Verify that the device supports Interrupt Disable bit in comman= d register, > > + * per PCI 2.3, by flipping this bit and reading it back: this bit= was readonly > > + * in PCI 2.2. */ > > +static int __devinit verify_pci_2_3(struct pci_dev *pdev) > > +{ > > + u16 orig, new; > > + int err =3D 0; > > + > > + pci_block_user_cfg_access(pdev); > > + pci_read_config_word(pdev, PCI_COMMAND, &orig); > > + pci_write_config_word(pdev, PCI_COMMAND, > > + orig ^ PCI_COMMAND_INTX_DISABLE); > > + pci_read_config_word(pdev, PCI_COMMAND, &new); > > + /* There's no way to protect against > > + * hardware bugs or detect them reliably, but as long as we know > > + * what the value should be, let's go ahead and check it. */ > > + if ((new ^ orig) & ~PCI_COMMAND_INTX_DISABLE) { > > + err =3D -EBUSY; > > + dev_err(&pdev->dev, "Command changed from 0x%x to 0x%x: " > > + "driver or HW bug?\n", orig, new); > > + goto err; > > + } > > + if (!((new ^ orig) & PCI_COMMAND_INTX_DISABLE)) { > > + dev_warn(&pdev->dev, "Device does not support " > > + "disabling interrupts: unable to bind.\n"); > > + err =3D -ENODEV; > > + goto err; > > + } > > + /* Now restore the original value. */ > > + pci_write_config_word(pdev, PCI_COMMAND, orig); > > +err: > > + pci_unblock_user_cfg_access(pdev); > > + return err; > > +} > > + > > +static int __devinit probe(struct pci_dev *pdev, > > + const struct pci_device_id *id) > > +{ > > + struct uio_pci_generic_dev *gdev; > > + int err; > > + > > + if (!pdev->irq) { > > + dev_warn(&pdev->dev, "No IRQ assigned to device: " > > + "no support for interrupts?\n"); > > + return -ENODEV; > > + } > > + > > + err =3D pci_enable_device(pdev); > > + if (err) { > > + dev_err(&pdev->dev, "%s: pci_enable_device failed: %d\n", > > + __func__, err); > > + return err; > > + } > > + > > + err =3D verify_pci_2_3(pdev); > > + if (err) > > + goto err_verify; > > + > > + gdev =3D kzalloc(sizeof(struct uio_pci_generic_dev), GFP_KERNEL); > > + if (!gdev) { > > + err =3D -ENOMEM; > > + goto err_alloc; > > + } > > + > > + gdev->info.name =3D "uio_pci_generic"; > > + gdev->info.version =3D DRIVER_VERSION; > > + gdev->info.irq =3D pdev->irq; > > + gdev->info.irq_flags =3D IRQF_SHARED; > > + gdev->info.handler =3D irqhandler; > > + gdev->pdev =3D pdev; > > + spin_lock_init(&gdev->lock); > > + > > + if (uio_register_device(&pdev->dev, &gdev->info)) > > + goto err_register; > > + pci_set_drvdata(pdev, gdev); > > + > > + return 0; > > +err_register: > > + kfree(gdev); > > +err_alloc: > > +err_verify: > > + pci_disable_device(pdev); > > + return err; > > +} > > + > > +static void remove(struct pci_dev *pdev) > > +{ > > + struct uio_pci_generic_dev *gdev =3D pci_get_drvdata(pdev); > > + > > + uio_unregister_device(&gdev->info); > > + pci_disable_device(pdev); > > + kfree(gdev); > > +} > > + > > +static struct pci_driver driver =3D { > > + .name =3D "uio_pci_generic", > > + .id_table =3D NULL, /* only dynamic id's */ > > + .probe =3D probe, > > + .remove =3D remove, > > +}; > > + > > +static int __init init(void) > > +{ > > + pr_info(DRIVER_DESC " version: " DRIVER_VERSION "\n"); > > + return pci_register_driver(&driver); > > +} > > + > > +static void __exit cleanup(void) > > +{ > > + pci_unregister_driver(&driver); > > +} > > + > > +module_init(init); > > +module_exit(cleanup); > > + > > +MODULE_VERSION(DRIVER_VERSION); > > +MODULE_LICENSE("GPL v2"); > > +MODULE_AUTHOR(DRIVER_AUTHOR); > > +MODULE_DESCRIPTION(DRIVER_DESC); > > diff --git a/include/linux/pci_regs.h b/include/linux/pci_regs.h > > index fcaee42..dd0bed4 100644 > > --- a/include/linux/pci_regs.h > > +++ b/include/linux/pci_regs.h > > @@ -42,6 +42,7 @@ > > #define PCI_COMMAND_INTX_DISABLE 0x400 /* INTx Emulation Disable = */ > > =20 > > #define PCI_STATUS 0x06 /* 16 bits */ > > +#define PCI_STATUS_INTERRUPT 0x08 /* Interrupt status */ > > #define PCI_STATUS_CAP_LIST 0x10 /* Support Capability List */ > > #define PCI_STATUS_66MHZ 0x20 /* Support 66 Mhz PCI 2.1 bus */ > > #define PCI_STATUS_UDF 0x40 /* Support User Definable Features [= obsolete] */ > > --=20 > > 1.6.2.5