All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marcelo Tosatti <mtosatti@redhat.com>
To: Paul Brook <paul@codesourcery.com>
Cc: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [RFC] New device API
Date: Fri, 8 May 2009 02:27:55 -0300	[thread overview]
Message-ID: <20090508052755.GA13230@amt.cnet> (raw)
In-Reply-To: <200905051231.09759.paul@codesourcery.com>

Paul,

On Tue, May 05, 2009 at 12:31:09PM +0100, Paul Brook wrote:
> The attached patch is my attempt at a new internal API for device creation in 
> qemu.
> 
> The long term goal here is fully dynamic machine construction, i.e. a user 
> supplied configuration file/tree rather than the current hardcoded board 
> configs.
> 
> As a first step towards this, I've implemented an API that allows devices to 
> be created and configured without requiring device specific knowledge.
> 
> There are two sides to this API.  The device side allows a device to 
> register/configure functionality (e.g. MMIO regions, IRQs, character devices, 
> etc). Most of the infrastructure for this is already present, we just need to 
> configure it consistently, rather than on an ad-hoc basis. I expect this API 
> to remain largely unchanged[1].
> 
> The board side allows creation on configuration of devices. In the medium term 
> I expect this to go away, and be replaced by data driven configuration.
> 
> I also expect that this device abstraction will let itself to a system bus 
> model to solve many of the IOMMU/DMA/address mapping issues that qemu 
> currently can't handle.
> 
> There are still a few rough edges. Busses aren't handled in a particularly 
> consistent way, PCI isn't particularly well integrated, and the 
> implementation of things like qdev_get_chardev is an ugly hack mapping onto 
> current commandline options. However I believe the device side API is fairly 
> solid, and is a necessary prerequisite for fixing the bigger configuration 
> problem.
> 
> I've converted a bunch of devices, anyone interested in seeing how it fits 
> together in practice can pull from
>   git://repo.or.cz/qemu/pbrook.git qdev
> 
> It you have objections to or suggestion about this approach please speak up 
> now, but please bear in ming that this code is still somewhat in flux and the 
> caveats mentioned above.
> 
> Paul
> 
> [1] I subscribe to the linux kernel theory that stable internal APIs are a 
> coincidence, not a feature. i.e. a consistent internal API is good, but that 
> it should be changed whenever technically desirable. I have no interest in 
> maintaining a fixed API for the benefit of third parties.

Makes lots of sense. And also its perfectly fine for the API to evolve.

> commit 11c765848af6cb345a81f2722f141b305538182d
> Author: Paul Brook <paul@codesourcery.com>
> Date:   Mon May 4 17:13:08 2009 +0100
> 
>     Basic qdev infrastructure.
>     
>     Signed-off-by: Paul Brook <paul@codesourcery.com>
> 
> diff --git a/Makefile.target b/Makefile.target
> index f735105..a35e724 100644
> --- a/Makefile.target
> +++ b/Makefile.target
> @@ -498,6 +498,7 @@ endif #CONFIG_BSD_USER
>  # System emulator target
>  ifndef CONFIG_USER_ONLY
>  
> +DEVICES=
>  OBJS=vl.o osdep.o monitor.o pci.o loader.o isa_mmio.o machine.o dma-helpers.o
>  # virtio has to be here due to weird dependency between PCI and virtio-net.
>  # need to fix this properly
> @@ -686,6 +687,13 @@ ifeq ($(TARGET_BASE_ARCH), m68k)
>  OBJS+= an5206.o mcf5206.o mcf_uart.o mcf_intc.o mcf5208.o mcf_fec.o
>  OBJS+= m68k-semi.o dummy_m68k.o
>  endif
> +
> +DEVICE_OBJS=$(addsuffix .o,$(DEVICES))
> +$(DEVICE_OBJS): CPPFLAGS+=-DDEVICE_NAME=$(subst -,_,$(@:.o=))
> +
> +OBJS+=$(DEVICE_OBJS)
> +OBJS+=devices.o qdev.o
> +
>  ifdef CONFIG_GDBSTUB
>  OBJS+=gdbstub.o gdbstub-xml.o
>  endif
> @@ -752,6 +760,9 @@ endif
>  qemu-options.h: $(SRC_PATH)/qemu-options.hx
>  	$(call quiet-command,sh $(SRC_PATH)/hxtool -h < $< > $@,"  GEN   $(TARGET_DIR)$@")
>  
> +devices.c: gen_devices.sh Makefile.target config.mak
> +	$(call quiet-command,$(SHELL) $(SRC_PATH)/gen_devices.sh $@ $(subst -,_,$(DEVICES)),"  GEN   $(TARGET_DIR)$@")
> +
>  clean:
>  	rm -f *.o *.a *~ $(PROGS) nwfpe/*.o fpu/*.o qemu-options.h gdbstub-xml.c
>  	rm -f *.d */*.d tcg/*.o
> diff --git a/gen_devices.sh b/gen_devices.sh
> new file mode 100644
> index 0000000..1d6ec12
> --- /dev/null
> +++ b/gen_devices.sh
> @@ -0,0 +1,17 @@
> +#! /bin/sh
> +# Call device init functions.
> +
> +file="$1"
> +shift
> +devices="$@"
> +echo '/* Generated by gen_devices.sh */' > $file
> +echo '#include "sysemu.h"' >> $file
> +for x in $devices ; do
> +    echo "void ${x}_register(void);" >> $file
> +done
> +echo "void register_devices(void)" >> $file
> +echo "{" >> $file
> +for x in $devices ; do
> +    echo "    ${x}_register();" >> $file
> +done
> +echo "}" >> $file
> diff --git a/hw/qdev.c b/hw/qdev.c
> new file mode 100644
> index 0000000..eb8c172
> --- /dev/null
> +++ b/hw/qdev.c
> @@ -0,0 +1,294 @@
> +/*
> + *  Dynamic device configuration and creation.
> + *
> + *  Copyright (c) 2009 CodeSourcery
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2 of the License, or (at your option) any later version.
> + *
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library; if not, write to the Free Software
> + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
> + */
> +
> +#include "hw.h"
> +#include "qdev.h"
> +#include "sysemu.h"
> +
> +#include <assert.h>
> +
> +struct DeviceProperty
> +{
> +    const char *name;
> +    union {
> +        int i;
> +        void *ptr;
> +    } value;
> +    DeviceProperty *next;
> +};
> +
> +struct DeviceType
> +{
> +    const char *name;
> +    qdev_initfn init;
> +    void *opaque;
> +    int size;
> +    DeviceType *next;
> +};
> +
> +static DeviceType *device_type_list;

Perhaps "device driver" is more suitable? Or "device emulator"??

> +
> +/* Register a new device type.  */
> +DeviceType *qdev_register(const char *name, int size, qdev_initfn init,
> +                          void *opaque)
> +{
> +    DeviceType *t;
> +
> +    assert(size >= sizeof(DeviceState));
> +
> +    t = qemu_mallocz(sizeof(DeviceType));
> +    t->next = device_type_list;
> +    device_type_list = t;
> +    t->name = qemu_strdup(name);
> +    t->size = size;
> +    t->init = init;
> +    t->opaque = opaque;
> +
> +    return t;
> +}

void virtio_blk_register(void)
{
    pci_qdev_register("virtio-blk", sizeof(VirtIOBlock), virtio_blk_init);
}

So you'd expect all "device types" to be registered by the time the
actual machine initialization starts. Similar to modular device drivers
in Linux.

    bs = qdev_init_bdrv(&pci_dev->qdev, IF_VIRTIO);
    s->vdev.get_config = virtio_blk_update_config;

And eventually you'd move qdev_init_bdrv (and all BlockDriverState
knowledge) to happen somewhere else other than device emulation code?

> +
> +/* Create a new device.  This only initializes the device state structure
> +   and allows properties to be set.  qdev_init should be called to
> +   initialize the actual device emulation.  */
> +DeviceState *qdev_create(const char *name)
> +{
> +    DeviceType *t;
> +    DeviceState *dev;
> +
> +    for (t = device_type_list; t; t = t->next) {
> +        if (strcmp(t->name, name) == 0) {
> +            break;
> +        }
> +    }
> +    if (!t) {
> +        fprintf(stderr, "Unknown device '%s'\n", name);
> +        exit(1);
> +    }
> +
> +    dev = qemu_mallocz(t->size);
> +    dev->name = name;
> +    dev->type = t;
> +    return dev;
> +}
> +
> +/* Initialize a device.  Device properties should be set before calling
> +   this function.  IRQs and MMIO regions should be connected/mapped after
> +   calling this function.  */
> +void qdev_init(DeviceState *dev)
> +{
> +    dev->type->init(dev, dev->type->opaque);
> +}
>
> +
> +void qdev_connect_irq(DeviceState *dev, int n, qemu_irq irq)
> +{
> +    assert(n >= 0 && n < dev->num_irq);
> +    dev->irqs[n] = 0;
> +    if (dev->irqp[n]) {
> +        *dev->irqp[n] = irq;
> +    }
> +}
> +
> +void qdev_mmio_map(DeviceState *dev, int n, target_phys_addr_t addr)
> +{
> +    assert(n >= 0 && n < dev->num_mmio);
> +
> +    if (dev->mmio[n].addr == addr) {
> +        /* ??? region already mapped here.  */
> +        return;
> +    }
> +    if (dev->mmio[n].addr != (target_phys_addr_t)-1) {
> +        /* Unregister previous mapping.  */
> +        cpu_register_physical_memory(dev->mmio[n].addr, dev->mmio[n].size,
> +                                     IO_MEM_UNASSIGNED);
> +    }
> +    dev->mmio[n].addr = addr;
> +    if (dev->mmio[n].cb) {
> +        dev->mmio[n].cb(dev, addr);
> +    } else {
> +        cpu_register_physical_memory(addr, dev->mmio[n].size,
> +                                     dev->mmio[n].iofunc);
> +    }
> +}
> +
> +void qdev_set_bus(DeviceState *dev, void *bus)
> +{
> +    assert(!dev->bus);
> +    dev->bus = bus;
> +}
> +
> +static DeviceProperty *create_prop(DeviceState *dev, const char *name)
> +{
> +    DeviceProperty *prop;
> +
> +    /* TODO: Check for duplicate properties.  */
> +    prop = qemu_mallocz(sizeof(*prop));
> +    prop->name = qemu_strdup(name);
> +    prop->next = dev->props;
> +    dev->props = prop;
> +
> +    return prop;
> +}
> +
> +void qdev_set_prop_int(DeviceState *dev, const char *name, int value)
> +{
> +    DeviceProperty *prop;
> +
> +    prop = create_prop(dev, name);
> +    prop->value.i = value;
> +}
> +
> +void qdev_set_prop_ptr(DeviceState *dev, const char *name, void *value)
> +{
> +    DeviceProperty *prop;
> +
> +    prop = create_prop(dev, name);
> +    prop->value.ptr = value;
> +}
> +
> +
> +qemu_irq qdev_get_irq_sink(DeviceState *dev, int n)
> +{
> +    assert(n >= 0 && n < dev->num_irq_sink);
> +    return dev->irq_sink[n];
> +}
> +
> +void qdev_init_mmio(DeviceState *dev, target_phys_addr_t size, int iofunc)
> +{
> +    int n;
> +
> +    assert(dev->num_mmio < QDEV_MAX_MMIO);
> +    n = dev->num_mmio++;
> +    dev->mmio[n].addr = -1;
> +    dev->mmio[n].size = size;
> +    dev->mmio[n].iofunc = iofunc;
> +}
> +
> +void qdev_init_mmio_cb(DeviceState *dev, target_phys_addr_t size,
> +                       mmio_mapfunc cb)
> +{
> +    int n;
> +
> +    assert(dev->num_mmio < QDEV_MAX_MMIO);
> +    n = dev->num_mmio++;
> +    dev->mmio[n].addr = -1;
> +    dev->mmio[n].size = size;
> +    dev->mmio[n].cb = cb;
> +}
> +
> +/* Request an IRQ source.  The actual IRQ object may be populated later.  */
> +void qdev_init_irq(DeviceState *dev, qemu_irq *p)
> +{
> +    int n;
> +
> +    assert(dev->num_irq < QDEV_MAX_IRQ);
> +    n = dev->num_irq++;
> +    dev->irqp[n] = p;
> +}
> +
> +/* Pass IRQs from a target device.  */
> +void qdev_pass_irq(DeviceState *dev, DeviceState *target)
> +{
> +    int i;
> +    assert(dev->num_irq == 0);
> +    dev->num_irq = target->num_irq;
> +    for (i = 0; i < dev->num_irq; i++) {
> +        dev->irqp[i] = target->irqp[i];
> +    }
> +}
> +
> +/* Register device IRQ sinks.  */
> +void qdev_init_irq_sink(DeviceState *dev, qemu_irq_handler handler, int nirq)
> +{
> +    dev->num_irq_sink = nirq;
> +    dev->irq_sink = qemu_allocate_irqs(handler, dev, nirq);
> +}
> +
> +/* Get a character (serial) device interface.  */
> +CharDriverState *qdev_init_chardev(DeviceState *dev)
> +{
> +    static int next_serial;
> +    static int next_virtconsole;
> +    if (strncmp(dev->name, "virtio", 6) == 0) {
> +        return virtcon_hds[next_virtconsole++];
> +    } else {
> +        return serial_hds[next_serial++];
> +    }
> +}
> +
> +void *qdev_get_bus(DeviceState *dev)
> +{
> +    return dev->bus;
> +}
> +
> +static DeviceProperty *find_prop(DeviceState *dev, const char *name)
> +{
> +    DeviceProperty *prop;
> +
> +    for (prop = dev->props; prop; prop = prop->next) {
> +        if (strcmp(prop->name, name) == 0) {
> +            return prop;
> +        }
> +    }
> +    /* TODO: When this comes from a config file we will need to handle
> +       missing properties gracefully.  */
> +    abort();
> +}
> +
> +int qdev_get_prop_int(DeviceState *dev, const char *name)
> +{
> +    DeviceProperty *prop;
> +
> +    prop = find_prop(dev, name);
> +    return prop->value.i;
> +}
> +
> +void *qdev_get_prop_ptr(DeviceState *dev, const char *name)
> +{
> +    DeviceProperty *prop;
> +
> +    prop = find_prop(dev, name);
> +    return prop->value.ptr;
> +}
> +
> +DeviceState *qdev_create_varargs(const char *name,
> +                                 target_phys_addr_t addr, ...)
> +{
> +    DeviceState *dev;
> +    va_list va;
> +    qemu_irq irq;
> +    int n;
> +
> +    dev = qdev_create(name);
> +    qdev_init(dev);
> +    if (addr != (target_phys_addr_t)-1) {
> +        qdev_mmio_map(dev, 0, addr);
> +    }
> +    va_start(va, addr);
> +    n = 0;
> +    while (1) {
> +        irq = va_arg(va, qemu_irq);
> +        if (!irq) {
> +            break;
> +        }
> +        qdev_connect_irq(dev, n, irq);
> +        n++;
> +    }
> +    return dev;
> +}
> diff --git a/hw/qdev.h b/hw/qdev.h
> new file mode 100644
> index 0000000..08b2713
> --- /dev/null
> +++ b/hw/qdev.h
> @@ -0,0 +1,89 @@
> +#ifndef QDEV_H
> +#define QDEV_H
> +
> +#include "irq.h"
> +
> +typedef struct DeviceType DeviceType;
> +
> +#define QDEV_MAX_MMIO 5
> +#define QDEV_MAX_IRQ 32
> +
> +typedef struct DeviceProperty DeviceProperty;
> +
> +typedef void (*mmio_mapfunc)(DeviceState *dev, target_phys_addr_t addr);
> +
> +/* This structure should not be accessed directly.  We declare it here
> +   so that it can be embedded in individual device state structures.  */
> +struct DeviceState
> +{
> +    const char *name;
> +    DeviceType *type;
> +    void *bus;

I suppose parent_bus is more descriptive?.

Here you would have a tree node, eventually.

> +    int num_irq;
> +    qemu_irq irqs[QDEV_MAX_IRQ];
> +    qemu_irq *irqp[QDEV_MAX_IRQ];
> +    int num_irq_sink;
> +    qemu_irq *irq_sink;

This is somewhat complicated. So you need irqp pointers and irqs array
due to the way IRQ initialization is performed? 

> +    struct {
> +        target_phys_addr_t addr;
> +        target_phys_addr_t size;
> +        mmio_mapfunc cb;
> +        int iofunc;
> +    } mmio[QDEV_MAX_MMIO];
> +    DeviceProperty *props;
> +};
> +
> +
> +/*** Board API.  This should go away once we have a machine config file.  ***/
> +
> +DeviceState *qdev_create(const char *name);
> +void qdev_init(DeviceState *dev);
> +
> +/* Set properties between creation and init.  */
> +void qdev_set_bus(DeviceState *dev, void *bus);
> +void qdev_set_prop_int(DeviceState *dev, const char *name, int value);
> +void qdev_set_prop_ptr(DeviceState *dev, const char *name, void *value);
> +
> +/* Configure device after init.   */
> +void qdev_connect_irq(DeviceState *dev, int n, qemu_irq irq);
> +void qdev_mmio_map(DeviceState *dev, int n, target_phys_addr_t addr);
> +qemu_irq qdev_get_irq_sink(DeviceState *dev, int n);
> +
> +
> +/*** Device API.  ***/
> +
> +typedef void (*qdev_initfn)(DeviceState *dev, void *opaque);
> +
> +DeviceType *qdev_register(const char *name, int size, qdev_initfn init,
> +                          void *opaque);

Don't you want to cover savevm/restore callbacks at this level too? 

Device destruction surely (for hotunplug). Passing a structure with
callbacks would be nicer.

> +
> +/* Register device properties.  */
> +void qdev_init_mmio(DeviceState *dev, target_phys_addr_t size, int iofunc);
> +void qdev_init_mmio_cb(DeviceState *dev, target_phys_addr_t size,
> +                       mmio_mapfunc cb);
> +void qdev_init_irq(DeviceState *dev, qemu_irq *p);
> +void qdev_pass_irq(DeviceState *dev, DeviceState *target);

Can you explain how init_irq_sink/pass_irq are supposed to work?

> +void qdev_init_scsi(DeviceState *dev, SCSIAttachFn attach)
> +{
> +   int bus = next_scsi_bus++;
> +   int unit;
> +   int index;
> +
> +   for (unit = 0; unit < MAX_SCSI_DEVS; unit++) {
> +       index = drive_get_index(IF_SCSI, bus, unit);
> +       if (index == -1) {
> +           continue;
> +       }
> +       attach(dev, drives_table[index].bdrv, unit);
> +   }
> +}
> diff --git a/hw/qdev.h b/hw/qdev.h
> index 08b2713..79eb22f 100644
> --- a/hw/qdev.h
> +++ b/hw/qdev.h
> @@ -54,6 +54,7 @@ qemu_irq qdev_get_irq_sink(DeviceState *dev, int n);
>  /*** Device API.  ***/
>  
>  typedef void (*qdev_initfn)(DeviceState *dev, void *opaque);
> +typedef void (*SCSIAttachFn)(void *opaque, BlockDriverState *bdrv, int unit);
>  
>  DeviceType *qdev_register(const char *name, int size, qdev_initfn init,
>                            void *opaque);
> @@ -65,6 +66,7 @@ void qdev_init_mmio_cb(DeviceState *dev, target_phys_addr_t size,
>  void qdev_init_irq(DeviceState *dev, qemu_irq *p);
>  void qdev_pass_irq(DeviceState *dev, DeviceState *target);
>  void qdev_init_irq_sink(DeviceState *dev, qemu_irq_handler handler, int nirq);
> +void qdev_init_scsi(DeviceState *dev, SCSIAttachFn attach);
>  
>  CharDriverState *qdev_init_chardev(DeviceState *dev);
>  

Looks like a good start to me.

Markus should have more substantial comments (he's on vacation
this week). The tree driven machine initialization introduces some
complications (such as parent_bus->child device dependencies) while
handling a lot of the problems you are ignoring (eg the ->config method
is responsible for linking host device -> emulated device information,
etc).

  parent reply	other threads:[~2009-05-08  5:43 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-05 11:31 [Qemu-devel] [RFC] New device API Paul Brook
2009-05-05 15:56 ` Blue Swirl
2009-05-05 16:17   ` Paul Brook
2009-05-05 16:26     ` Blue Swirl
2009-05-05 16:35       ` Paul Brook
2009-05-05 18:22 ` Anthony Liguori
2009-05-05 22:42   ` Edgar E. Iglesias
2009-05-06  0:52   ` Paul Brook
2009-05-06  1:04     ` Paul Brook
2009-05-06 13:35       ` Anthony Liguori
2009-05-09 20:55       ` Anthony Liguori
2009-05-09 21:06         ` Paul Brook
2009-05-10  1:34           ` Anthony Liguori
2009-05-09 22:52         ` malc
2009-05-10  1:35           ` Anthony Liguori
2009-05-10  6:50             ` Andreas Färber
2009-05-10 18:38             ` malc
2009-05-10  1:37           ` Anthony Liguori
2009-05-05 22:25 ` Edgar E. Iglesias
2009-05-08  1:54 ` Zachary Amsden
2009-05-08 11:28   ` Paul Brook
2009-05-08 13:47   ` Anthony Liguori
2009-05-09  1:21     ` Zachary Amsden
2009-05-09 13:36       ` Anthony Liguori
2009-05-08  5:27 ` Marcelo Tosatti [this message]
2009-05-08 10:44   ` Paul Brook
2009-05-28 13:53   ` Markus Armbruster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090508052755.GA13230@amt.cnet \
    --to=mtosatti@redhat.com \
    --cc=paul@codesourcery.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.