* [PATCH 0/3] virtio implementation (draft VI)
@ 2007-09-24 9:13 Rusty Russell
[not found] ` <1190625194.27805.199.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
0 siblings, 1 reply; 20+ messages in thread
From: Rusty Russell @ 2007-09-24 9:13 UTC (permalink / raw)
To: kvm-devel, virtualization
I'm not reposting the drivers this time since they haven't changed
significantly. See http://lguest.ozlabs.org/patches for the whole
thing, from new-io.patch onwards).
Changes since last version:
- Switch to our own bus implementation, rather than relying on
an implementation-specific bus to back this up.
- Make virtio_config_ops much higher-level, don't assume layout
of config space.
- Expose "struct virtqueue" again and move ops inside that.
- virtio_ring uses 64 bit address, 32 bit length, not addr64 +
offset16 + len16.
- flags in virtio_ring to allow suppression of interrupts (by
guest) or notifications (by host).
I'm not too unhappy with this as it stands. Feedback welcome!
Cheers,
Rusty.
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
^ permalink raw reply [flat|nested] 20+ messages in thread[parent not found: <1190625194.27805.199.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>]
* [PATCH 1/3] virtio interface [not found] ` <1190625194.27805.199.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org> @ 2007-09-24 9:14 ` Rusty Russell [not found] ` <1190625256.27805.201.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org> 0 siblings, 1 reply; 20+ messages in thread From: Rusty Russell @ 2007-09-24 9:14 UTC (permalink / raw) To: kvm-devel; +Cc: virtualization This attempts to implement a "virtual I/O" layer which should allow common drivers to be efficiently used across most virtual I/O mechanisms. It will no-doubt need further enhancement. The virtio drivers add buffers to virtio queues; as the buffers are consumed the driver "interrupt" callbacks are invoked. There is also a generic implementation of config space which drivers can query to get setup information from the host. Signed-off-by: Rusty Russell <rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org> --- drivers/Kconfig | 2 drivers/Makefile | 1 drivers/virtio/Kconfig | 3 drivers/virtio/Makefile | 1 drivers/virtio/config.c | 13 ++ drivers/virtio/virtio.c | 187 +++++++++++++++++++++++++++++++++++++++ include/linux/mod_devicetable.h | 6 + include/linux/virtio.h | 110 ++++++++++++++++++++++ include/linux/virtio_config.h | 111 +++++++++++++++++++++++ 9 files changed, 434 insertions(+) =================================================================== --- a/drivers/Kconfig +++ b/drivers/Kconfig @@ -87,4 +87,6 @@ source "drivers/kvm/Kconfig" source "drivers/kvm/Kconfig" source "drivers/uio/Kconfig" + +source "drivers/virtio/Kconfig" endmenu =================================================================== --- a/drivers/Makefile +++ b/drivers/Makefile @@ -88,3 +88,4 @@ obj-$(CONFIG_HID) += hid/ obj-$(CONFIG_HID) += hid/ obj-$(CONFIG_PPC_PS3) += ps3/ obj-$(CONFIG_OF) += of/ +obj-$(CONFIG_VIRTIO) += virtio/ =================================================================== --- /dev/null +++ b/drivers/virtio/Kconfig @@ -0,0 +1,3 @@ +# Virtio always gets selected by whoever wants it. +config VIRTIO + bool =================================================================== --- /dev/null +++ b/drivers/virtio/Makefile @@ -0,0 +1,1 @@ +obj-$(CONFIG_VIRTIO) += virtio.o =================================================================== --- /dev/null +++ b/drivers/virtio/config.c @@ -0,0 +1,13 @@ +/* Configuration space parsing helpers for virtio. + * + * The configuration is [type][len][... len bytes ...] fields. + * + * Copyright 2007 Rusty Russell, IBM Corporation. + * GPL v2 or later. + */ +#include <linux/err.h> +#include <linux/virtio.h> +#include <linux/virtio_config.h> +#include <linux/bug.h> +#include <asm/system.h> + =================================================================== --- /dev/null +++ b/drivers/virtio/virtio.c @@ -0,0 +1,187 @@ +#include <linux/virtio.h> +#include <linux/spinlock.h> +#include <linux/virtio_config.h> + +static ssize_t device_show(struct device *_d, + struct device_attribute *attr, char *buf) +{ + struct virtio_device *dev = container_of(_d,struct virtio_device,dev); + return sprintf(buf, "%hu", dev->id.device); +} +static ssize_t vendor_show(struct device *_d, + struct device_attribute *attr, char *buf) +{ + struct virtio_device *dev = container_of(_d,struct virtio_device,dev); + return sprintf(buf, "%hu", dev->id.vendor); +} +static ssize_t status_show(struct device *_d, + struct device_attribute *attr, char *buf) +{ + struct virtio_device *dev = container_of(_d,struct virtio_device,dev); + return sprintf(buf, "0x%08x", dev->config->get_status(dev)); +} +static struct device_attribute virtio_dev_attrs[] = { + __ATTR_RO(device), + __ATTR_RO(vendor), + __ATTR_RO(status), + __ATTR_NULL +}; + +static inline int virtio_id_match(const struct virtio_device *dev, + const struct virtio_device_id *id) +{ + if (id->device != dev->id.device) + return 0; + + return id->vendor == VIRTIO_DEV_ANY_ID || id->vendor != dev->id.vendor; +} + +/* This looks through all the IDs a driver claims to support. If any of them + * match, we return 1 and the kernel will call virtio_dev_probe(). */ +static int virtio_dev_match(struct device *_dv, struct device_driver *_dr) +{ + unsigned int i; + struct virtio_device *dev = container_of(_dv,struct virtio_device,dev); + const struct virtio_device_id *ids; + + ids = container_of(_dr, struct virtio_driver, driver)->id_table; + for (i = 0; ids[i].device; i++) + if (virtio_id_match(dev, &ids[i])) + return 1; + return 0; +} + +struct virtio_bus { + struct bus_type bus; + struct device dev; +}; + +static struct virtio_bus virtio_bus = { + .bus = { + .name = "virtio", + .match = virtio_dev_match, + .dev_attrs = virtio_dev_attrs, + }, + .dev = { + /* Can override this if you have a real bus behind it. */ + .parent = NULL, + .bus_id = "virtio", + } +}; + +static void add_status(struct virtio_device *dev, unsigned status) +{ + dev->config->set_status(dev, dev->config->get_status(dev) | status); +} + +static int virtio_dev_probe(struct device *_d) +{ + int err; + struct virtio_device *dev = container_of(_d,struct virtio_device,dev); + struct virtio_driver *drv = container_of(dev->dev.driver, + struct virtio_driver, driver); + + add_status(dev, VIRTIO_CONFIG_S_DRIVER); + err = drv->probe(dev); + if (err) + add_status(dev, VIRTIO_CONFIG_S_FAILED); + else + add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK); + return err; +} + +int register_virtio_driver(struct virtio_driver *driver) +{ + driver->driver.bus = &virtio_bus.bus; + driver->driver.probe = virtio_dev_probe; + return driver_register(&driver->driver); +} +EXPORT_SYMBOL_GPL(register_virtio_driver); + +void unregister_virtio_driver(struct virtio_driver *driver) +{ + driver_unregister(&driver->driver); +} +EXPORT_SYMBOL_GPL(unregister_virtio_driver); + +int register_virtio_device(struct virtio_device *dev) +{ + int err; + + /* If the device hasn't already been given a parent, our bus is it. */ + if (!dev->dev.parent) + dev->dev.parent = &virtio_bus.dev; + dev->dev.bus = &virtio_bus.bus; + sprintf(dev->dev.bus_id, "%u", dev->index); + + /* Acknowledge that we've seen the device. */ + add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE); + + /* device_register() causes the bus infrastructure to look for a + * matching driver. */ + err = device_register(&dev->dev); + if (err) + add_status(dev, VIRTIO_CONFIG_S_FAILED); + return err; +} +EXPORT_SYMBOL_GPL(register_virtio_device); + +void unregister_virtio_device(struct virtio_device *dev) +{ + device_unregister(&dev->dev); +} +EXPORT_SYMBOL_GPL(unregister_virtio_device); + +int __virtio_config_val(struct virtio_device *vdev, + u8 type, void *val, size_t size) +{ + void *token; + unsigned int len; + + token = vdev->config->find(vdev, type, &len); + if (!token) + return -ENOENT; + + if (len != size) + return -EIO; + + vdev->config->get(vdev, token, val, size); + return 0; +} +EXPORT_SYMBOL_GPL(__virtio_config_val); + +int virtio_use_bit(struct virtio_device *vdev, + void *token, unsigned int len, unsigned int bitnum) +{ + unsigned long bits[16]; + + /* This makes it convenient to pass-through find() results. */ + if (!token) + return 0; + + /* bit not in range of this bitfield? */ + if (bitnum * 8 >= len / 2) + return 0; + + /* Giant feature bitfields are silly. */ + BUG_ON(len > sizeof(bits)); + vdev->config->get(vdev, token, bits, len); + + if (!test_bit(bitnum, bits)) + return 0; + + /* Set acknowledge bit, and write it back. */ + set_bit(bitnum + len * 8 / 2, bits); + vdev->config->set(vdev, token, bits, len); + return 1; +} +EXPORT_SYMBOL_GPL(virtio_use_bit); + +static int virtio_init(void) +{ + if (bus_register(&virtio_bus.bus) != 0 + || device_register(&virtio_bus.dev) != 0) + panic("virtio bus registration failed"); + return 0; +} +core_initcall(virtio_init); =================================================================== --- a/include/linux/mod_devicetable.h +++ b/include/linux/mod_devicetable.h @@ -340,4 +340,10 @@ struct parisc_device_id { #define PA_HVERSION_ANY_ID 0xffff #define PA_SVERSION_ANY_ID 0xffffffff +struct virtio_device_id { + __u32 device; + __u32 vendor; +}; +#define VIRTIO_DEV_ANY_ID 0xffffffff + #endif /* LINUX_MOD_DEVICETABLE_H */ =================================================================== --- /dev/null +++ b/include/linux/virtio.h @@ -0,0 +1,110 @@ +#ifndef _LINUX_VIRTIO_H +#define _LINUX_VIRTIO_H +/* Everything a virtio driver needs to work with any particular virtio + * implementation. */ +#include <linux/types.h> +#include <linux/scatterlist.h> +#include <linux/spinlock.h> +#include <linux/device.h> +#include <linux/mod_devicetable.h> + +/** + * virtqueue - a queue to register buffers for sending or receiving. + * @callback: the function to call when buffers are consumed (can be NULL). + * If this returns false, callbacks are suppressed until vq_ops->restart + * is called. + * @vdev: the virtio device this queue was created for. + * @vq_ops: the operations for this virtqueue (see below). + * @priv: a pointer for the virtqueue implementation to use. + */ +struct virtqueue +{ + bool (*callback)(struct virtqueue *vq); + struct virtio_device *vdev; + struct virtqueue_ops *vq_ops; + void *priv; +}; + +/** + * virtqueue_ops - operations for virtqueue abstraction layer + * @add_buf: expose buffer to other end + * vq: the struct virtqueue we're talking about. + * sg: the description of the buffer(s). + * out_num: the number of sg readable by other side + * in_num: the number of sg which are writable (after readable ones) + * data: the token identifying the buffer. + * Returns 0 or an error. + * @kick: update after add_buf + * vq: the struct virtqueue + * After one or more add_buf calls, invoke this to kick the other side. + * @get_buf: get the next used buffer + * vq: the struct virtqueue we're talking about. + * len: the length written into the buffer + * Returns NULL or the "data" token handed to add_buf. + * @restart: restart callbacks ater callback returned false. + * vq: the struct virtqueue we're talking about. + * This returns "false" (and doesn't re-enable) if there are pending + * buffers in the queue, to avoid a race. + * @shutdown: "unadd" all buffers. + * vq: the struct virtqueue we're talking about. + * Remove everything from the queue. + * + * Locking rules are straightforward: the driver is responsible for + * locking. No two operations may be invoked simultaneously. + * + * All operations can be called in any context. + */ +struct virtqueue_ops { + int (*add_buf)(struct virtqueue *vq, + struct scatterlist sg[], + unsigned int out_num, + unsigned int in_num, + void *data); + + void (*kick)(struct virtqueue *vq); + + void *(*get_buf)(struct virtqueue *vq, unsigned int *len); + + bool (*restart)(struct virtqueue *vq); + + void (*shutdown)(struct virtqueue *vq); +}; + +/** + * virtio_device - representation of a device using virtio + * @index: unique position on the virtio bus + * @dev: underlying device. + * @id: the device type identification (used to match it with a driver). + * @config: the configuration ops for this device. + * @priv: private pointer for the driver's use. + */ +struct virtio_device +{ + int index; + struct device dev; + struct virtio_device_id id; + struct virtio_config_ops *config; + void *priv; +}; + +int register_virtio_device(struct virtio_device *dev); +void unregister_virtio_device(struct virtio_device *dev); + +/** + * virtio_driver - operations for a virtio I/O driver + * @driver: underlying device driver (populate name and owner). + * @id_table: the ids serviced by this driver. + * @probe: the function to call when a device is found. Returns a token for + * remove, or PTR_ERR(). + * @remove: the function when a device is removed. + */ +struct virtio_driver { + struct device_driver driver; + const struct virtio_device_id *id_table; + int (*probe)(struct virtio_device *dev); + void (*remove)(struct virtio_device *dev); +}; + +int register_virtio_driver(struct virtio_driver *drv); +void unregister_virtio_driver(struct virtio_driver *drv); +#endif /* _LINUX_VIRTIO_H */ =================================================================== --- /dev/null +++ b/include/linux/virtio_config.h @@ -0,0 +1,111 @@ +#ifndef _LINUX_VIRTIO_CONFIG_H +#define _LINUX_VIRTIO_CONFIG_H +/* Virtio devices use a standardized configuration space to define their + * features and pass configuration information, but each implementation can + * store and access that space differently. */ +#include <linux/types.h> + +/* Status byte for guest to report progress, and synchronize config. */ +/* We have seen device and processed generic fields (VIRTIO_CONFIG_F_VIRTIO) */ +#define VIRTIO_CONFIG_S_ACKNOWLEDGE 1 +/* We have found a driver for the device. */ +#define VIRTIO_CONFIG_S_DRIVER 2 +/* Driver has used its parts of the config, and is happy */ +#define VIRTIO_CONFIG_S_DRIVER_OK 4 +/* We've given up on this device. */ +#define VIRTIO_CONFIG_S_FAILED 0x80 + +/* Feature byte (actually 7 bits availabe): */ +/* Requirements/features of the virtio implementation. */ +#define VIRTIO_CONFIG_F_VIRTIO 1 +/* Requirements/features of the virtqueue (may have more than one). */ +#define VIRTIO_CONFIG_F_VIRTQUEUE 2 + +#ifdef __KERNEL__ +struct virtio_device; + +/** + * virtio_config_ops - operations for configuring a virtio device + * @find: search for the next configuration field of the given type. + * vdev: the virtio_device + * type: the feature type + * len: the (returned) length of the field if found. + * Returns a token if found, or NULL. Never returnes the same field twice + * (ie. it's used up). + * @get: read the value of a configuration field after find(). + * vdev: the virtio_device + * token: the token returned from find(). + * buf: the buffer to write the field value into. + * len: the length of the buffer (given by find()). + * Note that contents are conventionally little-endian. + * @set: write the value of a configuration field after find(). + * vdev: the virtio_device + * token: the token returned from find(). + * buf: the buffer to read the field value from. + * len: the length of the buffer (given by find()). + * Note that contents are conventionally little-endian. + * @get_status: read the status byte + * vdev: the virtio_device + * Returns the status byte + * @set_status: write the status byte + * vdev: the virtio_device + * status: the new status byte + * @find_vq: find the first VIRTIO_CONFIG_F_VIRTQUEUE and create a virtqueue. + * vdev: the virtio_device + * callback: the virqtueue callback + * Returns the new virtqueue or ERR_PTR(). + * @del_vq: free a virtqueue found by find_vq(). + */ +struct virtio_config_ops +{ + void *(*find)(struct virtio_device *vdev, u8 type, unsigned *len); + void (*get)(struct virtio_device *vdev, void *token, + void *buf, unsigned len); + void (*set)(struct virtio_device *vdev, void *token, + const void *buf, unsigned len); + u8 (*get_status)(struct virtio_device *vdev); + void (*set_status)(struct virtio_device *vdev, u8 status); + struct virtqueue *(*find_vq)(struct virtio_device *vdev, + bool (*callback)(struct virtqueue *)); + void (*del_vq)(struct virtqueue *vq); +}; + +/** + * virtio_config_val - get a single virtio config and mark it used. + * @config: the virtio config space + * @type: the type to search for. + * @val: a pointer to the value to fill in. + * + * Once used, the config type is marked with VIRTIO_CONFIG_F_USED so it can't + * be found again. This version does endian conversion. */ +#define virtio_config_val(vdev, type, v) ({ \ + int _err = __virtio_config_val((vdev),(type),(v),sizeof(*(v))); \ + \ + BUILD_BUG_ON(sizeof(*(v)) != 1 && sizeof(*(v)) != 2 \ + && sizeof(*(v)) != 4 && sizeof(*(v)) != 8); \ + if (!_err) { \ + switch (sizeof(*(v))) { \ + case 2: le16_to_cpus(v); break; \ + case 4: le32_to_cpus(v); break; \ + case 8: le64_to_cpus(v); break; \ + } \ + } \ + _err; \ +}) + +int __virtio_config_val(struct virtio_device *dev, + u8 type, void *val, size_t size); + +/** + * virtio_use_bit - helper to use a feature bit in a bitfield value. + * @dev: the virtio device + * @token: the token as returned from vdev->config->find(). + * @len: the length of the field. + * @bitnum: the bit to test. + * + * If handed a NULL token, it returns false, otherwise returns bit status. + * If it's one, it sets the mirroring acknowledgement bit. */ +int virtio_use_bit(struct virtio_device *vdev, + void *token, unsigned int len, unsigned int bitnum); +#endif /* __KERNEL__ */ +#endif /* _LINUX_VIRTIO_CONFIG_H */ ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <1190625256.27805.201.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>]
* [PATCH 2/3] virtio ring implementation [not found] ` <1190625256.27805.201.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org> @ 2007-09-24 9:15 ` Rusty Russell [not found] ` <1190625307.27805.203.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org> 2007-09-24 22:18 ` [PATCH 1/3] virtio interface Arnd Bergmann 1 sibling, 1 reply; 20+ messages in thread From: Rusty Russell @ 2007-09-24 9:15 UTC (permalink / raw) To: kvm-devel; +Cc: virtualization These helper routines supply most of the virtqueue_ops for hypervisors which want to use a ring for virtio. Unlike the previous lguest implementation: 1) The rings are variable sized (2^n-1 elements). 2) They have an unfortunate limit of 65535 bytes per sg element. 3) The page numbers are always 64 bit (PAE anyone?) 4) They no longer place used[] on a separate page, just a separate cacheline. 5) We do a modulo on a variable. We could be tricky if we cared. 6) Interrupts and notifies are suppressed using flags within the rings. Users need only implement the new_vq and free_vq hooks (KVM wants the guest to allocate the rings, lguest does it sanely). Signed-off-by: Rusty Russell <rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org> --- arch/i386/lguest/Kconfig | 1 drivers/virtio/Kconfig | 5 drivers/virtio/Makefile | 1 drivers/virtio/virtio_ring.c | 316 ++++++++++++++++++++++++++++++++++++++++++ include/linux/virtio_ring.h | 119 +++++++++++++++ 5 files changed, 442 insertions(+) =================================================================== --- a/arch/i386/lguest/Kconfig +++ b/arch/i386/lguest/Kconfig @@ -2,6 +2,7 @@ config LGUEST_GUEST bool "Lguest guest support" select PARAVIRT depends on !X86_PAE + select VIRTIO_RING help Lguest is a tiny in-kernel hypervisor. Selecting this will allow your kernel to boot under lguest. This option will increase =================================================================== --- a/drivers/virtio/Kconfig +++ b/drivers/virtio/Kconfig @@ -1,3 +1,8 @@ # Virtio always gets selected by whoever wants it. config VIRTIO bool + +# Similarly the virtio ring implementation. +config VIRTIO_RING + bool + depends on VIRTIO =================================================================== --- a/drivers/virtio/Makefile +++ b/drivers/virtio/Makefile @@ -1,1 +1,2 @@ obj-$(CONFIG_VIRTIO) += virtio.o obj-$(CONFIG_VIRTIO) += virtio.o +obj-$(CONFIG_VIRTIO_RING) += virtio_ring.o =================================================================== --- /dev/null +++ b/drivers/virtio/virtio_ring.c @@ -0,0 +1,316 @@ +/* Virtio ring implementation. + * + * Copyright 2007 Rusty Russell IBM Corporation + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + */ +#include <linux/virtio.h> +#include <linux/virtio_ring.h> +#include <linux/device.h> + +#ifdef DEBUG +/* For development, we want to crash whenever the ring is screwed. */ +#define BAD_RING(vq, fmt...) \ + do { dev_err(&vq->vq.vdev->dev, fmt); BUG(); } while(0) +#define START_USE(vq) \ + do { if ((vq)->in_use) panic("in_use = %i\n", (vq)->in_use); (vq)->in_use = __LINE__; mb(); } while(0) +#define END_USE(vq) \ + do { BUG_ON(!(vq)->in_use); (vq)->in_use = 0; mb(); } while(0) +#else +#define BAD_RING(vq, fmt...) \ + do { dev_err(&vq->vq.vdev->dev, fmt); (vq)->broken = true; } while(0) +#define START_USE(vq) +#define END_USE(vq) +#endif + +struct vring_virtqueue +{ + struct virtqueue vq; + + /* Actual memory layout for this queue */ + struct vring vring; + + /* Other side has made a mess, don't try any more. */ + bool broken; + + /* Number of free buffers */ + unsigned int num_free; + /* Head of free buffer list. */ + unsigned int free_head; + /* Number we've added since last sync. */ + unsigned int num_added; + + /* Last used index we've seen. */ + unsigned int last_used_idx; + + /* How to notify other side. FIXME: commonalize hcalls! */ + void (*notify)(struct virtqueue *vq); + +#ifdef DEBUG + /* They're supposed to lock for us. */ + unsigned int in_use; +#endif + + /* Tokens for callbacks. */ + void *data[]; +}; + +#define to_vvq(_vq) container_of(_vq, struct vring_virtqueue, vq) + +static int vring_add_buf(struct virtqueue *_vq, + struct scatterlist sg[], + unsigned int out, + unsigned int in, + void *data) +{ + struct vring_virtqueue *vq = to_vvq(_vq); + unsigned int i, avail, head, uninitialized_var(prev); + + BUG_ON(data == NULL); + BUG_ON(out + in > vq->vring.num); + BUG_ON(out + in == 0); + + START_USE(vq); + + if (vq->num_free < out + in) { + pr_debug("Can't add buf len %i - avail = %i\n", + out + in, vq->num_free); + END_USE(vq); + return -ENOSPC; + } + + /* We're about to use some buffers from the free list. */ + vq->num_free -= out + in; + + head = vq->free_head; + for (i = vq->free_head; out; i = vq->vring.desc[i].next, out--) { + vq->vring.desc[i].flags = VRING_DESC_F_NEXT; + vq->vring.desc[i].addr = (page_to_pfn(sg->page) << PAGE_SHIFT) + + sg->offset; + vq->vring.desc[i].len = sg->length; + prev = i; + sg++; + } + for (; in; i = vq->vring.desc[i].next, in--) { + vq->vring.desc[i].flags = VRING_DESC_F_NEXT|VRING_DESC_F_WRITE; + vq->vring.desc[i].addr = (page_to_pfn(sg->page) << PAGE_SHIFT) + + sg->offset; + vq->vring.desc[i].len = sg->length; + prev = i; + sg++; + } + /* Last one doesn't continue. */ + vq->vring.desc[prev].flags &= ~VRING_DESC_F_NEXT; + + /* Update free pointer */ + vq->free_head = i; + + /* Set token. */ + vq->data[head] = data; + + /* Put entry in available array (but don't update avail->idx until they + * do sync). FIXME: avoid modulus here? */ + avail = (vq->vring.avail->idx + vq->num_added++) % vq->vring.num; + vq->vring.avail->ring[avail] = head; + + pr_debug("Added buffer head %i to %p\n", head, vq); + END_USE(vq); + return 0; +} + +static void vring_kick(struct virtqueue *_vq) +{ + struct vring_virtqueue *vq = to_vvq(_vq); + START_USE(vq); + /* Descriptors and available array need to be set before we expose the + * new available array entries. */ + wmb(); + + vq->vring.avail->idx += vq->num_added; + vq->num_added = 0; + + /* Need to update avail index before checking if we should notify */ + mb(); + + if (!(vq->vring.used->flags & VRING_USED_F_NO_NOTIFY)) + /* Prod other side to tell it about changes. */ + vq->notify(&vq->vq); + + END_USE(vq); +} + +static void detach_buf(struct vring_virtqueue *vq, unsigned int head) +{ + unsigned int i; + + /* Clear data ptr. */ + vq->data[head] = NULL; + + /* Put back on free list: find end */ + i = head; + while (vq->vring.desc[i].flags & VRING_DESC_F_NEXT) { + i = vq->vring.desc[i].next; + vq->num_free++; + } + + vq->vring.desc[i].next = vq->free_head; + vq->free_head = head; + /* Plus final descriptor */ + vq->num_free++; +} + +/* FIXME: We need to tell other side about removal, to synchronize. */ +static void vring_shutdown(struct virtqueue *_vq) +{ + struct vring_virtqueue *vq = to_vvq(_vq); + unsigned int i; + + for (i = 0; i < vq->vring.num; i++) + detach_buf(vq, i); +} + +static inline bool more_used(const struct vring_virtqueue *vq) +{ + return vq->last_used_idx != vq->vring.used->idx; +} + +static void *vring_get_buf(struct virtqueue *_vq, unsigned int *len) +{ + struct vring_virtqueue *vq = to_vvq(_vq); + void *ret; + unsigned int i; + + START_USE(vq); + + if (!more_used(vq)) { + pr_debug("No more buffers in queue\n"); + END_USE(vq); + return NULL; + } + + i = vq->vring.used->ring[vq->last_used_idx%vq->vring.num].id; + *len = vq->vring.used->ring[vq->last_used_idx%vq->vring.num].len; + + if (unlikely(i >= vq->vring.num)) { + BAD_RING(vq, "id %u out of range\n", i); + return NULL; + } + if (unlikely(!vq->data[i])) { + BAD_RING(vq, "id %u is not a head!\n", i); + return NULL; + } + + /* detach_buf clears data, so grab it now. */ + ret = vq->data[i]; + detach_buf(vq, i); + vq->last_used_idx++; + END_USE(vq); + return ret; +} + +static bool vring_restart(struct virtqueue *_vq) +{ + struct vring_virtqueue *vq = to_vvq(_vq); + + START_USE(vq); + BUG_ON(!(vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT)); + + /* We optimistically turn back on interrupts, then check if there was + * more to do. */ + vq->vring.avail->flags &= ~VRING_AVAIL_F_NO_INTERRUPT; + mb(); + if (unlikely(more_used(vq))) { + vq->vring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT; + END_USE(vq); + return false; + } + + END_USE(vq); + return true; +} + +irqreturn_t vring_interrupt(int irq, void *_vq) +{ + struct vring_virtqueue *vq = to_vvq(_vq); + + pr_debug("virtqueue interrupt for %p\n", vq); + + if (unlikely(vq->broken)) + return IRQ_HANDLED; + + if (more_used(vq)) { + pr_debug("virtqueue callback for %p (%p)\n", + vq, vq->vq.callback); + if (!vq->vq.callback) + return IRQ_NONE; + if (!vq->vq.callback(&vq->vq)) + vq->vring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT; + } else + pr_debug("virtqueue %p no more used\n", vq); + + return IRQ_HANDLED; +} + +static struct virtqueue_ops vring_vq_ops = { + .add_buf = vring_add_buf, + .get_buf = vring_get_buf, + .kick = vring_kick, + .restart = vring_restart, + .shutdown = vring_shutdown, +}; + +struct virtqueue *vring_new_virtqueue(unsigned int num, + struct virtio_device *vdev, + void *pages, + void (*notify)(struct virtqueue *), + bool (*callback)(struct virtqueue *)) +{ + struct vring_virtqueue *vq; + unsigned int i; + + vq = kmalloc(sizeof(*vq) + sizeof(void *)*num, GFP_KERNEL); + if (!vq) + return NULL; + + vring_init(&vq->vring, num, pages); + vq->vq.callback = callback; + vq->vq.vdev = vdev; + vq->vq.vq_ops = &vring_vq_ops; + vq->notify = notify; + vq->broken = false; + vq->last_used_idx = 0; + vq->num_added = 0; +#ifdef DEBUG + vq->in_use = false; +#endif + + /* No callback? Tell other side not to bother us. */ + if (!callback) + vq->vring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT; + + /* Put everything in free lists. */ + vq->num_free = num; + vq->free_head = 0; + for (i = 0; i < num-1; i++) + vq->vring.desc[i].next = i+1; + + return &vq->vq; +} + +void vring_del_virtqueue(struct virtqueue *vq) +{ + kfree(to_vvq(vq)); +} + =================================================================== --- /dev/null +++ b/include/linux/virtio_ring.h @@ -0,0 +1,119 @@ +#ifndef _LINUX_VIRTIO_RING_H +#define _LINUX_VIRTIO_RING_H +/* An interface for efficient virtio implementation, currently for use by KVM + * and lguest, but hopefully others soon. Do NOT change this since it will + * break existing servers and clients. + * + * This header is BSD licensed so anyone can use the definitions to implement + * compatible drivers/servers. + * + * Copyright Rusty Russell IBM Corporation 2007. */ +#include <linux/types.h> + +/* This marks a buffer as continuing via the next field. */ +#define VRING_DESC_F_NEXT 1 +/* This marks a buffer as write-only (otherwise read-only). */ +#define VRING_DESC_F_WRITE 2 + +/* This means don't notify other side when buffer added. */ +#define VRING_USED_F_NO_NOTIFY 1 +/* This means don't interrupt guest when buffer consumed. */ +#define VRING_AVAIL_F_NO_INTERRUPT 1 + +/* Virtio ring descriptors: 16 bytes. These can chain together via "next". */ +struct vring_desc +{ + /* Address (guest-physical). */ + __u64 addr; + /* Length. */ + __u32 len; + /* The flags as indicated above. */ + __u16 flags; + /* We chain unused descriptors via this, too */ + __u16 next; +}; + +struct vring_avail +{ + __u16 flags; + __u16 idx; + __u16 ring[]; +}; + +/* u32 is used here for ids for padding reasons. */ +struct vring_used_elem +{ + /* Index of start of used descriptor chain. */ + __u32 id; + /* Total length of the descriptor chain which was used (written to) */ + __u32 len; +}; + +struct vring_used +{ + __u16 flags; + __u16 idx; + struct vring_used_elem ring[]; +}; + +struct vring { + unsigned int num; + + struct vring_desc *desc; + + struct vring_avail *avail; + + struct vring_used *used; +}; + +/* The standard layout for the ring is a continuous chunk of memory which looks + * like this. The used fields will be aligned to a "num+1" boundary. + * + * struct vring + * { + * // The actual descriptors (16 bytes each) + * struct vring_desc desc[num]; + * + * // A ring of available descriptor heads with free-running index. + * __u16 avail_flags; + * __u16 avail_idx; + * __u16 available[num]; + * + * // Padding so a correctly-chosen num value will cache-align used_idx. + * char pad[sizeof(struct vring_desc) - sizeof(avail_flags)]; + * + * // A ring of used descriptor heads with free-running index. + * __u16 used_flags; + * __u16 used_idx; + * struct vring_used_elem used[num]; + * }; + */ +static inline void vring_init(struct vring *vr, unsigned int num, void *p) +{ + vr->num = num; + vr->desc = p; + vr->avail = p + num*sizeof(struct vring); + vr->used = p + (num+1)*(sizeof(struct vring) + sizeof(__u16)); +} + +static inline unsigned vring_size(unsigned int num) +{ + return (num + 1) * (sizeof(struct vring_desc) + sizeof(__u16)) + + sizeof(__u32) + num * sizeof(struct vring_used_elem); +} + +#ifdef __KERNEL__ +#include <linux/irqreturn.h> +struct virtio_device; +struct virtqueue; + +struct virtqueue *vring_new_virtqueue(unsigned int num, + struct virtio_device *vdev, + void *pages, + void (*notify)(struct virtqueue *vq), + bool (*callback)(struct virtqueue *vq)); +void vring_del_virtqueue(struct virtqueue *vq); + +irqreturn_t vring_interrupt(int irq, void *_vq); +#endif /* __KERNEL__ */ +#endif /* _LINUX_VIRTIO_RING_H */ ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <1190625307.27805.203.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>]
* [PATCH 3/3] virtio module alias support [not found] ` <1190625307.27805.203.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org> @ 2007-09-24 9:16 ` Rusty Russell [not found] ` <1190625394.27805.206.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org> 2007-09-24 13:44 ` [PATCH 2/3] virtio ring implementation Dor Laor 1 sibling, 1 reply; 20+ messages in thread From: Rusty Russell @ 2007-09-24 9:16 UTC (permalink / raw) To: kvm-devel; +Cc: Greg KH, virtualization This adds the logic to convert the virtio ids into module aliases, and includes a modalias entry in sysfs. Unfortunately this does not seem sufficient to have the module autoprobed at startup on my Ubuntu system. Greg? Am I missing some udev magic? Signed-off-by: Rusty Russell <rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org> --- drivers/virtio/virtio.c | 9 +++++++++ scripts/mod/file2alias.c | 18 ++++++++++++++++++ 2 files changed, 27 insertions(+) =================================================================== --- a/drivers/virtio/virtio.c +++ b/drivers/virtio/virtio.c @@ -20,10 +20,19 @@ static ssize_t status_show(struct device struct virtio_device *dev = container_of(_d,struct virtio_device,dev); return sprintf(buf, "0x%08x", dev->config->get_status(dev)); } +static ssize_t modalias_show(struct device *_d, + struct device_attribute *attr, char *buf) +{ + struct virtio_device *dev = container_of(_d,struct virtio_device,dev); + + return sprintf(buf, "virtio:d%08Xv%08X\n", + dev->id.device, dev->id.vendor); +} static struct device_attribute virtio_dev_attrs[] = { __ATTR_RO(device), __ATTR_RO(vendor), __ATTR_RO(status), + __ATTR_RO(modalias), __ATTR_NULL }; =================================================================== --- a/scripts/mod/file2alias.c +++ b/scripts/mod/file2alias.c @@ -484,6 +484,20 @@ static int do_parisc_entry(const char *f return 1; } +/* Looks like: virtio:dNvN */ +static int do_virtio_entry(const char *filename, struct virtio_device_id *id, + char *alias) +{ + id->device = TO_NATIVE(id->device); + id->vendor = TO_NATIVE(id->vendor); + + strcpy(alias, "virtio:"); + ADD(alias, "d", 1, id->device); + ADD(alias, "v", id->vendor != VIRTIO_DEV_ANY_ID, id->vendor); + + return 1; +} + /* Ignore any prefix, eg. v850 prepends _ */ static inline int sym_is(const char *symbol, const char *name) { @@ -599,6 +613,10 @@ void handle_moddevtable(struct module *m do_table(symval, sym->st_size, sizeof(struct parisc_device_id), "parisc", do_parisc_entry, mod); + else if (sym_is(symname, "__mod_virtio_device_table")) + do_table(symval, sym->st_size, + sizeof(struct virtio_device_id), "virtio", + do_virtio_entry, mod); } /* Now add out buffered information to the generated C source */ ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <1190625394.27805.206.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>]
* Re: [PATCH 3/3] virtio module alias support [not found] ` <1190625394.27805.206.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org> @ 2007-09-24 16:02 ` Greg KH [not found] ` <20070924160221.GB5846-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org> 0 siblings, 1 reply; 20+ messages in thread From: Greg KH @ 2007-09-24 16:02 UTC (permalink / raw) To: Rusty Russell; +Cc: kvm-devel, virtualization On Mon, Sep 24, 2007 at 07:16:34PM +1000, Rusty Russell wrote: > This adds the logic to convert the virtio ids into module aliases, and > includes a modalias entry in sysfs. > > Unfortunately this does not seem sufficient to have the module > autoprobed at startup on my Ubuntu system. Greg? Am I missing some > udev magic? You also need to pass the MODINFO environment variable to the hotplug call so that udev can pick it up. hope this helps, greg k-h ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <20070924160221.GB5846-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>]
* Re: [PATCH 3/3] virtio module alias support [not found] ` <20070924160221.GB5846-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org> @ 2007-09-25 0:50 ` Rusty Russell [not found] ` <1190681405.27805.243.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org> 0 siblings, 1 reply; 20+ messages in thread From: Rusty Russell @ 2007-09-25 0:50 UTC (permalink / raw) To: Greg KH; +Cc: kvm-devel, virtualization On Mon, 2007-09-24 at 09:02 -0700, Greg KH wrote: > On Mon, Sep 24, 2007 at 07:16:34PM +1000, Rusty Russell wrote: > > This adds the logic to convert the virtio ids into module aliases, and > > includes a modalias entry in sysfs. > > > > Unfortunately this does not seem sufficient to have the module > > autoprobed at startup on my Ubuntu system. Greg? Am I missing some > > udev magic? > > You also need to pass the MODINFO environment variable to the hotplug > call so that udev can pick it up. > > hope this helps, Well, I finally decoded this as follows: You need to implement the bus_type.uevent hook, and use add_uevent_var() to add a "MODALIAS=virtio:..." var. See drivers/pci/hotplug.c. Not sure why I need the modalias here when it's in sysfs. But, it works. Thanks! Rusty. --- Module autoprobing support for virtio drivers. This adds the logic to convert the virtio ids into module aliases, and includes a modalias entry in sysfs and the env var to make probing work. Signed-off-by: Rusty Russell <rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org> --- drivers/virtio/virtio.c | 9 +++++++++ scripts/mod/file2alias.c | 18 ++++++++++++++++++ 2 files changed, 27 insertions(+) diff -r 4b377c58d110 drivers/virtio/virtio.c --- a/drivers/virtio/virtio.c Mon Sep 24 18:46:28 2007 +1000 +++ b/drivers/virtio/virtio.c Tue Sep 25 10:42:18 2007 +1000 @@ -19,11 +19,20 @@ static ssize_t status_show(struct device { struct virtio_device *dev = container_of(_d,struct virtio_device,dev); return sprintf(buf, "0x%08x", dev->config->get_status(dev)); +} +static ssize_t modalias_show(struct device *_d, + struct device_attribute *attr, char *buf) +{ + struct virtio_device *dev = container_of(_d,struct virtio_device,dev); + + return sprintf(buf, "virtio:d%08Xv%08X\n", + dev->id.device, dev->id.vendor); } static struct device_attribute virtio_dev_attrs[] = { __ATTR_RO(device), __ATTR_RO(vendor), __ATTR_RO(status), + __ATTR_RO(modalias), __ATTR_NULL }; @@ -48,6 +57,22 @@ static int virtio_dev_match(struct devic for (i = 0; ids[i].device; i++) if (virtio_id_match(dev, &ids[i])) return 1; + return 0; +} + +static int virtio_uevent(struct device *_dv, char **envp, + int num_envp, char *buffer, int buffer_size) +{ + struct virtio_device *dev = container_of(_dv,struct virtio_device,dev); + int i = 0, err; + int length = 0; + + err = add_uevent_var(envp, num_envp, &i, buffer, buffer_size, &length, + "MODALIAS=virtio:d%08Xv%08X", + dev->id.device, dev->id.vendor); + if (err) + return err; + envp[i] = NULL; return 0; } @@ -61,6 +86,7 @@ static struct virtio_bus virtio_bus = { .name = "virtio", .match = virtio_dev_match, .dev_attrs = virtio_dev_attrs, + .uevent = virtio_uevent, }, .dev = { /* Can override this if you have a real bus behind it. */ diff -r 4b377c58d110 scripts/mod/file2alias.c --- a/scripts/mod/file2alias.c Mon Sep 24 18:46:28 2007 +1000 +++ b/scripts/mod/file2alias.c Mon Sep 24 18:46:28 2007 +1000 @@ -484,6 +484,20 @@ static int do_parisc_entry(const char *f return 1; } +/* Looks like: virtio:dNvN */ +static int do_virtio_entry(const char *filename, struct virtio_device_id *id, + char *alias) +{ + id->device = TO_NATIVE(id->device); + id->vendor = TO_NATIVE(id->vendor); + + strcpy(alias, "virtio:"); + ADD(alias, "d", 1, id->device); + ADD(alias, "v", id->vendor != VIRTIO_DEV_ANY_ID, id->vendor); + + return 1; +} + /* Ignore any prefix, eg. v850 prepends _ */ static inline int sym_is(const char *symbol, const char *name) { @@ -599,6 +613,10 @@ void handle_moddevtable(struct module *m do_table(symval, sym->st_size, sizeof(struct parisc_device_id), "parisc", do_parisc_entry, mod); + else if (sym_is(symname, "__mod_virtio_device_table")) + do_table(symval, sym->st_size, + sizeof(struct virtio_device_id), "virtio", + do_virtio_entry, mod); } /* Now add out buffered information to the generated C source */ ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <1190681405.27805.243.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>]
* Re: [PATCH 3/3] virtio module alias support [not found] ` <1190681405.27805.243.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org> @ 2007-09-25 1:57 ` Greg KH [not found] ` <20070925015747.GA16011-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org> 0 siblings, 1 reply; 20+ messages in thread From: Greg KH @ 2007-09-25 1:57 UTC (permalink / raw) To: Rusty Russell; +Cc: kvm-devel, virtualization On Tue, Sep 25, 2007 at 10:50:04AM +1000, Rusty Russell wrote: > On Mon, 2007-09-24 at 09:02 -0700, Greg KH wrote: > > On Mon, Sep 24, 2007 at 07:16:34PM +1000, Rusty Russell wrote: > > > This adds the logic to convert the virtio ids into module aliases, and > > > includes a modalias entry in sysfs. > > > > > > Unfortunately this does not seem sufficient to have the module > > > autoprobed at startup on my Ubuntu system. Greg? Am I missing some > > > udev magic? > > > > You also need to pass the MODINFO environment variable to the hotplug > > call so that udev can pick it up. > > > > hope this helps, > > Well, I finally decoded this as follows: > > You need to implement the bus_type.uevent hook, and use > add_uevent_var() to add a "MODALIAS=virtio:..." var. See > drivers/pci/hotplug.c. Ah, sorry, thanks for intrepreting it properly :) > Not sure why I need the modalias here when it's in sysfs. But, it > works. udev does not read from sysfs anymore, it just uses the environment variables. It's faster and race-free that way. thanks, greg k-h ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <20070925015747.GA16011-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>]
* Re: [PATCH 3/3] virtio module alias support [not found] ` <20070925015747.GA16011-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org> @ 2007-09-25 2:11 ` Rusty Russell [not found] ` <1190686275.27805.255.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org> 0 siblings, 1 reply; 20+ messages in thread From: Rusty Russell @ 2007-09-25 2:11 UTC (permalink / raw) To: Greg KH; +Cc: kvm-devel, virtualization On Mon, 2007-09-24 at 18:57 -0700, Greg KH wrote: > On Tue, Sep 25, 2007 at 10:50:04AM +1000, Rusty Russell wrote: > > Not sure why I need the modalias here when it's in sysfs. But, it > > works. > > udev does not read from sysfs anymore, it just uses the environment > variables. It's faster and race-free that way. Should I leave the modalias entry in sysfs for back compat, or not? This will be for >= 2.6.24 only... Thanks, Rusty. ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <1190686275.27805.255.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>]
* Re: [PATCH 3/3] virtio module alias support [not found] ` <1190686275.27805.255.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org> @ 2007-09-25 3:10 ` Greg KH 0 siblings, 0 replies; 20+ messages in thread From: Greg KH @ 2007-09-25 3:10 UTC (permalink / raw) To: Rusty Russell; +Cc: kvm-devel, virtualization On Tue, Sep 25, 2007 at 12:11:15PM +1000, Rusty Russell wrote: > On Mon, 2007-09-24 at 18:57 -0700, Greg KH wrote: > > On Tue, Sep 25, 2007 at 10:50:04AM +1000, Rusty Russell wrote: > > > Not sure why I need the modalias here when it's in sysfs. But, it > > > works. > > > > udev does not read from sysfs anymore, it just uses the environment > > variables. It's faster and race-free that way. > > Should I leave the modalias entry in sysfs for back compat, or not? > This will be for >= 2.6.24 only... No, leave it in sysfs, other tools use it (like older versions of udev). thanks, greg k-h ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/3] virtio ring implementation [not found] ` <1190625307.27805.203.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org> 2007-09-24 9:16 ` [PATCH 3/3] virtio module alias support Rusty Russell @ 2007-09-24 13:44 ` Dor Laor [not found] ` <46F7BF41.9060705-atKUWr5tajBWk0Htik3J/w@public.gmane.org> 1 sibling, 1 reply; 20+ messages in thread From: Dor Laor @ 2007-09-24 13:44 UTC (permalink / raw) To: Rusty Russell; +Cc: kvm-devel, virtualization [-- Attachment #1.1: Type: text/plain, Size: 1872 bytes --] Rusty Russell wrote: > > These helper routines supply most of the virtqueue_ops for hypervisors > which want to use a ring for virtio. Unlike the previous lguest > implementation: > > 1) The rings are variable sized (2^n-1 elements). > 2) They have an unfortunate limit of 65535 bytes per sg element. > 3) The page numbers are always 64 bit (PAE anyone?) > 4) They no longer place used[] on a separate page, just a separate > cacheline. > 5) We do a modulo on a variable. We could be tricky if we cared. > 6) Interrupts and notifies are suppressed using flags within the rings. > > Users need only implement the new_vq and free_vq hooks (KVM wants the > guest to allocate the rings, lguest does it sanely). > > Signed-off-by: Rusty Russell <rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org> > [snip] > > +irqreturn_t vring_interrupt(int irq, void *_vq) > +{ > + struct vring_virtqueue *vq = to_vvq(_vq); > + > + pr_debug("virtqueue interrupt for %p\n", vq); > + > + if (unlikely(vq->broken)) > + return IRQ_HANDLED; > + > + if (more_used(vq)) { > + pr_debug("virtqueue callback for %p (%p)\n", > + vq, vq->vq.callback); > + if (!vq->vq.callback) > + return IRQ_NONE; > + if (!vq->vq.callback(&vq->vq)) > + vq->vring.avail->flags |= > VRING_AVAIL_F_NO_INTERRUPT; > + } else > + pr_debug("virtqueue %p no more used\n", vq); > + > + return IRQ_HANDLED; > +} > + > Seems like there is a problem with shared irq line, the interrupt handler always returns IRQ_HANDLED (except for the trivial case were the callback is null). It can be solved by having a host irq counter (in the shared ring) and a guest irq counter and return mb(); return (host_counter!=guest_counter)? IRQ_HANDLED:IRQ_NONE; Dor. [-- Attachment #1.2: Type: text/html, Size: 3740 bytes --] [-- Attachment #2: Type: text/plain, Size: 228 bytes --] ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ [-- Attachment #3: Type: text/plain, Size: 186 bytes --] _______________________________________________ kvm-devel mailing list kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org https://lists.sourceforge.net/lists/listinfo/kvm-devel ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <46F7BF41.9060705-atKUWr5tajBWk0Htik3J/w@public.gmane.org>]
* Re: [PATCH 2/3] virtio ring implementation [not found] ` <46F7BF41.9060705-atKUWr5tajBWk0Htik3J/w@public.gmane.org> @ 2007-09-24 23:43 ` Rusty Russell 2007-09-25 13:32 ` Rusty Russell 1 sibling, 0 replies; 20+ messages in thread From: Rusty Russell @ 2007-09-24 23:43 UTC (permalink / raw) To: dor.laor-atKUWr5tajBWk0Htik3J/w; +Cc: kvm-devel, virtualization On Mon, 2007-09-24 at 15:44 +0200, Dor Laor wrote: > Seems like there is a problem with shared irq line, the interrupt > handler always returns IRQ_HANDLED (except for the trivial case > were the callback is null). > > It can be solved by having a host irq counter (in the shared ring) and > a guest irq counter and return > mb(); return (host_counter!=guest_counter)? IRQ_HANDLED:IRQ_NONE; Or we could make the callback return irqreturn_t and have an explicit "disable" hook to disable interrupts. Using the return value of the callback to disable the queue has always made be a little uncomfortable, but it's slightly more efficient than a separate hook. Thanks, Rusty. ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/3] virtio ring implementation [not found] ` <46F7BF41.9060705-atKUWr5tajBWk0Htik3J/w@public.gmane.org> 2007-09-24 23:43 ` Rusty Russell @ 2007-09-25 13:32 ` Rusty Russell [not found] ` <1190727156.27805.332.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org> 1 sibling, 1 reply; 20+ messages in thread From: Rusty Russell @ 2007-09-25 13:32 UTC (permalink / raw) To: dor.laor-atKUWr5tajBWk0Htik3J/w; +Cc: kvm-devel, virtualization On Mon, 2007-09-24 at 15:44 +0200, Dor Laor wrote: > Rusty Russell wrote: > > +irqreturn_t vring_interrupt(int irq, void *_vq) > > +{ > > + struct vring_virtqueue *vq = to_vvq(_vq); > > + > > + pr_debug("virtqueue interrupt for %p\n", vq); > > + > > + if (unlikely(vq->broken)) > > + return IRQ_HANDLED; > > + > > + if (more_used(vq)) { > > + pr_debug("virtqueue callback for %p (%p)\n", > > + vq, vq->vq.callback); > > + if (!vq->vq.callback) > > + return IRQ_NONE; > > + if (!vq->vq.callback(&vq->vq)) > > + vq->vring.avail->flags |= > > VRING_AVAIL_F_NO_INTERRUPT; > > + } else > > + pr_debug("virtqueue %p no more used\n", vq); > > + > > + return IRQ_HANDLED; > > +} > > + > > > Seems like there is a problem with shared irq line, the interrupt > handler always returns IRQ_HANDLED (except for the trivial case > were the callback is null). To reply for a second time, this time with thought. Sorry. Yes, this code is wrong. It should be: irqreturn_t vring_interrupt(int irq, void *_vq) { struct vring_virtqueue *vq = to_vvq(_vq); if (!more_used(vq)) { pr_debug("virtqueue interrupt with no work for %p\n", vq); return IRQ_NONE; } if (unlikely(vq->broken)) return IRQ_HANDLED; pr_debug("virtqueue callback for %p (%p)\n", vq, vq->vq.callback); if (vq->vq.callback && !vq->vq.callback(&vq->vq)) vq->vring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT; return IRQ_HANDLED; } Cheers, Rusty. ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <1190727156.27805.332.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>]
* Re: [PATCH 2/3] virtio ring implementation [not found] ` <1190727156.27805.332.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org> @ 2007-09-25 17:15 ` Dor Laor [not found] ` <46F94243.2000602-atKUWr5tajBWk0Htik3J/w@public.gmane.org> 0 siblings, 1 reply; 20+ messages in thread From: Dor Laor @ 2007-09-25 17:15 UTC (permalink / raw) To: Rusty Russell; +Cc: kvm-devel, virtualization [-- Attachment #1.1: Type: text/plain, Size: 2468 bytes --] Rusty Russell wrote: > > On Mon, 2007-09-24 at 15:44 +0200, Dor Laor wrote: > > Rusty Russell wrote: > > > +irqreturn_t vring_interrupt(int irq, void *_vq) > > > +{ > > > + struct vring_virtqueue *vq = to_vvq(_vq); > > > + > > > + pr_debug("virtqueue interrupt for %p\n", vq); > > > + > > > + if (unlikely(vq->broken)) > > > + return IRQ_HANDLED; > > > + > > > + if (more_used(vq)) { > > > + pr_debug("virtqueue callback for %p (%p)\n", > > > + vq, vq->vq.callback); > > > + if (!vq->vq.callback) > > > + return IRQ_NONE; > > > + if (!vq->vq.callback(&vq->vq)) > > > + vq->vring.avail->flags |= > > > VRING_AVAIL_F_NO_INTERRUPT; > > > + } else > > > + pr_debug("virtqueue %p no more used\n", vq); > > > + > > > + return IRQ_HANDLED; > > > +} > > > + > > > > > Seems like there is a problem with shared irq line, the interrupt > > handler always returns IRQ_HANDLED (except for the trivial case > > were the callback is null). > > To reply for a second time, this time with thought. Sorry. > > Yes, this code is wrong. It should be: > > irqreturn_t vring_interrupt(int irq, void *_vq) > { > struct vring_virtqueue *vq = to_vvq(_vq); > > if (!more_used(vq)) { > pr_debug("virtqueue interrupt with no work for %p\n", vq); > return IRQ_NONE; > } > > if (unlikely(vq->broken)) > return IRQ_HANDLED; > > pr_debug("virtqueue callback for %p (%p)\n", vq, vq->vq.callback); > if (vq->vq.callback && !vq->vq.callback(&vq->vq)) > vq->vring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT; > > return IRQ_HANDLED; > } > > Cheers, > Rusty. > At the moment it's not good enough, there is a potential race were the guest optimistically turn off the VRING_AVAIL_F_NO_INTERRUPT in the vring_restart and afterwards finds there are more_used so it consume them in the poll function. If in the middle, packets arrive the host will see the flag is off and send irq. In that case the above irq handler will report IRQ_NONE. It's also not trivial to cancel the optimistic approach in the restart function since then there might be another race that will result in missing irq reaching the guest. I'll try to think of something better than a hypercall for switching irq on/off. Cheers mate, Dor. [-- Attachment #1.2: Type: text/html, Size: 4910 bytes --] [-- Attachment #2: Type: text/plain, Size: 228 bytes --] ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ [-- Attachment #3: Type: text/plain, Size: 186 bytes --] _______________________________________________ kvm-devel mailing list kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org https://lists.sourceforge.net/lists/listinfo/kvm-devel ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <46F94243.2000602-atKUWr5tajBWk0Htik3J/w@public.gmane.org>]
* Re: [PATCH 2/3] virtio ring implementation [not found] ` <46F94243.2000602-atKUWr5tajBWk0Htik3J/w@public.gmane.org> @ 2007-09-25 23:37 ` Rusty Russell [not found] ` <1190763475.2227.20.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org> 0 siblings, 1 reply; 20+ messages in thread From: Rusty Russell @ 2007-09-25 23:37 UTC (permalink / raw) To: dor.laor-atKUWr5tajBWk0Htik3J/w; +Cc: kvm-devel, virtualization On Tue, 2007-09-25 at 19:15 +0200, Dor Laor wrote: > At the moment it's not good enough, there is a potential race were the > guest optimistically turn off > the VRING_AVAIL_F_NO_INTERRUPT in the vring_restart and afterwards > finds there are more_used so > it consume them in the poll function. > If in the middle, packets arrive the host will see the flag is off and > send irq. > In that case the above irq handler will report IRQ_NONE. Good point. On the one hand, reporting IRQ_NONE occasionally is not fatal. On the other, it'd be nice to get this right. > It's also not trivial to cancel the optimistic approach in the restart > function since then there might be another race > that will result in missing irq reaching the guest. I did it optimistically because otherwise we need two barriers (and still have a window). > I'll try to think of something better than a hypercall for switching > irq on/off. Erk. That would be v. bad... Thanks, Rusty. ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <1190763475.2227.20.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>]
* Re: [PATCH 2/3] virtio ring implementation [not found] ` <1190763475.2227.20.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org> @ 2007-09-26 9:08 ` Dor Laor 0 siblings, 0 replies; 20+ messages in thread From: Dor Laor @ 2007-09-26 9:08 UTC (permalink / raw) To: Rusty Russell; +Cc: kvm-devel, virtualization Rusty Russell wrote: > On Tue, 2007-09-25 at 19:15 +0200, Dor Laor wrote: > >> At the moment it's not good enough, there is a potential race were the >> guest optimistically turn off >> the VRING_AVAIL_F_NO_INTERRUPT in the vring_restart and afterwards >> finds there are more_used so >> it consume them in the poll function. >> If in the middle, packets arrive the host will see the flag is off and >> send irq. >> In that case the above irq handler will report IRQ_NONE. >> > > Good point. On the one hand, reporting IRQ_NONE occasionally is not > fatal. On the other, it'd be nice to get this right. > > >> It's also not trivial to cancel the optimistic approach in the restart >> function since then there might be another race >> that will result in missing irq reaching the guest. >> > > I did it optimistically because otherwise we need two barriers (and > still have a window). > > >> I'll try to think of something better than a hypercall for switching >> irq on/off. >> > > Erk. That would be v. bad... > Actually double barrier will satisfy non-shared irq lines and will report irq handling right. The more complex scenario is shared irqs (pci...), since several devices are raising the irq line (level triggered) in the same time, while the tap add receive packets it always leaves a potential race for irq handled return value. So seems like we better not have shared irq and if only if shared irq is a must we should do the naughty naughty things. If returning IRQ_NONE once in a while is not too bad (including the matching windows implementation) then there is no issue. Dor. ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/3] virtio interface [not found] ` <1190625256.27805.201.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org> 2007-09-24 9:15 ` [PATCH 2/3] virtio ring implementation Rusty Russell @ 2007-09-24 22:18 ` Arnd Bergmann [not found] ` <200709250018.12705.arnd-r2nGTMty4D4@public.gmane.org> 1 sibling, 1 reply; 20+ messages in thread From: Arnd Bergmann @ 2007-09-24 22:18 UTC (permalink / raw) To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f; +Cc: virtualization On Monday 24 September 2007, Rusty Russell wrote: > This attempts to implement a "virtual I/O" layer which should allow > common drivers to be efficiently used across most virtual I/O > mechanisms. It will no-doubt need further enhancement. Hi Rusty, Thanks for your update, as you can imagine, I'm much happier with this version ;-) > + > +struct virtio_bus { > + struct bus_type bus; > + struct device dev; > +}; > + > +static struct virtio_bus virtio_bus = { > + .bus = { > + .name = "virtio", > + .match = virtio_dev_match, > + .dev_attrs = virtio_dev_attrs, > + }, > + .dev = { > + /* Can override this if you have a real bus behind it. */ > + .parent = NULL, > + .bus_id = "virtio", > + } > +}; This is a pattern I've seen a few times before, but could never understand what it's good for. What is your reason for defining a new data structure that is used only once, instead of just static struct bus_type virtio_bus_type; static struct device virtio_root_dev; Also, I would not mix the two in a single source file. Instead, I think every driver that can provide virtio devices (pci, lguest, ...) should be responsible for setting the parent appropriately. > +#ifndef _LINUX_VIRTIO_CONFIG_H > +#define _LINUX_VIRTIO_CONFIG_H > +/* Virtio devices use a standardized configuration space to define their > + * features and pass configuration information, but each implementation can > + * store and access that space differently. */ > +#include <linux/types.h> > + > +/* Status byte for guest to report progress, and synchronize config. */ > +/* We have seen device and processed generic fields (VIRTIO_CONFIG_F_VIRTIO) */ > +#define VIRTIO_CONFIG_S_ACKNOWLEDGE 1 > +/* We have found a driver for the device. */ > +#define VIRTIO_CONFIG_S_DRIVER 2 > +/* Driver has used its parts of the config, and is happy */ > +#define VIRTIO_CONFIG_S_DRIVER_OK 4 > +/* We've given up on this device. */ > +#define VIRTIO_CONFIG_S_FAILED 0x80 > + > +/* Feature byte (actually 7 bits availabe): */ > +/* Requirements/features of the virtio implementation. */ > +#define VIRTIO_CONFIG_F_VIRTIO 1 > +/* Requirements/features of the virtqueue (may have more than one). */ > +#define VIRTIO_CONFIG_F_VIRTQUEUE 2 > + > +#ifdef __KERNEL__ > +struct virtio_device; > + > +/** > + * virtio_config_ops - operations for configuring a virtio device > + * @find: search for the next configuration field of the given type. > + * vdev: the virtio_device > + * type: the feature type > + * len: the (returned) length of the field if found. > + * Returns a token if found, or NULL. Never returnes the same field twice > + * (ie. it's used up). > + * @get: read the value of a configuration field after find(). > + * vdev: the virtio_device > + * token: the token returned from find(). > + * buf: the buffer to write the field value into. > + * len: the length of the buffer (given by find()). > + * Note that contents are conventionally little-endian. > + * @set: write the value of a configuration field after find(). > + * vdev: the virtio_device > + * token: the token returned from find(). > + * buf: the buffer to read the field value from. > + * len: the length of the buffer (given by find()). > + * Note that contents are conventionally little-endian. > + * @get_status: read the status byte > + * vdev: the virtio_device > + * Returns the status byte > + * @set_status: write the status byte > + * vdev: the virtio_device > + * status: the new status byte > + * @find_vq: find the first VIRTIO_CONFIG_F_VIRTQUEUE and create a virtqueue. > + * vdev: the virtio_device > + * callback: the virqtueue callback > + * Returns the new virtqueue or ERR_PTR(). > + * @del_vq: free a virtqueue found by find_vq(). > + */ > +struct virtio_config_ops > +{ > + void *(*find)(struct virtio_device *vdev, u8 type, unsigned *len); > + void (*get)(struct virtio_device *vdev, void *token, > + void *buf, unsigned len); > + void (*set)(struct virtio_device *vdev, void *token, > + const void *buf, unsigned len); > + u8 (*get_status)(struct virtio_device *vdev); > + void (*set_status)(struct virtio_device *vdev, u8 status); > + struct virtqueue *(*find_vq)(struct virtio_device *vdev, > + bool (*callback)(struct virtqueue *)); > + void (*del_vq)(struct virtqueue *vq); > +}; The configuration logic looks more complicated than it should be. Maybe more of this can be done using data structure definitions instead of dynamic callback. E.g. the virtqueues of a given device can be listed as members of the struct virtio_device. > +/** > + * virtio_config_val - get a single virtio config and mark it used. > + * @config: the virtio config space > + * @type: the type to search for. > + * @val: a pointer to the value to fill in. > + * > + * Once used, the config type is marked with VIRTIO_CONFIG_F_USED so it can't > + * be found again. This version does endian conversion. */ > +#define virtio_config_val(vdev, type, v) ({ \ > + int _err = __virtio_config_val((vdev),(type),(v),sizeof(*(v))); \ > + \ > + BUILD_BUG_ON(sizeof(*(v)) != 1 && sizeof(*(v)) != 2 \ > + && sizeof(*(v)) != 4 && sizeof(*(v)) != 8); \ > + if (!_err) { \ > + switch (sizeof(*(v))) { \ > + case 2: le16_to_cpus(v); break; \ > + case 4: le32_to_cpus(v); break; \ > + case 8: le64_to_cpus(v); break; \ > + } \ > + } \ > + _err; \ > +}) Especially the macro looks like something better avoided... Arnd <>< ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <200709250018.12705.arnd-r2nGTMty4D4@public.gmane.org>]
* Re: [PATCH 1/3] virtio interface [not found] ` <200709250018.12705.arnd-r2nGTMty4D4@public.gmane.org> @ 2007-09-24 23:37 ` Rusty Russell [not found] ` <1190677058.27805.225.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org> 0 siblings, 1 reply; 20+ messages in thread From: Rusty Russell @ 2007-09-24 23:37 UTC (permalink / raw) To: Arnd Bergmann; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, virtualization On Tue, 2007-09-25 at 00:18 +0200, Arnd Bergmann wrote: > On Monday 24 September 2007, Rusty Russell wrote: > > This attempts to implement a "virtual I/O" layer which should allow > > common drivers to be efficiently used across most virtual I/O > > mechanisms. It will no-doubt need further enhancement. > > Hi Rusty, > > Thanks for your update, as you can imagine, I'm much happier with > this version ;-) Hi Arnd, Yes, I thought you might be... > > + > > +struct virtio_bus { > > + struct bus_type bus; > > + struct device dev; > > +}; > > + > > +static struct virtio_bus virtio_bus = { > > + .bus = { > > + .name = "virtio", > > + .match = virtio_dev_match, > > + .dev_attrs = virtio_dev_attrs, > > + }, > > + .dev = { > > + /* Can override this if you have a real bus behind it. */ > > + .parent = NULL, > > + .bus_id = "virtio", > > + } > > +}; > > This is a pattern I've seen a few times before, but could never understand > what it's good for. What is your reason for defining a new data structure > that is used only once, instead of just > > static struct bus_type virtio_bus_type; > static struct device virtio_root_dev; It's copied from the lguest bus which was copied from somewhere else. Creating a struct like this is a quiet complaint about the requirements to do so: it's not clear to me why I need to create a fake device, rather than making the bus the parent of the device if it needs one. > Also, I would not mix the two in a single source file. Instead, I think > every driver that can provide virtio devices (pci, lguest, ...) should > be responsible for setting the parent appropriately. I don't mind: we could expose it. > > +struct virtio_config_ops > > +{ > > + void *(*find)(struct virtio_device *vdev, u8 type, unsigned *len); > > + void (*get)(struct virtio_device *vdev, void *token, > > + void *buf, unsigned len); > > + void (*set)(struct virtio_device *vdev, void *token, > > + const void *buf, unsigned len); > > + u8 (*get_status)(struct virtio_device *vdev); > > + void (*set_status)(struct virtio_device *vdev, u8 status); > > + struct virtqueue *(*find_vq)(struct virtio_device *vdev, > > + bool (*callback)(struct virtqueue *)); > > + void (*del_vq)(struct virtqueue *vq); > > +}; > > The configuration logic looks more complicated than it should be. > Maybe more of this can be done using data structure definitions instead > of dynamic callback. E.g. the virtqueues of a given device can be > listed as members of the struct virtio_device. I agree about the complexity, but the reason I didn't do that is the way the config system works. Drivers look for fields they want, and those fields are marked so the host can see what was used. This makes it fairly easy to add new fields and then tell if the guest understood them or not. This includes new virtqueues, so the core must not simply grab them all. My previous solution was to put a "new_vq" and "del_vq" inside virtqueue_ops: the driver found the config field and handed it through. That was a little messier for the driver. > > +/** > > + * virtio_config_val - get a single virtio config and mark it used. > > + * @config: the virtio config space > > + * @type: the type to search for. > > + * @val: a pointer to the value to fill in. > > + * > > + * Once used, the config type is marked with VIRTIO_CONFIG_F_USED so it can't > > + * be found again. This version does endian conversion. */ > > +#define virtio_config_val(vdev, type, v) ({ \ > > + int _err = __virtio_config_val((vdev),(type),(v),sizeof(*(v))); \ > > + \ > > + BUILD_BUG_ON(sizeof(*(v)) != 1 && sizeof(*(v)) != 2 \ > > + && sizeof(*(v)) != 4 && sizeof(*(v)) != 8); \ > > + if (!_err) { \ > > + switch (sizeof(*(v))) { \ > > + case 2: le16_to_cpus(v); break; \ > > + case 4: le32_to_cpus(v); break; \ > > + case 8: le64_to_cpus(v); break; \ > > + } \ > > + } \ > > + _err; \ > > +}) > > Especially the macro looks like something better avoided... C'mon, everyone loves optimization! (Which reminds me, I said little endian here but didn't actually do that in the lguest launcher, bad Rusty). Thanks, Rusty. ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <1190677058.27805.225.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>]
* Re: [PATCH 1/3] virtio interface [not found] ` <1190677058.27805.225.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org> @ 2007-09-25 6:36 ` Arnd Bergmann [not found] ` <200709250836.29451.arnd-r2nGTMty4D4@public.gmane.org> 2007-09-25 8:18 ` Cornelia Huck 1 sibling, 1 reply; 20+ messages in thread From: Arnd Bergmann @ 2007-09-25 6:36 UTC (permalink / raw) To: Rusty Russell; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, virtualization On Tuesday 25 September 2007, Rusty Russell wrote: > On Tue, 2007-09-25 at 00:18 +0200, Arnd Bergmann wrote: > > This is a pattern I've seen a few times before, but could never understand > > what it's good for. What is your reason for defining a new data structure > > that is used only once, instead of just > > > > static struct bus_type virtio_bus_type; > > static struct device virtio_root_dev; > > It's copied from the lguest bus which was copied from somewhere else. > Creating a struct like this is a quiet complaint about the requirements > to do so: it's not clear to me why I need to create a fake device, > rather than making the bus the parent of the device if it needs one. > > > Also, I would not mix the two in a single source file. Instead, I think > > every driver that can provide virtio devices (pci, lguest, ...) should > > be responsible for setting the parent appropriately. > > I don't mind: we could expose it. What I mean with setting the parent appropriately is not to have a global device that is used by the hv-specific probing code, but to make sure that each of them provides their own one. The bus_type should either be global or provide the wrappers for device_register and driver_register that you have, but the host bridge device belongs with the code that probes it. E.g. when all virtio devices are behind PCI bridges, there does not need to be an empty /virtio or /lguest device node at all. Arnd <>< ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <200709250836.29451.arnd-r2nGTMty4D4@public.gmane.org>]
* Re: [PATCH 1/3] virtio interface [not found] ` <200709250836.29451.arnd-r2nGTMty4D4@public.gmane.org> @ 2007-09-25 10:54 ` Rusty Russell 0 siblings, 0 replies; 20+ messages in thread From: Rusty Russell @ 2007-09-25 10:54 UTC (permalink / raw) To: Arnd Bergmann; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, virtualization On Tue, 2007-09-25 at 08:36 +0200, Arnd Bergmann wrote: > On Tuesday 25 September 2007, Rusty Russell wrote: > > I don't mind: we could expose it. > > What I mean with setting the parent appropriately is not to have a global > device that is used by the hv-specific probing code, but to make sure > that each of them provides their own one. The bus_type should either be > global or provide the wrappers for device_register and driver_register > that you have, but the host bridge device belongs with the code that > probes it. E.g. when all virtio devices are behind PCI bridges, there > does not need to be an empty /virtio or /lguest device node at all. Right, that makes sense. I'll shift this "virtio" device node out to the lguest code. Thanks! Rusty. ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/3] virtio interface [not found] ` <1190677058.27805.225.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org> 2007-09-25 6:36 ` Arnd Bergmann @ 2007-09-25 8:18 ` Cornelia Huck 1 sibling, 0 replies; 20+ messages in thread From: Cornelia Huck @ 2007-09-25 8:18 UTC (permalink / raw) To: Rusty Russell; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, virtualization On Tue, 25 Sep 2007 09:37:38 +1000, Rusty Russell <rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org> wrote: > On Tue, 2007-09-25 at 00:18 +0200, Arnd Bergmann wrote: > > On Monday 24 September 2007, Rusty Russell wrote: > > > + > > > +struct virtio_bus { > > > + struct bus_type bus; > > > + struct device dev; > > > +}; > > > + > > > +static struct virtio_bus virtio_bus = { > > > + .bus = { > > > + .name = "virtio", > > > + .match = virtio_dev_match, > > > + .dev_attrs = virtio_dev_attrs, > > > + }, > > > + .dev = { > > > + /* Can override this if you have a real bus behind it. */ > > > + .parent = NULL, > > > + .bus_id = "virtio", > > > + } > > > +}; > > > > This is a pattern I've seen a few times before, but could never understand > > what it's good for. What is your reason for defining a new data structure > > that is used only once, instead of just > > > > static struct bus_type virtio_bus_type; > > static struct device virtio_root_dev; > > It's copied from the lguest bus which was copied from somewhere else. > Creating a struct like this is a quiet complaint about the requirements > to do so: it's not clear to me why I need to create a fake device, > rather than making the bus the parent of the device if it needs one. You may want to have different busses, all having their own root device, which share the same bus type. Or you may want to embed the root device into a more complex structure. ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2007-09-26 9:08 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-09-24 9:13 [PATCH 0/3] virtio implementation (draft VI) Rusty Russell
[not found] ` <1190625194.27805.199.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2007-09-24 9:14 ` [PATCH 1/3] virtio interface Rusty Russell
[not found] ` <1190625256.27805.201.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2007-09-24 9:15 ` [PATCH 2/3] virtio ring implementation Rusty Russell
[not found] ` <1190625307.27805.203.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2007-09-24 9:16 ` [PATCH 3/3] virtio module alias support Rusty Russell
[not found] ` <1190625394.27805.206.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2007-09-24 16:02 ` Greg KH
[not found] ` <20070924160221.GB5846-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
2007-09-25 0:50 ` Rusty Russell
[not found] ` <1190681405.27805.243.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2007-09-25 1:57 ` Greg KH
[not found] ` <20070925015747.GA16011-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
2007-09-25 2:11 ` Rusty Russell
[not found] ` <1190686275.27805.255.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2007-09-25 3:10 ` Greg KH
2007-09-24 13:44 ` [PATCH 2/3] virtio ring implementation Dor Laor
[not found] ` <46F7BF41.9060705-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-09-24 23:43 ` Rusty Russell
2007-09-25 13:32 ` Rusty Russell
[not found] ` <1190727156.27805.332.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2007-09-25 17:15 ` Dor Laor
[not found] ` <46F94243.2000602-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-09-25 23:37 ` Rusty Russell
[not found] ` <1190763475.2227.20.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2007-09-26 9:08 ` Dor Laor
2007-09-24 22:18 ` [PATCH 1/3] virtio interface Arnd Bergmann
[not found] ` <200709250018.12705.arnd-r2nGTMty4D4@public.gmane.org>
2007-09-24 23:37 ` Rusty Russell
[not found] ` <1190677058.27805.225.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2007-09-25 6:36 ` Arnd Bergmann
[not found] ` <200709250836.29451.arnd-r2nGTMty4D4@public.gmane.org>
2007-09-25 10:54 ` Rusty Russell
2007-09-25 8:18 ` Cornelia Huck
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox