* [PATCH 01/10] IOQ: Adding basic definitions for IO-Queue logic
[not found] ` <20070816231357.8044.55943.stgit-5CR4LY5GPkvLDviKLk5550HKjMygAv58XqFh9Ls21Oc@public.gmane.org>
@ 2007-08-16 23:14 ` Gregory Haskins
2007-08-16 23:14 ` [PATCH 02/10] PARAVIRTUALIZATION: Add support for a bus abstraction Gregory Haskins
` (9 subsequent siblings)
10 siblings, 0 replies; 41+ messages in thread
From: Gregory Haskins @ 2007-08-16 23:14 UTC (permalink / raw)
To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
IOQ is a generic shared-memory-queue mechanism that happens to be friendly
to virtualization boundaries. Note that it is not virtualization specific
due to its flexible transport layer.
Signed-off-by: Gregory Haskins <ghaskins-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
---
include/linux/ioq.h | 176 +++++++++++++++++++++++++++++++++++++++
lib/Kconfig | 11 ++
lib/Makefile | 1
lib/ioq.c | 228 +++++++++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 416 insertions(+), 0 deletions(-)
diff --git a/include/linux/ioq.h b/include/linux/ioq.h
new file mode 100644
index 0000000..d3a18a1
--- /dev/null
+++ b/include/linux/ioq.h
@@ -0,0 +1,176 @@
+/*
+ * Copyright 2007 Novell. All Rights Reserved.
+ *
+ * IOQ is a generic shared-memory-queue mechanism that happens to be friendly
+ * to virtualization boundaries. It can be used in a variety of ways, though
+ * its intended purpose is to become the low-level communication path for
+ * paravirtualized drivers. Note that it is not virtualization specific
+ * due to its flexible signaling layer.
+ *
+ * The following are a list of key design points:
+ *
+ * #) All shared-memory is always allocated on explicitly one side of the
+ * link. This typically would be the guest side in a VM/VMM scenario.
+ * #) The code has the concept of "north" and "south" where north denotes the
+ * memory-owner side (e.g. guest).
+ * #) A IOQ is "created" on the north side (which generates a unique ID), and
+ * is "connected" on the remote side via its ID. The facilitates call-path
+ * setup in a manner that is friendly across VM/VMM boundaries.
+ * #) An IOQ is manipulated using an iterator idiom.
+ * #) A "IOQ Manager" abstraction handles the translation between two
+ * endpoints. E.g. allocating "north" memory, signaling, translating
+ * addresses (e.g. GPA to PA)
+ *
+ * Author:
+ * Gregory Haskins <ghaskins-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
+ *
+ * This file is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+
+#ifndef _LINUX_IOQ_H
+#define _LINUX_IOQ_H
+
+#include <linux/sched.h>
+#include <linux/wait.h>
+#include <asm/types.h>
+
+struct ioq_mgr;
+
+/*
+ *---------
+ * The following structures represent data that is shared across boundaries
+ * which may be quite disparate from one another (e.g. Windows vs Linux,
+ * 32 vs 64 bit, etc). Therefore, care has been taken to make sure they
+ * present data in a manner that is independent of the environment.
+ *-----------
+ */
+typedef u64 ioq_id_t;
+
+struct ioq_ring_desc {
+ u64 cookie; /* for arbitrary use by north-side */
+ u64 ptr;
+ u64 len;
+ u64 alen;
+ u8 valid;
+ u8 sown; /* South owned = 1, North owned = 0 */
+};
+
+#define IOQ_RING_MAGIC 0x47fa2fe4
+#define IOQ_RING_VER 1
+
+struct ioq_ring_idx {
+ u32 head; /* 0 based index to head of ptr array */
+ u32 tail; /* 0 based index to tail of ptr array */
+ u8 full;
+};
+
+struct ioq_irq {
+ u8 enabled;
+ u8 pending;
+};
+
+enum ioq_locality {
+ ioq_locality_north,
+ ioq_locality_south,
+};
+
+struct ioq_ring_head {
+ u32 magic;
+ u32 ver;
+ ioq_id_t id;
+ u32 count;
+ u64 ptr; /* ptr to array of ioq_ring_desc[count] */
+ struct ioq_ring_idx idx[2];
+ struct ioq_irq irq[2];
+ u8 padding[16];
+};
+
+/* --- END SHARED STRUCTURES --- */
+
+enum ioq_idx_type {
+ ioq_idxtype_valid,
+ ioq_idxtype_inuse,
+ ioq_idxtype_invalid,
+};
+
+enum ioq_seek_type {
+ ioq_seek_tail,
+ ioq_seek_next,
+ ioq_seek_head,
+ ioq_seek_set
+};
+
+struct ioq_iterator {
+ struct ioq *ioq;
+ struct ioq_ring_idx *idx;
+ u32 pos;
+ struct ioq_ring_desc *desc;
+ int update;
+};
+
+int ioq_iter_seek(struct ioq_iterator *iter, enum ioq_seek_type type,
+ long offset, int flags);
+int ioq_iter_push(struct ioq_iterator *iter, int flags);
+int ioq_iter_pop(struct ioq_iterator *iter, int flags);
+
+struct ioq_notifier {
+ void (*signal)(struct ioq_notifier*);
+};
+
+struct ioq {
+ void (*destroy)(struct ioq *ioq);
+ int (*signal)(struct ioq *ioq);
+
+ ioq_id_t id;
+ enum ioq_locality locale;
+ struct ioq_mgr *mgr;
+ struct ioq_ring_head *head_desc;
+ struct ioq_ring_desc *ring;
+ wait_queue_head_t wq;
+ struct ioq_notifier *notifier;
+};
+
+static inline void ioq_init(struct ioq *ioq)
+{
+ memset(ioq, 0, sizeof(*ioq));
+ init_waitqueue_head(&ioq->wq);
+}
+
+int ioq_start(struct ioq *ioq, int flags);
+int ioq_stop(struct ioq *ioq, int flags);
+int ioq_signal(struct ioq *ioq, int flags);
+void ioq_wakeup(struct ioq *ioq); /* This should only be used internally */
+int ioq_count(struct ioq *ioq, enum ioq_idx_type type);
+int ioq_full(struct ioq *ioq, enum ioq_idx_type type);
+
+static inline int ioq_empty(struct ioq *ioq, enum ioq_idx_type type)
+{
+ return !ioq_count(ioq, type);
+}
+
+
+
+#define IOQ_ITER_AUTOUPDATE (1 << 0)
+int ioq_iter_init(struct ioq *ioq, struct ioq_iterator *iter,
+ enum ioq_idx_type type, int flags);
+
+struct ioq_mgr {
+ int (*create)(struct ioq_mgr *t, struct ioq **ioq,
+ size_t ringsize, int flags);
+ int (*connect)(struct ioq_mgr *t, ioq_id_t id, struct ioq **ioq,
+ int flags);
+};
+
+
+#endif /* _LINUX_IOQ_H */
diff --git a/lib/Kconfig b/lib/Kconfig
index 2e7ae6b..65c6d5d 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -124,4 +124,15 @@ config HAS_DMA
depends on !NO_DMA
default y
+config IOQ
+ boolean "IO-Queue library - Generic shared-memory queue"
+ default n
+ help
+ IOQ is a generic shared-memory-queue mechanism that happens to be
+ friendly to virtualization boundaries. It can be used in a variety
+ of ways, though its intended purpose is to become the low-level
+ communication path for paravirtualized drivers.
+
+ If unsure, say N
+
endmenu
diff --git a/lib/Makefile b/lib/Makefile
index c8c8e20..2bf3b5d 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -56,6 +56,7 @@ obj-$(CONFIG_TEXTSEARCH_BM) += ts_bm.o
obj-$(CONFIG_TEXTSEARCH_FSM) += ts_fsm.o
obj-$(CONFIG_SMP) += percpu_counter.o
obj-$(CONFIG_AUDIT_GENERIC) += audit.o
+obj-$(CONFIG_IOQ) += ioq.o
obj-$(CONFIG_SWIOTLB) += swiotlb.o
obj-$(CONFIG_FAULT_INJECTION) += fault-inject.o
diff --git a/lib/ioq.c b/lib/ioq.c
new file mode 100644
index 0000000..b9ef75e
--- /dev/null
+++ b/lib/ioq.c
@@ -0,0 +1,228 @@
+/*
+ * Copyright 2007 Novell. All Rights Reserved.
+ *
+ * See include/linux/ioq.h for documentation
+ *
+ * Author:
+ * Gregory Haskins <ghaskins-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
+ *
+ * This file is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+
+#include <linux/sched.h>
+#include <linux/ioq.h>
+#include <asm/bitops.h>
+#include <linux/module.h>
+
+#ifndef NULL
+#define NULL 0
+#endif
+
+static int ioq_iter_setpos(struct ioq_iterator *iter, u32 pos)
+{
+ struct ioq *ioq = iter->ioq;
+
+ BUG_ON(pos >= ioq->head_desc->count);
+
+ iter->pos = pos;
+ iter->desc = &ioq->ring[pos];
+
+ return 0;
+}
+
+int ioq_iter_seek(struct ioq_iterator *iter, enum ioq_seek_type type,
+ long offset, int flags)
+{
+ struct ioq_ring_head *head_desc = iter->ioq->head_desc;
+ struct ioq_ring_idx *idx = iter->idx;
+ u32 pos;
+
+ switch (type) {
+ case ioq_seek_next:
+ pos = iter->pos + 1;
+ pos %= head_desc->count;
+ break;
+ case ioq_seek_tail:
+ pos = idx->tail;
+ break;
+ case ioq_seek_head:
+ pos = idx->head;
+ break;
+ case ioq_seek_set:
+ if (offset >= head_desc->count)
+ return -1;
+ pos = offset;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ return ioq_iter_setpos(iter, pos);
+}
+EXPORT_SYMBOL(ioq_iter_seek);
+
+static int ioq_ring_count(struct ioq_ring_idx *idx, int count)
+{
+ if (idx->full && (idx->head == idx->tail))
+ return count;
+ else if (idx->head >= idx->tail)
+ return idx->head - idx->tail;
+ else
+ return (idx->head + count) - idx->tail;
+}
+
+int ioq_iter_push(struct ioq_iterator *iter, int flags)
+{
+ struct ioq_ring_head *head_desc = iter->ioq->head_desc;
+ struct ioq_ring_idx *idx = iter->idx;
+ int ret = -ENOSPC;
+
+ /*
+ * Its only valid to push if we are currently pointed at the head
+ */
+ if (iter->pos != idx->head)
+ return -EINVAL;
+
+ if (ioq_ring_count(idx, head_desc->count) < head_desc->count) {
+ idx->head++;
+ idx->head %= head_desc->count;
+
+ if (idx->head == idx->tail)
+ idx->full = 1;
+
+ mb();
+
+ ret = ioq_iter_seek(iter, ioq_seek_next, 0, flags);
+
+ if (iter->update)
+ ioq_signal(iter->ioq, 0);
+ }
+
+ return ret;
+}
+EXPORT_SYMBOL(ioq_iter_push);
+
+int ioq_iter_pop(struct ioq_iterator *iter, int flags)
+{
+ struct ioq_ring_head *head_desc = iter->ioq->head_desc;
+ struct ioq_ring_idx *idx = iter->idx;
+ int ret = -ENOSPC;
+
+ /*
+ * Its only valid to pop if we are currently pointed at the tail
+ */
+ if (iter->pos != idx->tail)
+ return -EINVAL;
+
+ if (ioq_ring_count(idx, head_desc->count) != 0) {
+ idx->tail++;
+ idx->tail %= head_desc->count;
+
+ idx->full = 0;
+
+ mb();
+
+ ret = ioq_iter_seek(iter, ioq_seek_next, 0, flags);
+
+ if (iter->update)
+ ioq_signal(iter->ioq, 0);
+ }
+
+ return ret;
+}
+EXPORT_SYMBOL(ioq_iter_pop);
+
+int ioq_iter_init(struct ioq *ioq, struct ioq_iterator *iter,
+ enum ioq_idx_type type, int flags)
+{
+ BUG_ON((type < 0) || (type >= ioq_idxtype_invalid));
+
+ iter->ioq = ioq;
+ iter->update = (flags & IOQ_ITER_AUTOUPDATE);
+ iter->idx = &ioq->head_desc->idx[type];
+ iter->pos = -1;
+ iter->desc = NULL;
+
+ return 0;
+}
+EXPORT_SYMBOL(ioq_iter_init);
+
+int ioq_start(struct ioq *ioq, int flags)
+{
+ struct ioq_irq *irq = &ioq->head_desc->irq[ioq->locale];
+
+ irq->enabled = 1;
+ mb();
+
+ if (irq->pending)
+ ioq_wakeup(ioq);
+
+ return 0;
+}
+EXPORT_SYMBOL(ioq_start);
+
+int ioq_stop(struct ioq *ioq, int flags)
+{
+ struct ioq_irq *irq = &ioq->head_desc->irq[ioq->locale];
+
+ irq->enabled = 0;
+ mb();
+
+ return 0;
+}
+EXPORT_SYMBOL(ioq_stop);
+
+int ioq_signal(struct ioq *ioq, int flags)
+{
+ /* Load the irq structure from the other locale */
+ struct ioq_irq *irq = &ioq->head_desc->irq[!ioq->locale];
+
+ irq->pending = 1;
+ mb();
+
+ if (irq->enabled)
+ ioq->signal(ioq);
+
+ return 0;
+}
+EXPORT_SYMBOL(ioq_signal);
+
+int ioq_count(struct ioq *ioq, enum ioq_idx_type type)
+{
+ BUG_ON((type < 0) || (type >= ioq_idxtype_invalid));
+
+ return ioq_ring_count(&ioq->head_desc->idx[type], ioq->head_desc->count);
+}
+EXPORT_SYMBOL(ioq_count);
+
+int ioq_full(struct ioq *ioq, enum ioq_idx_type type)
+{
+ BUG_ON((type < 0) || (type >= ioq_idxtype_invalid));
+
+ return ioq->head_desc->idx[type].full;
+}
+EXPORT_SYMBOL(ioq_full);
+
+void ioq_wakeup(struct ioq *ioq)
+{
+ struct ioq_irq *irq = &ioq->head_desc->irq[ioq->locale];
+
+ irq->pending = 0;
+ mb();
+
+ wake_up(&ioq->wq);
+ if (ioq->notifier)
+ ioq->notifier->signal(ioq->notifier);
+}
+EXPORT_SYMBOL(ioq_wakeup);
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
^ permalink raw reply related [flat|nested] 41+ messages in thread* [PATCH 02/10] PARAVIRTUALIZATION: Add support for a bus abstraction
[not found] ` <20070816231357.8044.55943.stgit-5CR4LY5GPkvLDviKLk5550HKjMygAv58XqFh9Ls21Oc@public.gmane.org>
2007-08-16 23:14 ` [PATCH 01/10] IOQ: Adding basic definitions for IO-Queue logic Gregory Haskins
@ 2007-08-16 23:14 ` Gregory Haskins
2007-08-16 23:14 ` [PATCH 03/10] IOQ: Add an IOQ network driver Gregory Haskins
` (8 subsequent siblings)
10 siblings, 0 replies; 41+ messages in thread
From: Gregory Haskins @ 2007-08-16 23:14 UTC (permalink / raw)
To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
PV usually comes in two flavors: device PV, and "core" PV. The existing PV
ops deal in terms of the latter. However, it would be useful to add an
interface for a virtual bus with provisions for discovery/configuration of
backend PV devices. Often times it is desirable to run PV devices even if the
entire core is not operating with PVOPS. Therefore, we introduce a separate
interface to deal with the devices.
Signed-off-by: Gregory Haskins <ghaskins-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
---
arch/i386/Kconfig | 2 +
arch/x86_64/Kconfig | 2 +
drivers/Makefile | 1
drivers/pvbus/Kconfig | 7 ++
drivers/pvbus/Makefile | 6 ++
drivers/pvbus/pvbus-driver.c | 120 ++++++++++++++++++++++++++++++++++++++++++
include/linux/pvbus.h | 59 +++++++++++++++++++++
7 files changed, 197 insertions(+), 0 deletions(-)
diff --git a/arch/i386/Kconfig b/arch/i386/Kconfig
index c2d54b8..acf4506 100644
--- a/arch/i386/Kconfig
+++ b/arch/i386/Kconfig
@@ -1125,6 +1125,8 @@ source "drivers/pci/pcie/Kconfig"
source "drivers/pci/Kconfig"
+source "drivers/pvbus/Kconfig"
+
config ISA_DMA_API
bool
default y
diff --git a/arch/x86_64/Kconfig b/arch/x86_64/Kconfig
index 145bb82..17d6c78 100644
--- a/arch/x86_64/Kconfig
+++ b/arch/x86_64/Kconfig
@@ -721,6 +721,8 @@ source "drivers/pcmcia/Kconfig"
source "drivers/pci/hotplug/Kconfig"
+source "drivers/pvbus/Kconfig"
+
endmenu
diff --git a/drivers/Makefile b/drivers/Makefile
index adad2f3..179e669 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -81,3 +81,4 @@ obj-$(CONFIG_GENERIC_TIME) += clocksource/
obj-$(CONFIG_DMA_ENGINE) += dma/
obj-$(CONFIG_HID) += hid/
obj-$(CONFIG_PPC_PS3) += ps3/
+obj-$(CONFIG_PVBUS) += pvbus/
diff --git a/drivers/pvbus/Kconfig b/drivers/pvbus/Kconfig
new file mode 100644
index 0000000..1ca094d
--- /dev/null
+++ b/drivers/pvbus/Kconfig
@@ -0,0 +1,7 @@
+#
+# PVBUS configuration
+#
+
+config PVBUS
+ bool "Paravirtual Bus"
+
diff --git a/drivers/pvbus/Makefile b/drivers/pvbus/Makefile
new file mode 100644
index 0000000..0df2c2e
--- /dev/null
+++ b/drivers/pvbus/Makefile
@@ -0,0 +1,6 @@
+#
+# Makefile for the PVBUS bus specific drivers.
+#
+
+obj-y += pvbus-driver.o
+
diff --git a/drivers/pvbus/pvbus-driver.c b/drivers/pvbus/pvbus-driver.c
new file mode 100644
index 0000000..3f6687d
--- /dev/null
+++ b/drivers/pvbus/pvbus-driver.c
@@ -0,0 +1,120 @@
+/*
+ * Copyright 2007 Novell. All Rights Reserved.
+ *
+ * Paravirtualized-Bus - This is a generic infrastructure for virtual devices
+ * and their drivers. It is inspired by Rusty Russell's lguest_bus, but with
+ * the key difference that the bus is decoupled from the underlying hypervisor
+ * in both name and function.
+ *
+ * Instead, it is intended that external hypervisor support will register
+ * arbitrary devices. Generic drivers can then monitor this bus for
+ * compatible devices regardless of the hypervisor implementation.
+ *
+ * Author:
+ * Gregory Haskins <ghaskins-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
+ *
+ * This file is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+
+#include <linux/pvbus.h>
+
+#define PVBUS_NAME "pvbus"
+
+/*
+ * This function is invoked whenever a new driver and/or device is added
+ * to check if there is a match
+ */
+static int pvbus_dev_match(struct device *_dev, struct device_driver *_drv)
+{
+ struct pvbus_device *dev = container_of(_dev,struct pvbus_device,dev);
+ struct pvbus_driver *drv = container_of(_drv,struct pvbus_driver,drv);
+
+ return !strcmp(dev->name, drv->name);
+}
+
+/*
+ * This function is invoked after the bus infrastructure has already made a
+ * match. The device will contain a reference to the paired driver which
+ * we will extract.
+ */
+static int pvbus_dev_probe(struct device *_dev)
+{
+ int ret = 0;
+ struct pvbus_device*dev = container_of(_dev,struct pvbus_device, dev);
+ struct pvbus_driver*drv = container_of(_dev->driver,
+ struct pvbus_driver, drv);
+
+ if (drv->probe)
+ ret = drv->probe(dev);
+
+ return ret;
+}
+
+static struct bus_type pv_bus = {
+ .name = PVBUS_NAME,
+ .match = pvbus_dev_match,
+};
+
+static struct device pvbus_rootdev = {
+ .parent = NULL,
+ .bus_id = PVBUS_NAME,
+};
+
+static int __init pvbus_init(void)
+{
+ int ret;
+
+ ret = bus_register(&pv_bus);
+ BUG_ON(ret < 0);
+
+ ret = device_register(&pvbus_rootdev);
+ BUG_ON(ret < 0);
+
+ return 0;
+}
+
+postcore_initcall(pvbus_init);
+
+int pvbus_device_register(struct pvbus_device *new)
+{
+ new->dev.parent = &pvbus_rootdev;
+ new->dev.bus = &pv_bus;
+
+ return device_register(&new->dev);
+}
+EXPORT_SYMBOL(pvbus_device_register);
+
+void pvbus_device_unregister(struct pvbus_device *dev)
+{
+ device_unregister(&dev->dev);
+}
+EXPORT_SYMBOL(pvbus_device_unregister);
+
+int pvbus_driver_register(struct pvbus_driver *new)
+{
+ new->drv.bus = &pv_bus;
+ new->drv.name = new->name;
+ new->drv.owner = new->owner;
+ new->drv.probe = pvbus_dev_probe;
+
+ return driver_register(&new->drv);
+}
+EXPORT_SYMBOL(pvbus_driver_register);
+
+void pvbus_driver_unregister(struct pvbus_driver *drv)
+{
+ driver_unregister(&drv->drv);
+}
+EXPORT_SYMBOL(pvbus_driver_unregister);
+
diff --git a/include/linux/pvbus.h b/include/linux/pvbus.h
new file mode 100644
index 0000000..471f500
--- /dev/null
+++ b/include/linux/pvbus.h
@@ -0,0 +1,59 @@
+/*
+ * Copyright 2007 Novell. All Rights Reserved.
+ *
+ * Paravirtualized-Bus
+ *
+ * Author:
+ * Gregory Haskins <ghaskins-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
+ *
+ * This file is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+
+#ifndef _LINUX_PVBUS_H
+#define _LINUX_PVBUS_H
+
+#include <linux/device.h>
+#include <linux/ioq.h>
+
+struct pvbus_device {
+ char *name;
+ u64 id;
+
+ void *priv; /* Used by drivers that allocated the dev */
+
+ int (*createqueue)(struct pvbus_device *dev, struct ioq **ioq,
+ size_t ringsize, int flags);
+ int (*call)(struct pvbus_device *dev, u32 func,
+ void *data, size_t len, int flags);
+
+ struct device dev;
+};
+
+int pvbus_device_register(struct pvbus_device *dev);
+void pvbus_device_unregister(struct pvbus_device *dev);
+
+struct pvbus_driver {
+ char *name;
+ struct module *owner;
+
+ int (*probe)(struct pvbus_device *pdev);
+ int (*remove)(struct pvbus_device *pdev);
+
+ struct device_driver drv;
+};
+
+int pvbus_driver_register(struct pvbus_driver *drv);
+void pvbus_driver_unregister(struct pvbus_driver *drv);
+
+#endif /* _LINUX_PVBUS_H */
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
^ permalink raw reply related [flat|nested] 41+ messages in thread* [PATCH 03/10] IOQ: Add an IOQ network driver
[not found] ` <20070816231357.8044.55943.stgit-5CR4LY5GPkvLDviKLk5550HKjMygAv58XqFh9Ls21Oc@public.gmane.org>
2007-08-16 23:14 ` [PATCH 01/10] IOQ: Adding basic definitions for IO-Queue logic Gregory Haskins
2007-08-16 23:14 ` [PATCH 02/10] PARAVIRTUALIZATION: Add support for a bus abstraction Gregory Haskins
@ 2007-08-16 23:14 ` Gregory Haskins
2007-08-16 23:14 ` [PATCH 04/10] IOQNET: Add a test harness infrastructure to IOQNET Gregory Haskins
` (7 subsequent siblings)
10 siblings, 0 replies; 41+ messages in thread
From: Gregory Haskins @ 2007-08-16 23:14 UTC (permalink / raw)
To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Signed-off-by: Gregory Haskins <ghaskins-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
---
drivers/net/Kconfig | 10 +
drivers/net/Makefile | 2
drivers/net/ioqnet/Makefile | 11 +
drivers/net/ioqnet/driver.c | 658 +++++++++++++++++++++++++++++++++++++++++++
include/linux/ioqnet.h | 44 +++
5 files changed, 725 insertions(+), 0 deletions(-)
diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index fb99cd4..7ee7454 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -2947,6 +2947,16 @@ config NETCONSOLE
If you want to log kernel messages over the network, enable this.
See <file:Documentation/networking/netconsole.txt> for details.
+config IOQNET
+ tristate "IOQNET (IOQ based paravirtualized network driver)"
+ select IOQ
+ select PVBUS
+
+config IOQNET_DEBUG
+ bool "IOQNET debugging"
+ depends on IOQNET
+ default n
+
endif #NETDEVICES
config NETPOLL
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index a77affa..4c8a918 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -224,6 +224,8 @@ obj-$(CONFIG_ENP2611_MSF_NET) += ixp2000/
obj-$(CONFIG_NETCONSOLE) += netconsole.o
+obj-$(CONFIG_IOQNET) += ioqnet/
+
obj-$(CONFIG_FS_ENET) += fs_enet/
obj-$(CONFIG_NETXEN_NIC) += netxen/
diff --git a/drivers/net/ioqnet/Makefile b/drivers/net/ioqnet/Makefile
new file mode 100644
index 0000000..d7020ee
--- /dev/null
+++ b/drivers/net/ioqnet/Makefile
@@ -0,0 +1,11 @@
+#
+# Makefile for the IOQNET ethernet driver
+#
+
+ioqnet-objs = driver.o
+obj-$(CONFIG_IOQNET) += ioqnet.o
+
+
+ifeq ($(CONFIG_IOQNET_DEBUG),y)
+EXTRA_CFLAGS += -DIOQNET_DEBUG
+endif
diff --git a/drivers/net/ioqnet/driver.c b/drivers/net/ioqnet/driver.c
new file mode 100644
index 0000000..8352029
--- /dev/null
+++ b/drivers/net/ioqnet/driver.c
@@ -0,0 +1,658 @@
+/*
+ * ioqnet - A paravirtualized network device based on the IOQ interface
+ *
+ * Copyright (C) 2007 Novell, Gregory Haskins <ghaskins-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
+ *
+ * Derived from the SNULL example from the book "Linux Device
+ * Drivers" by Alessandro Rubini and Jonathan Corbet, published
+ * by O'Reilly & Associates.
+ */
+
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/moduleparam.h>
+
+#include <linux/sched.h>
+#include <linux/kernel.h> /* printk() */
+#include <linux/slab.h> /* kmalloc() */
+#include <linux/errno.h> /* error codes */
+#include <linux/types.h> /* size_t */
+#include <linux/interrupt.h> /* mark_bh */
+
+#include <linux/in.h>
+#include <linux/netdevice.h> /* struct device, and other headers */
+#include <linux/etherdevice.h> /* eth_type_trans */
+#include <linux/ip.h> /* struct iphdr */
+#include <linux/tcp.h> /* struct tcphdr */
+#include <linux/skbuff.h>
+#include <linux/ioq.h>
+#include <linux/pvbus.h>
+
+#include <linux/in6.h>
+#include <asm/checksum.h>
+
+#include <linux/ioqnet.h>
+
+MODULE_AUTHOR("Gregory Haskins");
+MODULE_LICENSE("GPL");
+
+#undef PDEBUG /* undef it, just in case */
+#ifdef IOQNET_DEBUG
+# define PDEBUG(fmt, args...) printk( KERN_DEBUG "ioqnet: " fmt, ## args)
+#else
+# define PDEBUG(fmt, args...) /* not debugging: nothing */
+#endif
+
+#define RX_RINGLEN 64
+#define TX_RINGLEN 64
+#define TX_PTRS_PER_DESC 64
+
+struct ioqnet_queue {
+ struct ioq *queue;
+ struct ioq_notifier notifier;
+};
+
+struct ioqnet_tx_desc {
+ struct sk_buff *skb;
+ struct ioqnet_tx_ptr data[TX_PTRS_PER_DESC];
+};
+
+struct ioqnet_priv {
+ spinlock_t lock;
+ struct net_device *dev;
+ struct pvbus_device *pdev;
+ struct net_device_stats stats;
+ struct ioqnet_queue rxq;
+ struct ioqnet_queue txq;
+ struct tasklet_struct txtask;
+};
+
+static int ioqnet_queue_init(struct ioqnet_priv *priv,
+ struct ioqnet_queue *q,
+ size_t ringsize,
+ void (*func)(struct ioq_notifier*))
+{
+ int ret = priv->pdev->createqueue(priv->pdev, &q->queue, ringsize, 0);
+ if (ret < 0)
+ return ret;
+
+ q->notifier.signal = func;
+ q->queue->notifier = &q->notifier;
+
+ return 0;
+}
+
+/* Perform a hypercall to register/connect our queues */
+static int ioqnet_connect(struct ioqnet_priv *priv)
+{
+ struct ioqnet_connect data = {
+ .rxq = priv->rxq.queue->id,
+ .txq = priv->txq.queue->id,
+ };
+
+ return priv->pdev->call(priv->pdev, IOQNET_CONNECT,
+ &data, sizeof(data), 0);
+}
+
+static int ioqnet_disconnect(struct ioqnet_priv *priv)
+{
+ return priv->pdev->call(priv->pdev, IOQNET_DISCONNECT, NULL, 0, 0);
+}
+
+/* Perform a hypercall to get the assigned MAC addr */
+static int ioqnet_query_mac(struct ioqnet_priv *priv)
+{
+ return priv->pdev->call(priv->pdev,
+ IOQNET_QUERY_MAC,
+ priv->dev->dev_addr,
+ ETH_ALEN, 0);
+}
+
+
+/*
+ * Enable and disable receive interrupts.
+ */
+static void ioqnet_rx_ints(struct net_device *dev, int enable)
+{
+ struct ioqnet_priv *priv = netdev_priv(dev);
+ struct ioq *ioq = priv->rxq.queue;
+ if (enable)
+ ioq_start(ioq, 0);
+ else
+ ioq_stop(ioq, 0);
+}
+
+static void ioqnet_alloc_rx_desc(struct ioq_ring_desc *desc, size_t len)
+{
+ struct sk_buff *skb = dev_alloc_skb(len + 2);
+ BUG_ON(!skb);
+
+ skb_reserve(skb, 2); /* align IP on 16B boundary */
+
+ desc->cookie = (u64)skb;
+ desc->ptr = (u64)__pa(skb->data);
+ desc->len = len; /* total length */
+ desc->alen = 0; /* actual length - to be filled in by host */
+
+ mb();
+ desc->valid = 1;
+ desc->sown = 1; /* give ownership to the south */
+ mb();
+}
+
+static void ioqnet_setup_rx(struct ioqnet_priv *priv)
+{
+ struct ioq *ioq = priv->rxq.queue;
+ struct ioq_iterator iter;
+ int ret;
+
+ /*
+ * We want to iterate on the "valid" index. By default the iterator
+ * will not "autoupdate" which means it will not hypercall the host
+ * with our changes. This is good, because we are really just
+ * initializing stuff here anyway. Note that you can always manually
+ * signal the host with ioq_signal() if the autoupdate feature is not
+ * used.
+ */
+ ret = ioq_iter_init(ioq, &iter, ioq_idxtype_valid, 0);
+ BUG_ON(ret < 0);
+
+ /*
+ * Seek to the head of the valid index (which should be our first
+ * item, since the queue is brand-new)
+ */
+ ret = ioq_iter_seek(&iter, ioq_seek_head, 0, 0);
+ BUG_ON(ret < 0);
+
+ /*
+ * Now populate each descriptor with an empty SKB and mark it valid
+ */
+ while (!iter.desc->valid) {
+ ioqnet_alloc_rx_desc(iter.desc, priv->dev->mtu);
+
+ /*
+ * This push operation will simultaneously advance the
+ * valid-head index and increment our position in the queue
+ * by one.
+ */
+ ret = ioq_iter_push(&iter, 0);
+ BUG_ON(ret < 0);
+ }
+}
+
+static void ioqnet_setup_tx(struct ioqnet_priv *priv)
+{
+ struct ioq *ioq = priv->txq.queue;
+ struct ioq_iterator iter;
+ int ret;
+ int i;
+
+ /*
+ * We setup the tx-desc in a similar way to how we did the rx SKBs
+ */
+ ret = ioq_iter_init(ioq, &iter, ioq_idxtype_valid, 0);
+ BUG_ON(ret < 0);
+
+ ret = ioq_iter_seek(&iter, ioq_seek_head, 0, 0);
+ BUG_ON(ret < 0);
+
+ for (i = 0; i < TX_RINGLEN; ++i) {
+ struct ioq_ring_desc *desc = iter.desc;
+ struct ioqnet_tx_desc *txdesc = kzalloc(sizeof(*txdesc),
+ GFP_KERNEL | GFP_DMA);
+
+ desc->cookie = (u64)txdesc;
+ desc->ptr = (u64)__pa(&txdesc->data[0]);
+ desc->len = TX_PTRS_PER_DESC; /* "len" is "count" */
+ desc->alen = 0;
+ desc->valid = 0; /* mark it "invalid" since payload empty */
+ desc->sown = 0; /* retain ownership until "inuse" */
+
+ /*
+ * One big difference between the RX and TX ring is that
+ * we are going to do an "iter++" here instead of an
+ * "iter->push()". That is because we don't want to actually
+ * advance the valid-index. We use the valid index to
+ * determine the difference between outstanding consumed and
+ * outstanding unconsumed packets
+ */
+ ret = ioq_iter_seek(&iter, ioq_seek_next, 0, 0);
+ BUG_ON(ret < 0);
+ }
+}
+
+/*
+ * Open and close
+ */
+
+static int ioqnet_open(struct net_device *dev)
+{
+ struct ioqnet_priv *priv = netdev_priv(dev);
+
+ if (ioqnet_connect(priv) < 0)
+ printk("IOQNET: Could not initialize instance %lld\n",
+ priv->pdev->id);
+
+
+ netif_start_queue(dev);
+ return 0;
+}
+
+static void ioqnet_destroy_queue(struct ioq *ioq)
+{
+ ioq_stop(ioq, 0);
+ ioq->destroy(ioq);
+}
+
+static int ioqnet_release(struct net_device *dev)
+{
+ struct ioqnet_priv *priv = netdev_priv(dev);
+
+ netif_stop_queue(dev);
+
+ if (ioqnet_disconnect(priv) < 0)
+ printk("IOQNET: Could not initialize instance %lld\n",
+ priv->pdev->id);
+
+ ioqnet_destroy_queue(priv->rxq.queue);
+ ioqnet_destroy_queue(priv->txq.queue);
+
+ return 0;
+}
+
+/*
+ * Configuration changes (passed on by ifconfig)
+ */
+static int ioqnet_config(struct net_device *dev, struct ifmap *map)
+{
+ if (dev->flags & IFF_UP) /* can't act on a running interface */
+ return -EBUSY;
+
+ /* Don't allow changing the I/O address */
+ if (map->base_addr != dev->base_addr) {
+ printk(KERN_WARNING "ioqnet: Can't change I/O address\n");
+ return -EOPNOTSUPP;
+ }
+
+ /* ignore other fields */
+ return 0;
+}
+
+/*
+ * The poll implementation.
+ */
+static int ioqnet_poll(struct net_device *dev, int *budget)
+{
+ int npackets = 0, quota = min(dev->quota, *budget);
+ struct ioqnet_priv *priv = netdev_priv(dev);
+ struct ioq_iterator iter;
+ unsigned long flags;
+ int ret;
+
+ PDEBUG("polling...\n");
+
+ spin_lock_irqsave(&priv->lock, flags);
+
+ /* We want to iterate on the tail of the in-use index */
+ ret = ioq_iter_init(priv->rxq.queue, &iter, ioq_idxtype_inuse, 0);
+ BUG_ON(ret < 0);
+
+ ret = ioq_iter_seek(&iter, ioq_seek_tail, 0, 0);
+ BUG_ON(ret < 0);
+
+ /*
+ * We stop if we have met the quota or there are no more packets.
+ * The EOM is indicated by finding a packet that is still owned by
+ * the south side
+ */
+ while ((npackets < quota) && (!iter.desc->sown)) {
+ struct sk_buff *skb = (struct sk_buff*)iter.desc->cookie;
+
+ skb_push(skb, iter.desc->alen);
+
+ /* Maintain stats */
+ npackets++;
+ priv->stats.rx_packets++;
+ priv->stats.rx_bytes += iter.desc->alen;
+
+ /* Pass the buffer up to the stack */
+ skb->dev = dev;
+ skb->protocol = eth_type_trans(skb, dev);
+ netif_receive_skb(skb);
+
+ mb();
+
+ /* Grab a new buffer to put in the ring */
+ ioqnet_alloc_rx_desc(iter.desc, dev->mtu);
+
+ /* Advance the in-use tail */
+ ret = ioq_iter_pop(&iter, 0);
+ BUG_ON(ret < 0);
+
+ /* Toggle the lock */
+ spin_unlock_irqrestore(&priv->lock, flags);
+ spin_lock_irqsave(&priv->lock, flags);
+ }
+
+ PDEBUG("poll: %d packets received\n", npackets);
+
+ /*
+ * If we processed all packets, we're done; tell the kernel and
+ * reenable ints
+ */
+ *budget -= npackets;
+ dev->quota -= npackets;
+ if (ioq_empty(priv->rxq.queue, ioq_idxtype_inuse)) {
+ /* FIXME: there is a race with enabling interrupts */
+ netif_rx_complete(dev);
+ ioqnet_rx_ints(dev, 1);
+ ret = 0;
+ } else
+ /* We couldn't process everything. */
+ ret = 1;
+
+ spin_unlock_irqrestore(&priv->lock, flags);
+
+ /* And let the south side know that we changed the rx-queue */
+ ioq_signal(priv->rxq.queue, 0);
+
+ return ret;
+}
+
+/*
+ * Transmit a packet (called by the kernel)
+ */
+static int ioqnet_tx_start(struct sk_buff *skb, struct net_device *dev)
+{
+ struct ioqnet_priv *priv = netdev_priv(dev);
+ struct ioq_iterator viter;
+ struct ioq_iterator uiter;
+ struct ioqnet_tx_desc *txdesc;
+ int ret;
+ int i;
+ unsigned long flags;
+
+ if (skb->len < ETH_ZLEN)
+ return -EINVAL;
+
+ PDEBUG("sending %d bytes\n", skb->len);
+
+ spin_lock_irqsave(&priv->lock, flags);
+
+ if (ioq_full(priv->txq.queue, ioq_idxtype_valid)) {
+ /*
+ * We must flow-control the kernel by disabling the queue
+ */
+ spin_unlock_irqrestore(&priv->lock, flags);
+ netif_stop_queue(dev);
+ return 0;
+ }
+
+ /*
+ * We want to iterate on the head of both the "inuse" and "valid" index
+ */
+ ret = ioq_iter_init(priv->txq.queue, &viter, ioq_idxtype_valid, 0);
+ BUG_ON(ret < 0);
+ ret = ioq_iter_init(priv->txq.queue, &uiter, ioq_idxtype_inuse, 0);
+ BUG_ON(ret < 0);
+
+ ret = ioq_iter_seek(&viter, ioq_seek_head, 0, 0);
+ BUG_ON(ret < 0);
+ ret = ioq_iter_seek(&uiter, ioq_seek_head, 0, 0);
+ BUG_ON(ret < 0);
+
+ /* The head pointers should move in lockstep */
+ BUG_ON(uiter.pos != viter.pos);
+
+ dev->trans_start = jiffies; /* save the timestamp */
+ skb_get(skb); /* add a refcount */
+
+ txdesc = (struct ioqnet_tx_desc*)uiter.desc->cookie;
+
+ /*
+ * We simply put the skb right onto the ring. We will get an interrupt
+ * later when the data has been consumed and we can reap the pointers
+ * at that time
+ */
+ for (i = 0; i < 1; ++i) { /* Someday we will support SG */
+ txdesc->data[i].len = (u64)skb->len;
+ txdesc->data[i].data = (u64)__pa(skb->data);
+
+ uiter.desc->alen++;
+ }
+
+ txdesc->skb = skb; /* save the skb for future release */
+
+ mb();
+ uiter.desc->valid = 1;
+ uiter.desc->sown = 1; /* give ownership to the south */
+ mb();
+
+ /* Advance both indexes together */
+ ret = ioq_iter_push(&viter, 0);
+ BUG_ON(ret < 0);
+ ret = ioq_iter_push(&uiter, 0);
+ BUG_ON(ret < 0);
+
+ /*
+ * This will signal the south side to consume the packet
+ */
+ ioq_signal(priv->txq.queue, 0);
+
+ spin_unlock_irqrestore(&priv->lock, flags);
+
+ return 0;
+}
+
+/*
+ * called by the tx interrupt handler to indicate that one or more packets
+ * have been consumed
+ */
+static void ioqnet_tx_complete(unsigned long data)
+{
+ struct ioqnet_priv *priv = (struct ioqnet_priv*)data;
+ struct ioq_iterator iter;
+ unsigned long flags;
+ int ret;
+
+ PDEBUG("send complete\n");
+
+ spin_lock_irqsave(&priv->lock, flags);
+
+ /* We want to iterate on the tail of the valid index */
+ ret = ioq_iter_init(priv->rxq.queue, &iter, ioq_idxtype_valid, 0);
+ BUG_ON(ret < 0);
+
+ ret = ioq_iter_seek(&iter, ioq_seek_tail, 0, 0);
+ BUG_ON(ret < 0);
+
+ /*
+ * We are done once we find the first packet either invalid or still
+ * owned by the south-side
+ */
+ while (iter.desc->valid && !iter.desc->sown) {
+ struct ioqnet_tx_desc *txdesc;
+ struct sk_buff *skb;
+
+ txdesc = (struct ioqnet_tx_desc*)iter.desc->cookie;
+ skb = txdesc->skb;
+
+ /* Maintain stats */
+ priv->stats.tx_packets++;
+ priv->stats.tx_bytes += skb->len;
+
+ /* Reset the descriptor */
+ mb();
+ iter.desc->alen = 0;
+ iter.desc->valid = 0;
+ mb();
+
+ dev_kfree_skb(skb);
+
+ /* Advance the valid-index tail */
+ ret = ioq_iter_pop(&iter, 0);
+ BUG_ON(ret < 0);
+
+ /* Toggle the lock */
+ spin_unlock_irqrestore(&priv->lock, flags);
+ spin_lock_irqsave(&priv->lock, flags);
+ }
+
+ /*
+ * If we were previously stopped due to flow control, restart the
+ * processing
+ */
+ if (netif_queue_stopped(priv->dev)
+ && !ioq_full(priv->txq.queue, ioq_idxtype_inuse)) {
+
+ netif_wake_queue(priv->dev);
+ }
+
+ spin_unlock_irqrestore(&priv->lock, flags);
+}
+
+/*
+ * Ioctl commands
+ */
+static int ioqnet_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
+{
+ PDEBUG("ioctl\n");
+ return 0;
+}
+
+/*
+ * Return statistics to the caller
+ */
+struct net_device_stats *ioqnet_stats(struct net_device *dev)
+{
+ struct ioqnet_priv *priv = netdev_priv(dev);
+ return &priv->stats;
+}
+
+static void ioq_rx_notify(struct ioq_notifier *notifier)
+{
+ struct ioqnet_priv *priv;
+ struct net_device *dev;
+
+ priv = container_of(notifier, struct ioqnet_priv, rxq.notifier);
+ dev = priv->dev;
+
+ ioqnet_rx_ints(dev, 0); /* Disable further interrupts */
+ netif_rx_schedule(dev);
+}
+
+static void ioq_tx_notify(struct ioq_notifier *notifier)
+{
+ struct ioqnet_priv *priv;
+
+ priv = container_of(notifier, struct ioqnet_priv, txq.notifier);
+
+ PDEBUG("tx_notify for %lld\n", priv->pdev->id);
+
+ tasklet_schedule(&priv->txtask);
+}
+
+/*
+ * This is called whenever a new pvbus_device is added to the pvbus with
+ * the matching IOQNET_NAME
+ */
+static int ioqnet_probe(struct pvbus_device *pdev)
+{
+ struct net_device *dev;
+ struct ioqnet_priv *priv;
+ int ret;
+
+ printk(KERN_INFO "IOQNET: Found new device at %lld\n", pdev->id);
+
+ dev = alloc_etherdev(sizeof(struct ioqnet_priv));
+ if (!dev)
+ return -ENOMEM;
+
+ priv = netdev_priv(dev);
+ memset(priv, 0, sizeof(*priv));
+
+ spin_lock_init(&priv->lock);
+ priv->dev = dev;
+ priv->pdev = pdev;
+ tasklet_init(&priv->txtask, ioqnet_tx_complete, (unsigned long)priv);
+
+ ioqnet_queue_init(priv, &priv->rxq, RX_RINGLEN, ioq_rx_notify);
+ ioqnet_queue_init(priv, &priv->txq, TX_RINGLEN, ioq_tx_notify);
+
+ ioqnet_setup_rx(priv);
+ ioqnet_setup_tx(priv);
+
+ ioqnet_rx_ints(dev, 1); /* enable receive interrupts */
+ ioq_start(priv->txq.queue, 0); /* enable transmit interrupts */
+
+ ether_setup(dev); /* assign some of the fields */
+
+ dev->open = ioqnet_open;
+ dev->stop = ioqnet_release;
+ dev->set_config = ioqnet_config;
+ dev->hard_start_xmit = ioqnet_tx_start;
+ dev->do_ioctl = ioqnet_ioctl;
+ dev->get_stats = ioqnet_stats;
+ dev->poll = ioqnet_poll;
+ dev->weight = 2;
+ dev->hard_header_cache = NULL; /* Disable caching */
+
+ ret = ioqnet_query_mac(priv);
+ if (ret < 0) {
+ printk("IOQNET: Could not obtain MAC address for %lld\n",
+ priv->pdev->id);
+ goto out_free;
+ }
+
+ ret = register_netdev(dev);
+ if (ret < 0) {
+ printk("IOQNET: error %i registering device \"%s\"\n",
+ ret, dev->name);
+ goto out_free;
+ }
+
+ pdev->priv = priv;
+
+ return 0;
+
+ out_free:
+ free_netdev(dev);
+
+ return ret;
+}
+
+static int ioqnet_remove(struct pvbus_device *pdev)
+{
+ struct ioqnet_priv *priv = (struct ioqnet_priv*)pdev->priv;
+
+ unregister_netdev(priv->dev);
+ ioqnet_release(priv->dev);
+ free_netdev(priv->dev);
+
+ return 0;
+}
+
+/*
+ * Finally, the module stuff
+ */
+
+static struct pvbus_driver ioqnet_pvbus_driver = {
+ .name = IOQNET_NAME,
+ .owner = THIS_MODULE,
+ .probe = ioqnet_probe,
+ .remove = ioqnet_remove,
+};
+
+__init int ioqnet_init_module(void)
+{
+ return pvbus_driver_register(&ioqnet_pvbus_driver);
+}
+
+__exit void ioqnet_cleanup(void)
+{
+ pvbus_driver_unregister(&ioqnet_pvbus_driver);
+}
+
+
+module_init(ioqnet_init_module);
+module_exit(ioqnet_cleanup);
diff --git a/include/linux/ioqnet.h b/include/linux/ioqnet.h
new file mode 100644
index 0000000..7c73a26
--- /dev/null
+++ b/include/linux/ioqnet.h
@@ -0,0 +1,44 @@
+/*
+ * Copyright 2007 Novell. All Rights Reserved.
+ *
+ * IOQ Network Driver
+ *
+ * Author:
+ * Gregory Haskins <ghaskins-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
+ *
+ * This file is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+
+#ifndef _IOQNET_H
+#define _IOQNET_H
+
+#define IOQNET_VERSION 1
+#define IOQNET_NAME "ioqnet"
+
+/* IOQNET functions (invoked via pvbus_device->call()) */
+#define IOQNET_CONNECT 1
+#define IOQNET_DISCONNECT 2
+#define IOQNET_QUERY_MAC 3
+
+struct ioqnet_connect {
+ ioq_id_t rxq;
+ ioq_id_t txq;
+};
+
+struct ioqnet_tx_ptr {
+ u64 len;
+ u64 data;
+};
+
+#endif /* _IOQNET_H */
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
^ permalink raw reply related [flat|nested] 41+ messages in thread* [PATCH 04/10] IOQNET: Add a test harness infrastructure to IOQNET
[not found] ` <20070816231357.8044.55943.stgit-5CR4LY5GPkvLDviKLk5550HKjMygAv58XqFh9Ls21Oc@public.gmane.org>
` (2 preceding siblings ...)
2007-08-16 23:14 ` [PATCH 03/10] IOQ: Add an IOQ network driver Gregory Haskins
@ 2007-08-16 23:14 ` Gregory Haskins
2007-08-16 23:14 ` [PATCH 05/10] IRQ: Export create_irq/destroy_irq Gregory Haskins
` (6 subsequent siblings)
10 siblings, 0 replies; 41+ messages in thread
From: Gregory Haskins @ 2007-08-16 23:14 UTC (permalink / raw)
To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
We can add a IOQNET loop-back device and register it with the PVBUS to test
many aspects of the system (IOQ, PVBUS, and IOQNET itself).
Signed-off-by: Gregory Haskins <ghaskins-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
---
drivers/net/Kconfig | 10 +
drivers/net/ioqnet/Makefile | 3
drivers/net/ioqnet/loopback.c | 502 +++++++++++++++++++++++++++++++++++++++++
3 files changed, 515 insertions(+), 0 deletions(-)
diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index 7ee7454..426947d 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -2957,6 +2957,16 @@ config IOQNET_DEBUG
depends on IOQNET
default n
+config IOQNET_LOOPBACK
+ tristate "IOQNET loopback device test harness"
+ depends on IOQNET
+ default n
+ ---help---
+ This will install a special PVBUS device that implements two IOQNET
+ devices. The devices are, of course, linked to one another forming a
+ loopback mechanism. This allows many subsystems to be tested: IOQ,
+ PVBUS, and IOQNET itself. If unsure, say N.
+
endif #NETDEVICES
config NETPOLL
diff --git a/drivers/net/ioqnet/Makefile b/drivers/net/ioqnet/Makefile
index d7020ee..7d2d156 100644
--- a/drivers/net/ioqnet/Makefile
+++ b/drivers/net/ioqnet/Makefile
@@ -4,8 +4,11 @@
ioqnet-objs = driver.o
obj-$(CONFIG_IOQNET) += ioqnet.o
+ioqnet-loopback-objs = loopback.o
+obj-$(CONFIG_IOQNET_LOOPBACK) += ioqnet-loopback.o
ifeq ($(CONFIG_IOQNET_DEBUG),y)
EXTRA_CFLAGS += -DIOQNET_DEBUG
endif
+
diff --git a/drivers/net/ioqnet/loopback.c b/drivers/net/ioqnet/loopback.c
new file mode 100644
index 0000000..0e36b43
--- /dev/null
+++ b/drivers/net/ioqnet/loopback.c
@@ -0,0 +1,502 @@
+/*
+ * ioqnet test harness
+ *
+ * Copyright (C) 2007 Novell, Gregory Haskins <ghaskins-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
+ *
+ * This file is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+
+#include <linux/module.h>
+#include <linux/pvbus.h>
+#include <linux/ioq.h>
+#include <linux/kthread.h>
+#include <linux/ioqnet.h>
+#include <linux/interrupt.h>
+
+MODULE_AUTHOR("Gregory Haskins");
+MODULE_LICENSE("GPL");
+
+#ifndef ETH_ALEN
+#define ETH_ALEN 6
+#endif
+
+#undef PDEBUG /* undef it, just in case */
+#ifdef IOQNET_DEBUG
+# define PDEBUG(fmt, args...) printk( KERN_DEBUG "ioqnet: " fmt, ## args)
+#else
+# define PDEBUG(fmt, args...) /* not debugging: nothing */
+#endif
+
+/*
+ * ---------------------------------------------------------------------
+ * First we must create an IOQ implementation to use while under test
+ * since these operations will all be local to the same host
+ * ---------------------------------------------------------------------
+ */
+
+struct ioqnet_lb_ioq {
+ struct ioq ioq;
+ struct ioqnet_lb_ioq *peer;
+ struct tasklet_struct task;
+};
+
+struct ioqnet_lb_ioqmgr {
+ struct ioq_mgr mgr;
+
+ /*
+ * Since this is just a test harness, we know ahead of time that
+ * we aren't going to need more than a handful of IOQs. So to keep
+ * lookups simple we will simply create a static array of them
+ */
+ struct ioqnet_lb_ioq ioqs[8];
+ int pos;
+};
+
+static struct ioqnet_lb_ioqmgr lb_ioqmgr;
+
+struct ioqnet_lb_ioq* to_ioq(struct ioq *ioq)
+{
+ return container_of(ioq, struct ioqnet_lb_ioq, ioq);
+}
+
+struct ioqnet_lb_ioqmgr* to_mgr(struct ioq_mgr *mgr)
+{
+ return container_of(mgr, struct ioqnet_lb_ioqmgr, mgr);
+}
+
+/*
+ * ------------------
+ * ioq implementation
+ * ------------------
+ */
+static void ioqnet_lb_ioq_wake(unsigned long data)
+{
+ struct ioqnet_lb_ioq *_ioq = (struct ioqnet_lb_ioq*)data;
+
+ if (_ioq->peer)
+ ioq_wakeup(&_ioq->peer->ioq);
+}
+
+static int ioqnet_lb_ioq_signal(struct ioq *ioq)
+{
+ struct ioqnet_lb_ioq *_ioq = to_ioq(ioq);
+
+ if (_ioq->peer)
+ tasklet_schedule(&_ioq->task);
+
+ return 0;
+}
+
+static void ioqnet_lb_ioq_destroy(struct ioq *ioq)
+{
+ struct ioqnet_lb_ioq *_ioq = to_ioq(ioq);
+
+ if (_ioq->peer) {
+ _ioq->peer->peer = NULL;
+ _ioq->peer = NULL;
+ }
+
+ if (_ioq->ioq.locale == ioq_locality_north) {
+ kfree(_ioq->ioq.ring);
+ kfree(_ioq->ioq.head_desc);
+ } else
+ kfree(_ioq);
+}
+
+/*
+ * ------------------
+ * ioqmgr implementation
+ * ------------------
+ */
+static int ioqnet_lb_ioq_create(struct ioq_mgr *t, struct ioq **ioq,
+ size_t ringsize, int flags)
+{
+ struct ioqnet_lb_ioqmgr *mgr = to_mgr(t);
+ struct ioqnet_lb_ioq *_ioq = NULL;
+ struct ioq_ring_head *head_desc = NULL;
+ void *ring = NULL;
+ int ret = -ENOMEM;
+ size_t ringlen;
+ ioq_id_t id;
+
+ ringlen = sizeof(struct ioq_ring_desc) * ringsize;
+
+ BUG_ON(mgr->pos >= (sizeof(mgr->ioqs)/sizeof(mgr->ioqs[0])));
+
+ id = (ioq_id_t)mgr->pos++;
+
+ _ioq = &mgr->ioqs[id];
+ if (!_ioq)
+ goto error;
+
+ head_desc = kzalloc(sizeof(*head_desc), GFP_KERNEL);
+ if (!head_desc)
+ goto error;
+
+ ring = kzalloc(ringlen, GFP_KERNEL);
+ if (!ring)
+ goto error;
+
+ head_desc->magic = IOQ_RING_MAGIC;
+ head_desc->ver = IOQ_RING_VER;
+ head_desc->id = id;
+ head_desc->count = ringsize;
+ head_desc->ptr = (u64)ring;
+
+ ioq_init(&_ioq->ioq);
+
+ _ioq->ioq.signal = ioqnet_lb_ioq_signal;
+ _ioq->ioq.destroy = ioqnet_lb_ioq_destroy;
+
+ _ioq->ioq.id = head_desc->id;
+ _ioq->ioq.locale = ioq_locality_north;
+ _ioq->ioq.mgr = t;
+ _ioq->ioq.head_desc = head_desc;
+ _ioq->ioq.ring = ring;
+
+ tasklet_init(&_ioq->task, ioqnet_lb_ioq_wake, (unsigned long)_ioq);
+
+ *ioq = &_ioq->ioq;
+
+ return 0;
+
+ error:
+ if (head_desc)
+ kfree(head_desc);
+ if (ring)
+ kfree(ring);
+
+ return ret;
+}
+
+static int ioqnet_lb_ioq_connect(struct ioq_mgr *t, ioq_id_t id,
+ struct ioq **ioq, int flags)
+{
+ struct ioqnet_lb_ioqmgr *mgr = to_mgr(t);
+ struct ioqnet_lb_ioq *peer_ioq = &mgr->ioqs[id];
+ struct ioqnet_lb_ioq *_ioq;
+
+ if (peer_ioq->peer)
+ return -EEXIST;
+
+ _ioq = kzalloc(sizeof(*_ioq), GFP_KERNEL);
+ if (!_ioq)
+ return -ENOMEM;
+
+ ioq_init(&_ioq->ioq);
+
+ _ioq->ioq.signal = ioqnet_lb_ioq_signal;
+ _ioq->ioq.destroy = ioqnet_lb_ioq_destroy;
+
+ _ioq->ioq.id = id;
+ _ioq->ioq.locale = ioq_locality_south;
+ _ioq->ioq.mgr = t;
+ _ioq->ioq.head_desc = peer_ioq->ioq.head_desc;
+ _ioq->ioq.ring = peer_ioq->ioq.ring;
+
+ _ioq->peer = peer_ioq;
+ peer_ioq->peer = _ioq;
+ tasklet_init(&_ioq->task, ioqnet_lb_ioq_wake, (unsigned long)_ioq);
+
+ *ioq = &_ioq->ioq;
+
+ return 0;
+}
+
+static void ioqnet_lb_ioqmgr_init(void)
+{
+ struct ioqnet_lb_ioqmgr *mgr = &lb_ioqmgr;
+
+ memset(mgr, 0, sizeof(mgr));
+
+ mgr->mgr.create = ioqnet_lb_ioq_create;
+ mgr->mgr.connect = ioqnet_lb_ioq_connect;
+}
+
+/*
+ * ---------------------------------------------------------------------
+ * Next we create the loopback device in terms of our ioqnet_lb_ioq
+ * subsystem
+ * ---------------------------------------------------------------------
+ */
+
+struct ioqnet_lb_device {
+ int idx; /* 0 or 1 */
+ struct ioq *rxq;
+ struct ioq *txq;
+ char mac[ETH_ALEN];
+ struct task_struct *task;
+ struct ioqnet_lb_device *peer;
+
+ struct pvbus_device dev;
+};
+
+struct ioqnet_lb_device* to_dev(struct pvbus_device *dev)
+{
+ return container_of(dev, struct ioqnet_lb_device, dev);
+}
+
+static void ioqnet_lb_xmit(struct ioqnet_lb_device *dev,
+ struct ioqnet_tx_ptr *ptr,
+ size_t count)
+{
+ DECLARE_WAITQUEUE(wait, current);
+ struct ioq_iterator iter;
+ int ret;
+ int i;
+ char *dest;
+
+ add_wait_queue(&dev->txq->wq, &wait);
+
+ /* We want to iterate on the head of the in-use index for reading */
+ ret = ioq_iter_init(dev->txq, &iter, ioq_idxtype_inuse,
+ IOQ_ITER_AUTOUPDATE);
+ BUG_ON(ret < 0);
+
+ ret = ioq_iter_seek(&iter, ioq_seek_head, 0, 0);
+ BUG_ON(ret < 0);
+
+ set_current_state(TASK_UNINTERRUPTIBLE);
+
+ while (!iter.desc->valid || !iter.desc->sown)
+ schedule();
+
+ set_current_state(TASK_RUNNING);
+
+ dest = __va(iter.desc->ptr);
+
+ for (i = 0; i < count; ++i) {
+ struct ioqnet_tx_ptr *p = &ptr[i];
+ void *d = __va(p->data);
+
+ memcpy(dest, d, p->len);
+ dest += p->len;
+ }
+
+ mb();
+ iter.desc->sown = 0;
+ mb();
+
+ /* Advance the in-use head */
+ ret = ioq_iter_push(&iter, 0);
+ BUG_ON(ret < 0);
+}
+
+/*
+ * This is the daemon thread for each device that gets created once the guest
+ * side connects to us (via the pvbus_device->call(IOQNET_CONNECT) operation).
+ * We want to wait on packets to arrive on the rxq, and then send them to our
+ * peer's txq.
+ */
+static int ioqnet_lb_thread(void *data)
+{
+ DECLARE_WAITQUEUE(wait, current);
+ struct ioqnet_lb_device *dev = (struct ioqnet_lb_device*)data;
+ struct ioq_iterator iter;
+ int ret;
+
+ add_wait_queue(&dev->rxq->wq, &wait);
+
+ /* We want to iterate on the tail of the in-use index for reading */
+ ret = ioq_iter_init(dev->rxq, &iter, ioq_idxtype_inuse,
+ IOQ_ITER_AUTOUPDATE);
+ BUG_ON(ret < 0);
+
+ ret = ioq_iter_seek(&iter, ioq_seek_tail, 0, 0);
+ BUG_ON(ret < 0);
+
+ while (1) {
+ struct ioq_ring_desc *desc = iter.desc;
+ struct ioqnet_tx_ptr *ptr;
+
+ PDEBUG("%d: Waiting...\n", dev->idx);
+
+ set_current_state(TASK_UNINTERRUPTIBLE);
+
+ while (!desc->sown)
+ schedule();
+
+ set_current_state(TASK_RUNNING);
+
+ PDEBUG("%d: Got a packet\n", dev->idx);
+
+ ptr = __va(desc->ptr);
+
+ /*
+ * If the peer is connected, we transmit it to their
+ * queue...otherwise we just drop it on the floor
+ */
+ if (dev->peer->txq)
+ ioqnet_lb_xmit(dev->peer, ptr, desc->alen);
+
+ mb();
+ desc->sown = 0;
+ mb();
+
+ /* Advance the in-use tail */
+ ret = ioq_iter_pop(&iter, 0);
+ BUG_ON(ret < 0);
+ }
+
+ return 0;
+}
+
+static int ioqnet_lb_dev_createqueue(struct pvbus_device *dev,
+ struct ioq **ioq,
+ size_t ringsize, int flags)
+{
+ struct ioq_mgr *ioqmgr = &lb_ioqmgr.mgr;
+
+ return ioqmgr->create(ioqmgr, ioq, ringsize, flags);
+}
+
+static int ioqnet_lb_queue_connect(ioq_id_t id, struct ioq **ioq)
+{
+ int ret;
+ struct ioq_mgr *ioqmgr = &lb_ioqmgr.mgr;
+
+ ret = ioqmgr->connect(ioqmgr, id, ioq, 0);
+ if (ret < 0)
+ return ret;
+
+ ioq_start(*ioq, 0);
+
+ return 0;
+}
+
+static int ioqnet_lb_dev_connect(struct ioqnet_lb_device *dev,
+ void *data, size_t len)
+{
+ struct ioqnet_connect *cnct = (struct ioqnet_connect*)data;
+ int ret;
+
+ /* We connect the north's rxq to our txq */
+ ret = ioqnet_lb_queue_connect(cnct->rxq, &dev->txq);
+ if (ret < 0)
+ return ret;
+
+ /* And vice-versa */
+ ret = ioqnet_lb_queue_connect(cnct->txq, &dev->rxq);
+ if (ret < 0)
+ return ret;
+
+ dev->task = kthread_create(ioqnet_lb_thread, dev,
+ "ioqnet-lb/%d", dev->idx);
+ wake_up_process(dev->task);
+
+ return 0;
+}
+
+static int ioqnet_lb_dev_query_mac(struct ioqnet_lb_device *dev,
+ void *data, size_t len)
+{
+ if (len != ETH_ALEN)
+ return -EINVAL;
+
+ memcpy(data, dev->mac, ETH_ALEN);
+
+ return 0;
+}
+
+/*
+ * This function is invoked whenever a guest calls pvbus_ops->call() against
+ * our instance ID
+ */
+static int ioqnet_lb_dev_call(struct pvbus_device *dev, u32 func, void *data,
+ size_t len, int flags)
+{
+ struct ioqnet_lb_device *_dev = to_dev(dev);
+ int ret = -EINVAL;
+
+ switch (func) {
+ case IOQNET_CONNECT:
+ ret = ioqnet_lb_dev_connect(_dev, data, len);
+ break;
+ case IOQNET_QUERY_MAC:
+ ret = ioqnet_lb_dev_query_mac(_dev, data, len);
+ break;
+ }
+
+ return ret;
+}
+
+static int ioqnet_lb_dev_init(struct ioqnet_lb_device *dev,
+ int idx,
+ struct ioqnet_lb_device *peer)
+{
+ char mac[] = { 0x00, 0x30, 0xcc, 0x00, 0x00, idx };
+
+ memset(dev, 0, sizeof(*dev));
+ dev->idx = idx;
+ dev->peer = peer;
+ memcpy(dev->mac, mac, ETH_ALEN);
+
+ dev->dev.name = IOQNET_NAME;
+ dev->dev.id = idx;
+ dev->dev.createqueue = ioqnet_lb_dev_createqueue;
+ dev->dev.call = ioqnet_lb_dev_call;
+ sprintf(dev->dev.dev.bus_id, "%d", idx);
+
+ return 0;
+}
+
+/*
+ * ---------------------------------------------------------------------
+ * Finally we create the top-level object that binds it all together
+ * ---------------------------------------------------------------------
+ */
+
+
+struct ioqnet_lb {
+ struct ioqnet_lb_device devs[2];
+};
+
+static struct ioqnet_lb ioqnet_lb;
+
+__init int ioqnet_lb_init_module(void)
+{
+ int ret;
+ int i;
+
+ ioqnet_lb_ioqmgr_init();
+
+ /* First initialize both devices */
+ for (i = 0; i < 2; i++) {
+ ret = ioqnet_lb_dev_init(&ioqnet_lb.devs[i],
+ i,
+ &ioqnet_lb.devs[!i]);
+ BUG_ON(ret < 0);
+ }
+
+ /* Then register them together */
+ for (i = 0; i < 2; i++) {
+ ret = pvbus_device_register(&ioqnet_lb.devs[i].dev);
+ BUG_ON(ret < 0);
+ }
+
+ return 0;
+}
+
+__exit void ioqnet_lb_cleanup(void)
+{
+ int i;
+
+ for (i = 0; i < 2; i++)
+ pvbus_device_unregister(&ioqnet_lb.devs[i].dev);
+
+}
+
+
+module_init(ioqnet_lb_init_module);
+module_exit(ioqnet_lb_cleanup);
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
^ permalink raw reply related [flat|nested] 41+ messages in thread* [PATCH 05/10] IRQ: Export create_irq/destroy_irq
[not found] ` <20070816231357.8044.55943.stgit-5CR4LY5GPkvLDviKLk5550HKjMygAv58XqFh9Ls21Oc@public.gmane.org>
` (3 preceding siblings ...)
2007-08-16 23:14 ` [PATCH 04/10] IOQNET: Add a test harness infrastructure to IOQNET Gregory Haskins
@ 2007-08-16 23:14 ` Gregory Haskins
2007-08-16 23:14 ` [PATCH 06/10] KVM: Add a guest side driver for IOQ Gregory Haskins
` (5 subsequent siblings)
10 siblings, 0 replies; 41+ messages in thread
From: Gregory Haskins @ 2007-08-16 23:14 UTC (permalink / raw)
To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Signed-off-by: Gregory Haskins <ghaskins-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
---
arch/x86_64/kernel/io_apic.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/arch/x86_64/kernel/io_apic.c b/arch/x86_64/kernel/io_apic.c
index d8bfe31..6bf8794 100644
--- a/arch/x86_64/kernel/io_apic.c
+++ b/arch/x86_64/kernel/io_apic.c
@@ -1849,6 +1849,7 @@ int create_irq(void)
}
return irq;
}
+EXPORT_SYMBOL(create_irq);
void destroy_irq(unsigned int irq)
{
@@ -1860,6 +1861,7 @@ void destroy_irq(unsigned int irq)
__clear_irq_vector(irq);
spin_unlock_irqrestore(&vector_lock, flags);
}
+EXPORT_SYMBOL(destroy_irq);
/*
* MSI mesage composition
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
^ permalink raw reply related [flat|nested] 41+ messages in thread* [PATCH 06/10] KVM: Add a guest side driver for IOQ
[not found] ` <20070816231357.8044.55943.stgit-5CR4LY5GPkvLDviKLk5550HKjMygAv58XqFh9Ls21Oc@public.gmane.org>
` (4 preceding siblings ...)
2007-08-16 23:14 ` [PATCH 05/10] IRQ: Export create_irq/destroy_irq Gregory Haskins
@ 2007-08-16 23:14 ` Gregory Haskins
2007-08-16 23:14 ` [PATCH 07/10] KVM: Add a gpa_to_hva helper function Gregory Haskins
` (4 subsequent siblings)
10 siblings, 0 replies; 41+ messages in thread
From: Gregory Haskins @ 2007-08-16 23:14 UTC (permalink / raw)
To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Signed-off-by: Gregory Haskins <ghaskins-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
---
drivers/kvm/Kconfig | 28 +++
drivers/kvm/Makefile | 3
drivers/kvm/ioq.h | 39 +++++
drivers/kvm/ioq_guest.c | 195 +++++++++++++++++++++++
drivers/kvm/pvbus.h | 63 +++++++
drivers/kvm/pvbus_guest.c | 382 +++++++++++++++++++++++++++++++++++++++++++++
include/linux/kvm.h | 4
7 files changed, 706 insertions(+), 8 deletions(-)
diff --git a/drivers/kvm/Kconfig b/drivers/kvm/Kconfig
index 22d0eb4..aca79d1 100644
--- a/drivers/kvm/Kconfig
+++ b/drivers/kvm/Kconfig
@@ -47,16 +47,32 @@ config KVM_BALLOON
The driver inflate/deflate guest physical memory on demand.
This ability provides memory over commit for the host
-config KVM_NET
- tristate "Para virtual network device"
- depends on KVM
- ---help---
- Provides support for guest paravirtualization networking
-
config KVM_NET_HOST
tristate "Para virtual network host device"
depends on KVM
---help---
Provides support for host paravirtualization networking
+config KVM_GUEST
+ bool "KVM Guest support"
+ depends on X86
+ default y
+
+config KVM_PVBUS_GUEST
+ tristate "Paravirtualized Bus (PVBUS) support"
+ depends on KVM_GUEST
+ select IOQ
+ select PVBUS
+ ---help---
+ PVBUS is an infrastructure for generic PV drivers to take advantage
+ of an underlying hypervisor without having to understand the details
+ of the hypervisor itself. You only need this option if you plan to
+ run this kernel as a KVM guest.
+
+config KVM_NET
+ tristate "Para virtual network device"
+ depends on KVM && KVM_GUEST
+ ---help---
+ Provides support for guest paravirtualization networking
+
endif # VIRTUALIZATION
diff --git a/drivers/kvm/Makefile b/drivers/kvm/Makefile
index 92600d8..c6a59bb 100644
--- a/drivers/kvm/Makefile
+++ b/drivers/kvm/Makefile
@@ -14,4 +14,5 @@ kvm-net-objs = kvm_net.o
obj-$(CONFIG_KVM_NET) += kvm-net.o
kvm-net-host-objs = kvm_net_host.o
obj-$(CONFIG_KVM_NET_HOST) += kvm_net_host.o
-
+kvm-pvbus-objs := ioq_guest.o pvbus_guest.o
+obj-$(CONFIG_KVM_PVBUS_GUEST) += kvm-pvbus.o
diff --git a/drivers/kvm/ioq.h b/drivers/kvm/ioq.h
new file mode 100644
index 0000000..7e955f1
--- /dev/null
+++ b/drivers/kvm/ioq.h
@@ -0,0 +1,39 @@
+/*
+ * Copyright 2007 Novell. All Rights Reserved.
+ *
+ * See include/linux/ioq.h for documentation
+ *
+ * Author:
+ * Gregory Haskins <ghaskins-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
+ *
+ * This file is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+
+#ifndef _KVM_IOQ_H_
+#define _KVM_IOQ_H_
+
+#include <linux/ioq.h>
+
+#define IOQHC_REGISTER 1
+#define IOQHC_UNREGISTER 2
+#define IOQHC_SIGNAL 3
+
+struct ioq_register {
+ ioq_id_t id;
+ u32 irq;
+ u64 ring;
+};
+
+
+#endif /* _KVM_IOQ_H_ */
diff --git a/drivers/kvm/ioq_guest.c b/drivers/kvm/ioq_guest.c
new file mode 100644
index 0000000..068aeb1
--- /dev/null
+++ b/drivers/kvm/ioq_guest.c
@@ -0,0 +1,195 @@
+/*
+ * Copyright 2007 Novell. All Rights Reserved.
+ *
+ * See include/linux/ioq.h for documentation
+ *
+ * Author:
+ * Gregory Haskins <ghaskins-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
+ *
+ * This file is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+
+#include <linux/ioq.h>
+#include <asm/hypercall.h>
+
+#include "ioq.h"
+#include "kvm.h"
+
+struct kvmguest_ioq {
+ struct ioq ioq;
+ int irq;
+};
+
+struct kvmguest_ioq* to_ioq(struct ioq *ioq)
+{
+ return container_of(ioq, struct kvmguest_ioq, ioq);
+}
+
+static int ioq_hypercall(unsigned long nr, void *data)
+{
+ return hypercall(2, __NR_hypercall_ioq, nr, __pa(data));
+}
+
+/*
+ * ------------------
+ * interrupt handler
+ * ------------------
+ */
+irqreturn_t kvmguest_ioq_intr(int irq, void *dev)
+{
+ struct kvmguest_ioq *_ioq = to_ioq(dev);
+
+ ioq_wakeup(&_ioq->ioq);
+
+ return IRQ_HANDLED;
+}
+
+/*
+ * ------------------
+ * ioq implementation
+ * ------------------
+ */
+
+static int kvmguest_ioq_signal(struct ioq *ioq)
+{
+ return ioq_hypercall(IOQHC_SIGNAL, &ioq->id);
+}
+
+static void kvmguest_ioq_destroy(struct ioq *ioq)
+{
+ struct kvmguest_ioq *_ioq = to_ioq(ioq);
+ int ret;
+
+ ret = ioq_hypercall(IOQHC_UNREGISTER, &ioq->id);
+ BUG_ON (ret < 0);
+
+ free_irq(_ioq->irq, NULL);
+ destroy_irq(_ioq->irq);
+
+ kfree(_ioq->ioq.ring);
+ kfree(_ioq->ioq.head_desc);
+ kfree(_ioq);
+}
+
+/*
+ * ------------------
+ * ioqmgr implementation
+ * ------------------
+ */
+static int kvmguest_ioq_register(struct kvmguest_ioq *ioq, ioq_id_t id,
+ int irq, void *ring)
+{
+ struct ioq_register data = {
+ .id = id,
+ .irq = irq,
+ .ring = (u64)__pa(ring),
+ };
+
+ return ioq_hypercall(IOQHC_REGISTER, &data);
+}
+
+static int kvmguest_ioq_create(struct ioq_mgr *t, struct ioq **ioq,
+ size_t ringsize, int flags)
+{
+ struct kvmguest_ioq *_ioq = NULL;
+ struct ioq_ring_head *head_desc = NULL;
+ void *ring = NULL;
+ size_t ringlen = sizeof(struct ioq_ring_desc) * ringsize;
+ int ret = -ENOMEM;
+
+ _ioq = kzalloc(sizeof(*_ioq), GFP_KERNEL);
+ if (!_ioq)
+ goto error;
+
+ head_desc = kzalloc(sizeof(*head_desc), GFP_KERNEL | GFP_DMA);
+ if (!head_desc)
+ goto error;
+
+ ring = kzalloc(ringlen, GFP_KERNEL | GFP_DMA);
+ if (!ring)
+ goto error;
+
+ head_desc->magic = IOQ_RING_MAGIC;
+ head_desc->ver = IOQ_RING_VER;
+ head_desc->id = (ioq_id_t)_ioq;
+ head_desc->count = ringsize;
+ head_desc->ptr = (u64)__pa(ring);
+
+ /* Dynamically assign a free IRQ to this resource */
+ _ioq->irq = create_irq();
+
+ ioq_init(&_ioq->ioq);
+
+ _ioq->ioq.signal = kvmguest_ioq_signal;
+ _ioq->ioq.destroy = kvmguest_ioq_destroy;
+
+ _ioq->ioq.id = head_desc->id;
+ _ioq->ioq.locale = ioq_locality_north;
+ _ioq->ioq.mgr = t;
+ _ioq->ioq.head_desc = head_desc;
+ _ioq->ioq.ring = ring;
+
+ ret = request_irq(_ioq->irq, kvmguest_ioq_intr, 0, "KVM-IOQ", _ioq);
+ if (ret < 0)
+ goto error;
+
+ ret = kvmguest_ioq_register(_ioq, _ioq->ioq.id, _ioq->irq, ring);
+ if (ret < 0)
+ goto error;
+
+ *ioq = &_ioq->ioq;
+
+ return 0;
+
+ error:
+ if (_ioq)
+ kfree(_ioq);
+ if (head_desc)
+ kfree(head_desc);
+ if (ring)
+ kfree(ring);
+
+ return ret;
+}
+
+static int kvmguest_ioq_connect(struct ioq_mgr *t, ioq_id_t id,
+ struct ioq **ioq, int flags)
+{
+ /* You cannot connect to queues on the guest */
+ return -EINVAL;
+
+}
+
+int kvmguest_ioqmgr_alloc(struct ioq_mgr **mgr)
+{
+ struct ioq_mgr *_mgr = kzalloc(sizeof(*_mgr), GFP_KERNEL);
+ if (!_mgr)
+ return -ENOMEM;
+
+ _mgr->create = kvmguest_ioq_create;
+ _mgr->connect = kvmguest_ioq_connect;
+
+ *mgr = _mgr;
+
+ return 0;
+}
+
+void kvmguest_ioqmgr_free(struct ioq_mgr *mgr)
+{
+ kfree(mgr);
+}
+
+
+
+
diff --git a/drivers/kvm/pvbus.h b/drivers/kvm/pvbus.h
new file mode 100644
index 0000000..3241ef0
--- /dev/null
+++ b/drivers/kvm/pvbus.h
@@ -0,0 +1,63 @@
+/*
+ * Copyright 2007 Novell. All Rights Reserved.
+ *
+ * Author:
+ * Gregory Haskins <ghaskins-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
+ *
+ * This file is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+
+#ifndef _KVM_PVBUS_H
+#define _KVM_PVBUS_H
+
+#include <linux/ioq.h>
+
+#define KVM_PVBUS_OP_REGISTER 1
+#define KVM_PVBUS_OP_UNREGISTER 2
+#define KVM_PVBUS_OP_CALL 3
+
+struct pvbus_register_params {
+ ioq_id_t qid;
+};
+
+struct pvbus_call_params {
+ u64 inst;
+ u32 func;
+ u64 data;
+ u64 len;
+};
+
+#define KVM_PVBUS_EVENT_ADD 1
+#define KVM_PVBUS_EVENT_DROP 2
+
+#define PVBUS_MAX_NAME 128
+
+struct pvbus_add_event {
+ char name[PVBUS_MAX_NAME];
+ u64 id;
+};
+
+struct pvbus_drop_event {
+ u64 id;
+};
+
+struct pvbus_event {
+ u32 eventid;
+ union {
+ struct pvbus_add_event add;
+ struct pvbus_drop_event drop;
+ }data;
+};
+
+#endif /* _KVM_PVBUS_H */
diff --git a/drivers/kvm/pvbus_guest.c b/drivers/kvm/pvbus_guest.c
new file mode 100644
index 0000000..56c3b50
--- /dev/null
+++ b/drivers/kvm/pvbus_guest.c
@@ -0,0 +1,382 @@
+/*
+ * Copyright 2007 Novell. All Rights Reserved.
+ *
+ * Author:
+ * Gregory Haskins <ghaskins-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
+ *
+ * This file is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+
+#include <linux/module.h>
+#include <linux/pvbus.h>
+#include <linux/kvm_para.h>
+#include <linux/kvm.h>
+#include <linux/mm.h>
+#include <linux/ioq.h>
+#include <linux/interrupt.h>
+
+#include <asm/hypercall.h>
+
+#include "pvbus.h"
+
+MODULE_AUTHOR ("Gregory Haskins");
+MODULE_LICENSE("GPL");
+MODULE_VERSION("1");
+
+int kvmguest_ioqmgr_alloc(struct ioq_mgr **mgr);
+void kvmguest_ioqmgr_free(struct ioq_mgr *mgr);
+
+static int kvm_pvbus_hypercall(unsigned long nr, void *data, unsigned long len)
+{
+ return hypercall(3, __NR_hypercall_pvbus, nr, __pa(data), len);
+}
+
+/*
+ * This is the vm-syscall address - to be patched by the host to
+ * VMCALL (Intel) or VMMCALL (AMD), depending on the CPU model:
+ */
+asm (
+ " .globl hypercall_addr \n"
+ " .align 4 \n"
+ " hypercall_addr: \n"
+ " movl $-38, %eax \n"
+ " ret \n"
+);
+
+extern unsigned char hypercall_addr[6];
+
+#ifndef CONFIG_X86_64
+static DEFINE_PER_CPU(struct kvm_vcpu_para_state, para_state);
+#endif
+
+static int __init kvm_pvbus_probe(void)
+{
+ struct page *hypercall_addr_page;
+ struct kvm_vcpu_para_state *para_state;
+
+#ifdef CONFIG_X86_64
+ struct page *pstate_page;
+ if ((pstate_page = alloc_page(GFP_KERNEL)) == NULL)
+ return -ENOMEM;
+ para_state = (struct kvm_vcpu_para_state*)page_address(pstate_page);
+#else
+ para_state = &per_cpu(para_state, cpu);
+#endif
+ /*
+ * Try to write to a magic MSR (which is invalid on any real CPU),
+ * and thus signal to KVM that we wish to entering para-virtualized
+ * mode:
+ */
+ para_state->guest_version = KVM_PARA_API_VERSION;
+ para_state->host_version = -1;
+ para_state->size = sizeof(*para_state);
+ para_state->ret = -1;
+
+ hypercall_addr_page = vmalloc_to_page(hypercall_addr);
+ para_state->hypercall_gpa = page_to_pfn(hypercall_addr_page)
+ << PAGE_SHIFT | offset_in_page(hypercall_addr);
+ printk(KERN_DEBUG "kvm guest: hypercall gpa is 0x%lx\n",
+ (long)para_state->hypercall_gpa);
+
+ if (wrmsr_safe(MSR_KVM_API_MAGIC, __pa(para_state), 0)) {
+ printk(KERN_INFO "KVM guest: WRMSR probe failed.\n");
+ return -1;
+ }
+
+ printk(KERN_DEBUG "kvm guest: host returned %d\n",
+ para_state->ret);
+ printk(KERN_DEBUG "kvm guest: host version: %d\n",
+ para_state->host_version);
+ printk(KERN_DEBUG "kvm guest: syscall entry: %02x %02x %02x %02x\n",
+ hypercall_addr[0], hypercall_addr[1],
+ hypercall_addr[2], hypercall_addr[3]);
+
+ if (para_state->ret) {
+ printk(KERN_ERR "kvm guest: host refused registration.\n");
+ return -1;
+ }
+
+ return 0;
+
+}
+
+struct kvm_pvbus {
+ int connected;
+ struct ioq_mgr *ioqmgr;
+ struct ioq *ioq;
+ struct ioq_notifier ioqn;
+ struct tasklet_struct task;
+};
+
+static struct kvm_pvbus kvm_pvbus;
+
+struct kvm_pvbus_device {
+ struct pvbus_device pvbdev;
+ char name[PVBUS_MAX_NAME];
+};
+
+static int kvm_pvbus_createqueue(struct pvbus_device *dev, struct ioq **ioq,
+ size_t ringsize, int flags)
+{
+ struct ioq_mgr *ioqmgr = kvm_pvbus.ioqmgr;
+
+ return ioqmgr->create(ioqmgr, ioq, ringsize, flags);
+}
+
+static int kvm_pvbus_call(struct pvbus_device *dev, u32 func, void *data,
+ size_t len, int flags)
+{
+ struct pvbus_call_params params = {
+ .inst = dev->id,
+ .func = func,
+ .data = (u64)__pa(data),
+ .len = len,
+ };
+
+ return kvm_pvbus_hypercall(KVM_PVBUS_OP_CALL, ¶ms, sizeof(params));
+}
+
+static void kvm_pvbus_add_event(struct pvbus_add_event *event)
+{
+ int ret;
+ struct kvm_pvbus_device *new = kzalloc(sizeof(*new), GFP_KERNEL);
+ if (!new) {
+ printk("KVM_PVBUS: Out of memory on add_event\n");
+ return;
+ }
+
+ memcpy(new->name, event->name, PVBUS_MAX_NAME);
+ new->pvbdev.name = new->name;
+ new->pvbdev.id = event->id;
+ new->pvbdev.createqueue = kvm_pvbus_createqueue;
+ new->pvbdev.call = kvm_pvbus_call;
+
+ sprintf(new->pvbdev.dev.bus_id, "%lld", event->id);
+
+ ret = pvbus_device_register(&new->pvbdev);
+ BUG_ON(ret < 0);
+}
+
+static void kvm_pvbus_drop_event(struct pvbus_drop_event *event)
+{
+#if 0 /* FIXME */
+ int ret = pvbus_device_unregister(event->id);
+ BUG_ON(ret < 0);
+#endif
+}
+
+/* INTR-Layer2: Invoked whenever layer 1 schedules our tasklet */
+static void kvm_pvbus_intr_l2(unsigned long _data)
+{
+ struct ioq_iterator iter;
+ int ret;
+
+ /* We want to iterate on the tail of the in-use index */
+ ret = ioq_iter_init(kvm_pvbus.ioq, &iter, ioq_idxtype_inuse, 0);
+ BUG_ON(ret < 0);
+
+ ret = ioq_iter_seek(&iter, ioq_seek_tail, 0, 0);
+ BUG_ON(ret < 0);
+
+ /*
+ * The EOM is indicated by finding a packet that is still owned by
+ * the south side.
+ *
+ * FIXME: This in theory could run indefinitely if the host keeps
+ * feeding us events since there is nothing like a NAPI budget. We
+ * might need to address that
+ */
+ while (!iter.desc->sown) {
+ struct ioq_ring_desc *desc = iter.desc;
+ struct pvbus_event *event = (struct pvbus_event*)desc->cookie;
+
+ switch (event->eventid) {
+ case KVM_PVBUS_EVENT_ADD:
+ kvm_pvbus_add_event(&event->data.add);
+ break;
+ case KVM_PVBUS_EVENT_DROP:
+ kvm_pvbus_drop_event(&event->data.drop);
+ break;
+ default:
+ printk(KERN_WARNING "KVM_PVBUS: Unexpected event %d\n",
+ event->eventid);
+ break;
+ };
+
+ memset(event, 0, sizeof(*event));
+
+ mb();
+ desc->sown = 1; /* give ownership back to the south */
+ mb();
+
+ /* Advance the in-use tail */
+ ret = ioq_iter_pop(&iter, 0);
+ BUG_ON(ret < 0);
+ }
+
+ /* And let the south side know that we changed the rx-queue */
+ ioq_signal(kvm_pvbus.ioq, 0);
+}
+
+/* INTR-Layer1: Invoked whenever the host issues an ioq_signal() */
+static void kvm_pvbus_intr_l1(struct ioq_notifier *ioqn)
+{
+ tasklet_schedule(&kvm_pvbus.task);
+}
+
+static int __init kvm_pvbus_register(void)
+{
+ struct pvbus_register_params params = {
+ .qid = kvm_pvbus.ioq->id,
+ };
+
+ return kvm_pvbus_hypercall(KVM_PVBUS_OP_REGISTER,
+ ¶ms, sizeof(params));
+}
+
+static int __init kvm_pvbus_setup_ring(void)
+{
+ struct ioq *ioq = kvm_pvbus.ioq;
+ struct ioq_iterator iter;
+ int ret;
+
+ /*
+ * We want to iterate on the "valid" index. By default the iterator
+ * will not "autoupdate" which means it will not hypercall the host
+ * with our changes. This is good, because we are really just
+ * initializing stuff here anyway. Note that you can always manually
+ * signal the host with ioq_signal() if the autoupdate feature is not
+ * used.
+ */
+ ret = ioq_iter_init(ioq, &iter, ioq_idxtype_valid, 0);
+ BUG_ON(ret < 0);
+
+ /*
+ * Seek to the head of the valid index (which should be our first
+ * item since the queue is brand-new)
+ */
+ ret = ioq_iter_seek(&iter, ioq_seek_head, 0, 0);
+ BUG_ON(ret < 0);
+
+ /*
+ * Now populate each descriptor with an empty pvbus_event and mark it
+ * valid
+ */
+ while (!iter.desc->valid) {
+ struct pvbus_event *event;
+ size_t len = sizeof(*event);
+ struct ioq_ring_desc *desc = iter.desc;
+
+ event = kzalloc(sizeof(*event), GFP_KERNEL);
+ if (!event)
+ return -ENOMEM;
+
+ desc->cookie = (u64)event;
+ desc->ptr = (u64)__pa(event);
+ desc->len = len; /* total length */
+ desc->alen = 0; /* actual length - filled in by host */
+
+ /*
+ * We don't need any barriers here because the ring is not used
+ * yet
+ */
+ desc->valid = 1;
+ desc->sown = 1; /* give ownership to the south */
+
+ /*
+ * This push operation will simultaneously advance the
+ * valid-head index and increment our position in the queue
+ * by one.
+ */
+ ret = ioq_iter_push(&iter, 0);
+ BUG_ON(ret < 0);
+ }
+
+ return 0;
+}
+
+int __init kvm_pvbus_init(void)
+{
+ struct ioq_mgr *ioqmgr = NULL;
+ int ret;
+
+ memset(&kvm_pvbus, 0, sizeof(kvm_pvbus));
+
+ ret = kvm_pvbus_probe();
+ if (ret < 0)
+ return ret;
+
+ kvm_pvbus.connected = 1;
+
+ /* Allocate an IOQ-manager to use for all operations */
+ ret = kvmguest_ioqmgr_alloc(&ioqmgr);
+ if (ret < 0) {
+ printk(KERN_ERR "KVM_PVBUS: Could not create ioqmgr\n");
+ return ret;
+ }
+
+ kvm_pvbus.ioqmgr = ioqmgr;
+
+ /* Now allocate an IOQ to use for hotplug notification */
+ ret = ioqmgr->create(ioqmgr, &kvm_pvbus.ioq, 32, 0);
+ if (ret < 0) {
+ printk(KERN_ERR "KVM_PVBUS: Cound not create hotplug ioq\n");
+ goto out_fail;
+ }
+
+ ret = kvm_pvbus_setup_ring();
+ if (ret < 0) {
+ printk(KERN_ERR "KVM_PVBUS: Cound not setup ring\n");
+ goto out_fail;
+ }
+
+ /* Setup our interrupt callback */
+ kvm_pvbus.ioqn.signal = kvm_pvbus_intr_l1;
+ kvm_pvbus.ioq->notifier = &kvm_pvbus.ioqn;
+ tasklet_init(&kvm_pvbus.task, kvm_pvbus_intr_l2, 0);
+
+ /*
+ * Finally register our queue on the host to start receiving hotplug
+ * updates
+ */
+ ret = kvm_pvbus_register();
+ if (ret < 0) {
+ printk(KERN_ERR "KVM_PVBUS: Could not register with host\n");
+ goto out_fail;
+ }
+
+ return 0;
+
+ out_fail:
+ kvmguest_ioqmgr_free(ioqmgr);
+
+ return ret;
+
+}
+
+static void __exit kvm_pvbus_exit(void)
+{
+ if (kvm_pvbus.connected)
+ kvm_pvbus_hypercall(KVM_PVBUS_OP_UNREGISTER, NULL, 0);
+
+ if (kvm_pvbus.ioq)
+ kvm_pvbus.ioq->destroy(kvm_pvbus.ioq);
+
+ kvmguest_ioqmgr_free(kvm_pvbus.ioqmgr);
+}
+
+module_init(kvm_pvbus_init);
+module_exit(kvm_pvbus_exit);
+
+
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 992aeec..bc2b51e 100755
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -377,13 +377,15 @@ struct kvm_pvnet_config {
* No registers are clobbered by the hypercall, except that the
* return value is in RAX.
*/
-#define KVM_NR_HYPERCALLS 5
+#define KVM_NR_HYPERCALLS 7
#define __NR_hypercall_test 0
#define __NR_hypercall_register_eth 1
#define __NR_hypercall_send_eth 2
#define __NR_hypercall_set_multicast_eth 3
#define __NR_hypercall_start_stop_eth 4
+#define __NR_hypercall_ioq 5
+#define __NR_hypercall_pvbus 6
#define __NR_hypercall_balloon (KVM_NR_HYPERCALLS + 0)
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
^ permalink raw reply related [flat|nested] 41+ messages in thread* [PATCH 07/10] KVM: Add a gpa_to_hva helper function
[not found] ` <20070816231357.8044.55943.stgit-5CR4LY5GPkvLDviKLk5550HKjMygAv58XqFh9Ls21Oc@public.gmane.org>
` (5 preceding siblings ...)
2007-08-16 23:14 ` [PATCH 06/10] KVM: Add a guest side driver for IOQ Gregory Haskins
@ 2007-08-16 23:14 ` Gregory Haskins
2007-08-16 23:14 ` [PATCH 08/10] KVM: Add support for IOQ Gregory Haskins
` (3 subsequent siblings)
10 siblings, 0 replies; 41+ messages in thread
From: Gregory Haskins @ 2007-08-16 23:14 UTC (permalink / raw)
To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Signed-off-by: Gregory Haskins <ghaskins-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
---
drivers/kvm/kvm.h | 1 +
drivers/kvm/mmu.c | 12 ++++++++++++
2 files changed, 13 insertions(+), 0 deletions(-)
diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 9934f11..05d5be1 100755
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -475,6 +475,7 @@ void vcpu_load(struct kvm_vcpu *vcpu);
void vcpu_put(struct kvm_vcpu *vcpu);
hpa_t gpa_to_hpa(struct kvm_vcpu *vcpu, gpa_t gpa);
+void* gpa_to_hva(struct kvm *kvm, gpa_t gpa);
#define HPA_MSB ((sizeof(hpa_t) * 8) - 1)
#define HPA_ERR_MASK ((hpa_t)1 << HPA_MSB)
static inline int is_error_hpa(hpa_t hpa) { return hpa >> HPA_MSB; }
diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index e84c599..daaf0d2 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -766,6 +766,18 @@ hpa_t gpa_to_hpa(struct kvm_vcpu *vcpu, gpa_t gpa)
}
EXPORT_SYMBOL_GPL(gpa_to_hpa);
+void* gpa_to_hva(struct kvm *kvm, gpa_t gpa)
+{
+ struct page *page;
+
+ if ((gpa & HPA_ERR_MASK) == 0)
+ return NULL;
+
+ page = gfn_to_page(kvm, gpa >> PAGE_SHIFT);
+ return kmap_atomic(page, gpa & PAGE_MASK);
+}
+EXPORT_SYMBOL_GPL(gpa_to_hva);
+
hpa_t gva_to_hpa(struct kvm_vcpu *vcpu, gva_t gva)
{
gpa_t gpa = vcpu->mmu.gva_to_gpa(vcpu, gva);
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
^ permalink raw reply related [flat|nested] 41+ messages in thread* [PATCH 08/10] KVM: Add support for IOQ
[not found] ` <20070816231357.8044.55943.stgit-5CR4LY5GPkvLDviKLk5550HKjMygAv58XqFh9Ls21Oc@public.gmane.org>
` (6 preceding siblings ...)
2007-08-16 23:14 ` [PATCH 07/10] KVM: Add a gpa_to_hva helper function Gregory Haskins
@ 2007-08-16 23:14 ` Gregory Haskins
2007-08-16 23:14 ` [PATCH 09/10] KVM: Add PVBUS support to the KVM host Gregory Haskins
` (2 subsequent siblings)
10 siblings, 0 replies; 41+ messages in thread
From: Gregory Haskins @ 2007-08-16 23:14 UTC (permalink / raw)
To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
IOQ is a shared-memory-queue interface for implmenting PV driver
communication.
Signed-off-by: Gregory Haskins <ghaskins-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
---
drivers/kvm/Kconfig | 5 +
drivers/kvm/Makefile | 3
drivers/kvm/ioq.h | 12 +-
drivers/kvm/ioq_host.c | 365 ++++++++++++++++++++++++++++++++++++++++++++++++
drivers/kvm/kvm.h | 5 +
drivers/kvm/kvm_main.c | 3
include/linux/kvm.h | 1
7 files changed, 393 insertions(+), 1 deletions(-)
diff --git a/drivers/kvm/Kconfig b/drivers/kvm/Kconfig
index aca79d1..d9def33 100644
--- a/drivers/kvm/Kconfig
+++ b/drivers/kvm/Kconfig
@@ -47,6 +47,11 @@ config KVM_BALLOON
The driver inflate/deflate guest physical memory on demand.
This ability provides memory over commit for the host
+config KVM_IOQ_HOST
+ boolean "Add IOQ support to KVM"
+ depends on KVM
+ select IOQ
+
config KVM_NET_HOST
tristate "Para virtual network host device"
depends on KVM
diff --git a/drivers/kvm/Makefile b/drivers/kvm/Makefile
index c6a59bb..2095061 100644
--- a/drivers/kvm/Makefile
+++ b/drivers/kvm/Makefile
@@ -4,6 +4,9 @@
EXTRA_CFLAGS :=
kvm-objs := kvm_main.o mmu.o x86_emulate.o
+ifeq ($(CONFIG_KVM_IOQ_HOST),y)
+kvm-objs += ioq_host.o
+endif
obj-$(CONFIG_KVM) += kvm.o
kvm-intel-objs = vmx.o
obj-$(CONFIG_KVM_INTEL) += kvm-intel.o
diff --git a/drivers/kvm/ioq.h b/drivers/kvm/ioq.h
index 7e955f1..b942113 100644
--- a/drivers/kvm/ioq.h
+++ b/drivers/kvm/ioq.h
@@ -25,7 +25,17 @@
#include <linux/ioq.h>
-#define IOQHC_REGISTER 1
+struct kvm;
+
+#ifdef CONFIG_KVM_IOQ_HOST
+int kvmhost_ioqmgr_init(struct kvm *kvm);
+int kvmhost_ioqmgr_module_init(void);
+#else
+#define kvmhost_ioqmgr_init(kvm) {}
+#define kvmhost_ioqmgr_module_init() {}
+#endif
+
+#define IOQHC_REGISTER 1
#define IOQHC_UNREGISTER 2
#define IOQHC_SIGNAL 3
diff --git a/drivers/kvm/ioq_host.c b/drivers/kvm/ioq_host.c
new file mode 100644
index 0000000..413f103
--- /dev/null
+++ b/drivers/kvm/ioq_host.c
@@ -0,0 +1,365 @@
+/*
+ * Copyright 2007 Novell. All Rights Reserved.
+ *
+ * See include/linux/ioq.h for documentation
+ *
+ * Author:
+ * Gregory Haskins <ghaskins-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
+ *
+ * This file is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+
+#include <linux/ioq.h>
+#include <linux/rbtree.h>
+#include <linux/spinlock.h>
+#include <linux/highmem.h>
+
+#include <asm/atomic.h>
+
+#include "ioq.h"
+#include "kvm.h"
+
+struct kvmhost_ioq {
+ struct ioq ioq;
+ struct rb_node node;
+ atomic_t refcnt;
+ struct kvm_vcpu *vcpu;
+ int irq;
+};
+
+struct kvmhost_map {
+ spinlock_t lock;
+ struct rb_root root;
+};
+
+struct kvmhost_ioq_mgr {
+ struct ioq_mgr mgr;
+ struct kvm *kvm;
+ struct kvmhost_map map;
+};
+
+struct kvmhost_ioq* to_ioq(struct ioq *ioq)
+{
+ return container_of(ioq, struct kvmhost_ioq, ioq);
+}
+
+struct kvmhost_ioq_mgr* to_mgr(struct ioq_mgr *mgr)
+{
+ return container_of(mgr, struct kvmhost_ioq_mgr, mgr);
+}
+
+/*
+ * ------------------
+ * rb map management
+ * ------------------
+ */
+
+static void kvmhost_map_init(struct kvmhost_map *map)
+{
+ spin_lock_init(&map->lock);
+ map->root = RB_ROOT;
+}
+
+static int kvmhost_map_register(struct kvmhost_map *map,
+ struct kvmhost_ioq *ioq)
+{
+ int ret = 0;
+ struct rb_root *root;
+ struct rb_node **new, *parent = NULL;
+
+ spin_lock(&map->lock);
+
+ root = &map->root;
+ new = &(root->rb_node);
+
+ /* Figure out where to put new node */
+ while (*new) {
+ struct kvmhost_ioq *this;
+
+ this = container_of(*new, struct kvmhost_ioq, node);
+ parent = *new;
+
+ if (ioq->ioq.id < this->ioq.id)
+ new = &((*new)->rb_left);
+ else if (ioq->ioq.id > this->ioq.id)
+ new = &((*new)->rb_right);
+ else {
+ ret = -EEXIST;
+ break;
+ }
+ }
+
+ if (!ret) {
+ /* Add new node and rebalance tree. */
+ rb_link_node(&ioq->node, parent, new);
+ rb_insert_color(&ioq->node, root);
+ }
+
+ spin_unlock(&map->lock);
+
+ return ret;
+}
+
+static struct kvmhost_ioq* kvmhost_map_find(struct kvmhost_map *map,
+ ioq_id_t id)
+{
+ struct rb_node *node;
+ struct kvmhost_ioq *ioq = NULL;
+
+ spin_lock(&map->lock);
+
+ node = map->root.rb_node;
+
+ while (node) {
+ struct kvmhost_ioq *_ioq;
+
+ _ioq = container_of(node, struct kvmhost_ioq, node);
+
+ if (ioq->ioq.id < id)
+ node = node->rb_left;
+ else if (_ioq->ioq.id > id)
+ node = node->rb_right;
+ else {
+ ioq = _ioq;
+ break;
+ }
+ }
+
+ spin_unlock(&map->lock);
+
+ return ioq;
+}
+
+static void kvmhost_map_erase(struct kvmhost_map *map,
+ struct kvmhost_ioq *ioq)
+{
+ spin_lock(&map->lock);
+ rb_erase(&ioq->node, &map->root);
+ spin_unlock(&map->lock);
+}
+
+/*
+ * ------------------
+ * ioq implementation
+ * ------------------
+ */
+
+static int kvmhost_ioq_signal(struct ioq *ioq)
+{
+ struct kvmhost_ioq *_ioq = to_ioq(ioq);
+ BUG_ON(!_ioq);
+
+ /*
+ * FIXME: Inject an interrupt to the guest for "id"
+ *
+ * We will have to decide if we will have 1:1 IOQ:IRQ, or if we
+ * will aggregate all IOQs through a single IRQ. For purposes of
+ * example, we will assume 1:1.
+ */
+
+ /* kvm_vcpu_send_interrupt(_ioq->vcpu, _ioq->irq); */
+
+ return 0;
+}
+
+static void kvmhost_ioq_destroy(struct ioq *ioq)
+{
+ struct kvmhost_ioq *_ioq = to_ioq(ioq);
+
+ if (atomic_dec_and_test(&_ioq->refcnt))
+ kfree(_ioq);
+}
+
+static struct kvmhost_ioq* kvmhost_ioq_alloc(struct ioq_mgr *t,
+ struct kvm_vcpu *vcpu,
+ ioq_id_t id, int irq, gpa_t ring)
+{
+ struct kvmhost_ioq *_ioq;
+ struct ioq *ioq;
+
+ _ioq = kzalloc(sizeof(*_ioq), GFP_KERNEL);
+ if (!_ioq)
+ return NULL;
+
+ ioq = &_ioq->ioq;
+
+ atomic_set(&_ioq->refcnt, 1);
+ _ioq->vcpu = vcpu;
+ _ioq->irq = irq;
+
+ ioq_init(&_ioq->ioq);
+
+ ioq->signal = kvmhost_ioq_signal;
+ ioq->destroy = kvmhost_ioq_destroy;
+
+ ioq->id = id;
+ ioq->locale = ioq_locality_south;
+ ioq->mgr = t;
+ ioq->head_desc = (struct ioq_ring_head*)gpa_to_hva(vcpu->kvm, ring);
+ ioq->ring = (struct ioq_ring_desc*)gpa_to_hva(vcpu->kvm,
+ ioq->head_desc->ptr);
+
+ return _ioq;
+}
+
+/*
+ * ------------------
+ * hypercall implementation
+ * ------------------
+ */
+
+static int kvmhost_ioq_hc_register(struct ioq_mgr *t, struct kvm_vcpu *vcpu,
+ ioq_id_t id, int irq, gpa_t ring)
+{
+ struct kvmhost_ioq *_ioq = kvmhost_ioq_alloc(t, vcpu, id, irq, ring);
+ int ret;
+
+ if (!_ioq)
+ return -ENOMEM;
+
+ ret = kvmhost_map_register(&to_mgr(t)->map, _ioq);
+ if (ret < 0)
+ kvmhost_ioq_destroy(&_ioq->ioq);
+
+ return 0;
+}
+
+static int kvmhost_ioq_hc_unregister(struct ioq_mgr *t, ioq_id_t id)
+{
+ struct kvmhost_ioq_mgr *_mgr = to_mgr(t);
+ struct kvmhost_ioq *_ioq = kvmhost_map_find(&_mgr->map, id);
+
+ if (!_ioq)
+ return -ENOENT;
+
+ kvmhost_map_erase(&_mgr->map, _ioq);
+ kvmhost_ioq_destroy(&_ioq->ioq);
+
+ return 0;
+}
+
+static int kvmhost_ioq_hc_signal(struct ioq_mgr *t, ioq_id_t id)
+{
+ struct kvmhost_ioq *_ioq = kvmhost_map_find(&to_mgr(t)->map, id);
+
+ if (!_ioq)
+ return -1;
+
+ ioq_wakeup(&_ioq->ioq);
+
+ return 0;
+}
+
+/*
+ * Our hypercall format will always follow with the call-id in arg[0] and
+ * a pointer to the arguments in arg[1]
+ */
+static unsigned long kvmhost_hc(struct kvm_vcpu *vcpu, unsigned long args[])
+{
+ struct ioq_mgr *t = vcpu->kvm->ioqmgr;
+ void *vdata = gpa_to_hva(vcpu->kvm, args[1]);
+ int ret = -EINVAL;
+
+ if (!vdata)
+ return -EINVAL;
+
+ /*
+ * FIXME: we need to make sure that the pointer is sane
+ * so a malicious guest cannot crash the host.
+ */
+
+ switch (args[0])
+ {
+ case IOQHC_REGISTER: {
+ struct ioq_register *data = (struct ioq_register*)vdata;
+ ret = kvmhost_ioq_hc_register(t, vcpu,
+ data->id,
+ data->irq,
+ data->ring);
+ }
+ case IOQHC_UNREGISTER: {
+ ioq_id_t *id = (ioq_id_t*)vdata;
+ ret = kvmhost_ioq_hc_unregister(t, *id);
+ }
+ case IOQHC_SIGNAL: {
+ ioq_id_t *id = (ioq_id_t*)vdata;
+ ret = kvmhost_ioq_hc_signal(t, *id);
+ }
+ }
+
+ /* FIXME: unmap the vdata? */
+
+ return ret;
+}
+
+/*
+ * ------------------
+ * ioqmgr implementation
+ * ------------------
+ */
+
+static int kvmhost_ioq_create(struct ioq_mgr *t, struct ioq **ioq,
+ size_t ringsize, int flags)
+{
+ /* You cannot create queues on the host */
+ return -EINVAL;
+}
+
+static int kvmhost_ioq_connect(struct ioq_mgr *t, ioq_id_t id,
+ struct ioq **ioq, int flags)
+{
+ struct kvmhost_ioq *_ioq = kvmhost_map_find(&to_mgr(t)->map, id);
+
+ if (!_ioq)
+ return -1;
+
+ atomic_inc(&_ioq->refcnt);
+ *ioq = &_ioq->ioq;
+
+ return 0;
+
+}
+
+int kvmhost_ioqmgr_init(struct kvm *kvm)
+{
+ struct kvmhost_ioq_mgr *_mgr = kzalloc(sizeof(*_mgr), GFP_KERNEL);
+ if (!_mgr)
+ return -ENOMEM;
+
+ _mgr->kvm = kvm;
+ kvmhost_map_init(&_mgr->map);
+
+ _mgr->mgr.create = kvmhost_ioq_create;
+ _mgr->mgr.connect = kvmhost_ioq_connect;
+
+ kvm->ioqmgr = &_mgr->mgr;
+
+ return 0;
+}
+
+__init int kvmhost_ioqmgr_module_init(void)
+{
+ struct kvm_hypercall hc;
+
+ hc.hypercall = kvmhost_hc;
+ hc.idx = __NR_hypercall_ioq;
+
+ kvm_register_hypercall(THIS_MODULE, &hc);
+
+ return 0;
+}
+
+
+
+
diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 05d5be1..c38c84f 100755
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -16,6 +16,7 @@
#include <linux/netdevice.h>
#include "vmx.h"
+#include "ioq.h"
#include <linux/kvm.h>
#include <linux/kvm_para.h>
@@ -389,6 +390,10 @@ struct kvm {
struct list_head vm_list;
struct net_device *netdev;
struct file *filp;
+#ifdef CONFIG_KVM_IOQ_HOST
+ struct ioq_mgr *ioqmgr;
+#endif
+
};
struct descriptor_table {
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index f252b39..fbffd2f 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -349,6 +349,7 @@ static struct kvm *kvm_create_vm(void)
list_add(&kvm->vm_list, &vm_list);
spin_unlock(&kvm_lock);
}
+ kvmhost_ioqmgr_init(kvm);
return kvm;
}
@@ -3614,6 +3615,8 @@ static __init int kvm_init(void)
bad_page_address = page_to_pfn(bad_page) << PAGE_SHIFT;
memset(__va(bad_page_address), 0, PAGE_SIZE);
+ kvmhost_ioqmgr_module_init();
+
return 0;
out:
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index bc2b51e..2cceae3 100755
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -377,6 +377,7 @@ struct kvm_pvnet_config {
* No registers are clobbered by the hypercall, except that the
* return value is in RAX.
*/
+
#define KVM_NR_HYPERCALLS 7
#define __NR_hypercall_test 0
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
^ permalink raw reply related [flat|nested] 41+ messages in thread* [PATCH 09/10] KVM: Add PVBUS support to the KVM host
[not found] ` <20070816231357.8044.55943.stgit-5CR4LY5GPkvLDviKLk5550HKjMygAv58XqFh9Ls21Oc@public.gmane.org>
` (7 preceding siblings ...)
2007-08-16 23:14 ` [PATCH 08/10] KVM: Add support for IOQ Gregory Haskins
@ 2007-08-16 23:14 ` Gregory Haskins
2007-08-16 23:14 ` [PATCH 10/10] KVM: Add an IOQNET backend driver Gregory Haskins
2007-08-17 1:25 ` [PATCH 00/10] PV-IO v3 Rusty Russell
10 siblings, 0 replies; 41+ messages in thread
From: Gregory Haskins @ 2007-08-16 23:14 UTC (permalink / raw)
To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
PVBUS allows VMM agnostic PV drivers to discover/configure virtual resources
Signed-off-by: Gregory Haskins <ghaskins-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
---
drivers/kvm/Kconfig | 10 +
drivers/kvm/Makefile | 3
drivers/kvm/kvm.h | 4
drivers/kvm/kvm_main.c | 4
drivers/kvm/pvbus_host.c | 636 ++++++++++++++++++++++++++++++++++++++++++++++
drivers/kvm/pvbus_host.h | 66 +++++
6 files changed, 723 insertions(+), 0 deletions(-)
diff --git a/drivers/kvm/Kconfig b/drivers/kvm/Kconfig
index d9def33..9f2ef22 100644
--- a/drivers/kvm/Kconfig
+++ b/drivers/kvm/Kconfig
@@ -52,6 +52,16 @@ config KVM_IOQ_HOST
depends on KVM
select IOQ
+config KVM_PVBUS_HOST
+ boolean "Paravirtualized Bus (PVBUS) host support"
+ depends on KVM
+ select KVM_IOQ_HOST
+ ---help---
+ PVBUS is an infrastructure for generic PV drivers to take advantage
+ of an underlying hypervisor without having to understand the details
+ of the hypervisor itself. You only need this option if you plan to
+ run PVBUS based PV guests in KVM.
+
config KVM_NET_HOST
tristate "Para virtual network host device"
depends on KVM
diff --git a/drivers/kvm/Makefile b/drivers/kvm/Makefile
index 2095061..8926fa9 100644
--- a/drivers/kvm/Makefile
+++ b/drivers/kvm/Makefile
@@ -7,6 +7,9 @@ kvm-objs := kvm_main.o mmu.o x86_emulate.o
ifeq ($(CONFIG_KVM_IOQ_HOST),y)
kvm-objs += ioq_host.o
endif
+ifeq ($(CONFIG_KVM_PVBUS_HOST),y)
+kvm-objs += pvbus_host.o
+endif
obj-$(CONFIG_KVM) += kvm.o
kvm-intel-objs = vmx.o
obj-$(CONFIG_KVM_INTEL) += kvm-intel.o
diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index c38c84f..8dc9ac3 100755
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -14,6 +14,7 @@
#include <linux/sched.h>
#include <linux/mm.h>
#include <linux/netdevice.h>
+#include <linux/pvbus.h>
#include "vmx.h"
#include "ioq.h"
@@ -393,6 +394,9 @@ struct kvm {
#ifdef CONFIG_KVM_IOQ_HOST
struct ioq_mgr *ioqmgr;
#endif
+#ifdef CONFIG_KVM_PVBUS_HOST
+ struct kvm_pvbus *pvbus;
+#endif
};
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index fbffd2f..d35ce8d 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -44,6 +44,7 @@
#include "x86_emulate.h"
#include "segment_descriptor.h"
+#include "pvbus_host.h"
MODULE_AUTHOR("Qumranet");
MODULE_LICENSE("GPL");
@@ -350,6 +351,7 @@ static struct kvm *kvm_create_vm(void)
spin_unlock(&kvm_lock);
}
kvmhost_ioqmgr_init(kvm);
+ kvm_pvbus_init(kvm);
return kvm;
}
@@ -3616,6 +3618,7 @@ static __init int kvm_init(void)
memset(__va(bad_page_address), 0, PAGE_SIZE);
kvmhost_ioqmgr_module_init();
+ kvm_pvbus_module_init();
return 0;
@@ -3637,6 +3640,7 @@ static __exit void kvm_exit(void)
mntput(kvmfs_mnt);
unregister_filesystem(&kvm_fs_type);
kvm_mmu_module_exit();
+ kvm_pvbus_module_exit();
}
module_init(kvm_init)
diff --git a/drivers/kvm/pvbus_host.c b/drivers/kvm/pvbus_host.c
new file mode 100644
index 0000000..cc506f4
--- /dev/null
+++ b/drivers/kvm/pvbus_host.c
@@ -0,0 +1,636 @@
+/*
+ * Copyright 2007 Novell. All Rights Reserved.
+ *
+ * Author:
+ * Gregory Haskins <ghaskins-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
+ *
+ * This file is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+
+#include <linux/module.h>
+#include <linux/rbtree.h>
+#include <linux/spinlock.h>
+#include <linux/highmem.h>
+#include <linux/workqueue.h>
+
+#include "pvbus.h"
+#include "pvbus_host.h"
+#include "kvm.h"
+
+struct pvbus_map {
+ int (*compare)(const void *left, const void *right);
+ const void* (*getkey)(struct rb_node *node);
+
+ struct mutex lock;
+ struct rb_root root;
+ size_t count;
+};
+
+struct _pv_devtype {
+ struct kvm_pv_devtype *item;
+ struct rb_node node;
+};
+
+struct _pv_device {
+ struct kvm_pv_device *item;
+ struct rb_node node;
+ struct _pv_devtype *parent;
+ int synced;
+};
+
+static struct pvbus_map pvbus_typemap;
+
+struct kvm_pvbus_eventq {
+ struct mutex lock;
+ struct ioq *ioq;
+
+};
+
+struct kvm_pvbus {
+ struct mutex lock;
+ struct kvm *kvm;
+ struct pvbus_map devmap;
+ struct kvm_pvbus_eventq eventq;
+};
+
+/*
+ * ------------------
+ * generic rb map management
+ * ------------------
+ */
+
+static void pvbus_map_init(struct pvbus_map *map)
+{
+ mutex_init(&map->lock);
+ map->root = RB_ROOT;
+}
+
+static int pvbus_map_register(struct pvbus_map *map, struct rb_node *node)
+{
+ int ret = 0;
+ struct rb_root *root;
+ struct rb_node **new, *parent = NULL;
+
+ mutex_lock(&map->lock);
+
+ root = &map->root;
+ new = &(root->rb_node);
+
+ /* Figure out where to put new node */
+ while (*new) {
+ int result = map->compare(map->getkey(node),
+ map->getkey(*new));
+
+ parent = *new;
+
+ if (result < 0)
+ new = &((*new)->rb_left);
+ else if (result > 0)
+ new = &((*new)->rb_right);
+ else {
+ ret = -EEXIST;
+ break;
+ }
+ }
+
+ if (!ret) {
+ /* Add new node and rebalance tree. */
+ rb_link_node(node, parent, new);
+ rb_insert_color(node, root);
+ map->count++;
+ }
+
+ mutex_unlock(&map->lock);
+
+ return ret;
+}
+
+static struct rb_node* pvbus_map_find(struct pvbus_map *map, const void *key)
+{
+ struct rb_node *node;
+
+ mutex_lock(&map->lock);
+
+ node = map->root.rb_node;
+
+ while (node) {
+ int result = map->compare(map->getkey(node), key);
+
+ if (result < 0)
+ node = node->rb_left;
+ else if (result > 0)
+ node = node->rb_right;
+ else {
+ break;
+ }
+ }
+
+ mutex_unlock(&map->lock);
+
+ return node;
+}
+
+static void pvbus_map_erase(struct pvbus_map *map, struct rb_node *node)
+{
+ mutex_lock(&map->lock);
+ rb_erase(node, &map->root);
+ map->count--;
+ mutex_unlock(&map->lock);
+}
+
+/*
+ * ------------------
+ * pv_devtype rb map
+ * ------------------
+ */
+static int pv_devtype_map_compare(const void *left, const void *right)
+{
+ return strcmp((char*)left, (char*)right);
+}
+
+static const void* pv_devtype_map_getkey(struct rb_node *node)
+{
+ struct _pv_devtype *dt;
+
+ dt = container_of(node, struct _pv_devtype, node);
+
+ return dt->item->name;
+}
+
+static void pv_devtype_map_init(struct pvbus_map *map)
+{
+ pvbus_map_init(map);
+
+ map->compare = pv_devtype_map_compare;
+ map->getkey = pv_devtype_map_getkey;
+}
+
+static struct _pv_devtype* devtype_map_find(struct pvbus_map *map,
+ const void *key)
+{
+ struct rb_node *node = pvbus_map_find(map, key);
+ if (!node)
+ return NULL;
+
+ return container_of(node, struct _pv_devtype, node);
+}
+
+/*
+ * ------------------
+ * pv_device rb map
+ * ------------------
+ */
+static int pv_device_map_compare(const void *left, const void *right)
+{
+ u64 lid = *(const u64*)left;
+ u64 rid = *(const u64*)right;
+
+ return lid - rid;
+}
+
+static const void* pv_device_map_getkey(struct rb_node *node)
+{
+ struct _pv_device *dev;
+
+ dev = container_of(node, struct _pv_device, node);
+
+ return &dev->item->id;
+}
+
+static void pv_device_map_init(struct pvbus_map *map)
+{
+ pvbus_map_init(map);
+
+ map->compare = pv_device_map_compare;
+ map->getkey = pv_device_map_getkey;
+}
+
+static struct _pv_device* device_map_find(struct pvbus_map *map,
+ const void *key)
+{
+ struct rb_node *node = pvbus_map_find(map, key);
+ if (!node)
+ return NULL;
+
+ return container_of(node, struct _pv_device, node);
+}
+
+/*
+ * ------------------
+ * event-inject code
+ * ------------------
+ */
+static void kvm_pvbus_inject_event(struct kvm_pvbus *pvbus, u32 eventid,
+ void *data, size_t len)
+{
+ DECLARE_WAITQUEUE(wait, current);
+ struct kvm_pvbus_eventq *eventq = &pvbus->eventq;
+ struct ioq_iterator iter;
+ struct pvbus_event *entry;
+ int ret;
+
+ add_wait_queue(&eventq->ioq->wq, &wait);
+
+ mutex_lock(&eventq->lock);
+
+ /* We want to iterate on the head of the in-use index */
+ ret = ioq_iter_init(eventq->ioq, &iter,
+ ioq_idxtype_inuse, IOQ_ITER_AUTOUPDATE);
+ BUG_ON(ret < 0);
+
+ ret = ioq_iter_seek(&iter, ioq_seek_head, 0, 0);
+ BUG_ON(ret < 0);
+
+ set_current_state(TASK_UNINTERRUPTIBLE);
+
+ while (!iter.desc->sown)
+ schedule();
+
+ set_current_state(TASK_RUNNING);
+
+ entry = (struct pvbus_event*)gpa_to_hva(pvbus->kvm, iter.desc->ptr);
+
+ entry->eventid = eventid;
+ memcpy(&entry->data, data, len);
+
+ mb();
+ iter.desc->sown = 0;
+ mb();
+
+ /*
+ * This will push the index AND signal the guest since AUTOUPDATE is
+ * enabled
+ */
+ ret = ioq_iter_push(&iter, 0);
+ BUG_ON(ret < 0);
+
+ /* FIXME: Unmap the entry */
+
+ mutex_unlock(&eventq->lock);
+}
+
+static void kvm_pvbus_inject_add(struct kvm_pvbus *pvbus,
+ const char *name, u64 id)
+{
+ struct pvbus_add_event data = {
+ .id = id,
+ };
+
+ strncpy(data.name, name, PVBUS_MAX_NAME);
+
+ kvm_pvbus_inject_event(pvbus, KVM_PVBUS_EVENT_ADD,
+ &data, sizeof(data));
+}
+
+/*
+ * ------------------
+ * add-event code
+ * ------------------
+ */
+
+struct deferred_add {
+ struct kvm_pvbus *pvbus;
+ struct work_struct work;
+ size_t count;
+ struct pvbus_add_event data[1];
+};
+
+static void kvm_pvbus_deferred_resync(struct work_struct *work)
+{
+ struct deferred_add *event = container_of(work,
+ struct deferred_add,
+ work);
+ int i;
+
+
+ for (i = 0; i < event->count; i++) {
+ struct pvbus_add_event *entry = &event->data[i];
+
+ kvm_pvbus_inject_add(event->pvbus, entry->name, entry->id);
+ }
+
+ kfree(event);
+}
+
+#define for_each_rbnode(node, root) \
+ for (node = rb_first(root); node != NULL; node = rb_next(node))
+
+/*
+ * This function will build a list of all currently registered devices and
+ * send it to a work-queue to be placed on the ioq. We do this as a two
+ * step operation because the work-queues can queue infinitely deep
+ * (assuming enough memory) whereas IOQ is only queue as deep as the
+ * guest's allocation and then we must sleep. Since we cannot sleep during
+ * registration we have no real choice but to defer things here
+ */
+static int kvm_pvbus_resync(struct kvm_pvbus *pvbus)
+{
+ struct pvbus_map *map = &pvbus->devmap;
+ struct deferred_add *event;
+ struct rb_node *node;
+ size_t len;
+ int i = 0;
+ int ret = 0;
+
+ mutex_lock(&map->lock);
+
+ if (!map->count)
+ /* There are no items current registered so just exit */
+ goto out;
+
+ /*
+ * First allocate a structure large enough to hold our map->count
+ * number of entries that are pending
+ */
+
+ /* we subtract 1 because of item already in struct */
+ len = sizeof(struct pvbus_add_event) * (map->count - 1);
+ event = kzalloc(sizeof(*event) + len, GFP_KERNEL);
+ if (!event) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ event->pvbus = pvbus;
+ event->count = map->count;
+ INIT_WORK(&event->work, kvm_pvbus_deferred_resync);
+
+ /*
+ * Then cycle through the map and load each node discovered into
+ * the event
+ */
+ for_each_rbnode(node, &map->root) {
+ struct pvbus_add_event *entry = &event->data[i++];
+ struct _pv_device *dev = container_of(node,
+ struct _pv_device,
+ node);
+
+ strncpy(entry->name, dev->parent->item->name, PVBUS_MAX_NAME);
+ entry->id = dev->item->id;
+ }
+
+ /* Finally, fire off the work */
+ schedule_work(&event->work);
+
+ out:
+ mutex_unlock(&map->lock);
+
+ return 0;
+}
+
+
+/*
+ * ------------------
+ * hypercall implementation
+ * ------------------
+ */
+
+/*
+ * This function is invoked when the guest wants to start getting hotplug
+ * events from us to publish on the pvbus
+ */
+static int kvm_pvbus_register(struct kvm_pvbus *pvbus, ioq_id_t id)
+{
+ struct ioq_mgr *ioqmgr = pvbus->kvm->ioqmgr;
+ int ret = 0;
+
+ mutex_lock(&pvbus->lock);
+
+ /*
+ * Trying to register while someone else is already registered
+ * is just plain illegal
+ */
+ if (pvbus->eventq.ioq) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ /*
+ * Open the IOQ channel back to the guest so we can deliver hotplug
+ * events as devices are registered
+ */
+ ret = ioqmgr->connect(ioqmgr, id, &pvbus->eventq.ioq, 0);
+ if (ret < 0)
+ goto out;
+
+ /*
+ * Enable interrupts on the queue
+ */
+ ioq_start(pvbus->eventq.ioq, 0);
+
+ /*
+ * Now we need to backfill the guest by sending any of our currently
+ * registered devices up as hotplug events as if they just happened
+ */
+ ret = kvm_pvbus_resync(pvbus);
+
+ out:
+ mutex_unlock(&pvbus->lock);
+
+ return ret;
+}
+
+/*
+ * This function is invoked whenever a driver calls pvbus_device->call()
+ */
+static int kvm_pvbus_call(struct kvm_pvbus *pvbus,
+ u64 instance, u32 func, void *data, size_t len)
+{
+ struct kvm_pv_device *dev;
+ struct _pv_device *_dev = device_map_find(&pvbus->devmap,
+ &instance);
+ if (!_dev)
+ return -ENOENT;
+
+ dev = _dev->item;
+
+ return dev->call(dev, func, data, len);
+}
+
+/*
+ * Our hypercall format will always follow with the call-id in arg[0],
+ * a pointer to the arguments in arg[1], and the argument length in arg[2]
+ */
+static unsigned long kvm_pvbus_hc(struct kvm_vcpu *vcpu,
+ unsigned long args[])
+{
+ struct kvm_pvbus *pvbus = vcpu->kvm->pvbus;
+ void *vdata = (void*)gpa_to_hva(vcpu->kvm, args[1]);
+ int ret = -EINVAL;
+
+ /* FIXME: We need to validate vdata so that malicious guests cannot
+ cause the host to segfault */
+
+ switch (args[0])
+ {
+ case KVM_PVBUS_OP_REGISTER: {
+ struct pvbus_register_params *params;
+
+ params = (struct pvbus_register_params*)vdata;
+
+ ret = kvm_pvbus_register(pvbus, params->qid);
+ }
+ case KVM_PVBUS_OP_CALL: {
+ struct pvbus_call_params *params;
+ void *data;
+
+ params = (struct pvbus_call_params*)vdata;
+ data = gpa_to_hva(vcpu->kvm, params->data);
+
+ /*
+ * FIXME: Again, we should validate that
+ *
+ * params->data to params->data+len
+ *
+ * is a valid region owned by the guest
+ */
+
+ ret = kvm_pvbus_call(pvbus, params->inst, params->func,
+ data, params->len);
+
+ /* FIXME: Do we need to unmap the data */
+ }
+ }
+
+ /* FIXME: Do we need to kunmap the vdata? */
+
+ return ret;
+}
+
+int kvm_pvbus_registertype(struct kvm_pv_devtype *devtype)
+{
+ struct _pv_devtype *_devtype = kzalloc(sizeof(*_devtype), GFP_KERNEL);
+ if (!_devtype)
+ return -ENOMEM;
+
+ _devtype->item = devtype;
+
+ return pvbus_map_register(&pvbus_typemap, &_devtype->node);
+}
+EXPORT_SYMBOL_GPL(kvm_pvbus_registertype);
+
+int kvm_pvbus_unregistertype(const char *name)
+{
+ /* FIXME: */
+ return -ENOSYS;
+}
+EXPORT_SYMBOL_GPL(kvm_pvbus_unregistertype);
+
+/*
+ * This function is invoked by an administrative operation which wants to
+ * instantiate a registered type into a device associated with a specific VM.
+ *
+ * For instance, QEMU may one day issue an ioctl that says
+ *
+ * createinstance("ioqnet", "mac = 00:30:cc:00:20:10");
+ *
+ * This would cause the system to search for any registered types called
+ * "ioqnet". If found, it would instantiate the device with a config string
+ * set to give it a specific MAC. Obviously the name and config string are
+ * specific to a particular driver type.
+ */
+int kvm_pvbus_createinstance(struct kvm *kvm, const char *name,
+ const char *cfg, u64 *id)
+{
+ struct kvm_pvbus *pvbus = kvm->pvbus;
+ struct _pv_devtype *_devtype;
+ struct kvm_pv_devtype *devtype;
+ struct _pv_device *_dev = NULL;
+ struct kvm_pv_device *dev;
+ u64 _id;
+ int ret = 0;
+
+ mutex_lock(&pvbus->lock);
+
+ _devtype = devtype_map_find(&pvbus_typemap, name);
+ if (!_devtype) {
+ ret = -ENOENT;
+ goto out_err;
+ }
+
+ devtype = _devtype->item;
+
+ _dev = kzalloc(sizeof(*_dev), GFP_KERNEL);
+ if (!_dev) {
+ ret = -ENOMEM;
+ goto out_err;
+ }
+
+ /* We just use the pointer address as a unique id */
+ _id = (u64)_dev;
+
+ ret = devtype->create(kvm, devtype, _id, cfg, &dev);
+ if (ret < 0)
+ goto out_err;
+
+ _dev->item = dev;
+ _dev->parent = _devtype;
+
+ pvbus_map_register(&pvbus->devmap, &_dev->node);
+
+ mutex_unlock(&pvbus->lock);
+
+ *id = _id;
+
+ kvm_pvbus_inject_add(pvbus, name, _id);
+
+ return 0;
+
+ out_err:
+ mutex_unlock(&pvbus->lock);
+
+ kfree(_dev);
+
+ return ret;
+}
+
+int kvm_pvbus_init(struct kvm *kvm)
+{
+ struct kvm_pvbus *pvbus = kzalloc(sizeof(*pvbus), GFP_KERNEL);
+ if (!pvbus)
+ return -ENOMEM;
+
+ mutex_init(&pvbus->lock);
+ pvbus->kvm = kvm;
+ pv_device_map_init(&pvbus->devmap);
+ mutex_init(&pvbus->eventq.lock);
+
+ kvm->pvbus = pvbus;
+
+ return 0;
+}
+
+__init int kvm_pvbus_module_init(void)
+{
+ struct kvm_hypercall hc;
+
+ pv_devtype_map_init(&pvbus_typemap);
+
+ /* Register our hypercall */
+ hc.hypercall = kvm_pvbus_hc;
+ hc.idx = __NR_hypercall_pvbus;
+
+ kvm_register_hypercall(THIS_MODULE, &hc);
+
+ return 0;
+}
+
+ __exit void kvm_pvbus_module_exit(void)
+{
+ /* FIXME: Unregister our hypercall */
+}
+
+
+
+
diff --git a/drivers/kvm/pvbus_host.h b/drivers/kvm/pvbus_host.h
new file mode 100644
index 0000000..a3cc7a0
--- /dev/null
+++ b/drivers/kvm/pvbus_host.h
@@ -0,0 +1,66 @@
+/*
+ * Copyright 2007 Novell. All Rights Reserved.
+ *
+ * Author:
+ * Gregory Haskins <ghaskins-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
+ *
+ * This file is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+
+#ifndef _KVM_PVBUS_HOST_H
+#define _KVM_PVBUS_HOST_H
+
+#include <linux/rbtree.h>
+
+#ifdef CONFIG_KVM_PVBUS_HOST
+
+struct kvm;
+
+struct kvm_pvbus;
+
+struct kvm_pv_device {
+ int (*call)(struct kvm_pv_device *t, u32 func, void *data, size_t len);
+ void (*destroy)(struct kvm_pv_device *t);
+
+ u64 id;
+ u32 ver;
+
+};
+
+struct kvm_pv_devtype {
+ int (*create)(struct kvm *kvm,
+ struct kvm_pv_devtype *t, u64 id, const char *cfg,
+ struct kvm_pv_device **dev);
+ void (*destroy)(struct kvm_pv_devtype *t);
+
+ const char *name;
+};
+
+int kvm_pvbus_init(struct kvm *kvm);
+int kvm_pvbus_module_init(void);
+void kvm_pvbus_module_exit(void);
+int kvm_pvbus_registertype(struct kvm_pv_devtype *devtype);
+int kvm_pvbus_unregistertype(const char *name);
+int kvm_pvbus_createinstance(struct kvm *kvm, const char *name,
+ const char *config, u64 *id);
+
+#else /* CONFIG_KVM_PVBUS_HOST */
+
+#define kvm_pvbus_init(kvm) {}
+#define kvm_pvbus_module_init() {}
+#define kvm_pvbus_module_exit() {}
+
+#endif /* CONFIG_KVM_PVBUS_HOST */
+
+#endif /* _KVM_PVBUS_HOST_H */
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
^ permalink raw reply related [flat|nested] 41+ messages in thread* [PATCH 10/10] KVM: Add an IOQNET backend driver
[not found] ` <20070816231357.8044.55943.stgit-5CR4LY5GPkvLDviKLk5550HKjMygAv58XqFh9Ls21Oc@public.gmane.org>
` (8 preceding siblings ...)
2007-08-16 23:14 ` [PATCH 09/10] KVM: Add PVBUS support to the KVM host Gregory Haskins
@ 2007-08-16 23:14 ` Gregory Haskins
2007-08-17 1:25 ` [PATCH 00/10] PV-IO v3 Rusty Russell
10 siblings, 0 replies; 41+ messages in thread
From: Gregory Haskins @ 2007-08-16 23:14 UTC (permalink / raw)
To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Signed-off-by: Gregory Haskins <ghaskins-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
---
drivers/kvm/Kconfig | 5
drivers/kvm/Makefile | 2
drivers/kvm/ioqnet_host.c | 566 +++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 573 insertions(+), 0 deletions(-)
diff --git a/drivers/kvm/Kconfig b/drivers/kvm/Kconfig
index 9f2ef22..19551a2 100644
--- a/drivers/kvm/Kconfig
+++ b/drivers/kvm/Kconfig
@@ -62,6 +62,11 @@ config KVM_PVBUS_HOST
of the hypervisor itself. You only need this option if you plan to
run PVBUS based PV guests in KVM.
+config KVM_IOQNET
+ tristate "IOQNET host support"
+ depends on KVM
+ select KVM_PVBUS_HOST
+
config KVM_NET_HOST
tristate "Para virtual network host device"
depends on KVM
diff --git a/drivers/kvm/Makefile b/drivers/kvm/Makefile
index 8926fa9..66e5272 100644
--- a/drivers/kvm/Makefile
+++ b/drivers/kvm/Makefile
@@ -22,3 +22,5 @@ kvm-net-host-objs = kvm_net_host.o
obj-$(CONFIG_KVM_NET_HOST) += kvm_net_host.o
kvm-pvbus-objs := ioq_guest.o pvbus_guest.o
obj-$(CONFIG_KVM_PVBUS_GUEST) += kvm-pvbus.o
+kvm-ioqnet-objs := ioqnet_host.o
+obj-$(CONFIG_KVM_IOQNET) += kvm-ioqnet.o
\ No newline at end of file
diff --git a/drivers/kvm/ioqnet_host.c b/drivers/kvm/ioqnet_host.c
new file mode 100644
index 0000000..0f4d055
--- /dev/null
+++ b/drivers/kvm/ioqnet_host.c
@@ -0,0 +1,566 @@
+/*
+ * Copyright 2007 Novell. All Rights Reserved.
+ *
+ * ioqnet - A paravirtualized network device based on the IOQ interface.
+ *
+ * This module represents the backend driver for an IOQNET driver on the KVM
+ * platform.
+ *
+ * Author:
+ * Gregory Haskins <ghaskins-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
+ *
+ * Derived in part from the SNULL example from the book "Linux Device
+ * Drivers" by Alessandro Rubini and Jonathan Corbet, published
+ * by O'Reilly & Associates.
+ *
+ * This file is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/moduleparam.h>
+
+#include <linux/sched.h>
+#include <linux/kernel.h> /* printk() */
+#include <linux/slab.h> /* kmalloc() */
+#include <linux/errno.h> /* error codes */
+#include <linux/types.h> /* size_t */
+#include <linux/interrupt.h> /* mark_bh */
+
+#include <linux/in.h>
+#include <linux/netdevice.h> /* struct device, and other headers */
+#include <linux/etherdevice.h> /* eth_type_trans */
+#include <linux/ip.h> /* struct iphdr */
+#include <linux/tcp.h> /* struct tcphdr */
+#include <linux/skbuff.h>
+#include <linux/ioq.h>
+#include <linux/pvbus.h>
+
+#include <linux/in6.h>
+#include <asm/checksum.h>
+#include <linux/ioq.h>
+#include <linux/ioqnet.h>
+#include <linux/highmem.h>
+
+#include "pvbus_host.h"
+#include "kvm.h"
+
+MODULE_AUTHOR("Gregory Haskins");
+MODULE_LICENSE("GPL");
+
+#define IOQNET_NAME "ioqnet"
+
+/*
+ * FIXME: Any "BUG_ON" code that can be triggered by a malicious guest must
+ * be turned into an inject_gp()
+ */
+
+struct ioqnet_queue {
+ struct ioq *queue;
+ struct ioq_notifier notifier;
+};
+
+struct ioqnet_priv {
+ spinlock_t lock;
+ struct kvm *kvm;
+ struct kvm_pv_device pvdev;
+ struct net_device *netdev;
+ struct net_device_stats stats;
+ struct ioqnet_queue rxq;
+ struct ioqnet_queue txq;
+ struct tasklet_struct txtask;
+ int connected;
+ int opened;
+};
+
+#undef PDEBUG /* undef it, just in case */
+#ifdef IOQNET_DEBUG
+# define PDEBUG(fmt, args...) printk( KERN_DEBUG "ioqnet: " fmt, ## args)
+#else
+# define PDEBUG(fmt, args...) /* not debugging: nothing */
+#endif
+
+/*
+ * Enable and disable receive interrupts.
+ */
+static void ioqnet_rx_ints(struct net_device *dev, int enable)
+{
+ struct ioqnet_priv *priv = netdev_priv(dev);
+ struct ioq *ioq = priv->rxq.queue;
+
+ if (priv->connected) {
+ if (enable)
+ ioq_start(ioq, 0);
+ else
+ ioq_stop(ioq, 0);
+ }
+}
+
+/*
+ * Open and close
+ */
+
+int ioqnet_open(struct net_device *dev)
+{
+ struct ioqnet_priv *priv = netdev_priv(dev);
+
+ priv->opened = 1;
+ netif_start_queue(dev);
+
+ return 0;
+}
+
+int ioqnet_release(struct net_device *dev)
+{
+ struct ioqnet_priv *priv = netdev_priv(dev);
+
+ priv->opened = 0;
+ netif_stop_queue(dev);
+
+ return 0;
+}
+
+/*
+ * Configuration changes (passed on by ifconfig)
+ */
+int ioqnet_config(struct net_device *dev, struct ifmap *map)
+{
+ if (dev->flags & IFF_UP) /* can't act on a running interface */
+ return -EBUSY;
+
+ /* Don't allow changing the I/O address */
+ if (map->base_addr != dev->base_addr) {
+ printk(KERN_WARNING "ioqnet: Can't change I/O address\n");
+ return -EOPNOTSUPP;
+ }
+
+ /* ignore other fields */
+ return 0;
+}
+
+/*
+ * The poll implementation.
+ */
+static int ioqnet_poll(struct net_device *dev, int *budget)
+{
+ int npackets = 0, quota = min(dev->quota, *budget);
+ struct ioqnet_priv *priv = netdev_priv(dev);
+ struct ioq_iterator iter;
+ unsigned long flags;
+ int ret;
+
+ if (!priv->connected)
+ return 0;
+
+ spin_lock_irqsave(&priv->lock, flags);
+
+ /* We want to iterate on the tail of the in-use index */
+ ret = ioq_iter_init(priv->rxq.queue, &iter, ioq_idxtype_inuse, 0);
+ BUG_ON(ret < 0);
+
+ ret = ioq_iter_seek(&iter, ioq_seek_tail, 0, 0);
+ BUG_ON(ret < 0);
+
+ /*
+ * We stop if we have met the quota or there are no more packets.
+ * The EOM is indicated by finding a packet that is still owned by
+ * the north side
+ */
+ while ((npackets < quota) && iter.desc->sown) {
+ struct ioq_ring_desc *desc = iter.desc;
+ struct ioqnet_tx_ptr *ptr = gpa_to_hva(priv->kvm, desc->ptr);
+ struct sk_buff *skb;
+ int i;
+ size_t len = 0;
+
+ /* First figure out how much of an skb we need */
+ for (i = 0; i < desc->alen; ++i) {
+ len += ptr[i].len;
+ }
+
+ skb = dev_alloc_skb(len + 2);
+ if (!skb) {
+ /* FIXME: This leaks... */
+ printk(KERN_ERR "FATAL: Out of memory on IOQNET\n");
+ netif_stop_queue(dev);
+ return -ENOMEM;
+ }
+
+ skb_reserve(skb, 2);
+
+ /* Then copy the data out to our fresh SKB */
+ for (i = 0; i < desc->alen; ++i) {
+ struct ioqnet_tx_ptr *p = &ptr[i];
+ void *d = gpa_to_hva(priv->kvm,
+ p->data);
+
+ memcpy(skb_push(skb, p->len), d, p->len);
+ kunmap(d);
+ }
+
+ /* Maintain stats */
+ npackets++;
+ priv->stats.rx_packets++;
+ priv->stats.rx_bytes += len;
+
+ /* Pass the buffer up to the stack */
+ skb->dev = dev;
+ skb->protocol = eth_type_trans(skb, dev);
+ netif_receive_skb(skb);
+
+ /*
+ * Ensure that we have finished reading before marking the
+ * state of the queue
+ */
+ mb();
+ desc->sown = 0;
+ mb();
+
+ /* Advance the in-use tail */
+ ret = ioq_iter_pop(&iter, 0);
+ BUG_ON(ret < 0);
+
+ /* Toggle the lock */
+ spin_unlock_irqrestore(&priv->lock, flags);
+ spin_lock_irqsave(&priv->lock, flags);
+ }
+
+ /*
+ * If we processed all packets, we're done; tell the kernel and
+ * reenable ints
+ */
+ *budget -= npackets;
+ dev->quota -= npackets;
+ if (ioq_empty(priv->rxq.queue, ioq_idxtype_inuse)) {
+ /* FIXME: there is a race with enabling interrupts */
+ netif_rx_complete(dev);
+ ioqnet_rx_ints(dev, 1);
+ ret = 0;
+ } else
+ /* We couldn't process everything. */
+ ret = 1;
+
+ spin_unlock_irqrestore(&priv->lock, flags);
+
+ /* And let the north side know that we changed the rx-queue */
+ ioq_signal(priv->rxq.queue, 0);
+
+ return ret;
+}
+
+/*
+ * Transmit a packet (called by the kernel)
+ */
+int ioqnet_tx(struct sk_buff *skb, struct net_device *dev)
+{
+ struct ioqnet_priv *priv = netdev_priv(dev);
+ struct ioq_iterator iter;
+ int ret;
+ unsigned long flags;
+ char *data;
+
+ if (skb->len < ETH_ZLEN)
+ return -EINVAL;
+
+ if (!priv->connected)
+ return 0;
+
+ spin_lock_irqsave(&priv->lock, flags);
+
+ if (ioq_full(priv->txq.queue, ioq_idxtype_valid)) {
+ /*
+ * We must flow-control the kernel by disabling the queue
+ */
+ spin_unlock_irqrestore(&priv->lock, flags);
+ netif_stop_queue(dev);
+ return 0;
+ }
+
+ /*
+ * We want to iterate on the head of the "inuse" index
+ */
+ ret = ioq_iter_init(priv->txq.queue, &iter, ioq_idxtype_inuse, 0);
+ BUG_ON(ret < 0);
+
+ ret = ioq_iter_seek(&iter, ioq_seek_head, 0, 0);
+ BUG_ON(ret < 0);
+
+ if (skb->len > iter.desc->len)
+ return -EINVAL;
+
+ dev->trans_start = jiffies; /* save the timestamp */
+
+ /* Copy the data to the north-side buffer */
+ data = (char*)gpa_to_hva(priv->kvm, iter.desc->ptr);
+ memcpy(data, skb->data, skb->len);
+ kunmap(data);
+
+ /* Give ownership back to the north */
+ mb();
+ iter.desc->sown = 0;
+ mb();
+
+ /* Advance the index */
+ ret = ioq_iter_push(&iter, 0);
+ BUG_ON(ret < 0);
+
+ /*
+ * This will signal the north side to consume the packet
+ */
+ ioq_signal(priv->txq.queue, 0);
+
+ spin_unlock_irqrestore(&priv->lock, flags);
+
+ return 0;
+}
+
+void ioqnet_tx_intr(unsigned long data)
+{
+ struct ioqnet_priv *priv = (struct ioqnet_priv*)data;
+ unsigned long flags;
+
+ spin_lock_irqsave(&priv->lock, flags);
+
+ /*
+ * If we were previously stopped due to flow control, restart the
+ * processing
+ */
+ if (netif_queue_stopped(priv->netdev)
+ && !ioq_full(priv->txq.queue, ioq_idxtype_inuse)) {
+
+ netif_wake_queue(priv->netdev);
+ }
+
+ spin_unlock_irqrestore(&priv->lock, flags);
+}
+
+/*
+ * Ioctl commands
+ */
+int ioqnet_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
+{
+ PDEBUG("ioctl\n");
+ return 0;
+}
+
+/*
+ * Return statistics to the caller
+ */
+struct net_device_stats *ioqnet_stats(struct net_device *dev)
+{
+ struct ioqnet_priv *priv = netdev_priv(dev);
+ return &priv->stats;
+}
+
+static void ioq_rx_notify(struct ioq_notifier *notifier)
+{
+ struct ioqnet_priv *priv;
+ struct net_device *dev;
+
+ priv = container_of(notifier, struct ioqnet_priv, rxq.notifier);
+ dev = priv->netdev;
+
+ ioqnet_rx_ints(dev, 0); /* Disable further interrupts */
+ netif_rx_schedule(dev);
+}
+
+static void ioq_tx_notify(struct ioq_notifier *notifier)
+{
+ struct ioqnet_priv *priv;
+
+ priv = container_of(notifier, struct ioqnet_priv, txq.notifier);
+
+ tasklet_schedule(&priv->txtask);
+}
+
+/*
+ * The init function (sometimes called probe).
+ * It is invoked by register_netdev()
+ */
+void ioqnet_init(struct net_device *dev)
+{
+ ether_setup(dev); /* assign some of the fields */
+
+ dev->open = ioqnet_open;
+ dev->stop = ioqnet_release;
+ dev->set_config = ioqnet_config;
+ dev->hard_start_xmit = ioqnet_tx;
+ dev->do_ioctl = ioqnet_ioctl;
+ dev->get_stats = ioqnet_stats;
+ dev->poll = ioqnet_poll;
+ dev->weight = 2;
+ dev->hard_header_cache = NULL; /* Disable caching */
+
+ /* We go "link down" until the guest connects to us */
+ netif_carrier_off(dev);
+
+}
+
+
+/* -------------------------------------------------------------- */
+
+static inline struct ioqnet_priv* to_priv(struct kvm_pv_device *t)
+{
+ return container_of(t, struct ioqnet_priv, pvdev);
+}
+
+
+static int ioqnet_connect(struct ioqnet_priv *priv,
+ ioq_id_t id,
+ struct ioqnet_queue *q,
+ void (*func)(struct ioq_notifier*))
+{
+ int ret;
+ struct ioq_mgr *ioqmgr = priv->kvm->ioqmgr;
+
+ ret = ioqmgr->connect(ioqmgr, id, &q->queue, 0);
+ if (ret < 0)
+ return ret;
+
+ q->notifier.signal = func;
+
+ return 0;
+}
+
+static int ioqnet_pvbus_connect(struct ioqnet_priv *priv,
+ void *data, size_t len)
+{
+ struct ioqnet_connect *cnct = (struct ioqnet_connect*)data;
+ int ret;
+
+ /* We connect the north's rxq to our txq */
+ ret = ioqnet_connect(priv, cnct->rxq, &priv->txq, ioq_tx_notify);
+ if (ret < 0)
+ return ret;
+
+ /* And vice-versa */
+ ret = ioqnet_connect(priv, cnct->txq, &priv->rxq, ioq_rx_notify);
+ if (ret < 0)
+ return ret;
+
+ /*
+ * So now that the guest has connected we can send a "link up" event
+ * to the kernel.
+ */
+ netif_carrier_on(priv->netdev);
+
+ priv->connected = 1;
+
+ return 0;
+}
+
+static int ioqnet_pvbus_query_mac(struct ioqnet_priv *priv,
+ void *data, size_t len)
+{
+ if (len != ETH_ALEN)
+ return -EINVAL;
+
+ memcpy(data, priv->netdev->dev_addr, ETH_ALEN);
+
+ return 0;
+}
+
+/*
+ * This function is invoked whenever a guest calls pvbus_ops->call() against
+ * our instance ID
+ */
+static int ioqnet_pvbus_device_call(struct kvm_pv_device *t, u32 func,
+ void *data, size_t len)
+{
+ struct ioqnet_priv *priv = to_priv(t);
+ int ret = -EINVAL;
+
+ switch (func) {
+ case IOQNET_CONNECT:
+ ret = ioqnet_pvbus_connect(priv, data, len);
+ break;
+ case IOQNET_QUERY_MAC:
+ ret = ioqnet_pvbus_query_mac(priv, data, len);
+ break;
+ }
+
+ return ret;
+}
+
+static void ioqnet_pvbus_device_destroy(struct kvm_pv_device *t)
+{
+
+}
+
+/*
+ * This function is invoked whenever someone instantiates an IOQNET object
+ */
+static int ioqnet_pvbus_devtype_create(struct kvm *kvm,
+ struct kvm_pv_devtype *t, u64 id,
+ const char *cfg,
+ struct kvm_pv_device **pvdev)
+{
+ struct net_device *dev;
+ struct ioqnet_priv *priv;
+ int ret;
+
+ dev = alloc_netdev(sizeof(struct ioqnet_priv), "ioq%d",
+ ioqnet_init);
+ if (!dev)
+ return -ENOMEM;
+
+ priv = netdev_priv(dev);
+
+ memset(priv, 0, sizeof(*priv));
+
+ priv->pvdev.call = ioqnet_pvbus_device_call;
+ priv->pvdev.destroy = ioqnet_pvbus_device_destroy;
+ priv->pvdev.id = id;
+ priv->pvdev.ver = IOQNET_VERSION;
+
+ spin_lock_init(&priv->lock);
+ priv->kvm = kvm;
+ priv->netdev = dev;
+ tasklet_init(&priv->txtask, ioqnet_tx_intr, (unsigned long)priv);
+
+ ret = register_netdev(dev);
+ if (ret < 0) {
+ printk("ioqnet: error %i registering device \"%s\"\n",
+ ret, dev->name);
+ free_netdev(dev);
+ }
+
+ *pvdev = &priv->pvdev;
+
+ return 0;
+}
+
+static void ioqnet_pvbus_devtype_destroy(struct kvm_pv_devtype *t)
+{
+
+}
+
+static struct kvm_pv_devtype ioqnet_devtype = {
+ .create = ioqnet_pvbus_devtype_create,
+ .destroy = ioqnet_pvbus_devtype_destroy,
+ .name = IOQNET_NAME,
+};
+
+static int __init ioqnet_init_module(void)
+{
+ return kvm_pvbus_registertype(&ioqnet_devtype);
+}
+
+static void __exit ioqnet_cleanup_module(void)
+{
+ kvm_pvbus_unregistertype(IOQNET_NAME);
+}
+
+module_init(ioqnet_init_module);
+module_exit(ioqnet_cleanup_module);
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
^ permalink raw reply related [flat|nested] 41+ messages in thread* Re: [PATCH 00/10] PV-IO v3
[not found] ` <20070816231357.8044.55943.stgit-5CR4LY5GPkvLDviKLk5550HKjMygAv58XqFh9Ls21Oc@public.gmane.org>
` (9 preceding siblings ...)
2007-08-16 23:14 ` [PATCH 10/10] KVM: Add an IOQNET backend driver Gregory Haskins
@ 2007-08-17 1:25 ` Rusty Russell
[not found] ` <1187313953.6449.70.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
10 siblings, 1 reply; 41+ messages in thread
From: Rusty Russell @ 2007-08-17 1:25 UTC (permalink / raw)
To: Gregory Haskins
Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, virtualization
On Thu, 2007-08-16 at 19:13 -0400, Gregory Haskins wrote:
> Here is the v3 release of the patch series for a generalized PV-IO
> infrastructure. It has v2 plus the following changes:
Hi Gregory,
This is a lot of code. I'm having trouble taking it all in, TBH. It
might help me if we could to go back to the basic transport
implementation questions.
Transport has several parts. What the hypervisor knows about (usually
shared memory and some interrupt mechanism and possibly "DMA") and what
is convention between users (eg. ringbuffer layouts). Whether it's 1:1
or n-way (if 1:1, is it symmetrical?). Whether it has to be host <->
guest, or can be inter-guest. Whether it requires trust between the
sides.
My personal thoughts are that we should be aiming for 1:1 untrusting. I
like N-way, but it adds complexity. And not having inter-guest is just
poor form (and putting it in later is impossible, as we'll see).
It seems that a shared-memory "ring-buffer of descriptors" is the
simplest implementation. But there are two problems with a simple
descriptor ring:
1) A ring buffer doesn't work well for things which process
out-of-order, such as a block device.
2) We either need huge descriptors or some chaining mechanism to
handle scatter-gather.
So we end up with an array of descriptors with next pointers, and two
ring buffers which refer to those descriptors: one for what descriptors
are pending, and one for what descriptors have been used (by the other
end).
This is sufficient for guest<->host, but care must be taken for guest
<-> guest. Let's dig down:
Consider a transport from A -> B. A populates the descriptor entries
corresponding to its sg, then puts the head descriptor entry in the
"pending" ring buffer and sends B an interrupt. B sees the new pending
entry, reads the descriptors, does the operation and reads or writes
into the memory pointed to by the descriptors. It then updates the
"used" ring buffer and sends A an interrupt.
Now, if B is untrusted, this is more difficult. It needs to read the
descriptor entries and the "pending" ring buffer, and write to the
"used" ring buffer. We can use page protection to share these if we
arrange things carefully, like so:
struct desc_pages
{
/* Page of descriptors. */
struct lguest_desc desc[NUM_DESCS];
/* Next page: how we tell other side what buffers are available. */
unsigned int avail_idx;
unsigned int available[NUM_DESCS];
char pad[PAGE_SIZE - (NUM_DESCS+1) * sizeof(unsigned int)];
/* Third page: how other side tells us what's used. */
unsigned int used_idx;
struct lguest_used used[NUM_DESCS];
};
But we still have the problem of an untrusted B having to read/write A's
memory pointed to A's descriptors. At this point, my preferred solution
so far is as follows (note: have not implemented this!):
(1) have the hypervisor be aware of the descriptor page format, location
and which guest can access it.
(2) have the descriptors themselves contains a type (read/write) and a
valid bit.
(3) have a "DMA" hypercall to copy to/from someone else's descriptors.
Note that this means we do a copy for the untrusted case which doesn't
exist for the trusted case. In theory the hypervisor could do some
tricky copy-on-write page-sharing for very large well-aligned buffers,
but it remains to be seen if that is actually useful.
Sorry for the long mail, but I really want to get the mechanism correct.
Cheers,
Rusty.
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
^ permalink raw reply [flat|nested] 41+ messages in thread