* [patch/rfc] multiprotocol blkback drivers (32-on-64)
@ 2006-12-18 16:39 Gerd Hoffmann
2006-12-18 17:09 ` Jan Beulich
2006-12-19 7:55 ` Jan Beulich
0 siblings, 2 replies; 12+ messages in thread
From: Gerd Hoffmann @ 2006-12-18 16:39 UTC (permalink / raw)
To: Xen devel list; +Cc: jan Beulich
[-- Attachment #1: Type: text/plain, Size: 1790 bytes --]
Hi,
This is a patch for the block interface, frontend drivers, backend
drivers and tools to support multiple ring protocols. Right there are
now just two: the 32bit and the 64bit one. If needed it can be extended.
Interface changes (io/blkif.h)
* Have both request structs there, with "v1" and "v2" added to the
name. The old name is aliased to the native protocol of the
architecture.
* Add helper functions to convert v1/v2 requests to native.
Frontend changes:
* Create a new node "protocol", add the protocol number it speaks
there.
Backend changes:
* Look at the "protocol" number of the frontend and switch ring
handling accordingly. If the protocol node isn't present it assumes
native protocol.
* As the request struct is copied anyway before being processed (for
security reasons) it is converted to native at that point so most
backend code doesn't need to know what the frontend speaks.
* In case of blktap this is completely transparent to userspace, the
kernel/userspace ring is always native no matter what the frontend
speaks.
Tools changes:
* Add one more option to the disk configuration, so one can specify the
protocol the frontend speaks in the config file. This is needed for
old frontends which don't advertise the protocol they are speaking
themself.
I'm not that happy with this approach, but it works for now and I'm
kida lost in the stack of python classes doing domain and device
handling ...
Consider the code experimental, not all frontend/backend combinations
are tested.
Comments? Questions? Suggesions?
cheers,
Gerd
PS: Anyone working on blkback/blktap code sharing? While walking
through the code I've noticed quite alot of it is cut&paste ...
--
Gerd Hoffmann <kraxel@suse.de>
[-- Attachment #2: blkif-bimodal.diff --]
[-- Type: text/x-patch, Size: 23296 bytes --]
---
linux-2.6-xen-sparse/drivers/xen/blkback/blkback.c | 79 ++++++++++++-----
linux-2.6-xen-sparse/drivers/xen/blkback/common.h | 4
linux-2.6-xen-sparse/drivers/xen/blkback/interface.c | 17 ++-
linux-2.6-xen-sparse/drivers/xen/blkback/xenbus.c | 14 +++
linux-2.6-xen-sparse/drivers/xen/blkfront/blkfront.c | 6 +
linux-2.6-xen-sparse/drivers/xen/blktap/blktap.c | 86 +++++++++++++------
linux-2.6-xen-sparse/drivers/xen/blktap/common.h | 4
linux-2.6-xen-sparse/drivers/xen/blktap/interface.c | 17 ++-
linux-2.6-xen-sparse/drivers/xen/blktap/xenbus.c | 14 +++
tools/python/xen/xend/server/blkif.py | 3
tools/python/xen/xm/create.py | 7 +
xen/include/public/io/blkif.h | 81 +++++++++++++++--
12 files changed, 264 insertions(+), 68 deletions(-)
Index: build-64-release304-12901/linux-2.6-xen-sparse/drivers/xen/blkback/blkback.c
===================================================================
--- build-64-release304-12901.orig/linux-2.6-xen-sparse/drivers/xen/blkback/blkback.c
+++ build-64-release304-12901/linux-2.6-xen-sparse/drivers/xen/blkback/blkback.c
@@ -298,17 +298,33 @@ irqreturn_t blkif_be_int(int irq, void *
static int do_block_io_op(blkif_t *blkif)
{
- blkif_back_ring_t *blk_ring = &blkif->blk_ring;
+ blkif_v1_back_ring_t *blk_v1_ring = NULL;
+ blkif_v2_back_ring_t *blk_v2_ring = NULL;
blkif_request_t req;
pending_req_t *pending_req;
RING_IDX rc, rp;
int more_to_do = 0;
- rc = blk_ring->req_cons;
- rp = blk_ring->sring->req_prod;
+ if (1 == blkif->blk_protocol) {
+ blk_v1_ring = &blkif->blk_v1_ring;
+ rc = blk_v1_ring->req_cons;
+ rp = blk_v1_ring->sring->req_prod;
+ } else {
+ blk_v2_ring = &blkif->blk_v2_ring;
+ rc = blk_v2_ring->req_cons;
+ rp = blk_v2_ring->sring->req_prod;
+ }
rmb(); /* Ensure we see queued requests up to 'rp'. */
- while ((rc != rp) && !RING_REQUEST_CONS_OVERFLOW(blk_ring, rc)) {
+ while ((rc != rp)) {
+
+ if (1 == blkif->blk_protocol) {
+ if (RING_REQUEST_CONS_OVERFLOW(blk_v1_ring, rc))
+ break;
+ } else {
+ if (RING_REQUEST_CONS_OVERFLOW(blk_v2_ring, rc))
+ break;
+ }
pending_req = alloc_req();
if (NULL == pending_req) {
@@ -317,8 +333,13 @@ static int do_block_io_op(blkif_t *blkif
break;
}
- memcpy(&req, RING_GET_REQUEST(blk_ring, rc), sizeof(req));
- blk_ring->req_cons = ++rc; /* before make_response() */
+ if (1 == blkif->blk_protocol) {
+ blkif_get_v1_req(&req, RING_GET_REQUEST(blk_v1_ring, rc));
+ blk_v1_ring->req_cons = ++rc; /* before make_response() */
+ } else {
+ blkif_get_v2_req(&req, RING_GET_REQUEST(blk_v2_ring, rc));
+ blk_v2_ring->req_cons = ++rc; /* before make_response() */
+ }
switch (req.operation) {
case BLKIF_OP_READ:
@@ -500,32 +521,48 @@ static void make_response(blkif_t *blkif
{
blkif_response_t *resp;
unsigned long flags;
- blkif_back_ring_t *blk_ring = &blkif->blk_ring;
+ blkif_v1_back_ring_t *blk_v1_ring = NULL;
+ blkif_v2_back_ring_t *blk_v2_ring = NULL;
int more_to_do = 0;
int notify;
spin_lock_irqsave(&blkif->blk_ring_lock, flags);
/* Place on the response ring for the relevant domain. */
- resp = RING_GET_RESPONSE(blk_ring, blk_ring->rsp_prod_pvt);
+ if (1 == blkif->blk_protocol) {
+ blk_v1_ring = &blkif->blk_v1_ring;
+ resp = RING_GET_RESPONSE(blk_v1_ring, blk_v1_ring->rsp_prod_pvt);
+ } else {
+ blk_v2_ring = &blkif->blk_v2_ring;
+ resp = RING_GET_RESPONSE(blk_v2_ring, blk_v2_ring->rsp_prod_pvt);
+ }
resp->id = id;
resp->operation = op;
resp->status = st;
- blk_ring->rsp_prod_pvt++;
- RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(blk_ring, notify);
-
- if (blk_ring->rsp_prod_pvt == blk_ring->req_cons) {
- /*
- * Tail check for pending requests. Allows frontend to avoid
- * notifications if requests are already in flight (lower
- * overheads and promotes batching).
- */
- RING_FINAL_CHECK_FOR_REQUESTS(blk_ring, more_to_do);
-
- } else if (RING_HAS_UNCONSUMED_REQUESTS(blk_ring)) {
- more_to_do = 1;
+ if (1 == blkif->blk_protocol) {
+ blk_v1_ring->rsp_prod_pvt++;
+ RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(blk_v1_ring, notify);
+ if (blk_v1_ring->rsp_prod_pvt == blk_v1_ring->req_cons) {
+ /*
+ * Tail check for pending requests. Allows frontend to avoid
+ * notifications if requests are already in flight (lower
+ * overheads and promotes batching).
+ */
+ RING_FINAL_CHECK_FOR_REQUESTS(blk_v1_ring, more_to_do);
+ } else if (RING_HAS_UNCONSUMED_REQUESTS(blk_v1_ring)) {
+ more_to_do = 1;
+ }
+ } else {
+ blk_v2_ring->rsp_prod_pvt++;
+ RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(blk_v2_ring, notify);
+ if (blk_v2_ring->rsp_prod_pvt == blk_v2_ring->req_cons) {
+ RING_FINAL_CHECK_FOR_REQUESTS(blk_v2_ring, more_to_do);
+ } else if (RING_HAS_UNCONSUMED_REQUESTS(blk_v2_ring)) {
+ more_to_do = 1;
+ }
}
+
spin_unlock_irqrestore(&blkif->blk_ring_lock, flags);
if (more_to_do)
Index: build-64-release304-12901/linux-2.6-xen-sparse/drivers/xen/blkback/common.h
===================================================================
--- build-64-release304-12901.orig/linux-2.6-xen-sparse/drivers/xen/blkback/common.h
+++ build-64-release304-12901/linux-2.6-xen-sparse/drivers/xen/blkback/common.h
@@ -68,7 +68,9 @@ typedef struct blkif_st {
unsigned int evtchn;
unsigned int irq;
/* Comms information. */
- blkif_back_ring_t blk_ring;
+ int blk_protocol;
+ blkif_v1_back_ring_t blk_v1_ring;
+ blkif_v2_back_ring_t blk_v2_ring;
struct vm_struct *blk_ring_area;
/* The VBD attached to this interface. */
struct vbd vbd;
Index: build-64-release304-12901/linux-2.6-xen-sparse/drivers/xen/blkback/interface.c
===================================================================
--- build-64-release304-12901.orig/linux-2.6-xen-sparse/drivers/xen/blkback/interface.c
+++ build-64-release304-12901/linux-2.6-xen-sparse/drivers/xen/blkback/interface.c
@@ -95,7 +95,8 @@ static void unmap_frontend_page(blkif_t
int blkif_map(blkif_t *blkif, unsigned long shared_page, unsigned int evtchn)
{
- blkif_sring_t *sring;
+ blkif_v1_sring_t *sring_v1;
+ blkif_v2_sring_t *sring_v2;
int err;
struct evtchn_bind_interdomain bind_interdomain;
@@ -125,8 +126,13 @@ int blkif_map(blkif_t *blkif, unsigned l
blkif->evtchn = bind_interdomain.local_port;
- sring = (blkif_sring_t *)blkif->blk_ring_area->addr;
- BACK_RING_INIT(&blkif->blk_ring, sring, PAGE_SIZE);
+ if (1 == blkif->blk_protocol) {
+ sring_v1 = (blkif_v1_sring_t *)blkif->blk_ring_area->addr;
+ BACK_RING_INIT(&blkif->blk_v1_ring, sring_v1, PAGE_SIZE);
+ } else {
+ sring_v2 = (blkif_v2_sring_t *)blkif->blk_ring_area->addr;
+ BACK_RING_INIT(&blkif->blk_v2_ring, sring_v2, PAGE_SIZE);
+ }
blkif->irq = bind_evtchn_to_irqhandler(
blkif->evtchn, blkif_be_int, 0, "blkif-backend", blkif);
@@ -150,10 +156,11 @@ void blkif_disconnect(blkif_t *blkif)
blkif->irq = 0;
}
- if (blkif->blk_ring.sring) {
+ if (blkif->blk_v1_ring.sring || blkif->blk_v2_ring.sring) {
unmap_frontend_page(blkif);
free_vm_area(blkif->blk_ring_area);
- blkif->blk_ring.sring = NULL;
+ blkif->blk_v1_ring.sring = NULL;
+ blkif->blk_v2_ring.sring = NULL;
}
}
Index: build-64-release304-12901/linux-2.6-xen-sparse/drivers/xen/blkback/xenbus.c
===================================================================
--- build-64-release304-12901.orig/linux-2.6-xen-sparse/drivers/xen/blkback/xenbus.c
+++ build-64-release304-12901/linux-2.6-xen-sparse/drivers/xen/blkback/xenbus.c
@@ -435,6 +435,7 @@ static int connect_ring(struct backend_i
struct xenbus_device *dev = be->dev;
unsigned long ring_ref;
unsigned int evtchn;
+ unsigned int protocol;
int err;
DPRINTK("%s", dev->otherend);
@@ -448,6 +449,19 @@ static int connect_ring(struct backend_i
return err;
}
+ err = xenbus_gather(XBT_NIL, dev->otherend, "protocol",
+ "%u", &protocol, NULL);
+ if (err)
+ protocol = BLKIF_NATIVE_PROTOCOL;
+ if (protocol < 1 || protocol > 2) {
+ xenbus_dev_fatal(dev, err, "unknown fe protocol %d", protocol);
+ return -1;
+ }
+ be->blkif->blk_protocol = protocol;
+
+ printk("blkback: ring-ref %ld, event-channel %d, protocol %d\n",
+ ring_ref, evtchn, protocol);
+
/* Map the shared frame, irq etc. */
err = blkif_map(be->blkif, ring_ref, evtchn);
if (err) {
Index: build-64-release304-12901/linux-2.6-xen-sparse/drivers/xen/blkfront/blkfront.c
===================================================================
--- build-64-release304-12901.orig/linux-2.6-xen-sparse/drivers/xen/blkfront/blkfront.c
+++ build-64-release304-12901/linux-2.6-xen-sparse/drivers/xen/blkfront/blkfront.c
@@ -180,6 +180,12 @@ again:
message = "writing event-channel";
goto abort_transaction;
}
+ err = xenbus_printf(xbt, dev->nodename,
+ "protocol", "%u", BLKIF_NATIVE_PROTOCOL);
+ if (err) {
+ message = "writing protocol";
+ goto abort_transaction;
+ }
err = xenbus_transaction_end(xbt, 0);
if (err) {
Index: build-64-release304-12901/xen/include/public/io/blkif.h
===================================================================
--- build-64-release304-12901.orig/xen/include/public/io/blkif.h
+++ build-64-release304-12901/xen/include/public/io/blkif.h
@@ -71,20 +71,36 @@
*/
#define BLKIF_MAX_SEGMENTS_PER_REQUEST 11
-struct blkif_request {
+struct blkif_request_segment {
+ grant_ref_t gref; /* reference to I/O buffer frame */
+ /* @first_sect: first sector in frame to transfer (inclusive). */
+ /* @last_sect: last sector in frame to transfer (inclusive). */
+ uint8_t first_sect, last_sect;
+};
+
+/* i386 protocol version */
+#pragma pack(push, 4)
+struct blkif_v1_request {
uint8_t operation; /* BLKIF_OP_??? */
uint8_t nr_segments; /* number of segments */
blkif_vdev_t handle; /* only for read/write requests */
uint64_t id; /* private guest value, echoed in resp */
blkif_sector_t sector_number;/* start sector idx on disk (r/w only) */
- struct blkif_request_segment {
- grant_ref_t gref; /* reference to I/O buffer frame */
- /* @first_sect: first sector in frame to transfer (inclusive). */
- /* @last_sect: last sector in frame to transfer (inclusive). */
- uint8_t first_sect, last_sect;
- } seg[BLKIF_MAX_SEGMENTS_PER_REQUEST];
+ struct blkif_request_segment seg[BLKIF_MAX_SEGMENTS_PER_REQUEST];
};
-typedef struct blkif_request blkif_request_t;
+typedef struct blkif_v1_request blkif_v1_request_t;
+#pragma pack(pop)
+
+/* x86_64 protocol version */
+struct blkif_v2_request {
+ uint8_t operation; /* BLKIF_OP_??? */
+ uint8_t nr_segments; /* number of segments */
+ blkif_vdev_t handle; /* only for read/write requests */
+ uint64_t __attribute__((__aligned__(8))) id;
+ blkif_sector_t sector_number;/* start sector idx on disk (r/w only) */
+ struct blkif_request_segment seg[BLKIF_MAX_SEGMENTS_PER_REQUEST];
+};
+typedef struct blkif_v2_request blkif_v2_request_t;
struct blkif_response {
uint64_t id; /* copied from request */
@@ -107,7 +123,54 @@ typedef struct blkif_response blkif_resp
* Generate blkif ring structures and types.
*/
-DEFINE_RING_TYPES(blkif, struct blkif_request, struct blkif_response);
+DEFINE_RING_TYPES(blkif_v1, struct blkif_v1_request, struct blkif_response);
+DEFINE_RING_TYPES(blkif_v2, struct blkif_v2_request, struct blkif_response);
+
+#if defined(__i386__)
+
+#define BLKIF_NATIVE_PROTOCOL 1
+typedef struct blkif_v1_request blkif_request_t;
+DEFINE_RING_TYPES(blkif, struct blkif_v1_request, struct blkif_response);
+
+#elif defined(__x86_64__)
+
+#define BLKIF_NATIVE_PROTOCOL 2
+typedef struct blkif_v2_request blkif_request_t;
+DEFINE_RING_TYPES(blkif, struct blkif_v2_request, struct blkif_response);
+
+#else
+# error arch fixup needed here
+#endif
+
+#ifdef __KERNEL__
+
+/* translate requests: v1/v2 to native */
+#if 1 == BLKIF_NATIVE_PROTOCOL
+static void inline blkif_get_v1_req(blkif_request_t *dst, blkif_v1_request_t *src)
+#else
+static void inline blkif_get_v2_req(blkif_request_t *dst, blkif_v2_request_t *src)
+#endif
+{
+ memcpy(dst, src, sizeof(*dst));
+}
+
+#if 1 == BLKIF_NATIVE_PROTOCOL
+static void inline blkif_get_v2_req(blkif_request_t *dst, blkif_v2_request_t *src)
+#else
+static void inline blkif_get_v1_req(blkif_request_t *dst, blkif_v1_request_t *src)
+#endif
+{
+ int i;
+ dst->operation = src->operation;
+ dst->nr_segments = src->nr_segments;
+ dst->handle = src->handle;
+ dst->id = src->id;
+ dst->sector_number = src->sector_number;
+ for (i = 0; i < src->nr_segments; i++)
+ dst->seg[i] = src->seg[i];
+}
+
+#endif
#define VDISK_CDROM 0x1
#define VDISK_REMOVABLE 0x2
Index: build-64-release304-12901/tools/python/xen/xend/server/blkif.py
===================================================================
--- build-64-release304-12901.orig/tools/python/xen/xend/server/blkif.py
+++ build-64-release304-12901/tools/python/xen/xend/server/blkif.py
@@ -38,6 +38,7 @@ class BlkifController(DevController):
"""@see DevController.getDeviceDetails"""
uname = config.get('uname', '')
dev = config.get('dev', '')
+ protocol = config.get('protocol')
if 'ioemu:' in dev:
(_, dev) = string.split(dev, ':', 1)
@@ -85,6 +86,8 @@ class BlkifController(DevController):
front = { 'virtual-device' : "%i" % devid,
'device-type' : dev_type
}
+ if protocol:
+ front.update({ 'protocol' : protocol });
return (devid, back, front)
Index: build-64-release304-12901/tools/python/xen/xm/create.py
===================================================================
--- build-64-release304-12901.orig/tools/python/xen/xm/create.py
+++ build-64-release304-12901/tools/python/xen/xm/create.py
@@ -531,7 +531,7 @@ def configure_image(vals):
def configure_disks(config_devs, vals):
"""Create the config for disks (virtual block devices).
"""
- for (uname, dev, mode, backend) in vals.disk:
+ for (uname, dev, mode, backend, protocol) in vals.disk:
if uname.startswith('tap:'):
cls = 'tap'
else:
@@ -543,6 +543,8 @@ def configure_disks(config_devs, vals):
['mode', mode ] ]
if backend:
config_vbd.append(['backend', backend])
+ if protocol:
+ config_vbd.append(['protocol', protocol])
config_devs.append(['device', config_vbd])
def configure_pci(config_devs, vals):
@@ -787,7 +789,10 @@ def preprocess_disk(vals):
n = len(d)
if n == 3:
d.append(None)
+ d.append(None)
elif n == 4:
+ d.append(None)
+ elif n == 5:
pass
else:
err('Invalid disk specifier: ' + v)
Index: build-64-release304-12901/linux-2.6-xen-sparse/drivers/xen/blktap/blktap.c
===================================================================
--- build-64-release304-12901.orig/linux-2.6-xen-sparse/drivers/xen/blktap/blktap.c
+++ build-64-release304-12901/linux-2.6-xen-sparse/drivers/xen/blktap/blktap.c
@@ -1094,15 +1094,24 @@ irqreturn_t tap_blkif_be_int(int irq, vo
static int print_dbug = 1;
static int do_block_io_op(blkif_t *blkif)
{
- blkif_back_ring_t *blk_ring = &blkif->blk_ring;
+ blkif_v1_back_ring_t *blk_v1_ring = NULL;
+ blkif_v2_back_ring_t *blk_v2_ring = NULL;
blkif_request_t req;
pending_req_t *pending_req;
RING_IDX rc, rp;
- int more_to_do = 0;
+ int retval, more_to_do = 0;
tap_blkif_t *info;
- rc = blk_ring->req_cons;
- rp = blk_ring->sring->req_prod;
+ if (1 == blkif->blk_protocol) {
+ blk_v1_ring = &blkif->blk_v1_ring;
+ rc = blk_v1_ring->req_cons;
+ rp = blk_v1_ring->sring->req_prod;
+ } else {
+ blk_v2_ring = &blkif->blk_v2_ring;
+ rc = blk_v2_ring->req_cons;
+ rp = blk_v2_ring->sring->req_prod;
+ }
+
rmb(); /* Ensure we see queued requests up to 'rp'. */
/*Check blkif has corresponding UE ring*/
@@ -1133,8 +1142,11 @@ static int do_block_io_op(blkif_t *blkif
more_to_do = 1;
break;
}
-
- if (RING_REQUEST_CONS_OVERFLOW(blk_ring, rc)) {
+
+ retval = (1 == blkif->blk_protocol)
+ ? RING_REQUEST_CONS_OVERFLOW(blk_v1_ring, rc)
+ : RING_REQUEST_CONS_OVERFLOW(blk_v2_ring, rc);
+ if (retval) {
WPRINTK("RING_REQUEST_CONS_OVERFLOW!"
" More to do\n");
more_to_do = 1;
@@ -1148,8 +1160,13 @@ static int do_block_io_op(blkif_t *blkif
break;
}
- memcpy(&req, RING_GET_REQUEST(blk_ring, rc), sizeof(req));
- blk_ring->req_cons = ++rc; /* before make_response() */
+ if (1 == blkif->blk_protocol) {
+ blkif_get_v1_req(&req, RING_GET_REQUEST(blk_v1_ring, rc));
+ blk_v1_ring->req_cons = ++rc; /* before make_response() */
+ } else {
+ blkif_get_v2_req(&req, RING_GET_REQUEST(blk_v2_ring, rc));
+ blk_v2_ring->req_cons = ++rc; /* before make_response() */
+ }
switch (req.operation) {
case BLKIF_OP_READ:
@@ -1225,7 +1242,9 @@ static void dispatch_rw_block_io(blkif_t
WPRINTK("blktap: fe_ring is full, can't add "
"IO Request will be dropped. %d %d\n",
RING_SIZE(&info->ufe_ring),
- RING_SIZE(&blkif->blk_ring));
+ (1 == blkif->blk_protocol)
+ ? RING_SIZE(&blkif->blk_v1_ring)
+ : RING_SIZE(&blkif->blk_v2_ring));
goto fail_response;
}
@@ -1415,30 +1434,47 @@ static void make_response(blkif_t *blkif
{
blkif_response_t *resp;
unsigned long flags;
- blkif_back_ring_t *blk_ring = &blkif->blk_ring;
+ blkif_v1_back_ring_t *blk_v1_ring = NULL;
+ blkif_v2_back_ring_t *blk_v2_ring = NULL;
int more_to_do = 0;
int notify;
+
spin_lock_irqsave(&blkif->blk_ring_lock, flags);
- /* Place on the response ring for the relevant domain. */
- resp = RING_GET_RESPONSE(blk_ring, blk_ring->rsp_prod_pvt);
+ /* Place on the response ring for the relevant domain. */
+ if (1 == blkif->blk_protocol) {
+ blk_v1_ring = &blkif->blk_v1_ring;
+ resp = RING_GET_RESPONSE(blk_v1_ring, blk_v1_ring->rsp_prod_pvt);
+ } else {
+ blk_v2_ring = &blkif->blk_v2_ring;
+ resp = RING_GET_RESPONSE(blk_v2_ring, blk_v2_ring->rsp_prod_pvt);
+ }
resp->id = id;
resp->operation = op;
resp->status = st;
- blk_ring->rsp_prod_pvt++;
- RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(blk_ring, notify);
-
- if (blk_ring->rsp_prod_pvt == blk_ring->req_cons) {
- /*
- * Tail check for pending requests. Allows frontend to avoid
- * notifications if requests are already in flight (lower
- * overheads and promotes batching).
- */
- RING_FINAL_CHECK_FOR_REQUESTS(blk_ring, more_to_do);
- } else if (RING_HAS_UNCONSUMED_REQUESTS(blk_ring)) {
- more_to_do = 1;
+ if (1 == blkif->blk_protocol) {
+ blk_v1_ring->rsp_prod_pvt++;
+ RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(blk_v1_ring, notify);
+ if (blk_v1_ring->rsp_prod_pvt == blk_v1_ring->req_cons) {
+ /*
+ * Tail check for pending requests. Allows frontend to avoid
+ * notifications if requests are already in flight (lower
+ * overheads and promotes batching).
+ */
+ RING_FINAL_CHECK_FOR_REQUESTS(blk_v1_ring, more_to_do);
+ } else if (RING_HAS_UNCONSUMED_REQUESTS(blk_v1_ring)) {
+ more_to_do = 1;
+ }
+ } else {
+ blk_v2_ring->rsp_prod_pvt++;
+ RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(blk_v2_ring, notify);
+ if (blk_v2_ring->rsp_prod_pvt == blk_v2_ring->req_cons) {
+ RING_FINAL_CHECK_FOR_REQUESTS(blk_v2_ring, more_to_do);
+ } else if (RING_HAS_UNCONSUMED_REQUESTS(blk_v2_ring)) {
+ more_to_do = 1;
+ }
+ }
- }
spin_unlock_irqrestore(&blkif->blk_ring_lock, flags);
if (more_to_do)
blkif_notify_work(blkif);
Index: build-64-release304-12901/linux-2.6-xen-sparse/drivers/xen/blktap/common.h
===================================================================
--- build-64-release304-12901.orig/linux-2.6-xen-sparse/drivers/xen/blktap/common.h
+++ build-64-release304-12901/linux-2.6-xen-sparse/drivers/xen/blktap/common.h
@@ -59,7 +59,9 @@ typedef struct blkif_st {
unsigned int evtchn;
unsigned int irq;
/* Comms information. */
- blkif_back_ring_t blk_ring;
+ int blk_protocol;
+ blkif_v1_back_ring_t blk_v1_ring;
+ blkif_v2_back_ring_t blk_v2_ring;
struct vm_struct *blk_ring_area;
/* Back pointer to the backend_info. */
struct backend_info *be;
Index: build-64-release304-12901/linux-2.6-xen-sparse/drivers/xen/blktap/interface.c
===================================================================
--- build-64-release304-12901.orig/linux-2.6-xen-sparse/drivers/xen/blktap/interface.c
+++ build-64-release304-12901/linux-2.6-xen-sparse/drivers/xen/blktap/interface.c
@@ -96,7 +96,8 @@ static void unmap_frontend_page(blkif_t
int tap_blkif_map(blkif_t *blkif, unsigned long shared_page,
unsigned int evtchn)
{
- blkif_sring_t *sring;
+ blkif_v1_sring_t *sring_v1;
+ blkif_v2_sring_t *sring_v2;
int err;
struct evtchn_bind_interdomain bind_interdomain;
@@ -126,8 +127,13 @@ int tap_blkif_map(blkif_t *blkif, unsign
blkif->evtchn = bind_interdomain.local_port;
- sring = (blkif_sring_t *)blkif->blk_ring_area->addr;
- BACK_RING_INIT(&blkif->blk_ring, sring, PAGE_SIZE);
+ if (1 == blkif->blk_protocol) {
+ sring_v1 = (blkif_v1_sring_t *)blkif->blk_ring_area->addr;
+ BACK_RING_INIT(&blkif->blk_v1_ring, sring_v1, PAGE_SIZE);
+ } else {
+ sring_v2 = (blkif_v2_sring_t *)blkif->blk_ring_area->addr;
+ BACK_RING_INIT(&blkif->blk_v2_ring, sring_v2, PAGE_SIZE);
+ }
blkif->irq = bind_evtchn_to_irqhandler(
blkif->evtchn, tap_blkif_be_int, 0, "blkif-backend", blkif);
@@ -141,10 +147,11 @@ void tap_blkif_unmap(blkif_t *blkif)
unbind_from_irqhandler(blkif->irq, blkif);
blkif->irq = 0;
}
- if (blkif->blk_ring.sring) {
+ if (blkif->blk_v1_ring.sring || blkif->blk_v2_ring.sring) {
unmap_frontend_page(blkif);
free_vm_area(blkif->blk_ring_area);
- blkif->blk_ring.sring = NULL;
+ blkif->blk_v1_ring.sring = NULL;
+ blkif->blk_v2_ring.sring = NULL;
}
}
Index: build-64-release304-12901/linux-2.6-xen-sparse/drivers/xen/blktap/xenbus.c
===================================================================
--- build-64-release304-12901.orig/linux-2.6-xen-sparse/drivers/xen/blktap/xenbus.c
+++ build-64-release304-12901/linux-2.6-xen-sparse/drivers/xen/blktap/xenbus.c
@@ -316,6 +316,7 @@ static int connect_ring(struct backend_i
struct xenbus_device *dev = be->dev;
unsigned long ring_ref;
unsigned int evtchn;
+ unsigned int protocol;
int err;
DPRINTK("%s\n", dev->otherend);
@@ -329,6 +330,19 @@ static int connect_ring(struct backend_i
return err;
}
+ err = xenbus_gather(XBT_NIL, dev->otherend, "protocol",
+ "%u", &protocol, NULL);
+ if (err)
+ protocol = BLKIF_NATIVE_PROTOCOL;
+ if (protocol < 1 || protocol > 2) {
+ xenbus_dev_fatal(dev, err, "unknown fe protocol %d", protocol);
+ return -1;
+ }
+ be->blkif->blk_protocol = protocol;
+
+ printk("blktap: ring-ref %ld, event-channel %d, protocol %d\n",
+ ring_ref, evtchn, protocol);
+
/* Map the shared frame, irq etc. */
err = tap_blkif_map(be->blkif, ring_ref, evtchn);
if (err) {
[-- Attachment #3: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [patch/rfc] multiprotocol blkback drivers (32-on-64) 2006-12-18 16:39 [patch/rfc] multiprotocol blkback drivers (32-on-64) Gerd Hoffmann @ 2006-12-18 17:09 ` Jan Beulich 2006-12-18 17:58 ` Keir Fraser 2006-12-19 8:20 ` Gerd Hoffmann 2006-12-19 7:55 ` Jan Beulich 1 sibling, 2 replies; 12+ messages in thread From: Jan Beulich @ 2006-12-18 17:09 UTC (permalink / raw) To: Gerd Hoffmann; +Cc: Xen devel list I understand you favor this over the bi-modal approach I took? Any specific advantages? Jan >>> Gerd Hoffmann <kraxel@suse.de> 18.12.06 17:39 >>> Hi, This is a patch for the block interface, frontend drivers, backend drivers and tools to support multiple ring protocols. Right there are now just two: the 32bit and the 64bit one. If needed it can be extended. Interface changes (io/blkif.h) * Have both request structs there, with "v1" and "v2" added to the name. The old name is aliased to the native protocol of the architecture. * Add helper functions to convert v1/v2 requests to native. Frontend changes: * Create a new node "protocol", add the protocol number it speaks there. Backend changes: * Look at the "protocol" number of the frontend and switch ring handling accordingly. If the protocol node isn't present it assumes native protocol. * As the request struct is copied anyway before being processed (for security reasons) it is converted to native at that point so most backend code doesn't need to know what the frontend speaks. * In case of blktap this is completely transparent to userspace, the kernel/userspace ring is always native no matter what the frontend speaks. Tools changes: * Add one more option to the disk configuration, so one can specify the protocol the frontend speaks in the config file. This is needed for old frontends which don't advertise the protocol they are speaking themself. I'm not that happy with this approach, but it works for now and I'm kida lost in the stack of python classes doing domain and device handling ... Consider the code experimental, not all frontend/backend combinations are tested. Comments? Questions? Suggesions? cheers, Gerd PS: Anyone working on blkback/blktap code sharing? While walking through the code I've noticed quite alot of it is cut&paste ... -- Gerd Hoffmann <kraxel@suse.de> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Re: [patch/rfc] multiprotocol blkback drivers (32-on-64) 2006-12-18 17:09 ` Jan Beulich @ 2006-12-18 17:58 ` Keir Fraser 2006-12-19 7:37 ` Jan Beulich 2006-12-19 8:20 ` Gerd Hoffmann 1 sibling, 1 reply; 12+ messages in thread From: Keir Fraser @ 2006-12-18 17:58 UTC (permalink / raw) To: Jan Beulich, Gerd Hoffmann; +Cc: Xen devel list Gerd's description is along the lines of what I would implement myself. How does your bi-modal approach work? -- Keir On 18/12/06 17:09, "Jan Beulich" <jbeulich@novell.com> wrote: > I understand you favor this over the bi-modal approach I took? Any specific > advantages? Jan > >>>> Gerd Hoffmann <kraxel@suse.de> 18.12.06 17:39 >>> > Hi, > > This is a patch for the block interface, frontend drivers, backend > drivers and tools to support multiple ring protocols. Right there are > now just two: the 32bit and the 64bit one. If needed it can be extended. > > Interface changes (io/blkif.h) > * Have both request structs there, with "v1" and "v2" added to the > name. The old name is aliased to the native protocol of the > architecture. > * Add helper functions to convert v1/v2 requests to native. > > Frontend changes: > * Create a new node "protocol", add the protocol number it speaks > there. > > Backend changes: > * Look at the "protocol" number of the frontend and switch ring > handling accordingly. If the protocol node isn't present it assumes > native protocol. > * As the request struct is copied anyway before being processed (for > security reasons) it is converted to native at that point so most > backend code doesn't need to know what the frontend speaks. > * In case of blktap this is completely transparent to userspace, the > kernel/userspace ring is always native no matter what the frontend > speaks. > > Tools changes: > * Add one more option to the disk configuration, so one can specify the > protocol the frontend speaks in the config file. This is needed for > old frontends which don't advertise the protocol they are speaking > themself. > I'm not that happy with this approach, but it works for now and I'm > kida lost in the stack of python classes doing domain and device > handling ... > > Consider the code experimental, not all frontend/backend combinations > are tested. > > Comments? Questions? Suggesions? > > cheers, > Gerd > > PS: Anyone working on blkback/blktap code sharing? While walking > through the code I've noticed quite alot of it is cut&paste ... ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Re: [patch/rfc] multiprotocol blkback drivers (32-on-64) 2006-12-18 17:58 ` Keir Fraser @ 2006-12-19 7:37 ` Jan Beulich 0 siblings, 0 replies; 12+ messages in thread From: Jan Beulich @ 2006-12-19 7:37 UTC (permalink / raw) To: Gerd Hoffmann, Keir Fraser; +Cc: Xen devel list [-- Attachment #1: Type: text/plain, Size: 2616 bytes --] By adding abstraction to the ring macros and the affected headers, and then replacing direct structure member accesses with appropriate macros. Reference patches attached (not checked whether they would apply cleanly on -unstable). Jan >>> Keir Fraser <keir@xensource.com> 18.12.06 18:58 >>> Gerd's description is along the lines of what I would implement myself. How does your bi-modal approach work? -- Keir On 18/12/06 17:09, "Jan Beulich" <jbeulich@novell.com> wrote: > I understand you favor this over the bi-modal approach I took? Any specific > advantages? Jan > >>>> Gerd Hoffmann <kraxel@suse.de> 18.12.06 17:39 >>> > Hi, > > This is a patch for the block interface, frontend drivers, backend > drivers and tools to support multiple ring protocols. Right there are > now just two: the 32bit and the 64bit one. If needed it can be extended. > > Interface changes (io/blkif.h) > * Have both request structs there, with "v1" and "v2" added to the > name. The old name is aliased to the native protocol of the > architecture. > * Add helper functions to convert v1/v2 requests to native. > > Frontend changes: > * Create a new node "protocol", add the protocol number it speaks > there. > > Backend changes: > * Look at the "protocol" number of the frontend and switch ring > handling accordingly. If the protocol node isn't present it assumes > native protocol. > * As the request struct is copied anyway before being processed (for > security reasons) it is converted to native at that point so most > backend code doesn't need to know what the frontend speaks. > * In case of blktap this is completely transparent to userspace, the > kernel/userspace ring is always native no matter what the frontend > speaks. > > Tools changes: > * Add one more option to the disk configuration, so one can specify the > protocol the frontend speaks in the config file. This is needed for > old frontends which don't advertise the protocol they are speaking > themself. > I'm not that happy with this approach, but it works for now and I'm > kida lost in the stack of python classes doing domain and device > handling ... > > Consider the code experimental, not all frontend/backend combinations > are tested. > > Comments? Questions? Suggesions? > > cheers, > Gerd > > PS: Anyone working on blkback/blktap code sharing? While walking > through the code I've noticed quite alot of it is cut&paste ... _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel [-- Attachment #2: xen-bimodal-tpmback.patch --] [-- Type: text/plain, Size: 10406 bytes --] Enhance tpmback to be capable of dealing with bi-modal operation (frontend running in with different word size than backend). Index: sle10-sp1-2006-12-05/drivers/xen/tpmback/common.h =================================================================== --- sle10-sp1-2006-12-05.orig/drivers/xen/tpmback/common.h 2006-12-07 16:43:42.000000000 +0100 +++ sle10-sp1-2006-12-05/drivers/xen/tpmback/common.h 2006-12-07 16:45:08.000000000 +0100 @@ -12,7 +12,9 @@ #include <linux/slab.h> #include <xen/evtchn.h> #include <xen/driver_util.h> -#include <xen/interface/grant_table.h> +#ifdef CONFIG_XEN_BIMODAL_BACKENDS +#define TPMIF_BIMODAL +#endif #include <xen/interface/io/tpmif.h> #include <asm/io.h> #include <asm/pgalloc.h> @@ -27,6 +29,9 @@ typedef struct tpmif_st { struct list_head tpmif_list; /* Unique identifier for this interface. */ domid_t domid; +#ifdef CONFIG_XEN_BIMODAL_BACKENDS + unsigned char native; +#endif unsigned int handle; /* Physical parameters of the comms window. */ @@ -34,7 +39,7 @@ typedef struct tpmif_st { unsigned int irq; /* The shared rings and indexes. */ - tpmif_tx_interface_t *tx; + tpmif_tx_interface_u tx; struct vm_struct *tx_area; /* Miscellaneous private stuff. */ @@ -54,6 +59,15 @@ typedef struct tpmif_st { char devname[20]; } tpmif_t; +#ifdef CONFIG_XEN_BIMODAL_BACKENDS +#define tpmif_request(tpmif, rq, n) \ + ((tpmif)->native \ + ? (void)(rq.nat = &(tpmif)->tx.nat->ring[n].req) \ + : (void)(rq.alt = &(tpmif)->tx.alt->ring[n].req)) +#else +#define tpmif_request(tpmif, rq, n) ((void)(rq = &(tpmif)->tx->ring[n].req)) +#endif + void tpmif_disconnect_complete(tpmif_t * tpmif); tpmif_t *tpmif_find(domid_t domid, struct backend_info *bi); void tpmif_interface_init(void); Index: sle10-sp1-2006-12-05/drivers/xen/tpmback/interface.c =================================================================== --- sle10-sp1-2006-12-05.orig/drivers/xen/tpmback/interface.c 2006-12-07 16:43:42.000000000 +0100 +++ sle10-sp1-2006-12-05/drivers/xen/tpmback/interface.c 2006-12-07 16:45:08.000000000 +0100 @@ -30,6 +30,9 @@ static tpmif_t *alloc_tpmif(domid_t domi memset(tpmif, 0, sizeof (*tpmif)); tpmif->domid = domid; +#ifdef CONFIG_XEN_BIMODAL_BACKENDS + tpmif->native = 1; /* XXX */ +#endif tpmif->status = DISCONNECTED; tpmif->bi = bi; snprintf(tpmif->devname, sizeof(tpmif->devname), "tpmif%d", domid); @@ -115,6 +118,12 @@ static void unmap_frontend_page(tpmif_t BUG_ON(ret); } +#ifndef CONFIG_XEN_BIMODAL_BACKENDS +#define tpmif_tx tpmif->tx +#else +#define tpmif_tx tpmif->tx.nat +#endif + int tpmif_map(tpmif_t *tpmif, unsigned long shared_page, unsigned int evtchn) { int err; @@ -147,7 +156,7 @@ int tpmif_map(tpmif_t *tpmif, unsigned l tpmif->evtchn = bind_interdomain.local_port; - tpmif->tx = (tpmif_tx_interface_t *)tpmif->tx_area->addr; + tpmif_tx = (tpmif_tx_interface_t *)tpmif->tx_area->addr; tpmif->irq = bind_evtchn_to_irqhandler( tpmif->evtchn, tpmif_be_int, 0, tpmif->devname, tpmif); @@ -162,7 +171,7 @@ void tpmif_disconnect_complete(tpmif_t * if (tpmif->irq) unbind_from_irqhandler(tpmif->irq, tpmif); - if (tpmif->tx) { + if (tpmif_tx) { unmap_frontend_page(tpmif); free_vm_area(tpmif->tx_area); } @@ -170,6 +179,8 @@ void tpmif_disconnect_complete(tpmif_t * free_tpmif(tpmif); } +#undef tpmif_tx + void __init tpmif_interface_init(void) { tpmif_cachep = kmem_cache_create("tpmif_cache", sizeof (tpmif_t), Index: sle10-sp1-2006-12-05/drivers/xen/tpmback/tpmback.c =================================================================== --- sle10-sp1-2006-12-05.orig/drivers/xen/tpmback/tpmback.c 2006-12-07 16:43:42.000000000 +0100 +++ sle10-sp1-2006-12-05/drivers/xen/tpmback/tpmback.c 2006-12-07 16:45:08.000000000 +0100 @@ -70,6 +70,12 @@ static int packet_read_shmem(struct pack char *buffer, int isuserbuffer, u32 left); static int vtpm_queue_packet(struct packet *pak); +#ifdef CONFIG_XEN_BIMODAL_BACKENDS +#define tx(op) (tpmif->native ? (tx.nat->op) : (tx.alt->op)) +#else +#define tx(op) (tx->op) +#endif + /*************************************************************** Buffer copying fo user and kernel space buffes. ***************************************************************/ @@ -244,17 +250,17 @@ int _packet_write(struct packet *pak, unsigned int tocopy; struct gnttab_map_grant_ref map_op; struct gnttab_unmap_grant_ref unmap_op; - tpmif_tx_request_t *tx; + tpmif_tx_request_u tx; - tx = &tpmif->tx->ring[i].req; + tpmif_request(tpmif, tx, i); - if (0 == tx->addr) { + if (0 == tx(addr)) { DPRINTK("ERROR: Buffer for outgoing packet NULL?! i=%d\n", i); return 0; } gnttab_set_map_op(&map_op, idx_to_kaddr(tpmif, i), - GNTMAP_host_map, tx->ref, tpmif->domid); + GNTMAP_host_map, tx(ref), tpmif->domid); if (unlikely(HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, &map_op, 1))) { @@ -271,12 +277,12 @@ int _packet_write(struct packet *pak, tocopy = min_t(size_t, size - offset, PAGE_SIZE); if (copy_from_buffer((void *)(idx_to_kaddr(tpmif, i) | - (tx->addr & ~PAGE_MASK)), + (unsigned long)tx(addr & ~PAGE_MASK)), &data[offset], tocopy, isuserbuffer)) { tpmif_put(tpmif); return -EFAULT; } - tx->size = tocopy; + tx(size = tocopy); gnttab_set_unmap_op(&unmap_op, idx_to_kaddr(tpmif, i), GNTMAP_host_map, handle); @@ -375,9 +381,6 @@ static int packet_read_shmem(struct pack u32 to_copy; grant_handle_t handle; - tpmif_tx_request_t *tx; - - tx = &tpmif->tx->ring[0].req; /* * Start copying data at the page with index 'index' * and within that page at offset 'offset'. @@ -385,14 +388,15 @@ static int packet_read_shmem(struct pack */ to_copy = min_t(u32, PAGE_SIZE - pg_offset, room_left); while (to_copy > 0) { + tpmif_tx_request_u tx; void *src; struct gnttab_map_grant_ref map_op; struct gnttab_unmap_grant_ref unmap_op; - tx = &tpmif->tx->ring[i].req; + tpmif_request(tpmif, tx, i); gnttab_set_map_op(&map_op, idx_to_kaddr(tpmif, i), - GNTMAP_host_map, tx->ref, tpmif->domid); + GNTMAP_host_map, tx(ref), tpmif->domid); if (unlikely(HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, &map_op, 1))) { @@ -406,19 +410,19 @@ static int packet_read_shmem(struct pack handle = map_op.handle; - if (to_copy > tx->size) { + if (to_copy > tx(size)) { /* * User requests more than what's available */ - to_copy = min_t(u32, tx->size, to_copy); + to_copy = min_t(u32, tx(size), to_copy); } DPRINTK("Copying from mapped memory at %08lx\n", (unsigned long)(idx_to_kaddr(tpmif, i) | - (tx->addr & ~PAGE_MASK))); + (unsigned long)tx(addr & ~PAGE_MASK))); src = (void *)(idx_to_kaddr(tpmif, i) | - ((tx->addr & ~PAGE_MASK) + pg_offset)); + ((unsigned long)tx(addr & ~PAGE_MASK) + pg_offset)); if (copy_to_buffer(&buffer[offset], src, to_copy, isuserbuffer)) { return -EFAULT; @@ -879,7 +883,7 @@ static void tpm_tx_action(unsigned long { struct list_head *ent; tpmif_t *tpmif; - tpmif_tx_request_t *tx; + tpmif_tx_request_u tx; DPRINTK("%s: Getting data from front-end(s)!\n", __FUNCTION__); @@ -890,10 +894,10 @@ static void tpm_tx_action(unsigned long tpmif_get(tpmif); remove_from_tpm_schedule_list(tpmif); - tx = &tpmif->tx->ring[0].req; + tpmif_request(tpmif, tx, 0); /* pass it up */ - vtpm_receive(tpmif, tx->size); + vtpm_receive(tpmif, tx(size)); tpmif_put(tpmif); } Index: sle10-sp1-2006-12-05/include/xen/interface/io/tpmif.h =================================================================== --- sle10-sp1-2006-12-05.orig/include/xen/interface/io/tpmif.h 2006-12-07 16:43:42.000000000 +0100 +++ sle10-sp1-2006-12-05/include/xen/interface/io/tpmif.h 2006-12-07 16:45:08.000000000 +0100 @@ -32,17 +32,39 @@ */ #ifndef __XEN_PUBLIC_IO_TPMIF_H__ -#define __XEN_PUBLIC_IO_TPMIF_H__ #include "../grant_table.h" -struct tpmif_tx_request { +#ifndef tpmif +# ifdef TPMIF_BIMODAL +# define BACKEND_RING_BIMODAL +# if defined(__linux__) && defined(__KERNEL__) +# ifdef CONFIG_64BIT +# define long int +# else +# define long long long __attribute__((__aligned__(8))) +# endif +# else +# error Environment unsupported for bi-modal operation. +# endif +# define tpmif(x) tpmif_alt_##x +# include "tpmif.h" +# undef long +# endif +# define tpmif(x) tpmif_##x +#endif + +#define __XEN_PUBLIC_IO_TPMIF_H__ + +struct tpmif(tx_request) { unsigned long addr; /* Machine address of packet. */ grant_ref_t ref; /* grant table access reference */ uint16_t unused; uint16_t size; /* Packet size in bytes. */ }; -typedef struct tpmif_tx_request tpmif_tx_request_t; +typedef struct tpmif(tx_request) tpmif(tx_request_t); + +#ifndef TPMIF_TX_RING_SIZE /* * The TPMIF_TX_RING_SIZE defines the number of pages the @@ -50,20 +72,38 @@ typedef struct tpmif_tx_request tpmif_tx */ typedef uint32_t TPMIF_RING_IDX; +#ifdef TPMIF_BIMODAL +typedef union { + struct tpmif_tx_request *nat; + struct tpmif_alt_tx_request *alt; +} tpmif_tx_request_u; +typedef union { + struct tpmif_tx_interface *nat; + struct tpmif_alt_tx_interface *alt; +} tpmif_tx_interface_u; +#else +typedef struct tpmif_tx_request *tpmif_tx_request_u; +typedef struct tpmif_tx_interface *tpmif_tx_interface_u; +#endif + +#endif + #define TPMIF_TX_RING_SIZE 10 /* This structure must fit in a memory page. */ -struct tpmif_ring { - struct tpmif_tx_request req; +struct tpmif(ring) { + struct tpmif(tx_request) req; }; -typedef struct tpmif_ring tpmif_ring_t; +typedef struct tpmif(ring) tpmif(ring_t); -struct tpmif_tx_interface { - struct tpmif_ring ring[TPMIF_TX_RING_SIZE]; +struct tpmif(tx_interface) { + struct tpmif(ring) ring[TPMIF_TX_RING_SIZE]; }; -typedef struct tpmif_tx_interface tpmif_tx_interface_t; +typedef struct tpmif(tx_interface) tpmif(tx_interface_t); +#undef tpmif +#undef TPMIF_BIMODAL #endif /* [-- Attachment #3: xen-bimodal-blkif.patch --] [-- Type: text/plain, Size: 21741 bytes --] Enhance blkif and ring handling to be capable of dealing with bi-modal operation (frontend running in with different word size than backend). Index: sle10-sp1-2006-12-05/drivers/xen/Kconfig =================================================================== --- sle10-sp1-2006-12-05.orig/drivers/xen/Kconfig 2006-12-07 16:43:54.000000000 +0100 +++ sle10-sp1-2006-12-05/drivers/xen/Kconfig 2006-12-07 16:44:54.000000000 +0100 @@ -45,6 +45,14 @@ config XEN_BACKEND Support for backend device drivers that provide I/O services to other virtual machines. +config XEN_BIMODAL_BACKENDS + bool "Bi-modal backend driver support" + depends on XEN_BACKEND && (X86_PAE || X86_64) + default y + help + Build backend device drivers providing support for both + native as well as compatibility guests. + config XEN_BLKDEV_BACKEND tristate "Block-device backend driver" depends on XEN_BACKEND Index: sle10-sp1-2006-12-05/include/xen/interface/io/blkif.h =================================================================== --- sle10-sp1-2006-12-05.orig/include/xen/interface/io/blkif.h 2006-12-07 16:43:54.000000000 +0100 +++ sle10-sp1-2006-12-05/include/xen/interface/io/blkif.h 2006-12-07 16:44:54.000000000 +0100 @@ -25,6 +25,30 @@ */ #ifndef __XEN_PUBLIC_IO_BLKIF_H__ +#ifndef blkif +# ifdef BLKIF_BIMODAL +# define BACKEND_RING_BIMODAL +# if defined(__linux__) && defined(__KERNEL__) +# ifdef CONFIG_64BIT +# pragma pack(push, 4) +# else +# define uint64_t uint64_t __attribute__((__aligned__(8))) +# endif +# else +# error Environment unsupported for bi-modal operation. +# endif +# define blkif(x) blkif_alt_##x +# include "blkif.h" +# if defined(__linux__) && defined(__KERNEL__) +# ifdef CONFIG_64BIT +# pragma pack(pop) +# else +# undef uint64_t +# endif +# endif +# endif +# define blkif(x) blkif_##x +#endif #define __XEN_PUBLIC_IO_BLKIF_H__ #include "ring.h" @@ -71,27 +95,31 @@ */ #define BLKIF_MAX_SEGMENTS_PER_REQUEST 11 -struct blkif_request { +#ifndef BLKIF_RSP_ERROR +struct blkif_request_segment { + grant_ref_t gref; /* reference to I/O buffer frame */ + /* @first_sect: first sector in frame to transfer (inclusive). */ + /* @last_sect: last sector in frame to transfer (inclusive). */ + uint8_t first_sect, last_sect; +}; +#endif + +struct blkif(request) { uint8_t operation; /* BLKIF_OP_??? */ uint8_t nr_segments; /* number of segments */ blkif_vdev_t handle; /* only for read/write requests */ uint64_t id; /* private guest value, echoed in resp */ blkif_sector_t sector_number;/* start sector idx on disk (r/w only) */ - struct blkif_request_segment { - grant_ref_t gref; /* reference to I/O buffer frame */ - /* @first_sect: first sector in frame to transfer (inclusive). */ - /* @last_sect: last sector in frame to transfer (inclusive). */ - uint8_t first_sect, last_sect; - } seg[BLKIF_MAX_SEGMENTS_PER_REQUEST]; + struct blkif_request_segment seg[BLKIF_MAX_SEGMENTS_PER_REQUEST]; }; -typedef struct blkif_request blkif_request_t; +typedef struct blkif(request) blkif(request_t); -struct blkif_response { +struct blkif(response) { uint64_t id; /* copied from request */ uint8_t operation; /* copied from request */ int16_t status; /* BLKIF_RSP_??? */ }; -typedef struct blkif_response blkif_response_t; +typedef struct blkif(response) blkif(response_t); /* * STATUS RETURN CODES. @@ -107,12 +135,16 @@ typedef struct blkif_response blkif_resp * Generate blkif ring structures and types. */ +#ifndef BLKIF_BIMODAL DEFINE_RING_TYPES(blkif, struct blkif_request, struct blkif_response); +#endif #define VDISK_CDROM 0x1 #define VDISK_REMOVABLE 0x2 #define VDISK_READONLY 0x4 +#undef blkif +#undef BLKIF_BIMODAL #endif /* __XEN_PUBLIC_IO_BLKIF_H__ */ /* Index: sle10-sp1-2006-12-05/include/xen/interface/io/ring.h =================================================================== --- sle10-sp1-2006-12-05.orig/include/xen/interface/io/ring.h 2006-12-07 16:43:54.000000000 +0100 +++ sle10-sp1-2006-12-05/include/xen/interface/io/ring.h 2006-12-07 16:44:54.000000000 +0100 @@ -78,6 +78,7 @@ typedef unsigned int RING_IDX; */ #define DEFINE_RING_TYPES(__name, __req_t, __rsp_t) \ +DEFINE_ALT_RING_TYPES(__name); \ \ /* Shared ring entry */ \ union __name##_sring_entry { \ @@ -106,14 +107,72 @@ struct __name##_back_ring { RING_IDX rsp_prod_pvt; \ RING_IDX req_cons; \ unsigned int nr_ents; \ - struct __name##_sring *sring; \ + __name##_sring_u sring; \ }; \ \ +CHECK_RING_TYPES(__name) \ + \ /* Syntactic sugar */ \ typedef struct __name##_sring __name##_sring_t; \ typedef struct __name##_front_ring __name##_front_ring_t; \ typedef struct __name##_back_ring __name##_back_ring_t +#ifdef BACKEND_RING_BIMODAL +#define DEFINE_ALT_RING_TYPES(__name) \ + \ +/* Shared ring entry */ \ +union __name##_alt_sring_entry { \ + struct __name##_alt_request req; \ + struct __name##_alt_response rsp; \ +}; \ + \ +/* Shared ring page */ \ +typedef struct __name##_alt_sring { \ + RING_IDX req_prod, req_event; \ + RING_IDX rsp_prod, rsp_event; \ + uint8_t pad[48]; \ + union __name##_alt_sring_entry ring[1]; /* variable-length */ \ +} __name##_alt_sring_t; \ + \ +typedef union { \ + struct __name##_sring *nat; \ + struct __name##_alt_sring *alt; \ +} __name##_sring_u; \ + \ +typedef union { \ + struct __name##_request nat; \ + struct __name##_alt_request alt; \ +} __name##_request_u; \ + \ +typedef union { \ + struct __name##_response *nat; \ + struct __name##_alt_response *alt; \ +} __name##_response_u + +#define CHECK_SRING_FIELD(__name, __fld) \ + BUILD_BUG_ON(&((struct __name ## _sring *)0)->__fld != \ + &((struct __name ## _alt_sring *)0)->__fld) + +#define CHECK_RING_TYPES(__name) \ +static inline void _##__name##_check_ring_types_(void) { \ + CHECK_SRING_FIELD(__name, req_prod); \ + CHECK_SRING_FIELD(__name, req_event); \ + CHECK_SRING_FIELD(__name, rsp_prod); \ + CHECK_SRING_FIELD(__name, rsp_event); \ + CHECK_SRING_FIELD(__name, pad); \ +} +#define BB_SRING(_s) _s.alt +#else +#define DEFINE_ALT_RING_TYPES(__name) \ +typedef struct __name##_sring *__name##_sring_u; \ +typedef struct __name##_request __name##_request_u; \ +typedef struct __name##_response *__name##_response_u + +#define CHECK_RING_TYPES(__name) + +#define BB_SRING(_s) _s +#endif + /* * Macros for manipulating rings. * @@ -135,6 +194,7 @@ typedef struct __name##_back_ring __name (_s)->req_event = (_s)->rsp_event = 1; \ memset((_s)->pad, 0, sizeof((_s)->pad)); \ } while(0) +#define BB_SHARED_RING_INIT(_s) SHARED_RING_INIT(BB_SRING(_s)) #define FRONT_RING_INIT(_r, _s, __size) do { \ (_r)->req_prod_pvt = 0; \ @@ -143,12 +203,24 @@ typedef struct __name##_back_ring __name (_r)->sring = (_s); \ } while (0) +#ifndef BACKEND_RING_BIMODAL #define BACK_RING_INIT(_r, _s, __size) do { \ (_r)->rsp_prod_pvt = 0; \ (_r)->req_cons = 0; \ (_r)->nr_ents = __RING_SIZE(_s, __size); \ (_r)->sring = (_s); \ } while (0) +#else +#define BACK_RING_INIT(_r, _s, __size) do { \ + (_r)->rsp_prod_pvt = 0; \ + (_r)->req_cons = 0; \ + if (ring_native(_r)) \ + (_r)->nr_ents = __RING_SIZE((_s).nat, __size); \ + else \ + (_r)->nr_ents = __RING_SIZE((_s).alt, __size); \ + (_r)->sring = (_s); \ +} while (0) +#endif /* Initialize to existing shared indexes -- for recovery */ #define FRONT_RING_ATTACH(_r, _s, __size) do { \ @@ -158,12 +230,24 @@ typedef struct __name##_back_ring __name (_r)->nr_ents = __RING_SIZE(_s, __size); \ } while (0) +#ifndef BACKEND_RING_BIMODAL #define BACK_RING_ATTACH(_r, _s, __size) do { \ (_r)->sring = (_s); \ (_r)->rsp_prod_pvt = (_s)->rsp_prod; \ (_r)->req_cons = (_s)->req_prod; \ (_r)->nr_ents = __RING_SIZE(_s, __size); \ } while (0) +#else +#define BACK_RING_ATTACH(_r, _s, __size) do { \ + (_r)->sring = (_s); \ + (_r)->rsp_prod_pvt = (_s).alt->rsp_prod; \ + (_r)->req_cons = (_s).alt->req_prod; \ + if (ring_native(_r)) \ + (_r)->nr_ents = __RING_SIZE((_s).nat, __size); \ + else \ + (_r)->nr_ents = __RING_SIZE((_s).alt, __size); \ +} while (0) +#endif /* How big is this ring? */ #define RING_SIZE(_r) \ @@ -180,11 +264,15 @@ typedef struct __name##_back_ring __name (RING_FREE_REQUESTS(_r) == 0) /* Test if there are outstanding messages to be processed on a ring. */ -#define RING_HAS_UNCONSUMED_RESPONSES(_r) \ +#define _RING_HAS_UNCONSUMED_RESPONSES(_r, sring) \ ((_r)->sring->rsp_prod - (_r)->rsp_cons) +#define RING_HAS_UNCONSUMED_RESPONSES(_r) \ + _RING_HAS_UNCONSUMED_RESPONSES(_r, sring) +#define BB_RING_HAS_UNCONSUMED_RESPONSES(_r) \ + _RING_HAS_UNCONSUMED_RESPONSES(_r, BB_SRING(sring)) #ifdef __GNUC__ -#define RING_HAS_UNCONSUMED_REQUESTS(_r) ({ \ +#define _RING_HAS_UNCONSUMED_REQUESTS(_r, sring) ({ \ unsigned int req = (_r)->sring->req_prod - (_r)->req_cons; \ unsigned int rsp = RING_SIZE(_r) - \ ((_r)->req_cons - (_r)->rsp_prod_pvt); \ @@ -192,33 +280,75 @@ typedef struct __name##_back_ring __name }) #else /* Same as above, but without the nice GCC ({ ... }) syntax. */ -#define RING_HAS_UNCONSUMED_REQUESTS(_r) \ +#define _RING_HAS_UNCONSUMED_REQUESTS(_r, sring) \ ((((_r)->sring->req_prod - (_r)->req_cons) < \ (RING_SIZE(_r) - ((_r)->req_cons - (_r)->rsp_prod_pvt))) ? \ ((_r)->sring->req_prod - (_r)->req_cons) : \ (RING_SIZE(_r) - ((_r)->req_cons - (_r)->rsp_prod_pvt))) #endif +#define RING_HAS_UNCONSUMED_REQUESTS(_r) \ + _RING_HAS_UNCONSUMED_REQUESTS(_r, sring) +#define BB_RING_HAS_UNCONSUMED_REQUESTS(_r) \ + _RING_HAS_UNCONSUMED_REQUESTS(_r, BB_SRING(sring)) /* Direct access to individual ring elements, by index. */ #define RING_GET_REQUEST(_r, _idx) \ (&((_r)->sring->ring[((_idx) & (RING_SIZE(_r) - 1))].req)) +#ifndef BACKEND_RING_BIMODAL +#define BB_RING_GET_REQUEST(_req, _r, _idx) do { \ + (_req) = RING_GET_REQUEST(_r, _idx); \ +} while (0) +#else +#define BB_RING_GET_REQUEST(_req, _r, _idx) do { \ + if (ring_native(_r)) \ + (_req).nat = &(_r)->sring.nat->ring[(_idx) & (RING_SIZE(_r) - 1)].req; \ + else \ + (_req).alt = &(_r)->sring.alt->ring[(_idx) & (RING_SIZE(_r) - 1)].req; \ +} while (0) +#endif + +#define RING_COPY_REQUEST(_req, _r, _idx) \ + memcpy(&(_req), RING_GET_REQUEST(_r, _idx), sizeof(_req)) +#ifndef BACKEND_RING_BIMODAL +#define BB_RING_COPY_REQUEST RING_COPY_REQUEST +#else +#define BB_RING_COPY_REQUEST(_req, _r, _idx) (ring_native(_r) ? \ + memcpy(&(_req), &(_r)->sring.nat->ring[(_idx) & (RING_SIZE(_r) - 1)].req, sizeof((_req).nat)) : \ + memcpy(&(_req), &(_r)->sring.alt->ring[(_idx) & (RING_SIZE(_r) - 1)].req, sizeof((_req).alt))) +#endif #define RING_GET_RESPONSE(_r, _idx) \ (&((_r)->sring->ring[((_idx) & (RING_SIZE(_r) - 1))].rsp)) +#ifndef BACKEND_RING_BIMODAL +#define BB_RING_GET_RESPONSE(_rsp, _r, _idx) do { \ + (_rsp) = RING_GET_RESPONSE(_r, _idx); \ +} while (0) +#else +#define BB_RING_GET_RESPONSE(_rsp, _r, _idx) do { \ + if (ring_native(_r)) \ + (_rsp).nat = &(_r)->sring.nat->ring[(_idx) & (RING_SIZE(_r) - 1)].rsp; \ + else \ + (_rsp).alt = &(_r)->sring.alt->ring[(_idx) & (RING_SIZE(_r) - 1)].rsp; \ +} while (0) +#endif /* Loop termination condition: Would the specified index overflow the ring? */ #define RING_REQUEST_CONS_OVERFLOW(_r, _cons) \ (((_cons) - (_r)->rsp_prod_pvt) >= RING_SIZE(_r)) -#define RING_PUSH_REQUESTS(_r) do { \ +#define _RING_PUSH_REQUESTS(_r, sring) do { \ wmb(); /* back sees requests /before/ updated producer index */ \ (_r)->sring->req_prod = (_r)->req_prod_pvt; \ } while (0) +#define RING_PUSH_REQUESTS(_r) _RING_PUSH_REQUESTS(_r, sring) +#define BB_RING_PUSH_REQUESTS(_r) _RING_PUSH_REQUESTS(_r, BB_SRING(sring)) -#define RING_PUSH_RESPONSES(_r) do { \ +#define _RING_PUSH_RESPONSES(_r, sring) do { \ wmb(); /* front sees responses /before/ updated producer index */ \ (_r)->sring->rsp_prod = (_r)->rsp_prod_pvt; \ } while (0) +#define RING_PUSH_RESPONSES(_r) _RING_PUSH_RESPONSES(_r, sring) +#define BB_RING_PUSH_RESPONSES(_r) _RING_PUSH_RESPONSES(_r, BB_SRING(sring)) /* * Notification hold-off (req_event and rsp_event): @@ -250,7 +380,7 @@ typedef struct __name##_back_ring __name * field appropriately. */ -#define RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(_r, _notify) do { \ +#define _RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(_r, sring, _notify) do { \ RING_IDX __old = (_r)->sring->req_prod; \ RING_IDX __new = (_r)->req_prod_pvt; \ wmb(); /* back sees requests /before/ updated producer index */ \ @@ -259,8 +389,12 @@ typedef struct __name##_back_ring __name (_notify) = ((RING_IDX)(__new - (_r)->sring->req_event) < \ (RING_IDX)(__new - __old)); \ } while (0) +#define RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(_r, _notify) \ + _RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(_r, sring, _notify) +#define BB_RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(_r, _notify) \ + _RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(_r, BB_SRING(sring), _notify) -#define RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(_r, _notify) do { \ +#define _RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(_r, sring, _notify) do { \ RING_IDX __old = (_r)->sring->rsp_prod; \ RING_IDX __new = (_r)->rsp_prod_pvt; \ wmb(); /* front sees responses /before/ updated producer index */ \ @@ -269,22 +403,34 @@ typedef struct __name##_back_ring __name (_notify) = ((RING_IDX)(__new - (_r)->sring->rsp_event) < \ (RING_IDX)(__new - __old)); \ } while (0) +#define RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(_r, _notify) \ + _RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(_r, sring, _notify) +#define BB_RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(_r, _notify) \ + _RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(_r, BB_SRING(sring), _notify) -#define RING_FINAL_CHECK_FOR_REQUESTS(_r, _work_to_do) do { \ - (_work_to_do) = RING_HAS_UNCONSUMED_REQUESTS(_r); \ +#define _RING_FINAL_CHECK_FOR_REQUESTS(_r, bb, sring, _work_to_do) do { \ + (_work_to_do) = bb##RING_HAS_UNCONSUMED_REQUESTS(_r); \ if (_work_to_do) break; \ (_r)->sring->req_event = (_r)->req_cons + 1; \ mb(); \ - (_work_to_do) = RING_HAS_UNCONSUMED_REQUESTS(_r); \ + (_work_to_do) = bb##RING_HAS_UNCONSUMED_REQUESTS(_r); \ } while (0) +#define RING_FINAL_CHECK_FOR_REQUESTS(_r, _work_to_do) \ + _RING_FINAL_CHECK_FOR_REQUESTS(_r, , sring, _work_to_do) +#define BB_RING_FINAL_CHECK_FOR_REQUESTS(_r, _work_to_do) \ + _RING_FINAL_CHECK_FOR_REQUESTS(_r, BB_, BB_SRING(sring), _work_to_do) -#define RING_FINAL_CHECK_FOR_RESPONSES(_r, _work_to_do) do { \ - (_work_to_do) = RING_HAS_UNCONSUMED_RESPONSES(_r); \ +#define _RING_FINAL_CHECK_FOR_RESPONSES(_r, bb, sring, _work_to_do) do {\ + (_work_to_do) = bb##RING_HAS_UNCONSUMED_RESPONSES(_r); \ if (_work_to_do) break; \ (_r)->sring->rsp_event = (_r)->rsp_cons + 1; \ mb(); \ - (_work_to_do) = RING_HAS_UNCONSUMED_RESPONSES(_r); \ + (_work_to_do) = bb##RING_HAS_UNCONSUMED_RESPONSES(_r); \ } while (0) +#define RING_FINAL_CHECK_FOR_RESPONSES(_r, _work_to_do) \ + _RING_FINAL_CHECK_FOR_RESPONSES(_r, , sring, _work_to_do) +#define BB_RING_FINAL_CHECK_FOR_RESPONSES(_r, _work_to_do) \ + _RING_FINAL_CHECK_FOR_RESPONSES(_r, BB_, BB_SRING(sring), _work_to_do) #endif /* __XEN_PUBLIC_IO_RING_H__ */ [-- Attachment #4: xen-bimodal-blktap.patch --] [-- Type: text/plain, Size: 12015 bytes --] Enhance blktap to be capable of dealing with bi-modal operation (frontend running in with different word size than backend). Index: sle10-sp1-2006-12-05/drivers/xen/blktap/blktap.c =================================================================== --- sle10-sp1-2006-12-05.orig/drivers/xen/blktap/blktap.c 2006-12-07 16:43:47.000000000 +0100 +++ sle10-sp1-2006-12-05/drivers/xen/blktap/blktap.c 2006-12-07 16:45:05.000000000 +0100 @@ -154,7 +154,7 @@ module_param(debug_lvl, int, 0644); */ typedef struct { blkif_t *blkif; - unsigned long id; + uint64_t id; unsigned short mem_idx; int nr_pages; atomic_t pendcnt; @@ -757,9 +757,9 @@ void blktap_kick_user(int idx) static int do_block_io_op(blkif_t *blkif); static void dispatch_rw_block_io(blkif_t *blkif, - blkif_request_t *req, + blkif_request_u *req, pending_req_t *pending_req); -static void make_response(blkif_t *blkif, unsigned long id, +static void make_response(blkif_t *blkif, uint64_t id, unsigned short op, int st); /****************************************************************** @@ -1085,18 +1085,26 @@ irqreturn_t tap_blkif_be_int(int irq, vo /****************************************************************** * DOWNWARD CALLS -- These interface with the block-device layer proper. */ +#ifdef CONFIG_XEN_BIMODAL_BACKENDS +#define req(op) (blkif_native(blkif) ? (req.nat.op) : (req.alt.op)) +#define preq(op) (blkif_native(blkif) ? (req->nat.op) : (req->alt.op)) +#else +#define req(op) (req.op) +#define preq(op) (req->op) +#endif + static int print_dbug = 1; static int do_block_io_op(blkif_t *blkif) { blkif_back_ring_t *blk_ring = &blkif->blk_ring; - blkif_request_t req; + blkif_request_u req; pending_req_t *pending_req; RING_IDX rc, rp; int more_to_do = 0; tap_blkif_t *info; rc = blk_ring->req_cons; - rp = blk_ring->sring->req_prod; + rp = blk_ring->BB_SRING(sring)->req_prod; rmb(); /* Ensure we see queued requests up to 'rp'. */ /*Check blkif has corresponding UE ring*/ @@ -1142,10 +1150,10 @@ static int do_block_io_op(blkif_t *blkif break; } - memcpy(&req, RING_GET_REQUEST(blk_ring, rc), sizeof(req)); + BB_RING_COPY_REQUEST(req, blk_ring, rc); blk_ring->req_cons = ++rc; /* before make_response() */ - switch (req.operation) { + switch (req(operation)) { case BLKIF_OP_READ: blkif->st_rd_req++; dispatch_rw_block_io(blkif, &req, pending_req); @@ -1158,8 +1166,8 @@ static int do_block_io_op(blkif_t *blkif default: WPRINTK("unknown operation [%d]\n", - req.operation); - make_response(blkif, req.id, req.operation, + req(operation)); + make_response(blkif, req(id), req(operation), BLKIF_RSP_ERROR); free_req(pending_req); break; @@ -1172,11 +1180,11 @@ static int do_block_io_op(blkif_t *blkif } static void dispatch_rw_block_io(blkif_t *blkif, - blkif_request_t *req, + blkif_request_u *req, pending_req_t *pending_req) { extern void ll_rw_block(int rw, int nr, struct buffer_head * bhs[]); - int op, operation = (req->operation == BLKIF_OP_WRITE) ? WRITE : READ; + int op, operation = (preq(operation) == BLKIF_OP_WRITE) ? WRITE : READ; struct gnttab_map_grant_ref map[BLKIF_MAX_SEGMENTS_PER_REQUEST*2]; unsigned int nseg; int ret, i; @@ -1202,7 +1210,7 @@ static void dispatch_rw_block_io(blkif_t } /* Check that number of segments is sane. */ - nseg = req->nr_segments; + nseg = preq(nr_segments); if ( unlikely(nseg == 0) || unlikely(nseg > BLKIF_MAX_SEGMENTS_PER_REQUEST) ) { WPRINTK("Bad number of segments in request (%d)\n", nseg); @@ -1224,7 +1232,7 @@ static void dispatch_rw_block_io(blkif_t } pending_req->blkif = blkif; - pending_req->id = req->id; + pending_req->id = preq(id); pending_req->operation = operation; pending_req->status = BLKIF_RSP_OKAY; pending_req->nr_pages = nseg; @@ -1238,13 +1246,13 @@ static void dispatch_rw_block_io(blkif_t uvaddr = MMAP_VADDR(info->user_vstart, usr_idx, i); kvaddr = idx_to_kaddr(mmap_idx, pending_idx, i); - sector = req->sector_number + ((PAGE_SIZE / 512) * i); + sector = preq(sector_number) + ((PAGE_SIZE / 512) * i); if( (blkif->sectors > 0) && (sector >= blkif->sectors) ) { WPRINTK("BLKTAP: Sector request greater" "than size\n"); WPRINTK("BLKTAP: %s request sector" "[%llu,%llu], Total [%llu]\n", - (req->operation == + (preq(operation) == BLKIF_OP_WRITE ? "WRITE" : "READ"), (long long unsigned) sector, (long long unsigned) sector>>9, @@ -1255,7 +1263,7 @@ static void dispatch_rw_block_io(blkif_t if (operation == WRITE) flags |= GNTMAP_readonly; gnttab_set_map_op(&map[op], kvaddr, flags, - req->seg[i].gref, blkif->domid); + preq(seg)[i].gref, blkif->domid); op++; if (!xen_feature(XENFEAT_auto_translated_physmap)) { @@ -1272,7 +1280,7 @@ static void dispatch_rw_block_io(blkif_t if (operation == WRITE) flags |= GNTMAP_readonly; gnttab_set_map_op(&map[op], ptep, flags, - req->seg[i].gref, blkif->domid); + preq(seg)[i].gref, blkif->domid); op++; } } @@ -1356,7 +1364,7 @@ static void dispatch_rw_block_io(blkif_t if (xen_feature(XENFEAT_auto_translated_physmap)) down_write(&info->vma->vm_mm->mmap_sem); /* Mark mapped pages as reserved: */ - for (i = 0; i < req->nr_segments; i++) { + for (i = 0; i < preq(nr_segments); i++) { unsigned long kvaddr; struct page *pg; @@ -1383,7 +1391,21 @@ static void dispatch_rw_block_io(blkif_t /* Finally, write the request message to the user ring. */ target = RING_GET_REQUEST(&info->ufe_ring, info->ufe_ring.req_prod_pvt); - memcpy(target, req, sizeof(*req)); +#ifndef CONFIG_XEN_BIMODAL_BACKENDS + memcpy(target, &req, sizeof(req)); +#else + if (blkif_native(blkif)) + memcpy(target, &req->nat, sizeof(req->nat)); + else { + target->operation = req->alt.operation; + target->nr_segments = req->alt.nr_segments; + target->handle = req->alt.handle; + target->sector_number = req->alt.sector_number; + BUILD_BUG_ON((typeof(target->seg)*)0 != (typeof(req->alt.seg)*)0); + memcpy(target->seg, req->alt.seg, + target->nr_segments * sizeof(*target->seg)); + } +#endif target->id = usr_idx; wmb(); /* blktap_poll() reads req_prod_pvt asynchronously */ info->ufe_ring.req_prod_pvt++; @@ -1393,10 +1415,11 @@ static void dispatch_rw_block_io(blkif_t WPRINTK("Reached Fail_flush\n"); fast_flush_area(pending_req, pending_idx, usr_idx, blkif->dev_num); fail_response: - make_response(blkif, req->id, req->operation, BLKIF_RSP_ERROR); + make_response(blkif, preq(id), preq(operation), BLKIF_RSP_ERROR); free_req(pending_req); } +#undef req /****************************************************************** @@ -1404,10 +1427,15 @@ static void dispatch_rw_block_io(blkif_t */ -static void make_response(blkif_t *blkif, unsigned long id, +static void make_response(blkif_t *blkif, uint64_t id, unsigned short op, int st) { - blkif_response_t *resp; + blkif_response_u resp; +#ifdef CONFIG_XEN_BIMODAL_BACKENDS +#define resp(op) (blkif_native(blkif) ? (resp.nat->op) : (resp.alt->op)) +#else +#define resp(op) (resp->op) +#endif unsigned long flags; blkif_back_ring_t *blk_ring = &blkif->blk_ring; int more_to_do = 0; @@ -1415,12 +1443,12 @@ static void make_response(blkif_t *blkif spin_lock_irqsave(&blkif->blk_ring_lock, flags); /* Place on the response ring for the relevant domain. */ - resp = RING_GET_RESPONSE(blk_ring, blk_ring->rsp_prod_pvt); - resp->id = id; - resp->operation = op; - resp->status = st; + BB_RING_GET_RESPONSE(resp, blk_ring, blk_ring->rsp_prod_pvt); + resp(id = id); + resp(operation = op); + resp(status = st); blk_ring->rsp_prod_pvt++; - RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(blk_ring, notify); + BB_RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(blk_ring, notify); if (blk_ring->rsp_prod_pvt == blk_ring->req_cons) { /* @@ -1428,8 +1456,8 @@ static void make_response(blkif_t *blkif * notifications if requests are already in flight (lower * overheads and promotes batching). */ - RING_FINAL_CHECK_FOR_REQUESTS(blk_ring, more_to_do); - } else if (RING_HAS_UNCONSUMED_REQUESTS(blk_ring)) { + BB_RING_FINAL_CHECK_FOR_REQUESTS(blk_ring, more_to_do); + } else if (BB_RING_HAS_UNCONSUMED_REQUESTS(blk_ring)) { more_to_do = 1; } @@ -1438,6 +1466,7 @@ static void make_response(blkif_t *blkif blkif_notify_work(blkif); if (notify) notify_remote_via_irq(blkif->irq); +#undef resp } static int __init blkif_init(void) Index: sle10-sp1-2006-12-05/drivers/xen/blktap/common.h =================================================================== --- sle10-sp1-2006-12-05.orig/drivers/xen/blktap/common.h 2006-12-07 16:43:47.000000000 +0100 +++ sle10-sp1-2006-12-05/drivers/xen/blktap/common.h 2006-12-07 16:45:05.000000000 +0100 @@ -39,8 +39,10 @@ #include <asm/pgalloc.h> #include <xen/evtchn.h> #include <asm/hypervisor.h> +#ifdef CONFIG_XEN_BIMODAL_BACKENDS +#define BLKIF_BIMODAL +#endif #include <xen/interface/io/blkif.h> -#include <xen/interface/io/ring.h> #include <xen/gnttab.h> #include <xen/driver_util.h> @@ -54,6 +56,9 @@ struct backend_info; typedef struct blkif_st { /* Unique identifier for this interface. */ domid_t domid; +#ifdef CONFIG_XEN_BIMODAL_BACKENDS + unsigned char native; +#endif unsigned int handle; /* Physical parameters of the comms window. */ unsigned int evtchn; @@ -87,6 +92,15 @@ typedef struct blkif_st { uint64_t sectors; } blkif_t; +#ifdef CONFIG_XEN_BIMODAL_BACKENDS +#define blkif_native(blkif) ((blkif)->native) +static inline int ring_native(const blkif_back_ring_t *ring) { + return blkif_native(container_of(ring, blkif_t, blk_ring)); +} +#else +#define blkif_native(blkif) 1 +#endif + blkif_t *tap_alloc_blkif(domid_t domid); void tap_blkif_free(blkif_t *blkif); int tap_blkif_map(blkif_t *blkif, unsigned long shared_page, Index: sle10-sp1-2006-12-05/drivers/xen/blktap/interface.c =================================================================== --- sle10-sp1-2006-12-05.orig/drivers/xen/blktap/interface.c 2006-12-07 16:43:47.000000000 +0100 +++ sle10-sp1-2006-12-05/drivers/xen/blktap/interface.c 2006-12-07 16:45:05.000000000 +0100 @@ -46,6 +46,9 @@ blkif_t *tap_alloc_blkif(domid_t domid) memset(blkif, 0, sizeof(*blkif)); blkif->domid = domid; +#ifdef CONFIG_XEN_BIMODAL_BACKENDS + blkif->native = 1; /* XXX */ +#endif spin_lock_init(&blkif->blk_ring_lock); atomic_set(&blkif->refcnt, 1); init_waitqueue_head(&blkif->wq); @@ -96,7 +99,7 @@ static void unmap_frontend_page(blkif_t int tap_blkif_map(blkif_t *blkif, unsigned long shared_page, unsigned int evtchn) { - blkif_sring_t *sring; + blkif_sring_u sring; int err; struct evtchn_bind_interdomain bind_interdomain; @@ -126,7 +129,7 @@ int tap_blkif_map(blkif_t *blkif, unsign blkif->evtchn = bind_interdomain.local_port; - sring = (blkif_sring_t *)blkif->blk_ring_area->addr; + BB_SRING(sring) = (void *)blkif->blk_ring_area->addr; BACK_RING_INIT(&blkif->blk_ring, sring, PAGE_SIZE); blkif->irq = bind_evtchn_to_irqhandler( @@ -141,10 +144,10 @@ void tap_blkif_unmap(blkif_t *blkif) unbind_from_irqhandler(blkif->irq, blkif); blkif->irq = 0; } - if (blkif->blk_ring.sring) { + if (blkif->blk_ring.BB_SRING(sring)) { unmap_frontend_page(blkif); free_vm_area(blkif->blk_ring_area); - blkif->blk_ring.sring = NULL; + blkif->blk_ring.BB_SRING(sring) = NULL; } } [-- Attachment #5: xen-bimodal-blkback.patch --] [-- Type: text/plain, Size: 10602 bytes --] Enhance blkback to be capable of dealing with bi-modal operation (frontend running in with different word size than backend). Index: sle10-sp1-2006-12-05/drivers/xen/blkback/blkback.c =================================================================== --- sle10-sp1-2006-12-05.orig/drivers/xen/blkback/blkback.c 2006-12-07 16:43:52.000000000 +0100 +++ sle10-sp1-2006-12-05/drivers/xen/blkback/blkback.c 2006-12-07 16:45:02.000000000 +0100 @@ -70,7 +70,7 @@ module_param(debug_lvl, int, 0644); */ typedef struct { blkif_t *blkif; - unsigned long id; + uint64_t id; int nr_pages; atomic_t pendcnt; unsigned short operation; @@ -105,9 +105,9 @@ static inline unsigned long vaddr(pendin static int do_block_io_op(blkif_t *blkif); static void dispatch_rw_block_io(blkif_t *blkif, - blkif_request_t *req, + blkif_request_u *req, pending_req_t *pending_req); -static void make_response(blkif_t *blkif, unsigned long id, +static void make_response(blkif_t *blkif, uint64_t id, unsigned short op, int st); /****************************************************************** @@ -295,17 +295,24 @@ irqreturn_t blkif_be_int(int irq, void * /****************************************************************** * DOWNWARD CALLS -- These interface with the block-device layer proper. */ +#ifdef CONFIG_XEN_BIMODAL_BACKENDS +#define req(op) (blkif_native(blkif) ? (req.nat.op) : (req.alt.op)) +#define preq(op) (blkif_native(blkif) ? (req->nat.op) : (req->alt.op)) +#else +#define req(op) (req.op) +#define preq(op) (req->op) +#endif static int do_block_io_op(blkif_t *blkif) { blkif_back_ring_t *blk_ring = &blkif->blk_ring; - blkif_request_t req; + blkif_request_u req; pending_req_t *pending_req; RING_IDX rc, rp; int more_to_do = 0; rc = blk_ring->req_cons; - rp = blk_ring->sring->req_prod; + rp = blk_ring->BB_SRING(sring)->req_prod; rmb(); /* Ensure we see queued requests up to 'rp'. */ while ((rc != rp) && !RING_REQUEST_CONS_OVERFLOW(blk_ring, rc)) { @@ -317,10 +324,10 @@ static int do_block_io_op(blkif_t *blkif break; } - memcpy(&req, RING_GET_REQUEST(blk_ring, rc), sizeof(req)); + BB_RING_COPY_REQUEST(req, blk_ring, rc); blk_ring->req_cons = ++rc; /* before make_response() */ - switch (req.operation) { + switch (req(operation)) { case BLKIF_OP_READ: blkif->st_rd_req++; dispatch_rw_block_io(blkif, &req, pending_req); @@ -334,8 +341,8 @@ static int do_block_io_op(blkif_t *blkif break; default: DPRINTK("error: unknown block io operation [%d]\n", - req.operation); - make_response(blkif, req.id, req.operation, + req(operation)); + make_response(blkif, req(id), req(operation), BLKIF_RSP_ERROR); free_req(pending_req); break; @@ -345,7 +352,7 @@ static int do_block_io_op(blkif_t *blkif } static void dispatch_rw_block_io(blkif_t *blkif, - blkif_request_t *req, + blkif_request_u *req, pending_req_t *pending_req) { extern void ll_rw_block(int rw, int nr, struct buffer_head * bhs[]); @@ -359,7 +366,7 @@ static void dispatch_rw_block_io(blkif_t int ret, i, nbio = 0; int operation; - switch (req->operation) { + switch (preq(operation)) { case BLKIF_OP_READ: operation = READ; break; @@ -375,31 +382,31 @@ static void dispatch_rw_block_io(blkif_t } /* Check that number of segments is sane. */ - nseg = req->nr_segments; + nseg = preq(nr_segments); if (unlikely(nseg == 0) || unlikely(nseg > BLKIF_MAX_SEGMENTS_PER_REQUEST)) { DPRINTK("Bad number of segments in request (%d)\n", nseg); goto fail_response; } - preq.dev = req->handle; - preq.sector_number = req->sector_number; + preq.dev = preq(handle); + preq.sector_number = preq(sector_number); preq.nr_sects = 0; pending_req->blkif = blkif; - pending_req->id = req->id; - pending_req->operation = req->operation; + pending_req->id = preq(id); + pending_req->operation = preq(operation); pending_req->status = BLKIF_RSP_OKAY; pending_req->nr_pages = nseg; for (i = 0; i < nseg; i++) { uint32_t flags; - seg[i].nsec = req->seg[i].last_sect - - req->seg[i].first_sect + 1; + seg[i].nsec = preq(seg)[i].last_sect - + preq(seg)[i].first_sect + 1; - if ((req->seg[i].last_sect >= (PAGE_SIZE >> 9)) || - (req->seg[i].last_sect < req->seg[i].first_sect)) + if ((preq(seg)[i].last_sect >= (PAGE_SIZE >> 9)) || + (preq(seg)[i].last_sect < preq(seg)[i].first_sect)) goto fail_response; preq.nr_sects += seg[i].nsec; @@ -407,7 +414,7 @@ static void dispatch_rw_block_io(blkif_t if (operation != READ) flags |= GNTMAP_readonly; gnttab_set_map_op(&map[i], vaddr(pending_req, i), flags, - req->seg[i].gref, blkif->domid); + preq(seg)[i].gref, blkif->domid); } ret = HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, map, nseg); @@ -429,7 +436,7 @@ static void dispatch_rw_block_io(blkif_t pending_req, i)) >> PAGE_SHIFT, FOREIGN_FRAME(map[i].dev_bus_addr >> PAGE_SHIFT)); seg[i].buf = map[i].dev_bus_addr | - (req->seg[i].first_sect << 9); + (preq(seg)[i].first_sect << 9); } if (ret) @@ -484,10 +491,11 @@ static void dispatch_rw_block_io(blkif_t fail_flush: fast_flush_area(pending_req); fail_response: - make_response(blkif, req->id, req->operation, BLKIF_RSP_ERROR); + make_response(blkif, preq(id), preq(operation), BLKIF_RSP_ERROR); free_req(pending_req); } +#undef req /****************************************************************** @@ -495,10 +503,15 @@ static void dispatch_rw_block_io(blkif_t */ -static void make_response(blkif_t *blkif, unsigned long id, +static void make_response(blkif_t *blkif, uint64_t id, unsigned short op, int st) { - blkif_response_t *resp; + blkif_response_u resp; +#ifdef CONFIG_XEN_BIMODAL_BACKENDS +#define resp(op) (blkif_native(blkif) ? (resp.nat->op) : (resp.alt->op)) +#else +#define resp(op) (resp->op) +#endif unsigned long flags; blkif_back_ring_t *blk_ring = &blkif->blk_ring; int more_to_do = 0; @@ -507,12 +520,12 @@ static void make_response(blkif_t *blkif spin_lock_irqsave(&blkif->blk_ring_lock, flags); /* Place on the response ring for the relevant domain. */ - resp = RING_GET_RESPONSE(blk_ring, blk_ring->rsp_prod_pvt); - resp->id = id; - resp->operation = op; - resp->status = st; + BB_RING_GET_RESPONSE(resp, blk_ring, blk_ring->rsp_prod_pvt); + resp(id = id); + resp(operation = op); + resp(status = st); blk_ring->rsp_prod_pvt++; - RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(blk_ring, notify); + BB_RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(blk_ring, notify); if (blk_ring->rsp_prod_pvt == blk_ring->req_cons) { /* @@ -520,9 +533,9 @@ static void make_response(blkif_t *blkif * notifications if requests are already in flight (lower * overheads and promotes batching). */ - RING_FINAL_CHECK_FOR_REQUESTS(blk_ring, more_to_do); + BB_RING_FINAL_CHECK_FOR_REQUESTS(blk_ring, more_to_do); - } else if (RING_HAS_UNCONSUMED_REQUESTS(blk_ring)) { + } else if (BB_RING_HAS_UNCONSUMED_REQUESTS(blk_ring)) { more_to_do = 1; } @@ -532,6 +545,7 @@ static void make_response(blkif_t *blkif blkif_notify_work(blkif); if (notify) notify_remote_via_irq(blkif->irq); +#undef resp } static int __init blkif_init(void) Index: sle10-sp1-2006-12-05/drivers/xen/blkback/common.h =================================================================== --- sle10-sp1-2006-12-05.orig/drivers/xen/blkback/common.h 2006-12-07 16:43:52.000000000 +0100 +++ sle10-sp1-2006-12-05/drivers/xen/blkback/common.h 2006-12-07 16:45:02.000000000 +0100 @@ -40,8 +40,10 @@ #include <asm/pgalloc.h> #include <xen/evtchn.h> #include <asm/hypervisor.h> +#ifdef CONFIG_XEN_BIMODAL_BACKENDS +#define BLKIF_BIMODAL +#endif #include <xen/interface/io/blkif.h> -#include <xen/interface/io/ring.h> #include <xen/gnttab.h> #include <xen/driver_util.h> #include <xen/xenbus.h> @@ -63,6 +65,9 @@ struct backend_info; typedef struct blkif_st { /* Unique identifier for this interface. */ domid_t domid; +#ifdef CONFIG_XEN_BIMODAL_BACKENDS + unsigned char native; +#endif unsigned int handle; /* Physical parameters of the comms window. */ unsigned int evtchn; @@ -96,6 +101,15 @@ typedef struct blkif_st { grant_ref_t shmem_ref; } blkif_t; +#ifdef CONFIG_XEN_BIMODAL_BACKENDS +#define blkif_native(blkif) ((blkif)->native) +static inline int ring_native(const blkif_back_ring_t *ring) { + return blkif_native(container_of(ring, blkif_t, blk_ring)); +} +#else +#define blkif_native(blkif) 1 +#endif + blkif_t *blkif_alloc(domid_t domid); void blkif_disconnect(blkif_t *blkif); void blkif_free(blkif_t *blkif); Index: sle10-sp1-2006-12-05/drivers/xen/blkback/interface.c =================================================================== --- sle10-sp1-2006-12-05.orig/drivers/xen/blkback/interface.c 2006-12-07 16:43:52.000000000 +0100 +++ sle10-sp1-2006-12-05/drivers/xen/blkback/interface.c 2006-12-07 16:45:02.000000000 +0100 @@ -46,6 +46,9 @@ blkif_t *blkif_alloc(domid_t domid) memset(blkif, 0, sizeof(*blkif)); blkif->domid = domid; +#ifdef CONFIG_XEN_BIMODAL_BACKENDS + blkif->native = 1; /* XXX */ +#endif spin_lock_init(&blkif->blk_ring_lock); atomic_set(&blkif->refcnt, 1); init_waitqueue_head(&blkif->wq); @@ -95,7 +98,7 @@ static void unmap_frontend_page(blkif_t int blkif_map(blkif_t *blkif, unsigned long shared_page, unsigned int evtchn) { - blkif_sring_t *sring; + blkif_sring_u sring; int err; struct evtchn_bind_interdomain bind_interdomain; @@ -125,7 +128,7 @@ int blkif_map(blkif_t *blkif, unsigned l blkif->evtchn = bind_interdomain.local_port; - sring = (blkif_sring_t *)blkif->blk_ring_area->addr; + BB_SRING(sring) = (void *)blkif->blk_ring_area->addr; BACK_RING_INIT(&blkif->blk_ring, sring, PAGE_SIZE); blkif->irq = bind_evtchn_to_irqhandler( @@ -150,10 +153,10 @@ void blkif_disconnect(blkif_t *blkif) blkif->irq = 0; } - if (blkif->blk_ring.sring) { + if (blkif->blk_ring.BB_SRING(sring)) { unmap_frontend_page(blkif); free_vm_area(blkif->blk_ring_area); - blkif->blk_ring.sring = NULL; + blkif->blk_ring.BB_SRING(sring) = NULL; } } [-- Attachment #6: Type: text/plain, Size: 138 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [patch/rfc] multiprotocol blkback drivers (32-on-64) 2006-12-18 17:09 ` Jan Beulich 2006-12-18 17:58 ` Keir Fraser @ 2006-12-19 8:20 ` Gerd Hoffmann 1 sibling, 0 replies; 12+ messages in thread From: Gerd Hoffmann @ 2006-12-19 8:20 UTC (permalink / raw) To: Jan Beulich; +Cc: Xen devel list Jan Beulich wrote: > I understand you favor this over the bi-modal approach I took? Any specific > advantages? Jan IMHO the code is more readable and I'd rate the chance to be accepted by lkml review higher. I don't like the approach to hide alot of the logic in preprocessor magic. It leaves the door open to add more protocols. Not that I see a need right now. But maybe the lkml folks ask us to consolidate to some struct layout which doesn't look different on different architectures (i.e. sort struct members by size), then we maybe have to support a third protocol ... cheers, Gerd -- Gerd Hoffmann <kraxel@suse.de> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [patch/rfc] multiprotocol blkback drivers (32-on-64) 2006-12-18 16:39 [patch/rfc] multiprotocol blkback drivers (32-on-64) Gerd Hoffmann 2006-12-18 17:09 ` Jan Beulich @ 2006-12-19 7:55 ` Jan Beulich 2006-12-19 8:35 ` Gerd Hoffmann 2006-12-19 13:32 ` Gerd Hoffmann 1 sibling, 2 replies; 12+ messages in thread From: Jan Beulich @ 2006-12-19 7:55 UTC (permalink / raw) To: Gerd Hoffmann; +Cc: Xen devel list There are a couple of things I'd like to see changed if this is what we want to go with: - "if (protocol == 1) {} else {}" should be switches, failing (or even BUGing) for all protocol versions other than 1 and 2 - assuming the abstraction is meant to scale to future protocol versions, adding many such explicit version handling code paths seems undesirable, as seems adding extra version specific variables or (non-union) structure members - using all error possible values returned from xenbus_gather to indicate an old frontend seems odd at least - one specific error value should be recognized here - unconditionally using #pragma pack(), __attribute__(()), and __i386__ or __x86_64__ in public Xen headers is, in my opinion, a no-go (these header should all be suitable for building e.g. Windows drivers, too - I know this isn't generally the case at present, but I don't think anything else can be the goal, and hence the situation shouldn't be made worse) Jan >>> Gerd Hoffmann <kraxel@suse.de> 18.12.06 17:39 >>> Hi, This is a patch for the block interface, frontend drivers, backend drivers and tools to support multiple ring protocols. Right there are now just two: the 32bit and the 64bit one. If needed it can be extended. Interface changes (io/blkif.h) * Have both request structs there, with "v1" and "v2" added to the name. The old name is aliased to the native protocol of the architecture. * Add helper functions to convert v1/v2 requests to native. Frontend changes: * Create a new node "protocol", add the protocol number it speaks there. Backend changes: * Look at the "protocol" number of the frontend and switch ring handling accordingly. If the protocol node isn't present it assumes native protocol. * As the request struct is copied anyway before being processed (for security reasons) it is converted to native at that point so most backend code doesn't need to know what the frontend speaks. * In case of blktap this is completely transparent to userspace, the kernel/userspace ring is always native no matter what the frontend speaks. Tools changes: * Add one more option to the disk configuration, so one can specify the protocol the frontend speaks in the config file. This is needed for old frontends which don't advertise the protocol they are speaking themself. I'm not that happy with this approach, but it works for now and I'm kida lost in the stack of python classes doing domain and device handling ... Consider the code experimental, not all frontend/backend combinations are tested. Comments? Questions? Suggesions? cheers, Gerd PS: Anyone working on blkback/blktap code sharing? While walking through the code I've noticed quite alot of it is cut&paste ... -- Gerd Hoffmann <kraxel@suse.de> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [patch/rfc] multiprotocol blkback drivers (32-on-64) 2006-12-19 7:55 ` Jan Beulich @ 2006-12-19 8:35 ` Gerd Hoffmann 2006-12-19 13:32 ` Gerd Hoffmann 1 sibling, 0 replies; 12+ messages in thread From: Gerd Hoffmann @ 2006-12-19 8:35 UTC (permalink / raw) To: Jan Beulich; +Cc: Xen devel list Jan Beulich wrote: > There are a couple of things I'd like to see changed if this is what we want to > go with: yep, it's a first at least partly working rfc patch, certainly not final yet ;) > - "if (protocol == 1) {} else {}" should be switches, failing (or even BUGing) for > all protocol versions other than 1 and 2 BUG() should be ok, in theory the code should never ever reach one of the switches with an uninitialized protocol. > - assuming the abstraction is meant to scale to future protocol versions, adding > many such explicit version handling code paths seems undesirable, as seems > adding extra version specific variables or (non-union) structure members Using unions is one of the things I plan to change. > - using all error possible values returned from xenbus_gather to indicate an old > frontend seems odd at least - one specific error value should be > recognized here Yep, would be a bit cleaner, although I don't see any other possible reason than a nonexisting node why it should fail at that point ... > - unconditionally using #pragma pack(), __attribute__(()), and __i386__ or > __x86_64__ in public Xen headers is, in my opinion, a no-go (these header > should all be suitable for building e.g. Windows drivers, too - I know this isn't > generally the case at present, but I don't think anything else can be the goal, > and hence the situation shouldn't be made worse) Yep, we need some solution for that. The sun folks will veto at least the attribute stuff too. Not sure pragma pack is a portability issue. I think we need some compiler.h which provides that kind of stuff, i.e. #ifdef __GNUC__ #define __align8__ __attribute__((__aligned__(8))) #endif #ifdef __suncc__ #define __align8__ something_else #endif Also some of the bits I've placed into blkif.h for now should go to a linux-kernel header instead I think. cheers, Gerd -- Gerd Hoffmann <kraxel@suse.de> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [patch/rfc] multiprotocol blkback drivers (32-on-64) 2006-12-19 7:55 ` Jan Beulich 2006-12-19 8:35 ` Gerd Hoffmann @ 2006-12-19 13:32 ` Gerd Hoffmann 2006-12-19 14:20 ` Keir Fraser 1 sibling, 1 reply; 12+ messages in thread From: Gerd Hoffmann @ 2006-12-19 13:32 UTC (permalink / raw) To: Jan Beulich; +Cc: Xen devel list [-- Attachment #1: Type: text/plain, Size: 1508 bytes --] Next version attached. WARNING: compiles, but untested otherwise. > - "if (protocol == 1) {} else {}" should be switches, failing (or even BUGing) for > all protocol versions other than 1 and 2 fixed. > - assuming the abstraction is meant to scale to future protocol versions, adding > many such explicit version handling code paths seems undesirable, as seems > adding extra version specific variables or (non-union) structure members switch to unions done, also killed some "switch (protocols)" statements along the way, leaving only two places with different code paths: ring initialization and copying requests out of the ring. > - using all error possible values returned from xenbus_gather to indicate an old > frontend seems odd at least - one specific error value should be > recognized here Hmm, looked at netfront/back to figure the correct error code. They act the same way though, i.e. consider errors on reading "feature-*" nodes as "doesn't exist" ... > - unconditionally using #pragma pack(), __attribute__(()), and __i386__ or > __x86_64__ in public Xen headers is, in my opinion, a no-go (these header > should all be suitable for building e.g. Windows drivers, too - I know this isn't > generally the case at present, but I don't think anything else can be the goal, > and hence the situation shouldn't be made worse) Ideas for that one? Ok to create xen/include/public/compiler.h with that kind of stuff in? cheers, Gerd -- Gerd Hoffmann <kraxel@suse.de> [-- Attachment #2: blkif-bimodal.diff --] [-- Type: text/x-patch, Size: 22958 bytes --] multiprotocol blkback drivers. This is a patch for the block interface, frontend drivers, backend drivers and tools to support multiple ring protocols. Right there are now just two: the 32bit and the 64bit one. If needed it can be extended. Interface changes (io/blkif.h) * Have both request structs there, with "v1" and "v2" added to the name. The old name is aliased to the native protocol of the architecture. * Add helper functions to convert v1/v2 requests to native. Frontend changes: * Create a new node "protocol", add the protocol number it speaks there. Backend changes: * Look at the "protocol" number of the frontend and switch ring handling accordingly. If the protocol node isn't present it assumes native protocol. * As the request struct is copied anyway before being processed (for security reasons) it is converted to native at that point so most backend code doesn't need to know what the frontend speaks. * In case of blktap this is completely transparent to userspace, the kernel/userspace ring is always native no matter what the frontend speaks. Tools changes: * Add one more option to the disk configuration, so one can specify the protocol the frontend speaks in the config file. This is needed for old frontends which don't advertise the protocol they are speaking themself. I'm not that happy with this approach, but it works for now and I'm kida lost in the stack of python classes doing domain and device handling ... --- linux-2.6-xen-sparse/drivers/xen/blkback/blkback.c | 43 +++++--- linux-2.6-xen-sparse/drivers/xen/blkback/common.h | 3 linux-2.6-xen-sparse/drivers/xen/blkback/interface.c | 25 +++- linux-2.6-xen-sparse/drivers/xen/blkback/xenbus.c | 14 ++ linux-2.6-xen-sparse/drivers/xen/blkfront/blkfront.c | 6 + linux-2.6-xen-sparse/drivers/xen/blktap/blktap.c | 44 +++++--- linux-2.6-xen-sparse/drivers/xen/blktap/common.h | 3 linux-2.6-xen-sparse/drivers/xen/blktap/interface.c | 25 +++- linux-2.6-xen-sparse/drivers/xen/blktap/xenbus.c | 14 ++ tools/python/xen/xend/server/blkif.py | 3 tools/python/xen/xm/create.py | 7 + xen/include/public/io/blkif.h | 97 +++++++++++++++++-- 12 files changed, 229 insertions(+), 55 deletions(-) Index: build-32-release304-12901/linux-2.6-xen-sparse/drivers/xen/blkback/blkback.c =================================================================== --- build-32-release304-12901.orig/linux-2.6-xen-sparse/drivers/xen/blkback/blkback.c +++ build-32-release304-12901/linux-2.6-xen-sparse/drivers/xen/blkback/blkback.c @@ -298,17 +298,20 @@ irqreturn_t blkif_be_int(int irq, void * static int do_block_io_op(blkif_t *blkif) { - blkif_back_ring_t *blk_ring = &blkif->blk_ring; + blkif_back_rings_t *blk_rings = &blkif->blk_rings; blkif_request_t req; pending_req_t *pending_req; RING_IDX rc, rp; int more_to_do = 0; - rc = blk_ring->req_cons; - rp = blk_ring->sring->req_prod; + rc = blk_rings->co.req_cons; + rp = blk_rings->co.sring->req_prod; rmb(); /* Ensure we see queued requests up to 'rp'. */ - while ((rc != rp) && !RING_REQUEST_CONS_OVERFLOW(blk_ring, rc)) { + while ((rc != rp)) { + + if (RING_REQUEST_CONS_OVERFLOW(&blk_rings->co, rc)) + break; pending_req = alloc_req(); if (NULL == pending_req) { @@ -317,8 +320,17 @@ static int do_block_io_op(blkif_t *blkif break; } - memcpy(&req, RING_GET_REQUEST(blk_ring, rc), sizeof(req)); - blk_ring->req_cons = ++rc; /* before make_response() */ + switch (blkif->blk_protocol) { + case 1: + blkif_get_v1_req(&req, RING_GET_REQUEST(&blk_rings->v1, rc)); + break; + case 2: + blkif_get_v2_req(&req, RING_GET_REQUEST(&blk_rings->v2, rc)); + break; + default: + BUG(); + } + blk_rings->co.req_cons = ++rc; /* before make_response() */ switch (req.operation) { case BLKIF_OP_READ: @@ -500,32 +512,31 @@ static void make_response(blkif_t *blkif { blkif_response_t *resp; unsigned long flags; - blkif_back_ring_t *blk_ring = &blkif->blk_ring; + blkif_back_rings_t *blk_rings = &blkif->blk_rings; int more_to_do = 0; int notify; spin_lock_irqsave(&blkif->blk_ring_lock, flags); /* Place on the response ring for the relevant domain. */ - resp = RING_GET_RESPONSE(blk_ring, blk_ring->rsp_prod_pvt); + resp = RING_GET_RESPONSE(&blk_rings->co, blk_rings->co.rsp_prod_pvt); resp->id = id; resp->operation = op; resp->status = st; - blk_ring->rsp_prod_pvt++; - RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(blk_ring, notify); - - if (blk_ring->rsp_prod_pvt == blk_ring->req_cons) { + blk_rings->co.rsp_prod_pvt++; + RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&blk_rings->co, notify); + if (blk_rings->co.rsp_prod_pvt == blk_rings->co.req_cons) { /* * Tail check for pending requests. Allows frontend to avoid * notifications if requests are already in flight (lower * overheads and promotes batching). */ - RING_FINAL_CHECK_FOR_REQUESTS(blk_ring, more_to_do); - - } else if (RING_HAS_UNCONSUMED_REQUESTS(blk_ring)) { + RING_FINAL_CHECK_FOR_REQUESTS(&blk_rings->co, more_to_do); + + } else if (RING_HAS_UNCONSUMED_REQUESTS(&blk_rings->co)) { more_to_do = 1; - } + spin_unlock_irqrestore(&blkif->blk_ring_lock, flags); if (more_to_do) Index: build-32-release304-12901/linux-2.6-xen-sparse/drivers/xen/blkback/common.h =================================================================== --- build-32-release304-12901.orig/linux-2.6-xen-sparse/drivers/xen/blkback/common.h +++ build-32-release304-12901/linux-2.6-xen-sparse/drivers/xen/blkback/common.h @@ -68,7 +68,8 @@ typedef struct blkif_st { unsigned int evtchn; unsigned int irq; /* Comms information. */ - blkif_back_ring_t blk_ring; + int blk_protocol; + blkif_back_rings_t blk_rings; struct vm_struct *blk_ring_area; /* The VBD attached to this interface. */ struct vbd vbd; Index: build-32-release304-12901/linux-2.6-xen-sparse/drivers/xen/blkback/interface.c =================================================================== --- build-32-release304-12901.orig/linux-2.6-xen-sparse/drivers/xen/blkback/interface.c +++ build-32-release304-12901/linux-2.6-xen-sparse/drivers/xen/blkback/interface.c @@ -95,7 +95,6 @@ static void unmap_frontend_page(blkif_t int blkif_map(blkif_t *blkif, unsigned long shared_page, unsigned int evtchn) { - blkif_sring_t *sring; int err; struct evtchn_bind_interdomain bind_interdomain; @@ -125,8 +124,24 @@ int blkif_map(blkif_t *blkif, unsigned l blkif->evtchn = bind_interdomain.local_port; - sring = (blkif_sring_t *)blkif->blk_ring_area->addr; - BACK_RING_INIT(&blkif->blk_ring, sring, PAGE_SIZE); + switch (blkif->blk_protocol) { + case 1: + { + blkif_v1_sring_t *sring_v1; + sring_v1 = (blkif_v1_sring_t *)blkif->blk_ring_area->addr; + BACK_RING_INIT(&blkif->blk_rings.v1, sring_v1, PAGE_SIZE); + break; + } + case 2: + { + blkif_v2_sring_t *sring_v2; + sring_v2 = (blkif_v2_sring_t *)blkif->blk_ring_area->addr; + BACK_RING_INIT(&blkif->blk_rings.v2, sring_v2, PAGE_SIZE); + break; + } + default: + BUG(); + } blkif->irq = bind_evtchn_to_irqhandler( blkif->evtchn, blkif_be_int, 0, "blkif-backend", blkif); @@ -150,10 +165,10 @@ void blkif_disconnect(blkif_t *blkif) blkif->irq = 0; } - if (blkif->blk_ring.sring) { + if (blkif->blk_rings.co.sring) { unmap_frontend_page(blkif); free_vm_area(blkif->blk_ring_area); - blkif->blk_ring.sring = NULL; + blkif->blk_rings.co.sring = NULL; } } Index: build-32-release304-12901/linux-2.6-xen-sparse/drivers/xen/blkback/xenbus.c =================================================================== --- build-32-release304-12901.orig/linux-2.6-xen-sparse/drivers/xen/blkback/xenbus.c +++ build-32-release304-12901/linux-2.6-xen-sparse/drivers/xen/blkback/xenbus.c @@ -435,6 +435,7 @@ static int connect_ring(struct backend_i struct xenbus_device *dev = be->dev; unsigned long ring_ref; unsigned int evtchn; + unsigned int protocol; int err; DPRINTK("%s", dev->otherend); @@ -448,6 +449,19 @@ static int connect_ring(struct backend_i return err; } + err = xenbus_gather(XBT_NIL, dev->otherend, "protocol", + "%u", &protocol, NULL); + if (err) + protocol = BLKIF_NATIVE_PROTOCOL; + if (protocol < 1 || protocol > 2) { + xenbus_dev_fatal(dev, err, "unknown fe protocol %d", protocol); + return -1; + } + be->blkif->blk_protocol = protocol; + + printk("blkback: ring-ref %ld, event-channel %d, protocol %d\n", + ring_ref, evtchn, protocol); + /* Map the shared frame, irq etc. */ err = blkif_map(be->blkif, ring_ref, evtchn); if (err) { Index: build-32-release304-12901/linux-2.6-xen-sparse/drivers/xen/blkfront/blkfront.c =================================================================== --- build-32-release304-12901.orig/linux-2.6-xen-sparse/drivers/xen/blkfront/blkfront.c +++ build-32-release304-12901/linux-2.6-xen-sparse/drivers/xen/blkfront/blkfront.c @@ -180,6 +180,12 @@ again: message = "writing event-channel"; goto abort_transaction; } + err = xenbus_printf(xbt, dev->nodename, + "protocol", "%u", BLKIF_NATIVE_PROTOCOL); + if (err) { + message = "writing protocol"; + goto abort_transaction; + } err = xenbus_transaction_end(xbt, 0); if (err) { Index: build-32-release304-12901/xen/include/public/io/blkif.h =================================================================== --- build-32-release304-12901.orig/xen/include/public/io/blkif.h +++ build-32-release304-12901/xen/include/public/io/blkif.h @@ -71,20 +71,44 @@ */ #define BLKIF_MAX_SEGMENTS_PER_REQUEST 11 -struct blkif_request { +struct blkif_request_segment { + grant_ref_t gref; /* reference to I/O buffer frame */ + /* @first_sect: first sector in frame to transfer (inclusive). */ + /* @last_sect: last sector in frame to transfer (inclusive). */ + uint8_t first_sect, last_sect; +}; + +/* Not a real protocol. Used to generate ring structs which contain + * the elements common to all protocols only. This way we get a + * compiler-checkable way to use common struct elements, so we can + * avoid using switch(protocol) in a number of places. */ +struct blkif_co_request { + char dummy; +}; + +/* i386 protocol version */ +#pragma pack(push, 4) +struct blkif_v1_request { uint8_t operation; /* BLKIF_OP_??? */ uint8_t nr_segments; /* number of segments */ blkif_vdev_t handle; /* only for read/write requests */ uint64_t id; /* private guest value, echoed in resp */ blkif_sector_t sector_number;/* start sector idx on disk (r/w only) */ - struct blkif_request_segment { - grant_ref_t gref; /* reference to I/O buffer frame */ - /* @first_sect: first sector in frame to transfer (inclusive). */ - /* @last_sect: last sector in frame to transfer (inclusive). */ - uint8_t first_sect, last_sect; - } seg[BLKIF_MAX_SEGMENTS_PER_REQUEST]; + struct blkif_request_segment seg[BLKIF_MAX_SEGMENTS_PER_REQUEST]; }; -typedef struct blkif_request blkif_request_t; +typedef struct blkif_v1_request blkif_v1_request_t; +#pragma pack(pop) + +/* x86_64 protocol version */ +struct blkif_v2_request { + uint8_t operation; /* BLKIF_OP_??? */ + uint8_t nr_segments; /* number of segments */ + blkif_vdev_t handle; /* only for read/write requests */ + uint64_t __attribute__((__aligned__(8))) id; + blkif_sector_t sector_number;/* start sector idx on disk (r/w only) */ + struct blkif_request_segment seg[BLKIF_MAX_SEGMENTS_PER_REQUEST]; +}; +typedef struct blkif_v2_request blkif_v2_request_t; struct blkif_response { uint64_t id; /* copied from request */ @@ -107,7 +131,62 @@ typedef struct blkif_response blkif_resp * Generate blkif ring structures and types. */ -DEFINE_RING_TYPES(blkif, struct blkif_request, struct blkif_response); +DEFINE_RING_TYPES(blkif_co, struct blkif_co_request, struct blkif_response); +DEFINE_RING_TYPES(blkif_v1, struct blkif_v1_request, struct blkif_response); +DEFINE_RING_TYPES(blkif_v2, struct blkif_v2_request, struct blkif_response); + +union blkif_back_rings { + blkif_co_back_ring_t co; + blkif_v1_back_ring_t v1; + blkif_v2_back_ring_t v2; +}; +typedef union blkif_back_rings blkif_back_rings_t; + +#if defined(__i386__) + +#define BLKIF_NATIVE_PROTOCOL 1 +typedef struct blkif_v1_request blkif_request_t; +DEFINE_RING_TYPES(blkif, struct blkif_v1_request, struct blkif_response); + +#elif defined(__x86_64__) + +#define BLKIF_NATIVE_PROTOCOL 2 +typedef struct blkif_v2_request blkif_request_t; +DEFINE_RING_TYPES(blkif, struct blkif_v2_request, struct blkif_response); + +#else +# error arch fixup needed here +#endif + +#ifdef __KERNEL__ + +/* translate requests: v1/v2 to native */ +#if 1 == BLKIF_NATIVE_PROTOCOL +static void inline blkif_get_v1_req(blkif_request_t *dst, blkif_v1_request_t *src) +#else +static void inline blkif_get_v2_req(blkif_request_t *dst, blkif_v2_request_t *src) +#endif +{ + memcpy(dst, src, sizeof(*dst)); +} + +#if 1 == BLKIF_NATIVE_PROTOCOL +static void inline blkif_get_v2_req(blkif_request_t *dst, blkif_v2_request_t *src) +#else +static void inline blkif_get_v1_req(blkif_request_t *dst, blkif_v1_request_t *src) +#endif +{ + int i; + dst->operation = src->operation; + dst->nr_segments = src->nr_segments; + dst->handle = src->handle; + dst->id = src->id; + dst->sector_number = src->sector_number; + for (i = 0; i < src->nr_segments; i++) + dst->seg[i] = src->seg[i]; +} + +#endif #define VDISK_CDROM 0x1 #define VDISK_REMOVABLE 0x2 Index: build-32-release304-12901/tools/python/xen/xend/server/blkif.py =================================================================== --- build-32-release304-12901.orig/tools/python/xen/xend/server/blkif.py +++ build-32-release304-12901/tools/python/xen/xend/server/blkif.py @@ -38,6 +38,7 @@ class BlkifController(DevController): """@see DevController.getDeviceDetails""" uname = config.get('uname', '') dev = config.get('dev', '') + protocol = config.get('protocol') if 'ioemu:' in dev: (_, dev) = string.split(dev, ':', 1) @@ -85,6 +86,8 @@ class BlkifController(DevController): front = { 'virtual-device' : "%i" % devid, 'device-type' : dev_type } + if protocol: + front.update({ 'protocol' : protocol }); return (devid, back, front) Index: build-32-release304-12901/tools/python/xen/xm/create.py =================================================================== --- build-32-release304-12901.orig/tools/python/xen/xm/create.py +++ build-32-release304-12901/tools/python/xen/xm/create.py @@ -531,7 +531,7 @@ def configure_image(vals): def configure_disks(config_devs, vals): """Create the config for disks (virtual block devices). """ - for (uname, dev, mode, backend) in vals.disk: + for (uname, dev, mode, backend, protocol) in vals.disk: if uname.startswith('tap:'): cls = 'tap' else: @@ -543,6 +543,8 @@ def configure_disks(config_devs, vals): ['mode', mode ] ] if backend: config_vbd.append(['backend', backend]) + if protocol: + config_vbd.append(['protocol', protocol]) config_devs.append(['device', config_vbd]) def configure_pci(config_devs, vals): @@ -787,7 +789,10 @@ def preprocess_disk(vals): n = len(d) if n == 3: d.append(None) + d.append(None) elif n == 4: + d.append(None) + elif n == 5: pass else: err('Invalid disk specifier: ' + v) Index: build-32-release304-12901/linux-2.6-xen-sparse/drivers/xen/blktap/blktap.c =================================================================== --- build-32-release304-12901.orig/linux-2.6-xen-sparse/drivers/xen/blktap/blktap.c +++ build-32-release304-12901/linux-2.6-xen-sparse/drivers/xen/blktap/blktap.c @@ -1094,15 +1094,15 @@ irqreturn_t tap_blkif_be_int(int irq, vo static int print_dbug = 1; static int do_block_io_op(blkif_t *blkif) { - blkif_back_ring_t *blk_ring = &blkif->blk_ring; + blkif_back_rings_t *blk_rings = &blkif->blk_rings; blkif_request_t req; pending_req_t *pending_req; RING_IDX rc, rp; int more_to_do = 0; tap_blkif_t *info; - rc = blk_ring->req_cons; - rp = blk_ring->sring->req_prod; + rc = blk_rings->co.req_cons; + rp = blk_rings->co.sring->req_prod; rmb(); /* Ensure we see queued requests up to 'rp'. */ /*Check blkif has corresponding UE ring*/ @@ -1133,8 +1133,8 @@ static int do_block_io_op(blkif_t *blkif more_to_do = 1; break; } - - if (RING_REQUEST_CONS_OVERFLOW(blk_ring, rc)) { + + if (RING_REQUEST_CONS_OVERFLOW(&blk_rings->co, rc)) { WPRINTK("RING_REQUEST_CONS_OVERFLOW!" " More to do\n"); more_to_do = 1; @@ -1148,8 +1148,17 @@ static int do_block_io_op(blkif_t *blkif break; } - memcpy(&req, RING_GET_REQUEST(blk_ring, rc), sizeof(req)); - blk_ring->req_cons = ++rc; /* before make_response() */ + switch (blkif->blk_protocol) { + case 1: + blkif_get_v1_req(&req, RING_GET_REQUEST(&blk_rings->v1, rc)); + break; + case 2: + blkif_get_v2_req(&req, RING_GET_REQUEST(&blk_rings->v2, rc)); + break; + default: + BUG(); + } + blk_rings->co.req_cons = ++rc; /* before make_response() */ switch (req.operation) { case BLKIF_OP_READ: @@ -1225,7 +1234,7 @@ static void dispatch_rw_block_io(blkif_t WPRINTK("blktap: fe_ring is full, can't add " "IO Request will be dropped. %d %d\n", RING_SIZE(&info->ufe_ring), - RING_SIZE(&blkif->blk_ring)); + RING_SIZE(&blkif->blk_rings.co)); goto fail_response; } @@ -1415,30 +1424,31 @@ static void make_response(blkif_t *blkif { blkif_response_t *resp; unsigned long flags; - blkif_back_ring_t *blk_ring = &blkif->blk_ring; + blkif_back_rings_t *blk_rings = &blkif->blk_rings; int more_to_do = 0; int notify; + spin_lock_irqsave(&blkif->blk_ring_lock, flags); - /* Place on the response ring for the relevant domain. */ - resp = RING_GET_RESPONSE(blk_ring, blk_ring->rsp_prod_pvt); + /* Place on the response ring for the relevant domain. */ + resp = RING_GET_RESPONSE(&blk_rings->co, blk_rings->co.rsp_prod_pvt); resp->id = id; resp->operation = op; resp->status = st; - blk_ring->rsp_prod_pvt++; - RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(blk_ring, notify); + blk_rings->co.rsp_prod_pvt++; + RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&blk_rings->co, notify); - if (blk_ring->rsp_prod_pvt == blk_ring->req_cons) { + if (blk_rings->co.rsp_prod_pvt == blk_rings->co.req_cons) { /* * Tail check for pending requests. Allows frontend to avoid * notifications if requests are already in flight (lower * overheads and promotes batching). */ - RING_FINAL_CHECK_FOR_REQUESTS(blk_ring, more_to_do); - } else if (RING_HAS_UNCONSUMED_REQUESTS(blk_ring)) { + RING_FINAL_CHECK_FOR_REQUESTS(&blk_rings->co, more_to_do); + } else if (RING_HAS_UNCONSUMED_REQUESTS(&blk_rings->co)) { more_to_do = 1; + } - } spin_unlock_irqrestore(&blkif->blk_ring_lock, flags); if (more_to_do) blkif_notify_work(blkif); Index: build-32-release304-12901/linux-2.6-xen-sparse/drivers/xen/blktap/common.h =================================================================== --- build-32-release304-12901.orig/linux-2.6-xen-sparse/drivers/xen/blktap/common.h +++ build-32-release304-12901/linux-2.6-xen-sparse/drivers/xen/blktap/common.h @@ -59,7 +59,8 @@ typedef struct blkif_st { unsigned int evtchn; unsigned int irq; /* Comms information. */ - blkif_back_ring_t blk_ring; + int blk_protocol; + blkif_back_rings_t blk_rings; struct vm_struct *blk_ring_area; /* Back pointer to the backend_info. */ struct backend_info *be; Index: build-32-release304-12901/linux-2.6-xen-sparse/drivers/xen/blktap/interface.c =================================================================== --- build-32-release304-12901.orig/linux-2.6-xen-sparse/drivers/xen/blktap/interface.c +++ build-32-release304-12901/linux-2.6-xen-sparse/drivers/xen/blktap/interface.c @@ -96,7 +96,6 @@ static void unmap_frontend_page(blkif_t int tap_blkif_map(blkif_t *blkif, unsigned long shared_page, unsigned int evtchn) { - blkif_sring_t *sring; int err; struct evtchn_bind_interdomain bind_interdomain; @@ -126,8 +125,24 @@ int tap_blkif_map(blkif_t *blkif, unsign blkif->evtchn = bind_interdomain.local_port; - sring = (blkif_sring_t *)blkif->blk_ring_area->addr; - BACK_RING_INIT(&blkif->blk_ring, sring, PAGE_SIZE); + switch (blkif->blk_protocol) { + case 1: + { + blkif_v1_sring_t *sring_v1; + sring_v1 = (blkif_v1_sring_t *)blkif->blk_ring_area->addr; + BACK_RING_INIT(&blkif->blk_rings.v1, sring_v1, PAGE_SIZE); + break; + } + case 2: + { + blkif_v2_sring_t *sring_v2; + sring_v2 = (blkif_v2_sring_t *)blkif->blk_ring_area->addr; + BACK_RING_INIT(&blkif->blk_rings.v2, sring_v2, PAGE_SIZE); + break; + } + default: + BUG(); + } blkif->irq = bind_evtchn_to_irqhandler( blkif->evtchn, tap_blkif_be_int, 0, "blkif-backend", blkif); @@ -141,10 +156,10 @@ void tap_blkif_unmap(blkif_t *blkif) unbind_from_irqhandler(blkif->irq, blkif); blkif->irq = 0; } - if (blkif->blk_ring.sring) { + if (blkif->blk_rings.co.sring) { unmap_frontend_page(blkif); free_vm_area(blkif->blk_ring_area); - blkif->blk_ring.sring = NULL; + blkif->blk_rings.co.sring = NULL; } } Index: build-32-release304-12901/linux-2.6-xen-sparse/drivers/xen/blktap/xenbus.c =================================================================== --- build-32-release304-12901.orig/linux-2.6-xen-sparse/drivers/xen/blktap/xenbus.c +++ build-32-release304-12901/linux-2.6-xen-sparse/drivers/xen/blktap/xenbus.c @@ -316,6 +316,7 @@ static int connect_ring(struct backend_i struct xenbus_device *dev = be->dev; unsigned long ring_ref; unsigned int evtchn; + unsigned int protocol; int err; DPRINTK("%s\n", dev->otherend); @@ -329,6 +330,19 @@ static int connect_ring(struct backend_i return err; } + err = xenbus_gather(XBT_NIL, dev->otherend, "protocol", + "%u", &protocol, NULL); + if (err) + protocol = BLKIF_NATIVE_PROTOCOL; + if (protocol < 1 || protocol > 2) { + xenbus_dev_fatal(dev, err, "unknown fe protocol %d", protocol); + return -1; + } + be->blkif->blk_protocol = protocol; + + printk("blktap: ring-ref %ld, event-channel %d, protocol %d\n", + ring_ref, evtchn, protocol); + /* Map the shared frame, irq etc. */ err = tap_blkif_map(be->blkif, ring_ref, evtchn); if (err) { [-- Attachment #3: Type: text/plain, Size: 138 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Re: [patch/rfc] multiprotocol blkback drivers (32-on-64) 2006-12-19 13:32 ` Gerd Hoffmann @ 2006-12-19 14:20 ` Keir Fraser 2006-12-20 15:12 ` Gerd Hoffmann 0 siblings, 1 reply; 12+ messages in thread From: Keir Fraser @ 2006-12-19 14:20 UTC (permalink / raw) To: Gerd Hoffmann, Jan Beulich; +Cc: Xen devel list On 19/12/06 13:32, "Gerd Hoffmann" <kraxel@suse.de> wrote: >> - unconditionally using #pragma pack(), __attribute__(()), and __i386__ or >> __x86_64__ in public Xen headers is, in my opinion, a no-go (these header >> should all be suitable for building e.g. Windows drivers, too - I know this >> isn't >> generally the case at present, but I don't think anything else can be the >> goal, >> and hence the situation shouldn't be made worse) > > Ideas for that one? Ok to create xen/include/public/compiler.h with > that kind of stuff in? I think it would be reasonable to put this stuff in a (new) Linux-specific header file that wraps the Xen-public blkif.h. We could put just enough support in blkif.h itself to allow it to be multiply-compiled. Then different OSes can wrap or rewrite blkif.h as they see fit to get the required layout for 32-bit and 64-bit ABIs. This conveniently sidesteps some of these issues and allows you to concentrate on Linux and GCC, while not constraining the implementation choices for anyone else. -- Keir ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Re: [patch/rfc] multiprotocol blkback drivers (32-on-64) 2006-12-19 14:20 ` Keir Fraser @ 2006-12-20 15:12 ` Gerd Hoffmann 2006-12-20 15:47 ` Jan Beulich 0 siblings, 1 reply; 12+ messages in thread From: Gerd Hoffmann @ 2006-12-20 15:12 UTC (permalink / raw) To: Keir Fraser; +Cc: Xen devel list, Jan Beulich [-- Attachment #1: Type: text/plain, Size: 706 bytes --] Hi, > I think it would be reasonable to put this stuff in a (new) Linux-specific > header file that wraps the Xen-public blkif.h. We could put just enough > support in blkif.h itself to allow it to be multiply-compiled. Then > different OSes can wrap or rewrite blkif.h as they see fit to get the > required layout for 32-bit and 64-bit ABIs. This conveniently sidesteps some > of these issues and allows you to concentrate on Linux and GCC, while not > constraining the implementation choices for anyone else. Next interation with exactly that implemented, this time even tested with blkback in all four combinations out of 32/64 dom0, 32/64 guest. cheers, Gerd -- Gerd Hoffmann <kraxel@suse.de> [-- Attachment #2: blkif-bimodal.diff --] [-- Type: text/x-patch, Size: 25593 bytes --] multiprotocol blkback drivers. This is a patch for the block interface, frontend drivers, backend drivers and tools to support multiple ring protocols. Right there are now just two: the 32bit and the 64bit one. If needed it can be extended. Interface changes (io/blkif.h) * Have both request structs there, with "v1" and "v2" added to the name. The old name is aliased to the native protocol of the architecture. * Add helper functions to convert v1/v2 requests to native. Frontend changes: * Create a new node "protocol", add the protocol number it speaks there. Backend changes: * Look at the "protocol" number of the frontend and switch ring handling accordingly. If the protocol node isn't present it assumes native protocol. * As the request struct is copied anyway before being processed (for security reasons) it is converted to native at that point so most backend code doesn't need to know what the frontend speaks. * In case of blktap this is completely transparent to userspace, the kernel/userspace ring is always native no matter what the frontend speaks. Tools changes: * Add one more option to the disk configuration, so one can specify the protocol the frontend speaks in the config file. This is needed for old frontends which don't advertise the protocol they are speaking themself. I'm not that happy with this approach, but it works for now and I'm kida lost in the stack of python classes doing domain and device handling ... --- linux-2.6-xen-sparse/drivers/xen/blkback/blkback.c | 64 ++++++++---- linux-2.6-xen-sparse/drivers/xen/blkback/common.h | 6 - linux-2.6-xen-sparse/drivers/xen/blkback/interface.c | 25 +++- linux-2.6-xen-sparse/drivers/xen/blkback/xenbus.c | 14 ++ linux-2.6-xen-sparse/drivers/xen/blkfront/blkfront.c | 7 + linux-2.6-xen-sparse/drivers/xen/blktap/blktap.c | 63 ++++++++---- linux-2.6-xen-sparse/drivers/xen/blktap/common.h | 6 - linux-2.6-xen-sparse/drivers/xen/blktap/interface.c | 25 +++- linux-2.6-xen-sparse/drivers/xen/blktap/xenbus.c | 14 ++ linux-2.6-xen-sparse/include/xen/blkif.h | 99 +++++++++++++++++++ tools/python/xen/xend/server/blkif.py | 3 tools/python/xen/xm/create.py | 7 + xen/include/public/io/blkif.h | 14 +- 13 files changed, 282 insertions(+), 65 deletions(-) Index: build-64-release304-13098/linux-2.6-xen-sparse/drivers/xen/blkback/blkback.c =================================================================== --- build-64-release304-13098.orig/linux-2.6-xen-sparse/drivers/xen/blkback/blkback.c +++ build-64-release304-13098/linux-2.6-xen-sparse/drivers/xen/blkback/blkback.c @@ -298,17 +298,20 @@ irqreturn_t blkif_be_int(int irq, void * static int do_block_io_op(blkif_t *blkif) { - blkif_back_ring_t *blk_ring = &blkif->blk_ring; + blkif_back_rings_t *blk_rings = &blkif->blk_rings; blkif_request_t req; pending_req_t *pending_req; RING_IDX rc, rp; int more_to_do = 0; - rc = blk_ring->req_cons; - rp = blk_ring->sring->req_prod; + rc = blk_rings->co.req_cons; + rp = blk_rings->co.sring->req_prod; rmb(); /* Ensure we see queued requests up to 'rp'. */ - while ((rc != rp) && !RING_REQUEST_CONS_OVERFLOW(blk_ring, rc)) { + while ((rc != rp)) { + + if (RING_REQUEST_CONS_OVERFLOW(&blk_rings->co, rc)) + break; pending_req = alloc_req(); if (NULL == pending_req) { @@ -317,8 +320,17 @@ static int do_block_io_op(blkif_t *blkif break; } - memcpy(&req, RING_GET_REQUEST(blk_ring, rc), sizeof(req)); - blk_ring->req_cons = ++rc; /* before make_response() */ + switch (blkif->blk_protocol) { + case 1: + blkif_get_v1_req(&req, RING_GET_REQUEST(&blk_rings->v1, rc)); + break; + case 2: + blkif_get_v2_req(&req, RING_GET_REQUEST(&blk_rings->v2, rc)); + break; + default: + BUG(); + } + blk_rings->co.req_cons = ++rc; /* before make_response() */ switch (req.operation) { case BLKIF_OP_READ: @@ -498,34 +510,44 @@ static void dispatch_rw_block_io(blkif_t static void make_response(blkif_t *blkif, unsigned long id, unsigned short op, int st) { - blkif_response_t *resp; + blkif_response_t resp; unsigned long flags; - blkif_back_ring_t *blk_ring = &blkif->blk_ring; + blkif_back_rings_t *blk_rings = &blkif->blk_rings; int more_to_do = 0; int notify; - spin_lock_irqsave(&blkif->blk_ring_lock, flags); - - /* Place on the response ring for the relevant domain. */ - resp = RING_GET_RESPONSE(blk_ring, blk_ring->rsp_prod_pvt); - resp->id = id; - resp->operation = op; - resp->status = st; - blk_ring->rsp_prod_pvt++; - RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(blk_ring, notify); + resp.id = id; + resp.operation = op; + resp.status = st; - if (blk_ring->rsp_prod_pvt == blk_ring->req_cons) { + spin_lock_irqsave(&blkif->blk_ring_lock, flags); + /* Place on the response ring for the relevant domain. */ + switch (blkif->blk_protocol) { + case 1: + memcpy(RING_GET_RESPONSE(&blk_rings->v1, blk_rings->v1.rsp_prod_pvt), + &resp, sizeof(resp)); + break; + case 2: + memcpy(RING_GET_RESPONSE(&blk_rings->v2, blk_rings->v2.rsp_prod_pvt), + &resp, sizeof(resp)); + break; + default: + BUG(); + } + blk_rings->co.rsp_prod_pvt++; + RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&blk_rings->co, notify); + if (blk_rings->co.rsp_prod_pvt == blk_rings->co.req_cons) { /* * Tail check for pending requests. Allows frontend to avoid * notifications if requests are already in flight (lower * overheads and promotes batching). */ - RING_FINAL_CHECK_FOR_REQUESTS(blk_ring, more_to_do); + RING_FINAL_CHECK_FOR_REQUESTS(&blk_rings->co, more_to_do); - } else if (RING_HAS_UNCONSUMED_REQUESTS(blk_ring)) { + } else if (RING_HAS_UNCONSUMED_REQUESTS(&blk_rings->co)) { more_to_do = 1; - } + spin_unlock_irqrestore(&blkif->blk_ring_lock, flags); if (more_to_do) Index: build-64-release304-13098/linux-2.6-xen-sparse/drivers/xen/blkback/common.h =================================================================== --- build-64-release304-13098.orig/linux-2.6-xen-sparse/drivers/xen/blkback/common.h +++ build-64-release304-13098/linux-2.6-xen-sparse/drivers/xen/blkback/common.h @@ -40,8 +40,7 @@ #include <asm/pgalloc.h> #include <xen/evtchn.h> #include <asm/hypervisor.h> -#include <xen/interface/io/blkif.h> -#include <xen/interface/io/ring.h> +#include <xen/blkif.h> #include <xen/gnttab.h> #include <xen/driver_util.h> #include <xen/xenbus.h> @@ -68,7 +67,8 @@ typedef struct blkif_st { unsigned int evtchn; unsigned int irq; /* Comms information. */ - blkif_back_ring_t blk_ring; + int blk_protocol; + blkif_back_rings_t blk_rings; struct vm_struct *blk_ring_area; /* The VBD attached to this interface. */ struct vbd vbd; Index: build-64-release304-13098/linux-2.6-xen-sparse/drivers/xen/blkback/interface.c =================================================================== --- build-64-release304-13098.orig/linux-2.6-xen-sparse/drivers/xen/blkback/interface.c +++ build-64-release304-13098/linux-2.6-xen-sparse/drivers/xen/blkback/interface.c @@ -95,7 +95,6 @@ static void unmap_frontend_page(blkif_t int blkif_map(blkif_t *blkif, unsigned long shared_page, unsigned int evtchn) { - blkif_sring_t *sring; int err; struct evtchn_bind_interdomain bind_interdomain; @@ -125,8 +124,24 @@ int blkif_map(blkif_t *blkif, unsigned l blkif->evtchn = bind_interdomain.local_port; - sring = (blkif_sring_t *)blkif->blk_ring_area->addr; - BACK_RING_INIT(&blkif->blk_ring, sring, PAGE_SIZE); + switch (blkif->blk_protocol) { + case 1: + { + blkif_v1_sring_t *sring_v1; + sring_v1 = (blkif_v1_sring_t *)blkif->blk_ring_area->addr; + BACK_RING_INIT(&blkif->blk_rings.v1, sring_v1, PAGE_SIZE); + break; + } + case 2: + { + blkif_v2_sring_t *sring_v2; + sring_v2 = (blkif_v2_sring_t *)blkif->blk_ring_area->addr; + BACK_RING_INIT(&blkif->blk_rings.v2, sring_v2, PAGE_SIZE); + break; + } + default: + BUG(); + } blkif->irq = bind_evtchn_to_irqhandler( blkif->evtchn, blkif_be_int, 0, "blkif-backend", blkif); @@ -150,10 +165,10 @@ void blkif_disconnect(blkif_t *blkif) blkif->irq = 0; } - if (blkif->blk_ring.sring) { + if (blkif->blk_rings.co.sring) { unmap_frontend_page(blkif); free_vm_area(blkif->blk_ring_area); - blkif->blk_ring.sring = NULL; + blkif->blk_rings.co.sring = NULL; } } Index: build-64-release304-13098/linux-2.6-xen-sparse/drivers/xen/blkback/xenbus.c =================================================================== --- build-64-release304-13098.orig/linux-2.6-xen-sparse/drivers/xen/blkback/xenbus.c +++ build-64-release304-13098/linux-2.6-xen-sparse/drivers/xen/blkback/xenbus.c @@ -435,6 +435,7 @@ static int connect_ring(struct backend_i struct xenbus_device *dev = be->dev; unsigned long ring_ref; unsigned int evtchn; + unsigned int protocol; int err; DPRINTK("%s", dev->otherend); @@ -448,6 +449,19 @@ static int connect_ring(struct backend_i return err; } + err = xenbus_gather(XBT_NIL, dev->otherend, "protocol", + "%u", &protocol, NULL); + if (err) + protocol = BLKIF_NATIVE_PROTOCOL; + if (protocol < 1 || protocol > 2) { + xenbus_dev_fatal(dev, err, "unknown fe protocol %d", protocol); + return -1; + } + be->blkif->blk_protocol = protocol; + + printk("blkback: ring-ref %ld, event-channel %d, protocol %d\n", + ring_ref, evtchn, protocol); + /* Map the shared frame, irq etc. */ err = blkif_map(be->blkif, ring_ref, evtchn); if (err) { Index: build-64-release304-13098/linux-2.6-xen-sparse/drivers/xen/blkfront/blkfront.c =================================================================== --- build-64-release304-13098.orig/linux-2.6-xen-sparse/drivers/xen/blkfront/blkfront.c +++ build-64-release304-13098/linux-2.6-xen-sparse/drivers/xen/blkfront/blkfront.c @@ -45,6 +45,7 @@ #include <xen/xenbus.h> #include <xen/interface/grant_table.h> #include <xen/gnttab.h> +#include <xen/blkif.h> #include <asm/hypervisor.h> #include <asm/maddr.h> @@ -180,6 +181,12 @@ again: message = "writing event-channel"; goto abort_transaction; } + err = xenbus_printf(xbt, dev->nodename, + "protocol", "%u", BLKIF_NATIVE_PROTOCOL); + if (err) { + message = "writing protocol"; + goto abort_transaction; + } err = xenbus_transaction_end(xbt, 0); if (err) { Index: build-64-release304-13098/xen/include/public/io/blkif.h =================================================================== --- build-64-release304-13098.orig/xen/include/public/io/blkif.h +++ build-64-release304-13098/xen/include/public/io/blkif.h @@ -71,18 +71,20 @@ */ #define BLKIF_MAX_SEGMENTS_PER_REQUEST 11 +struct blkif_request_segment { + grant_ref_t gref; /* reference to I/O buffer frame */ + /* @first_sect: first sector in frame to transfer (inclusive). */ + /* @last_sect: last sector in frame to transfer (inclusive). */ + uint8_t first_sect, last_sect; +}; + struct blkif_request { uint8_t operation; /* BLKIF_OP_??? */ uint8_t nr_segments; /* number of segments */ blkif_vdev_t handle; /* only for read/write requests */ uint64_t id; /* private guest value, echoed in resp */ blkif_sector_t sector_number;/* start sector idx on disk (r/w only) */ - struct blkif_request_segment { - grant_ref_t gref; /* reference to I/O buffer frame */ - /* @first_sect: first sector in frame to transfer (inclusive). */ - /* @last_sect: last sector in frame to transfer (inclusive). */ - uint8_t first_sect, last_sect; - } seg[BLKIF_MAX_SEGMENTS_PER_REQUEST]; + struct blkif_request_segment seg[BLKIF_MAX_SEGMENTS_PER_REQUEST]; }; typedef struct blkif_request blkif_request_t; Index: build-64-release304-13098/tools/python/xen/xend/server/blkif.py =================================================================== --- build-64-release304-13098.orig/tools/python/xen/xend/server/blkif.py +++ build-64-release304-13098/tools/python/xen/xend/server/blkif.py @@ -38,6 +38,7 @@ class BlkifController(DevController): """@see DevController.getDeviceDetails""" uname = config.get('uname', '') dev = config.get('dev', '') + protocol = config.get('protocol') if 'ioemu:' in dev: (_, dev) = string.split(dev, ':', 1) @@ -85,6 +86,8 @@ class BlkifController(DevController): front = { 'virtual-device' : "%i" % devid, 'device-type' : dev_type } + if protocol: + front.update({ 'protocol' : protocol }); return (devid, back, front) Index: build-64-release304-13098/tools/python/xen/xm/create.py =================================================================== --- build-64-release304-13098.orig/tools/python/xen/xm/create.py +++ build-64-release304-13098/tools/python/xen/xm/create.py @@ -531,7 +531,7 @@ def configure_image(vals): def configure_disks(config_devs, vals): """Create the config for disks (virtual block devices). """ - for (uname, dev, mode, backend) in vals.disk: + for (uname, dev, mode, backend, protocol) in vals.disk: if uname.startswith('tap:'): cls = 'tap' else: @@ -543,6 +543,8 @@ def configure_disks(config_devs, vals): ['mode', mode ] ] if backend: config_vbd.append(['backend', backend]) + if protocol: + config_vbd.append(['protocol', protocol]) config_devs.append(['device', config_vbd]) def configure_pci(config_devs, vals): @@ -787,7 +789,10 @@ def preprocess_disk(vals): n = len(d) if n == 3: d.append(None) + d.append(None) elif n == 4: + d.append(None) + elif n == 5: pass else: err('Invalid disk specifier: ' + v) Index: build-64-release304-13098/linux-2.6-xen-sparse/drivers/xen/blktap/blktap.c =================================================================== --- build-64-release304-13098.orig/linux-2.6-xen-sparse/drivers/xen/blktap/blktap.c +++ build-64-release304-13098/linux-2.6-xen-sparse/drivers/xen/blktap/blktap.c @@ -1094,15 +1094,15 @@ irqreturn_t tap_blkif_be_int(int irq, vo static int print_dbug = 1; static int do_block_io_op(blkif_t *blkif) { - blkif_back_ring_t *blk_ring = &blkif->blk_ring; + blkif_back_rings_t *blk_rings = &blkif->blk_rings; blkif_request_t req; pending_req_t *pending_req; RING_IDX rc, rp; int more_to_do = 0; tap_blkif_t *info; - rc = blk_ring->req_cons; - rp = blk_ring->sring->req_prod; + rc = blk_rings->co.req_cons; + rp = blk_rings->co.sring->req_prod; rmb(); /* Ensure we see queued requests up to 'rp'. */ /*Check blkif has corresponding UE ring*/ @@ -1133,8 +1133,8 @@ static int do_block_io_op(blkif_t *blkif more_to_do = 1; break; } - - if (RING_REQUEST_CONS_OVERFLOW(blk_ring, rc)) { + + if (RING_REQUEST_CONS_OVERFLOW(&blk_rings->co, rc)) { WPRINTK("RING_REQUEST_CONS_OVERFLOW!" " More to do\n"); more_to_do = 1; @@ -1148,8 +1148,17 @@ static int do_block_io_op(blkif_t *blkif break; } - memcpy(&req, RING_GET_REQUEST(blk_ring, rc), sizeof(req)); - blk_ring->req_cons = ++rc; /* before make_response() */ + switch (blkif->blk_protocol) { + case 1: + blkif_get_v1_req(&req, RING_GET_REQUEST(&blk_rings->v1, rc)); + break; + case 2: + blkif_get_v2_req(&req, RING_GET_REQUEST(&blk_rings->v2, rc)); + break; + default: + BUG(); + } + blk_rings->co.req_cons = ++rc; /* before make_response() */ switch (req.operation) { case BLKIF_OP_READ: @@ -1225,7 +1234,7 @@ static void dispatch_rw_block_io(blkif_t WPRINTK("blktap: fe_ring is full, can't add " "IO Request will be dropped. %d %d\n", RING_SIZE(&info->ufe_ring), - RING_SIZE(&blkif->blk_ring)); + RING_SIZE(&blkif->blk_rings.co)); goto fail_response; } @@ -1413,32 +1422,44 @@ static void dispatch_rw_block_io(blkif_t static void make_response(blkif_t *blkif, unsigned long id, unsigned short op, int st) { - blkif_response_t *resp; + blkif_response_t resp; unsigned long flags; - blkif_back_ring_t *blk_ring = &blkif->blk_ring; + blkif_back_rings_t *blk_rings = &blkif->blk_rings; int more_to_do = 0; int notify; + resp.id = id; + resp.operation = op; + resp.status = st; + spin_lock_irqsave(&blkif->blk_ring_lock, flags); - /* Place on the response ring for the relevant domain. */ - resp = RING_GET_RESPONSE(blk_ring, blk_ring->rsp_prod_pvt); - resp->id = id; - resp->operation = op; - resp->status = st; - blk_ring->rsp_prod_pvt++; - RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(blk_ring, notify); + /* Place on the response ring for the relevant domain. */ + switch (blkif->blk_protocol) { + case 1: + memcpy(RING_GET_RESPONSE(&blk_rings->v1, blk_rings->v1.rsp_prod_pvt), + &resp, sizeof(resp)); + break; + case 2: + memcpy(RING_GET_RESPONSE(&blk_rings->v2, blk_rings->v2.rsp_prod_pvt), + &resp, sizeof(resp)); + break; + default: + BUG(); + } + blk_rings->co.rsp_prod_pvt++; + RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&blk_rings->co, notify); - if (blk_ring->rsp_prod_pvt == blk_ring->req_cons) { + if (blk_rings->co.rsp_prod_pvt == blk_rings->co.req_cons) { /* * Tail check for pending requests. Allows frontend to avoid * notifications if requests are already in flight (lower * overheads and promotes batching). */ - RING_FINAL_CHECK_FOR_REQUESTS(blk_ring, more_to_do); - } else if (RING_HAS_UNCONSUMED_REQUESTS(blk_ring)) { + RING_FINAL_CHECK_FOR_REQUESTS(&blk_rings->co, more_to_do); + } else if (RING_HAS_UNCONSUMED_REQUESTS(&blk_rings->co)) { more_to_do = 1; + } - } spin_unlock_irqrestore(&blkif->blk_ring_lock, flags); if (more_to_do) blkif_notify_work(blkif); Index: build-64-release304-13098/linux-2.6-xen-sparse/drivers/xen/blktap/common.h =================================================================== --- build-64-release304-13098.orig/linux-2.6-xen-sparse/drivers/xen/blktap/common.h +++ build-64-release304-13098/linux-2.6-xen-sparse/drivers/xen/blktap/common.h @@ -39,8 +39,7 @@ #include <asm/pgalloc.h> #include <xen/evtchn.h> #include <asm/hypervisor.h> -#include <xen/interface/io/blkif.h> -#include <xen/interface/io/ring.h> +#include <xen/blkif.h> #include <xen/gnttab.h> #include <xen/driver_util.h> @@ -59,7 +58,8 @@ typedef struct blkif_st { unsigned int evtchn; unsigned int irq; /* Comms information. */ - blkif_back_ring_t blk_ring; + int blk_protocol; + blkif_back_rings_t blk_rings; struct vm_struct *blk_ring_area; /* Back pointer to the backend_info. */ struct backend_info *be; Index: build-64-release304-13098/linux-2.6-xen-sparse/drivers/xen/blktap/interface.c =================================================================== --- build-64-release304-13098.orig/linux-2.6-xen-sparse/drivers/xen/blktap/interface.c +++ build-64-release304-13098/linux-2.6-xen-sparse/drivers/xen/blktap/interface.c @@ -96,7 +96,6 @@ static void unmap_frontend_page(blkif_t int tap_blkif_map(blkif_t *blkif, unsigned long shared_page, unsigned int evtchn) { - blkif_sring_t *sring; int err; struct evtchn_bind_interdomain bind_interdomain; @@ -126,8 +125,24 @@ int tap_blkif_map(blkif_t *blkif, unsign blkif->evtchn = bind_interdomain.local_port; - sring = (blkif_sring_t *)blkif->blk_ring_area->addr; - BACK_RING_INIT(&blkif->blk_ring, sring, PAGE_SIZE); + switch (blkif->blk_protocol) { + case 1: + { + blkif_v1_sring_t *sring_v1; + sring_v1 = (blkif_v1_sring_t *)blkif->blk_ring_area->addr; + BACK_RING_INIT(&blkif->blk_rings.v1, sring_v1, PAGE_SIZE); + break; + } + case 2: + { + blkif_v2_sring_t *sring_v2; + sring_v2 = (blkif_v2_sring_t *)blkif->blk_ring_area->addr; + BACK_RING_INIT(&blkif->blk_rings.v2, sring_v2, PAGE_SIZE); + break; + } + default: + BUG(); + } blkif->irq = bind_evtchn_to_irqhandler( blkif->evtchn, tap_blkif_be_int, 0, "blkif-backend", blkif); @@ -141,10 +156,10 @@ void tap_blkif_unmap(blkif_t *blkif) unbind_from_irqhandler(blkif->irq, blkif); blkif->irq = 0; } - if (blkif->blk_ring.sring) { + if (blkif->blk_rings.co.sring) { unmap_frontend_page(blkif); free_vm_area(blkif->blk_ring_area); - blkif->blk_ring.sring = NULL; + blkif->blk_rings.co.sring = NULL; } } Index: build-64-release304-13098/linux-2.6-xen-sparse/drivers/xen/blktap/xenbus.c =================================================================== --- build-64-release304-13098.orig/linux-2.6-xen-sparse/drivers/xen/blktap/xenbus.c +++ build-64-release304-13098/linux-2.6-xen-sparse/drivers/xen/blktap/xenbus.c @@ -316,6 +316,7 @@ static int connect_ring(struct backend_i struct xenbus_device *dev = be->dev; unsigned long ring_ref; unsigned int evtchn; + unsigned int protocol; int err; DPRINTK("%s\n", dev->otherend); @@ -329,6 +330,19 @@ static int connect_ring(struct backend_i return err; } + err = xenbus_gather(XBT_NIL, dev->otherend, "protocol", + "%u", &protocol, NULL); + if (err) + protocol = BLKIF_NATIVE_PROTOCOL; + if (protocol < 1 || protocol > 2) { + xenbus_dev_fatal(dev, err, "unknown fe protocol %d", protocol); + return -1; + } + be->blkif->blk_protocol = protocol; + + printk("blktap: ring-ref %ld, event-channel %d, protocol %d\n", + ring_ref, evtchn, protocol); + /* Map the shared frame, irq etc. */ err = tap_blkif_map(be->blkif, ring_ref, evtchn); if (err) { Index: build-64-release304-13098/linux-2.6-xen-sparse/include/xen/blkif.h =================================================================== --- /dev/null +++ build-64-release304-13098/linux-2.6-xen-sparse/include/xen/blkif.h @@ -0,0 +1,99 @@ +#ifndef __XEN_BLKIF_H__ +#define __XEN_BLKIF_H__ + +#include <xen/interface/io/ring.h> +#include <xen/interface/io/blkif.h> + +/* Not a real protocol. Used to generate ring structs which contain + * the elements common to all protocols only. This way we get a + * compiler-checkable way to use common struct elements, so we can + * avoid using switch(protocol) in a number of places. */ +struct blkif_co_request { + char dummy; +}; +struct blkif_co_response { + char dummy; +}; + +/* i386 protocol version */ +#pragma pack(push, 4) +struct blkif_v1_request { + uint8_t operation; /* BLKIF_OP_??? */ + uint8_t nr_segments; /* number of segments */ + blkif_vdev_t handle; /* only for read/write requests */ + uint64_t id; /* private guest value, echoed in resp */ + blkif_sector_t sector_number;/* start sector idx on disk (r/w only) */ + struct blkif_request_segment seg[BLKIF_MAX_SEGMENTS_PER_REQUEST]; +}; +struct blkif_v1_response { + uint64_t id; /* copied from request */ + uint8_t operation; /* copied from request */ + int16_t status; /* BLKIF_RSP_??? */ +}; +typedef struct blkif_v1_request blkif_v1_request_t; +typedef struct blkif_v1_response blkif_v1_response_t; +#pragma pack(pop) + +/* x86_64 protocol version */ +struct blkif_v2_request { + uint8_t operation; /* BLKIF_OP_??? */ + uint8_t nr_segments; /* number of segments */ + blkif_vdev_t handle; /* only for read/write requests */ + uint64_t __attribute__((__aligned__(8))) id; + blkif_sector_t sector_number;/* start sector idx on disk (r/w only) */ + struct blkif_request_segment seg[BLKIF_MAX_SEGMENTS_PER_REQUEST]; +}; +struct blkif_v2_response { + uint64_t __attribute__((__aligned__(8))) id; + uint8_t operation; /* copied from request */ + int16_t status; /* BLKIF_RSP_??? */ +}; +typedef struct blkif_v2_request blkif_v2_request_t; +typedef struct blkif_v2_response blkif_v2_response_t; + +DEFINE_RING_TYPES(blkif_co, struct blkif_co_request, struct blkif_co_response); +DEFINE_RING_TYPES(blkif_v1, struct blkif_v1_request, struct blkif_v1_response); +DEFINE_RING_TYPES(blkif_v2, struct blkif_v2_request, struct blkif_v2_response); + +union blkif_back_rings { + blkif_co_back_ring_t co; + blkif_v1_back_ring_t v1; + blkif_v2_back_ring_t v2; +}; +typedef union blkif_back_rings blkif_back_rings_t; + +#if defined(__i386__) +# define BLKIF_NATIVE_PROTOCOL 1 +#elif defined(__x86_64__) || defined(__ia64__) +# define BLKIF_NATIVE_PROTOCOL 2 +#else +# error arch fixup needed here +#endif + +/* translate requests: v1/v2 to native */ +#if 1 == BLKIF_NATIVE_PROTOCOL +static void inline blkif_get_v1_req(blkif_request_t *dst, blkif_v1_request_t *src) +#else +static void inline blkif_get_v2_req(blkif_request_t *dst, blkif_v2_request_t *src) +#endif +{ + memcpy(dst, src, sizeof(*dst)); +} + +#if 1 == BLKIF_NATIVE_PROTOCOL +static void inline blkif_get_v2_req(blkif_request_t *dst, blkif_v2_request_t *src) +#else +static void inline blkif_get_v1_req(blkif_request_t *dst, blkif_v1_request_t *src) +#endif +{ + int i; + dst->operation = src->operation; + dst->nr_segments = src->nr_segments; + dst->handle = src->handle; + dst->id = src->id; + dst->sector_number = src->sector_number; + for (i = 0; i < src->nr_segments; i++) + dst->seg[i] = src->seg[i]; +} + +#endif /* __XEN_BLKIF_H__ */ [-- Attachment #3: Type: text/plain, Size: 138 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Re: [patch/rfc] multiprotocol blkback drivers (32-on-64) 2006-12-20 15:12 ` Gerd Hoffmann @ 2006-12-20 15:47 ` Jan Beulich 2006-12-20 16:07 ` Keir Fraser 0 siblings, 1 reply; 12+ messages in thread From: Jan Beulich @ 2006-12-20 15:47 UTC (permalink / raw) To: Gerd Hoffmann, Keir Fraser; +Cc: Xen devel list Looks a lot nicer now, except that I dislike the replication of the request/response structures in now three places. Besides possibly being a maintenance issue, this now seems worse in terms of scaling up to eventual future protocol versions. I had understood Keir in a much different way - adding a compiler abstraction header (which maybe could even make use of Linux' native ones) to include/xen, and making use of its abstraction directly in xen/include/public/io/blkif.h. Jan >>> Gerd Hoffmann <kraxel@suse.de> 20.12.06 16:12 >>> Hi, > I think it would be reasonable to put this stuff in a (new) Linux-specific > header file that wraps the Xen-public blkif.h. We could put just enough > support in blkif.h itself to allow it to be multiply-compiled. Then > different OSes can wrap or rewrite blkif.h as they see fit to get the > required layout for 32-bit and 64-bit ABIs. This conveniently sidesteps some > of these issues and allows you to concentrate on Linux and GCC, while not > constraining the implementation choices for anyone else. Next interation with exactly that implemented, this time even tested with blkback in all four combinations out of 32/64 dom0, 32/64 guest. cheers, Gerd -- Gerd Hoffmann <kraxel@suse.de> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Re: [patch/rfc] multiprotocol blkback drivers (32-on-64) 2006-12-20 15:47 ` Jan Beulich @ 2006-12-20 16:07 ` Keir Fraser 0 siblings, 0 replies; 12+ messages in thread From: Keir Fraser @ 2006-12-20 16:07 UTC (permalink / raw) To: Jan Beulich, Gerd Hoffmann, Keir Fraser; +Cc: Xen devel list On 20/12/06 15:47, "Jan Beulich" <jbeulich@novell.com> wrote: > Looks a lot nicer now, except that I dislike the replication of the > request/response > structures in now three places. Besides possibly being a maintenance issue, > this > now seems worse in terms of scaling up to eventual future protocol versions. I > had > understood Keir in a much different way - adding a compiler abstraction header > (which maybe could even make use of Linux' native ones) to include/xen, and > making use of its abstraction directly in xen/include/public/io/blkif.h. I quite like the explicitness of this approach. The duplication is small and the structures aren't going to change (without changing protocol version too) so maintenance is not that much of an issue. It would be nicer to put the structure definition inside a macro, instantiated multiple times, or inside another header file, included multiple times, though. My other suggested approach is slightly hindered by the fact that all the Xen public headers are include-only-once, although we could circumvent that for this case I suppose. It's also questionable whether this can be done really cleanly in a way that most compilers can adapt to -- do most have the kind of flexible packing pragmas that GCC has? Xen/include/public/io/blkif.h changes infrequently enough that we could define linux/include/xen/blkif.h to be Linux's version and not include the Xen-provided one directly at all. The Xen headers need Linux formatting anyway, so it's not like the Linux blkif.h will ever be a direct copy of the original Xen-provided blkif.h. -- Keir ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2006-12-20 16:07 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-12-18 16:39 [patch/rfc] multiprotocol blkback drivers (32-on-64) Gerd Hoffmann 2006-12-18 17:09 ` Jan Beulich 2006-12-18 17:58 ` Keir Fraser 2006-12-19 7:37 ` Jan Beulich 2006-12-19 8:20 ` Gerd Hoffmann 2006-12-19 7:55 ` Jan Beulich 2006-12-19 8:35 ` Gerd Hoffmann 2006-12-19 13:32 ` Gerd Hoffmann 2006-12-19 14:20 ` Keir Fraser 2006-12-20 15:12 ` Gerd Hoffmann 2006-12-20 15:47 ` Jan Beulich 2006-12-20 16:07 ` Keir Fraser
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.