* [PATCH 1/3] Virtio draft IV
@ 2007-07-04 4:12 Rusty Russell
2007-07-04 4:19 ` [PATCH 2/3] Virtio draft IV: the block driver Rusty Russell
0 siblings, 1 reply; 21+ messages in thread
From: Rusty Russell @ 2007-07-04 4:12 UTC (permalink / raw)
To: virtualization; +Cc: Carsten Otte, Herbert Xu
In response to Avi's excellent analysis, I've updated virtio as promised
(apologies for the delay, travel got in the way).
===
This attempts to implement a "virtual I/O" layer which should allow
common drivers to be efficiently used across most virtual I/O
mechanisms. It will no-doubt need further enhancement.
The details of probing the device are left to hypervisor-specific
code: it simple constructs the "struct virtio_device" and hands it to
the probe function (eg. virtnet_probe() or virtblk_probe()).
The virtio drivers add and get I/O buffers; as the buffers are consumed
the driver "interrupt" callbacks are invoked.
I have written two virtio device drivers (net and block) and two
virtio implementations (for lguest): a read-write socket-style
implementation, and a more efficient descriptor-based implementation.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
---
include/linux/virtio.h | 64 ++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 64 insertions(+)
===================================================================
--- /dev/null
+++ b/include/linux/virtio.h
@@ -0,0 +1,64 @@
+#ifndef _LINUX_VIRTIO_H
+#define _LINUX_VIRTIO_H
+#include <linux/types.h>
+#include <linux/scatterlist.h>
+#include <linux/spinlock.h>
+
+/**
+ * virtqueue - queue for virtual I/O
+ * @ops: the operations for this virtqueue.
+ * @cb: set by the driver for callbacks.
+ * @priv: private pointer for the driver to use.
+ */
+struct virtqueue {
+ struct virtqueue_ops *ops;
+ bool (*cb)(struct virtqueue *vq);
+ void *priv;
+};
+
+/**
+ * virtqueue_ops - operations for virtqueue abstraction layer
+ * @add_buf: expose buffer to other end
+ * vq: the struct virtqueue we're talking about.
+ * sg: the description of the buffer(s).
+ * out_num: the number of sg readable by other side
+ * in_num: the number of sg which are writable (after readable ones)
+ * data: the token identifying the buffer.
+ * Returns 0 or an error.
+ * @sync: update after add_buf
+ * vq: the struct virtqueue
+ * After one or more add_buf calls, invoke this to kick the virtio layer.
+ * @get_buf: get the next used buffer
+ * vq: the struct virtqueue we're talking about.
+ * len: the length written into the buffer
+ * Returns NULL or the "data" token handed to add_buf.
+ * @detach_buf: "unadd" an unused buffer.
+ * vq: the struct virtqueue we're talking about.
+ * data: the buffer identifier.
+ * This is usually used for shutdown, returnes 0 or -ENOENT.
+ * @restart: restart callbacks ater callback returned false.
+ * vq: the struct virtqueue we're talking about.
+ * This returns "false" (and doesn't re-enable) if there are pending
+ * buffers in thq queue, to avoid a race.
+ *
+ * Locking rules are straightforward: the driver is responsible for
+ * locking. No two operations may be invoked simultaneously.
+ *
+ * All operations can be called in any context.
+ */
+struct virtqueue_ops {
+ int (*add_buf)(struct virtqueue *vq,
+ struct scatterlist sg[],
+ unsigned int out_num,
+ unsigned int in_num,
+ void *data);
+
+ void (*sync)(struct virtqueue *vq);
+
+ void *(*get_buf)(struct virtqueue *vq, unsigned int *len);
+
+ int (*detach_buf)(struct virtqueue *vq, void *data);
+
+ bool (*restart)(struct virtqueue *vq);
+};
+#endif /* _LINUX_VIRTIO_H */
^ permalink raw reply [flat|nested] 21+ messages in thread* [PATCH 2/3] Virtio draft IV: the block driver 2007-07-04 4:12 [PATCH 1/3] Virtio draft IV Rusty Russell @ 2007-07-04 4:19 ` Rusty Russell 2007-07-04 4:40 ` [PATCH 3/3] Virtio draft IV: the net driver Rusty Russell ` (2 more replies) 0 siblings, 3 replies; 21+ messages in thread From: Rusty Russell @ 2007-07-04 4:19 UTC (permalink / raw) To: virtualization; +Cc: Carsten Otte, Herbert Xu, Jens Axboe The block driver uses scatter-gather lists with sg[0] being the request information (struct virtio_blk_outhdr) with the type, sector and inbuf id. The next N sg entries are the bio itself, then the last sg is the status byte. Whether the N entries are in or out depends on whether it's a read or a write. We accept the normal (SCSI) ioctls: they get handed through to the other side which can then handle it or reply that it's unsupported. It's not clear that this actually works in general, since I don't know if blk_pc_request() requests have an accurate rq_data_dir(). Although we try to reply -ENOTTY on unsupported commands, the block layer in its infinite wisdom suppressed the error so ioctl(fd, CDROMEJECT) returns success to userspace. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> --- drivers/block/Makefile | 1 drivers/block/virtio_blk.c | 253 ++++++++++++++++++++++++++++++++++++++++++++ include/linux/Kbuild | 1 include/linux/virtio_blk.h | 45 +++++++ 4 files changed, 300 insertions(+) =================================================================== --- a/drivers/block/Makefile +++ b/drivers/block/Makefile @@ -20,6 +20,7 @@ obj-$(CONFIG_BLK_CPQ_CISS_DA) += cciss. obj-$(CONFIG_BLK_CPQ_CISS_DA) += cciss.o obj-$(CONFIG_BLK_DEV_DAC960) += DAC960.o obj-$(CONFIG_CDROM_PKTCDVD) += pktcdvd.o +obj-y += virtio_blk.o obj-$(CONFIG_BLK_DEV_UMEM) += umem.o obj-$(CONFIG_BLK_DEV_NBD) += nbd.o =================================================================== --- /dev/null +++ b/drivers/block/virtio_blk.c @@ -0,0 +1,253 @@ +#define DEBUG +#include <linux/spinlock.h> +#include <linux/blkdev.h> +#include <linux/hdreg.h> +#include <linux/virtio.h> +#include <linux/virtio_blk.h> + +static unsigned char virtblk_index = 'a'; +struct virtio_blk +{ + spinlock_t lock; + + struct virtqueue *vq; + + /* The disk structure for the kernel. */ + struct gendisk *disk; + + /* Request tracking. */ + struct list_head reqs; + + mempool_t *pool; + + /* Scatterlist: can be too big for stack. */ + struct scatterlist sg[3+MAX_PHYS_SEGMENTS]; +}; + +struct virtblk_req +{ + struct list_head list; + struct request *req; + struct virtio_blk_outhdr out_hdr; + struct virtio_blk_inhdr in_hdr; +}; + +static void end_dequeued_request(struct request *req, + request_queue_t *q, int uptodate) +{ + /* And so the insanity of the block layer infects us here. */ + int nsectors = req->hard_nr_sectors; + + if (blk_pc_request(req)) { + nsectors = (req->data_len + 511) >> 9; + if (!nsectors) + nsectors = 1; + } + if (end_that_request_first(req, uptodate, nsectors)) + BUG(); + add_disk_randomness(req->rq_disk); + end_that_request_last(req, uptodate); +} + +static bool blk_done(struct virtqueue *vq) +{ + struct virtio_blk *vblk = vq->priv; + struct virtblk_req *vbr; + unsigned int len; + unsigned long flags; + + spin_lock_irqsave(&vblk->lock, flags); + while ((vbr = vq->ops->get_buf(vq, &len)) != NULL) { + int uptodate; + switch (vbr->in_hdr.status) { + case VIRTIO_BLK_S_OK: + uptodate = 1; + break; + case VIRTIO_BLK_S_UNSUPP: + uptodate = -ENOTTY; + break; + default: + uptodate = 0; + break; + } + + end_dequeued_request(vbr->req, vblk->disk->queue, uptodate); + list_del(&vbr->list); + mempool_free(vbr, vblk->pool); + } + /* In case queue is stopped waiting for more buffers. */ + blk_start_queue(vblk->disk->queue); + spin_unlock_irqrestore(&vblk->lock, flags); + return true; +} + +static bool do_req(request_queue_t *q, struct virtio_blk *vblk, + struct virtblk_req *vbr) +{ + unsigned long num, out_num, in_num; + + if (blk_fs_request(vbr->req)) { + vbr->out_hdr.type = 0; + vbr->out_hdr.sector = vbr->req->sector; + vbr->out_hdr.ioprio = vbr->req->ioprio; + } else if (blk_pc_request(vbr->req)) { + vbr->out_hdr.type = VIRTIO_BLK_T_SCSI_CMD; + vbr->out_hdr.sector = 0; + vbr->out_hdr.ioprio = vbr->req->ioprio; + } else { + /* We don't put anything else in the queue. */ + BUG(); + } + + if (blk_barrier_rq(vbr->req)) + vbr->out_hdr.type |= VIRTIO_BLK_T_BARRIER; + + vblk->sg[0].page = virt_to_page(&vbr->out_hdr); + vblk->sg[0].offset = offset_in_page(&vbr->out_hdr); + vblk->sg[0].length = sizeof(vbr->out_hdr); + num = blk_rq_map_sg(q, vbr->req, vblk->sg+1); + vblk->sg[num+1].page = virt_to_page(&vbr->in_hdr); + vblk->sg[num+1].offset = offset_in_page(&vbr->in_hdr); + vblk->sg[num+1].length = sizeof(vbr->in_hdr); + + if (rq_data_dir(vbr->req) == WRITE) { + vbr->out_hdr.type |= VIRTIO_BLK_T_OUT; + out_num = 1 + num; + in_num = 1; + } else { + vbr->out_hdr.type |= VIRTIO_BLK_T_IN; + out_num = 1; + in_num = 1 + num; + } + + if (vblk->vq->ops->add_buf(vblk->vq, vblk->sg, out_num, in_num, vbr)) + return false; + + list_add_tail(&vbr->list, &vblk->reqs); + return true; +} + +static void do_virtblk_request(request_queue_t *q) +{ + struct virtio_blk *vblk = NULL; + struct request *req; + struct virtblk_req *vbr; + + while ((req = elv_next_request(q)) != NULL) { + vblk = req->rq_disk->private_data; + + vbr = mempool_alloc(vblk->pool, GFP_ATOMIC); + if (!vbr) + goto stop; + + BUG_ON(req->nr_phys_segments > ARRAY_SIZE(vblk->sg)); + vbr->req = req; + if (!do_req(q, vblk, vbr)) + goto stop; + blkdev_dequeue_request(req); + } + +sync: + if (vblk) + vblk->vq->ops->sync(vblk->vq); + return; + +stop: + /* Queue full? Wait. */ + blk_stop_queue(q); + mempool_free(vbr, vblk->pool); + goto sync; +} + +static int virtblk_ioctl(struct inode *inode, struct file *filp, + unsigned cmd, unsigned long data) +{ + return scsi_cmd_ioctl(filp, inode->i_bdev->bd_disk, cmd, + (void __user *)data); +} + +static struct block_device_operations virtblk_fops = { + .ioctl = virtblk_ioctl, + .owner = THIS_MODULE, +}; + +struct gendisk *virtblk_probe(struct virtqueue *vq) +{ + struct virtio_blk *vblk; + int err, major; + + vblk = kmalloc(sizeof(*vblk), GFP_KERNEL); + if (!vblk) { + err = -ENOMEM; + goto out; + } + + INIT_LIST_HEAD(&vblk->reqs); + spin_lock_init(&vblk->lock); + vblk->vq = vq; + vq->cb = blk_done; + vq->priv = vblk; + + vblk->pool = mempool_create_kmalloc_pool(1,sizeof(struct virtblk_req)); + if (!vblk->pool) { + err = -ENOMEM; + goto out_free_vblk; + } + + major = register_blkdev(0, "virtblk"); + if (major < 0) { + err = major; + goto out_mempool; + } + + /* FIXME: How many partitions? How long is a piece of string? */ + vblk->disk = alloc_disk(1 << 3); + if (!vblk->disk) { + err = -ENOMEM; + goto out_unregister_blkdev; + } + + vblk->disk->queue = blk_init_queue(do_virtblk_request, &vblk->lock); + if (!vblk->disk->queue) { + err = -ENOMEM; + goto out_put_disk; + } + + sprintf(vblk->disk->disk_name, "vd%c", virtblk_index++); + vblk->disk->major = major; + vblk->disk->first_minor = 0; + vblk->disk->private_data = vblk; + vblk->disk->fops = &virtblk_fops; + + blk_queue_ordered(vblk->disk->queue, QUEUE_ORDERED_TAG, NULL); + + /* Caller can do blk_queue_max_hw_segments(), set_capacity() + * etc then add_disk(). */ + return vblk->disk; + +out_put_disk: + put_disk(vblk->disk); +out_unregister_blkdev: + unregister_blkdev(major, "virtblk"); +out_mempool: + mempool_destroy(vblk->pool); +out_free_vblk: + kfree(vblk); +out: + return ERR_PTR(err); +} +EXPORT_SYMBOL_GPL(virtblk_probe); + +void virtblk_remove(struct gendisk *disk) +{ + struct virtio_blk *vblk = disk->private_data; + int major = vblk->disk->major; + + BUG_ON(!list_empty(&vblk->reqs)); + blk_cleanup_queue(vblk->disk->queue); + put_disk(vblk->disk); + unregister_blkdev(major, "virtblk"); + mempool_destroy(vblk->pool); + kfree(vblk); +} +EXPORT_SYMBOL_GPL(virtblk_remove); =================================================================== --- a/include/linux/Kbuild +++ b/include/linux/Kbuild @@ -341,6 +341,7 @@ unifdef-y += utsname.h unifdef-y += utsname.h unifdef-y += videodev2.h unifdef-y += videodev.h +unifdef-y += virtio_blk.h unifdef-y += wait.h unifdef-y += wanrouter.h unifdef-y += watchdog.h =================================================================== --- /dev/null +++ b/include/linux/virtio_blk.h @@ -0,0 +1,45 @@ +#ifndef _LINUX_VIRTIO_BLK_H +#define _LINUX_VIRTIO_BLK_H +#include <linux/types.h> + +/* These two define direction. */ +#define VIRTIO_BLK_T_IN 0 +#define VIRTIO_BLK_T_OUT 1 + +/* This bit says it's a scsi command, not an actual read or write. */ +#define VIRTIO_BLK_T_SCSI_CMD 2 + +/* Barrier before this op. */ +#define VIRTIO_BLK_T_BARRIER 0x80000000 + +/* This is the first element of the read scatter-gather list. */ +struct virtio_blk_outhdr +{ + /* VIRTIO_BLK_T* */ + __u32 type; + /* io priority. */ + __u32 ioprio; + /* Sector (ie. 512 byte offset) */ + __u64 sector; + /* Where to put reply. */ + __u64 id; +}; + +#define VIRTIO_BLK_S_OK 0 +#define VIRTIO_BLK_S_IOERR 1 +#define VIRTIO_BLK_S_UNSUPP 2 + +/* This is the first element of the write scatter-gather list */ +struct virtio_blk_inhdr +{ + unsigned char status; +}; + +#ifdef __KERNEL__ +struct gendisk; +struct virtio_device; + +struct gendisk *virtblk_probe(struct virtqueue *vq); +void virtblk_remove(struct gendisk *disk); +#endif /* __KERNEL__ */ +#endif /* _LINUX_VIRTIO_BLK_H */ ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 3/3] Virtio draft IV: the net driver 2007-07-04 4:19 ` [PATCH 2/3] Virtio draft IV: the block driver Rusty Russell @ 2007-07-04 4:40 ` Rusty Russell 2007-07-11 10:28 ` Christian Borntraeger 2007-07-11 10:45 ` Christian Borntraeger 2007-07-05 7:32 ` [PATCH 2/3] Virtio draft IV: the block driver Christian Borntraeger 2007-07-23 11:13 ` Christian Borntraeger 2 siblings, 2 replies; 21+ messages in thread From: Rusty Russell @ 2007-07-04 4:40 UTC (permalink / raw) To: virtualization; +Cc: Carsten Otte, Herbert Xu, David S. Miller The network driver uses *two* virtqueues: one for input packets and one for output packets. This has nice locking properties (ie. we don't do any for recv vs send). TODO: 1) GSO. 2) Checksum options. 3) Big packets. 4) Multi-client devices (maybe separate driver?). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> --- drivers/net/Makefile | 2 drivers/net/virtio_net.c | 276 ++++++++++++++++++++++++++++++++++++++++++++ include/linux/virtio_net.h | 15 ++ 3 files changed, 292 insertions(+), 1 deletion(-) =================================================================== --- a/drivers/net/Makefile +++ b/drivers/net/Makefile @@ -37,7 +37,7 @@ obj-$(CONFIG_CASSINI) += cassini.o obj-$(CONFIG_MACE) += mace.o obj-$(CONFIG_BMAC) += bmac.o - +obj-y += virtio_net.o obj-$(CONFIG_DGRS) += dgrs.o obj-$(CONFIG_VORTEX) += 3c59x.o obj-$(CONFIG_TYPHOON) += typhoon.o =================================================================== --- /dev/null +++ b/drivers/net/virtio_net.c @@ -0,0 +1,276 @@ +/* A simple network driver using virtio. + * + * Copyright 2007 Rusty Russell <rusty@rustcorp.com.au> IBM Corporation + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + */ +//#define DEBUG +#include <linux/netdevice.h> +#include <linux/etherdevice.h> +#include <linux/module.h> +#include <linux/virtio.h> +#include <linux/scatterlist.h> + +/* FIXME: Make dynamic */ +#define MAX_PACKET_LEN (ETH_HLEN+ETH_DATA_LEN) + +struct virtnet_info +{ + struct virtqueue *vq_recv; + struct virtqueue *vq_send; + struct net_device *ndev; + + /* Number of input buffers, and max we've ever had. */ + unsigned int num, max; + + /* Receive & send queues. */ + struct sk_buff_head recv; + struct sk_buff_head send; +}; + +static bool skb_xmit_done(struct virtqueue *vq) +{ + struct virtnet_info *vi = vq->priv; + + /* In case we were waiting for output buffers. */ + netif_wake_queue(vi->ndev); + return true; +} + +static void receive_skb(struct net_device *dev, struct sk_buff *skb, + unsigned len) +{ + if (unlikely(len < ETH_HLEN)) { + pr_debug("%s: short packet %i\n", dev->name, len); + dev->stats.rx_length_errors++; + dev_kfree_skb(skb); + return; + } + BUG_ON(len > MAX_PACKET_LEN); + + skb_trim(skb, len); + skb->protocol = eth_type_trans(skb, dev); + pr_debug("Receiving skb proto 0x%04x len %i type %i\n", + ntohs(skb->protocol), skb->len, skb->pkt_type); + dev->stats.rx_bytes += skb->len; + dev->stats.rx_packets++; + netif_rx(skb); +} + +static void try_fill_recv(struct virtnet_info *vi) +{ + struct sk_buff *skb; + struct scatterlist sg[MAX_SKB_FRAGS]; + int num, err; + + for (;;) { + skb = netdev_alloc_skb(vi->ndev, MAX_PACKET_LEN); + if (unlikely(!skb)) + break; + + skb_put(skb, MAX_PACKET_LEN); + num = skb_to_sgvec(skb, sg, 0, skb->len); + skb_queue_head(&vi->recv, skb); + + err = vi->vq_recv->ops->add_buf(vi->vq_recv, sg, 0, num, skb); + if (err) { + skb_unlink(skb, &vi->recv); + kfree_skb(skb); + break; + } + vi->num++; + } + if (unlikely(vi->num > vi->max)) + vi->max = vi->num; + vi->vq_recv->ops->sync(vi->vq_recv); +} + +static bool skb_recv_done(struct virtqueue *vq) +{ + struct virtnet_info *vi = vq->priv; + + netif_rx_schedule(vi->ndev); + /* Suppress further interrupts. */ + return false; +} + +static int virtnet_poll(struct net_device *dev, int *budget) +{ + struct virtnet_info *vi = netdev_priv(dev); + struct sk_buff *skb = NULL; + unsigned int len, received = 0; + +again: + while (received < dev->quota && + (skb = vi->vq_recv->ops->get_buf(vi->vq_recv, &len)) != NULL) { + __skb_unlink(skb, &vi->recv); + receive_skb(vi->ndev, skb, len); + vi->num--; + received++; + } + + dev->quota -= received; + *budget -= received; + + /* FIXME: If we oom and completely run out of inbufs, we need + * to start a timer trying to fill more. */ + if (vi->num < vi->max / 2) + try_fill_recv(vi); + + /* Still more work to do? */ + if (skb) + return 1; /* not done */ + + netif_rx_complete(dev); + if (unlikely(!vi->vq_recv->ops->restart(vi->vq_recv)) + && netif_rx_reschedule(dev, received)) + goto again; + + return 0; +} + +static void free_old_xmit_skbs(struct virtnet_info *vi) +{ + struct sk_buff *skb; + unsigned int len; + + while ((skb = vi->vq_send->ops->get_buf(vi->vq_send, &len)) != NULL) { + /* They cannot have written to the packet. */ + BUG_ON(len != 0); + pr_debug("Sent skb %p\n", skb); + __skb_unlink(skb, &vi->send); + vi->ndev->stats.tx_bytes += skb->len; + vi->ndev->stats.tx_packets++; + kfree_skb(skb); + } +} + +static int start_xmit(struct sk_buff *skb, struct net_device *dev) +{ + struct virtnet_info *vi = netdev_priv(dev); + int num, err; + struct scatterlist sg[MAX_SKB_FRAGS]; + const unsigned char *dest = ((struct ethhdr *)skb->data)->h_dest; + + pr_debug("%s: xmit %p %02x:%02x:%02x:%02x:%02x:%02x\n", + dev->name, skb, + dest[0], dest[1], dest[2], dest[3], dest[4], dest[5]); + + free_old_xmit_skbs(vi); + + num = skb_to_sgvec(skb, sg, 0, skb->len); + __skb_queue_head(&vi->send, skb); + err = vi->vq_send->ops->add_buf(vi->vq_send, sg, num, 0, skb); + if (err) { + pr_debug("%s: virtio not prepared to send\n", dev->name); + skb_unlink(skb, &vi->send); + netif_stop_queue(dev); + return NETDEV_TX_BUSY; + } + vi->vq_send->ops->sync(vi->vq_send); + + return 0; +} + +static int virtnet_open(struct net_device *dev) +{ + struct virtnet_info *vi = netdev_priv(dev); + + try_fill_recv(vi); + + /* If we didn't even get one input buffer, we're useless. */ + if (vi->num == 0) + return -ENOMEM; + + return 0; +} + +static int virtnet_close(struct net_device *dev) +{ + struct virtnet_info *vi = netdev_priv(dev); + struct sk_buff *skb; + + /* networking core has neutered skb_xmit_done/skb_recv_done, so don't + * worry about races vs. get(). */ + while ((skb = __skb_dequeue(&vi->recv)) != NULL) { + vi->vq_recv->ops->detach_buf(vi->vq_recv, skb); + kfree_skb(skb); + vi->num--; + } + while ((skb = __skb_dequeue(&vi->send)) != NULL) { + vi->vq_send->ops->detach_buf(vi->vq_send, skb); + kfree_skb(skb); + } + BUG_ON(vi->num != 0); + return 0; +} + +struct net_device *virtnet_probe(struct virtqueue *vq_recv, + struct virtqueue *vq_send, + struct device *device, + const u8 mac[ETH_ALEN]) +{ + int err; + struct net_device *dev; + struct virtnet_info *vi; + + dev = alloc_etherdev(sizeof(struct virtnet_info)); + if (!dev) + return ERR_PTR(-ENOMEM); + + SET_MODULE_OWNER(dev); + + ether_setup(dev); + memcpy(dev->dev_addr, mac, ETH_ALEN); + dev->open = virtnet_open; + dev->stop = virtnet_close; + dev->poll = virtnet_poll; + dev->hard_start_xmit = start_xmit; + dev->weight = 16; + SET_NETDEV_DEV(dev, device); + + vi = netdev_priv(dev); + vi->ndev = dev; + vi->vq_recv = vq_recv; + vi->vq_send = vq_send; + vq_recv->cb = skb_recv_done; + vq_send->cb = skb_xmit_done; + vq_recv->priv = vq_send->priv = vi; + skb_queue_head_init(&vi->recv); + skb_queue_head_init(&vi->send); + + err = register_netdev(dev); + if (err) { + pr_debug("virtio_net: registering device failed\n"); + goto free; + } + pr_debug("virtnet: registered device %s\n", dev->name); + return dev; + +free: + free_netdev(dev); + return ERR_PTR(err); +} +EXPORT_SYMBOL_GPL(virtnet_probe); + +void virtnet_remove(struct net_device *dev) +{ + unregister_netdev(dev); + free_netdev(dev); +} +EXPORT_SYMBOL_GPL(virtnet_remove); + +MODULE_DESCRIPTION("Virtio network driver"); +MODULE_LICENSE("GPL"); =================================================================== --- /dev/null +++ b/include/linux/virtio_net.h @@ -0,0 +1,15 @@ +#ifndef _LINUX_VIRTIO_NET_H +#define _LINUX_VIRTIO_NET_H +#include <linux/types.h> +#include <linux/etherdevice.h> +struct device; +struct net_device; +struct virtqueue; + +struct net_device *virtnet_probe(struct virtqueue *vq_recv, + struct virtqueue *vq_send, + struct device *dev, + const u8 mac[ETH_ALEN]); +void virtnet_remove(struct net_device *dev); + +#endif /* _LINUX_VIRTIO_NET_H */ ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 3/3] Virtio draft IV: the net driver 2007-07-04 4:40 ` [PATCH 3/3] Virtio draft IV: the net driver Rusty Russell @ 2007-07-11 10:28 ` Christian Borntraeger 2007-07-11 11:26 ` Rusty Russell 2007-07-11 10:45 ` Christian Borntraeger 1 sibling, 1 reply; 21+ messages in thread From: Christian Borntraeger @ 2007-07-11 10:28 UTC (permalink / raw) To: virtualization; +Cc: Carsten Otte, Herbert Xu, David S. Miller Am Mittwoch, 4. Juli 2007 schrieb Rusty Russell: > +static void try_fill_recv(struct virtnet_info *vi) > +{ > + struct sk_buff *skb; > + struct scatterlist sg[MAX_SKB_FRAGS]; > + int num, err; > + > + for (;;) { > + skb = netdev_alloc_skb(vi->ndev, MAX_PACKET_LEN); > + if (unlikely(!skb)) > + break; > + > + skb_put(skb, MAX_PACKET_LEN); > + num = skb_to_sgvec(skb, sg, 0, skb->len); > + skb_queue_head(&vi->recv, skb); > + > + err = vi->vq_recv->ops->add_buf(vi->vq_recv, sg, 0, num, skb); > + if (err) { > + skb_unlink(skb, &vi->recv); > + kfree_skb(skb); > + break; > + } > + vi->num++; > + } Hmm, so it allocates skbs until oom or until add_buf fails, right? Do you expect the add_buf call to fail if we have enough buffers? Who defines the amount of buffers we can add via add_buf? -- IBM Deutschland Entwicklung GmbH Vorsitzender des Aufsichtsrats: Martin Jetter Geschäftsführung: Herbert Kircher Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 3/3] Virtio draft IV: the net driver 2007-07-11 10:28 ` Christian Borntraeger @ 2007-07-11 11:26 ` Rusty Russell 2007-07-11 11:46 ` Christian Borntraeger 2007-07-11 19:27 ` Caitlin Bestler 0 siblings, 2 replies; 21+ messages in thread From: Rusty Russell @ 2007-07-11 11:26 UTC (permalink / raw) To: Christian Borntraeger Cc: Carsten Otte, David S. Miller, Herbert Xu, virtualization On Wed, 2007-07-11 at 12:28 +0200, Christian Borntraeger wrote: > Am Mittwoch, 4. Juli 2007 schrieb Rusty Russell: > > +static void try_fill_recv(struct virtnet_info *vi) > Hmm, so it allocates skbs until oom or until add_buf fails, right? Yep. > Do you expect the add_buf call to fail if we have enough buffers? Who defines > the amount of buffers we can add via add_buf? There will be some internal limit on how many buffers the virtio implementation supports, but depends on that implementation. It could be a number of buffers or a total number of descriptors. Cheers, Rusty. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 3/3] Virtio draft IV: the net driver 2007-07-11 11:26 ` Rusty Russell @ 2007-07-11 11:46 ` Christian Borntraeger 2007-07-12 2:23 ` Rusty Russell 2007-07-11 19:27 ` Caitlin Bestler 1 sibling, 1 reply; 21+ messages in thread From: Christian Borntraeger @ 2007-07-11 11:46 UTC (permalink / raw) To: Rusty Russell; +Cc: Carsten Otte, David S. Miller, Herbert Xu, virtualization Am Mittwoch, 11. Juli 2007 schrieb Rusty Russell: > There will be some internal limit on how many buffers the virtio > implementation supports, but depends on that implementation. It could > be a number of buffers or a total number of descriptors. I would suggest to implement a limit in the device driver as well. Otherwise the network driver could allocate a huge amount of guest memory if the virtio implementation accepts a large amount of buffers. This memory is not swappable and reclaimable by the memory management, so we should be careful. So what about something like this: + for (;vi->num < MAX_BUFS;) { + skb = netdev_alloc_skb(vi->ndev, MAX_PACKET_LEN); + if (unlikely(!skb)) + break; + + skb_put(skb, MAX_PACKET_LEN); + num = skb_to_sgvec(skb, sg, 0, skb->len); + skb_queue_head(&vi->recv, skb); + + err = vi->vq_recv->ops->add_buf(vi->vq_recv, sg, 0, num, skb); + if (err) { + skb_unlink(skb, &vi->recv); + kfree_skb(skb); + break; + } + vi->num++ + } Christian ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 3/3] Virtio draft IV: the net driver 2007-07-11 11:46 ` Christian Borntraeger @ 2007-07-12 2:23 ` Rusty Russell 0 siblings, 0 replies; 21+ messages in thread From: Rusty Russell @ 2007-07-12 2:23 UTC (permalink / raw) To: Christian Borntraeger Cc: Carsten Otte, David S. Miller, Herbert Xu, virtualization On Wed, 2007-07-11 at 13:46 +0200, Christian Borntraeger wrote: > Am Mittwoch, 11. Juli 2007 schrieb Rusty Russell: > > There will be some internal limit on how many buffers the virtio > > implementation supports, but depends on that implementation. It could > > be a number of buffers or a total number of descriptors. > > I would suggest to implement a limit in the device driver as well. Otherwise > the network driver could allocate a huge amount of guest memory if the virtio > implementation accepts a large amount of buffers. This memory is not swappable > and reclaimable by the memory management, so we should be careful. Hi Christian, I this this would be premature, but easy enough to do later if we have such a high-limit virtio implementation. If we did do it, it would make sense to have it a variable passed into the net driver. Thanks, Rusty. ^ permalink raw reply [flat|nested] 21+ messages in thread
* RE: [PATCH 3/3] Virtio draft IV: the net driver 2007-07-11 11:26 ` Rusty Russell 2007-07-11 11:46 ` Christian Borntraeger @ 2007-07-11 19:27 ` Caitlin Bestler 1 sibling, 0 replies; 21+ messages in thread From: Caitlin Bestler @ 2007-07-11 19:27 UTC (permalink / raw) To: Rusty Russell, Christian Borntraeger Cc: Carsten Otte, virtualization, David S. Miller, Herbert Xu virtualization-bounces@lists.linux-foundation.org wrote: > On Wed, 2007-07-11 at 12:28 +0200, Christian Borntraeger wrote: >> Am Mittwoch, 4. Juli 2007 schrieb Rusty Russell: >>> +static void try_fill_recv(struct virtnet_info *vi) > >> Hmm, so it allocates skbs until oom or until add_buf fails, right? > > Yep. > >> Do you expect the add_buf call to fail if we have enough buffers? Who >> defines the amount of buffers we can add via add_buf? > > There will be some internal limit on how many buffers the > virtio implementation supports, but depends on that > implementation. It could be a number of buffers or a total number of > descriptors. > I think one of the key tradeoffs here is simplicity versus flexibility. A QP style interface is very explicit about the size of queues (requested and actual) and the size of queue entries (request maximum and actual maximum), current depths, etc. My reading of the proposed interface is that is it much much simpler than all that. This takes flexibility away from the application, that in theory *could* have adapted its behavior to different queue sizes, and gives it to the specific implmentation (that can now do things like variable-length work queue entries if it wants to). The one thing I'd recommend avoiding is starting with a simple interface and then tacking on one bell or whistle at a time. We should either keep it simple or shift to a QP style interface and lean on all the research and negotiations on what attributes are needed that has already taken place for RDMA related networking interfaces. It is also a pain for implementations to have to deal with multiple *detailed* queue interface requirements that all try to accomplish the same thing but insist on doing it differently. If things are vague for one interface you have the flexibility to re-use what another interface forced you to do. But unless there are *specific* consumers who say they want specific bells and whistles my hunch would be to stick with the simple interface. Which translates as "if you need to add a buffer, add it, let us worry about it, if you've gone too far we'll tell you." ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 3/3] Virtio draft IV: the net driver 2007-07-04 4:40 ` [PATCH 3/3] Virtio draft IV: the net driver Rusty Russell 2007-07-11 10:28 ` Christian Borntraeger @ 2007-07-11 10:45 ` Christian Borntraeger 2007-07-11 11:32 ` Rusty Russell 2007-07-11 20:44 ` David Miller 1 sibling, 2 replies; 21+ messages in thread From: Christian Borntraeger @ 2007-07-11 10:45 UTC (permalink / raw) To: virtualization; +Cc: Carsten Otte, Herbert Xu, David S. Miller Am Mittwoch, 4. Juli 2007 schrieb Rusty Russell: > +static void receive_skb(struct net_device *dev, struct sk_buff *skb, [...] > + netif_rx(skb); In the NAPI case, we should use netif_receive_skb, no? Christian ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 3/3] Virtio draft IV: the net driver 2007-07-11 10:45 ` Christian Borntraeger @ 2007-07-11 11:32 ` Rusty Russell 2007-07-11 20:44 ` David Miller 1 sibling, 0 replies; 21+ messages in thread From: Rusty Russell @ 2007-07-11 11:32 UTC (permalink / raw) To: Christian Borntraeger Cc: Carsten Otte, David S. Miller, Herbert Xu, virtualization On Wed, 2007-07-11 at 12:45 +0200, Christian Borntraeger wrote: > Am Mittwoch, 4. Juli 2007 schrieb Rusty Russell: > > +static void receive_skb(struct net_device *dev, struct sk_buff *skb, > [...] > > + netif_rx(skb); > > In the NAPI case, we should use netif_receive_skb, no? Oops, well spotted! I'm working on GSO and checksum offload support at the moment, will roll that fix into the next release. Thanks! Rusty. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 3/3] Virtio draft IV: the net driver 2007-07-11 10:45 ` Christian Borntraeger 2007-07-11 11:32 ` Rusty Russell @ 2007-07-11 20:44 ` David Miller 2007-07-12 2:21 ` Rusty Russell 1 sibling, 1 reply; 21+ messages in thread From: David Miller @ 2007-07-11 20:44 UTC (permalink / raw) To: borntraeger; +Cc: cotte, herbert, virtualization From: Christian Borntraeger <borntraeger@de.ibm.com> Date: Wed, 11 Jul 2007 12:45:40 +0200 > Am Mittwoch, 4. Juli 2007 schrieb Rusty Russell: > > +static void receive_skb(struct net_device *dev, struct sk_buff *skb, > [...] > > + netif_rx(skb); > > In the NAPI case, we should use netif_receive_skb, no? NAPI doesn't make sense for virtual devices, my Sun LDOM nework driver won't use NAPI either. It's also too cumbersome to use NAPI with the way virtualized network drivers work (multiple ports, each with an interrupt source, not just one) until the NAPI split patches are ported and applied upstream and that won't be for a while. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 3/3] Virtio draft IV: the net driver 2007-07-11 20:44 ` David Miller @ 2007-07-12 2:21 ` Rusty Russell 2007-07-12 2:26 ` David Miller 0 siblings, 1 reply; 21+ messages in thread From: Rusty Russell @ 2007-07-12 2:21 UTC (permalink / raw) To: David Miller; +Cc: borntraeger, cotte, herbert, virtualization On Wed, 2007-07-11 at 13:44 -0700, David Miller wrote: > From: Christian Borntraeger <borntraeger@de.ibm.com> > Date: Wed, 11 Jul 2007 12:45:40 +0200 > > > Am Mittwoch, 4. Juli 2007 schrieb Rusty Russell: > > > +static void receive_skb(struct net_device *dev, struct sk_buff *skb, > > [...] > > > + netif_rx(skb); > > > > In the NAPI case, we should use netif_receive_skb, no? > > NAPI doesn't make sense for virtual devices, my Sun LDOM nework > driver won't use NAPI either. > > It's also too cumbersome to use NAPI with the way virtualized > network drivers work (multiple ports, each with an interrupt > source, not just one) until the NAPI split patches are ported > and applied upstream and that won't be for a while. Dave, I think you're the only one (so far?) with multiple irqs. It's not clear that guest-controlled interrupt mitigation is the best approach for virtual devices, but at the moment it doesn't hurt. Cheers, Rusty. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 3/3] Virtio draft IV: the net driver 2007-07-12 2:21 ` Rusty Russell @ 2007-07-12 2:26 ` David Miller 0 siblings, 0 replies; 21+ messages in thread From: David Miller @ 2007-07-12 2:26 UTC (permalink / raw) To: rusty; +Cc: borntraeger, cotte, herbert, virtualization From: Rusty Russell <rusty@rustcorp.com.au> Date: Thu, 12 Jul 2007 12:21:33 +1000 > Dave, I think you're the only one (so far?) with multiple irqs. Luckily there are known hw implementations with that issue so I won't be weird for long :) > It's not clear that guest-controlled interrupt mitigation is the best > approach for virtual devices, but at the moment it doesn't hurt. It would be nice for consistency's sake, once it is easy to do so. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/3] Virtio draft IV: the block driver 2007-07-04 4:19 ` [PATCH 2/3] Virtio draft IV: the block driver Rusty Russell 2007-07-04 4:40 ` [PATCH 3/3] Virtio draft IV: the net driver Rusty Russell @ 2007-07-05 7:32 ` Christian Borntraeger 2007-07-06 0:33 ` Rusty Russell 2007-07-23 11:13 ` Christian Borntraeger 2 siblings, 1 reply; 21+ messages in thread From: Christian Borntraeger @ 2007-07-05 7:32 UTC (permalink / raw) To: virtualization; +Cc: Carsten Otte, Herbert Xu, Jens Axboe Am Mittwoch, 4. Juli 2007 schrieb Rusty Russell: > + vbr = mempool_alloc(vblk->pool, GFP_ATOMIC); > + if (!vbr) > + goto stop; [...] > + BUG_ON(req->nr_phys_segments > ARRAY_SIZE(vblk->sg)); > + vbr->req = req; > + if (!do_req(q, vblk, vbr)) > + goto stop; [...] > +stop: > + /* Queue full? Wait. */ > + blk_stop_queue(q); > + mempool_free(vbr, vblk->pool); Hmm, can mempool_free really handle NULL as its first argument? (first goto). As far as I can see we have a problem here. Our pool has preallocated 1 element. So what about: - first alloc fails in mempool_alloc fails -> we get the pre-allocated element, no more elements left - second alloc fails in mempool_alloc fails -> we get NULL -> we return NULL to mempool_free which adds NULL back to the pool as the pool is empty. Now the pool has an NULL entry, no? Christian ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/3] Virtio draft IV: the block driver 2007-07-05 7:32 ` [PATCH 2/3] Virtio draft IV: the block driver Christian Borntraeger @ 2007-07-06 0:33 ` Rusty Russell 0 siblings, 0 replies; 21+ messages in thread From: Rusty Russell @ 2007-07-06 0:33 UTC (permalink / raw) To: Christian Borntraeger Cc: Andrew Morton, Ingo Molnar, lkml - Kernel Mailing List, virtualization On Thu, 2007-07-05 at 09:32 +0200, Christian Borntraeger wrote: > Am Mittwoch, 4. Juli 2007 schrieb Rusty Russell: > > + vbr = mempool_alloc(vblk->pool, GFP_ATOMIC); > > + if (!vbr) > > + goto stop; > [...] > > + BUG_ON(req->nr_phys_segments > ARRAY_SIZE(vblk->sg)); > > + vbr->req = req; > > + if (!do_req(q, vblk, vbr)) > > + goto stop; > [...] > > +stop: > > + /* Queue full? Wait. */ > > + blk_stop_queue(q); > > + mempool_free(vbr, vblk->pool); > > > Hmm, can mempool_free really handle NULL as its first argument? (first goto). Good point. Any objections to fixing that? Cheers, Rusty. === Christian Borntraeger points out that mempool_free() doesn't noop when handed NULL. This is inconsistent with the other free-like functions in the kernel. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> diff -r a306f0a8de5e mm/mempool.c --- a/mm/mempool.c Fri Jul 06 10:28:39 2007 +1000 +++ b/mm/mempool.c Fri Jul 06 10:29:40 2007 +1000 @@ -263,6 +263,9 @@ void mempool_free(void *element, mempool { unsigned long flags; + if (unlikely(element == NULL)) + return; + smp_mb(); if (pool->curr_nr < pool->min_nr) { spin_lock_irqsave(&pool->lock, flags); ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/3] Virtio draft IV: the block driver 2007-07-04 4:19 ` [PATCH 2/3] Virtio draft IV: the block driver Rusty Russell 2007-07-04 4:40 ` [PATCH 3/3] Virtio draft IV: the net driver Rusty Russell 2007-07-05 7:32 ` [PATCH 2/3] Virtio draft IV: the block driver Christian Borntraeger @ 2007-07-23 11:13 ` Christian Borntraeger 2007-07-24 3:02 ` Rusty Russell 2 siblings, 1 reply; 21+ messages in thread From: Christian Borntraeger @ 2007-07-23 11:13 UTC (permalink / raw) To: virtualization > +#ifdef __KERNEL__ > +struct gendisk; > +struct virtio_device; > + > +struct gendisk *virtblk_probe(struct virtqueue *vq); Hello Rusty, I am currently do some prototyping with virtio IV and found tis small issue in the block device. virtblk_probe requires a declaration of virtqueue, not virtio_device. --- include/linux/virtio_blk.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux-2.6.22/include/linux/virtio_blk.h =================================================================== --- linux-2.6.22.orig/include/linux/virtio_blk.h +++ linux-2.6.22/include/linux/virtio_blk.h @@ -37,7 +37,7 @@ struct virtio_blk_inhdr #ifdef __KERNEL__ struct gendisk; -struct virtio_device; +struct virtqueue; struct gendisk *virtblk_probe(struct virtqueue *vq); void virtblk_remove(struct gendisk *disk); ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/3] Virtio draft IV: the block driver 2007-07-23 11:13 ` Christian Borntraeger @ 2007-07-24 3:02 ` Rusty Russell 0 siblings, 0 replies; 21+ messages in thread From: Rusty Russell @ 2007-07-24 3:02 UTC (permalink / raw) To: Christian Borntraeger; +Cc: virtualization On Mon, 2007-07-23 at 13:13 +0200, Christian Borntraeger wrote: > > +#ifdef __KERNEL__ > > +struct gendisk; > > +struct virtio_device; > > + > > +struct gendisk *virtblk_probe(struct virtqueue *vq); > > Hello Rusty, > > I am currently do some prototyping with virtio IV and found tis small issue in > the block device. virtblk_probe requires a declaration of virtqueue, not > virtio_device. Oops, thanks Christian! I'll roll this in, thanks! Rusty. PS. 2.6.23-rc1 broke the block driver anyway, so will publish new versions soon. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/3] Virtio draft IV: the block driver
@ 2007-07-09 14:17 Martin Peschke
2007-07-09 15:54 ` Martin Peschke
0 siblings, 1 reply; 21+ messages in thread
From: Martin Peschke @ 2007-07-09 14:17 UTC (permalink / raw)
To: virtualization
> +static void do_virtblk_request(request_queue_t *q)
> +{
> + struct virtio_blk *vblk = NULL;
> + struct request *req;
> + struct virtblk_req *vbr;
> +
> + while ((req = elv_next_request(q)) != NULL) {
> + vblk = req->rq_disk->private_data;
> +
> + vbr = mempool_alloc(vblk->pool, GFP_ATOMIC);
> + if (!vbr)
> + goto stop;
> +
> + BUG_ON(req->nr_phys_segments > ARRAY_SIZE(vblk->sg));
> + vbr->req = req;
> + if (!do_req(q, vblk, vbr))
> + goto stop;
> + blkdev_dequeue_request(req);
> + }
> +
> +sync:
> + if (vblk)
this check looks bogus, as vblk->pool has been accessed unconditionally above
> + vblk->vq->ops->sync(vblk->vq);
> + return;
> +
> +stop:
> + /* Queue full? Wait. */
> + blk_stop_queue(q);
> + mempool_free(vbr, vblk->pool);
> + goto sync;
> +}
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [PATCH 2/3] Virtio draft IV: the block driver 2007-07-09 14:17 Martin Peschke @ 2007-07-09 15:54 ` Martin Peschke 2007-07-10 4:42 ` Rusty Russell 0 siblings, 1 reply; 21+ messages in thread From: Martin Peschke @ 2007-07-09 15:54 UTC (permalink / raw) To: virtualization On Mon, 2007-07-09 at 16:17 +0200, Martin Peschke wrote: > > +static void do_virtblk_request(request_queue_t *q) > > +{ > > + struct virtio_blk *vblk = NULL; > > + struct request *req; > > + struct virtblk_req *vbr; > > + > > + while ((req = elv_next_request(q)) != NULL) { > > + vblk = req->rq_disk->private_data; > > + > > + vbr = mempool_alloc(vblk->pool, GFP_ATOMIC); > > + if (!vbr) > > + goto stop; > > + > > + BUG_ON(req->nr_phys_segments > ARRAY_SIZE(vblk->sg)); > > + vbr->req = req; > > + if (!do_req(q, vblk, vbr)) > > + goto stop; > > + blkdev_dequeue_request(req); > > + } > > + > > +sync: > > + if (vblk) > > > this check looks bogus, as vblk->pool has been accessed unconditionally above Sorry, I have been seduced by the code to misread it. The check is correct and needed. Looks like we can make it more readable, though. - get rid of convoluted goto's - no need for calling mempool_free() with NULL-pointer - read private data from q->queuedata once instead of poking inside every request - use a local variable to track issued requests and the need for a sync-operation - pass req to do_req() as it is used there throughout the function (nice symmetry in parameter list, isn't it?) It's just an untested and incomplete sketch done while I am acquainting myself with the virtio code without having set up lguest yet. static void do_virtblk_request(request_queue_t *q) { struct virtio_blk *vblk = q->queuedata; struct request *req; struct virtblk_req *vbr; int issued = 0; while ((req = elv_next_request(q)) != NULL) { vbr = mempool_alloc(vblk->pool, GFP_ATOMIC); if (!vbr) { blk_stop_queue(q); break; } BUG_ON(req->nr_phys_segments > ARRAY_SIZE(vblk->sg)); vbr->req = req; if (!do_req(q, vblk, req, vbr)) { /* Queue full? Wait. */ blk_stop_queue(q); mempool_free(vbr, vblk->pool); break; } blkdev_dequeue_request(req); issued++; } if (issued) vblk->vq->ops->sync(vblk->vq); } ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/3] Virtio draft IV: the block driver 2007-07-09 15:54 ` Martin Peschke @ 2007-07-10 4:42 ` Rusty Russell 2007-07-10 12:12 ` Martin Peschke 0 siblings, 1 reply; 21+ messages in thread From: Rusty Russell @ 2007-07-10 4:42 UTC (permalink / raw) To: Martin Peschke; +Cc: virtualization On Mon, 2007-07-09 at 17:54 +0200, Martin Peschke wrote: > static void do_virtblk_request(request_queue_t *q) > { > struct virtio_blk *vblk = q->queuedata; > struct request *req; > struct virtblk_req *vbr; > int issued = 0; > > while ((req = elv_next_request(q)) != NULL) { > vbr = mempool_alloc(vblk->pool, GFP_ATOMIC); > if (!vbr) { > blk_stop_queue(q); > break; > } > > BUG_ON(req->nr_phys_segments > ARRAY_SIZE(vblk->sg)); > vbr->req = req; > if (!do_req(q, vblk, req, vbr)) { > /* Queue full? Wait. */ > blk_stop_queue(q); > mempool_free(vbr, vblk->pool); > break; > } > blkdev_dequeue_request(req); > issued++; > } > > if (issued) > vblk->vq->ops->sync(vblk->vq); > } Hi Martin! I like this: moving the allocation into do_req() makes it even simpler: static void do_virtblk_request(request_queue_t *q) { struct virtio_blk *vblk = NULL; struct request *req; unsigned int issued = 0; while ((req = elv_next_request(q)) != NULL) { vblk = req->rq_disk->private_data; BUG_ON(req->nr_phys_segments > ARRAY_SIZE(vblk->sg)); if (!do_req(q, vblk, req)) { /* Queue full? Wait. */ blk_stop_queue(q); break; } blkdev_dequeue_request(req); issued++; } if (issued) vblk->vq->ops->sync(vblk->vq); } Thanks! Rusty. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/3] Virtio draft IV: the block driver 2007-07-10 4:42 ` Rusty Russell @ 2007-07-10 12:12 ` Martin Peschke 0 siblings, 0 replies; 21+ messages in thread From: Martin Peschke @ 2007-07-10 12:12 UTC (permalink / raw) To: Rusty Russell; +Cc: virtualization Rusty Russell wrote: > Hi Martin! > > I like this: moving the allocation into do_req() makes it even simpler: > > static void do_virtblk_request(request_queue_t *q) > { > struct virtio_blk *vblk = NULL; > struct request *req; > unsigned int issued = 0; > > while ((req = elv_next_request(q)) != NULL) { > vblk = req->rq_disk->private_data; > BUG_ON(req->nr_phys_segments > ARRAY_SIZE(vblk->sg)); > > if (!do_req(q, vblk, req)) { > /* Queue full? Wait. */ > blk_stop_queue(q); > break; > } > blkdev_dequeue_request(req); > issued++; > } > > if (issued) > vblk->vq->ops->sync(vblk->vq); > } ACK ;) Thanks, Martin ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2007-07-24 3:02 UTC | newest] Thread overview: 21+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-07-04 4:12 [PATCH 1/3] Virtio draft IV Rusty Russell 2007-07-04 4:19 ` [PATCH 2/3] Virtio draft IV: the block driver Rusty Russell 2007-07-04 4:40 ` [PATCH 3/3] Virtio draft IV: the net driver Rusty Russell 2007-07-11 10:28 ` Christian Borntraeger 2007-07-11 11:26 ` Rusty Russell 2007-07-11 11:46 ` Christian Borntraeger 2007-07-12 2:23 ` Rusty Russell 2007-07-11 19:27 ` Caitlin Bestler 2007-07-11 10:45 ` Christian Borntraeger 2007-07-11 11:32 ` Rusty Russell 2007-07-11 20:44 ` David Miller 2007-07-12 2:21 ` Rusty Russell 2007-07-12 2:26 ` David Miller 2007-07-05 7:32 ` [PATCH 2/3] Virtio draft IV: the block driver Christian Borntraeger 2007-07-06 0:33 ` Rusty Russell 2007-07-23 11:13 ` Christian Borntraeger 2007-07-24 3:02 ` Rusty Russell -- strict thread matches above, loose matches on Subject: below -- 2007-07-09 14:17 Martin Peschke 2007-07-09 15:54 ` Martin Peschke 2007-07-10 4:42 ` Rusty Russell 2007-07-10 12:12 ` Martin Peschke
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).