virtualization.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* More virtio users
@ 2007-06-10  7:33 Avi Kivity
  0 siblings, 0 replies; 25+ messages in thread
From: Avi Kivity @ 2007-06-10  7:33 UTC (permalink / raw)
  To: Rusty Russell; +Cc: kvm-devel, xen-devel, virtualization

It is worthwhile, when designing virtio, to keep in mind as many 
possible users as possible.  In addition to block and net, I see at 
least the following:

- vmgl (paravirtualized 3D graphics) 
[http://www.cs.toronto.edu/~andreslc/xen-gl/]
- scsi (for tape, cd writer, etc.)
- framebuffer (with just one request to share the framebuffer?)

There are probably more.  Any ideas?

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Xen-devel] More virtio users
       [not found] <466BA965.6050208@qumranet.com>
@ 2007-06-10  8:06 ` Muli Ben-Yehuda
       [not found] ` <20070610080602.GD3738@rhun.haifa.ibm.com>
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 25+ messages in thread
From: Muli Ben-Yehuda @ 2007-06-10  8:06 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm-devel, xen-devel, virtualization

On Sun, Jun 10, 2007 at 10:33:57AM +0300, Avi Kivity wrote:
> It is worthwhile, when designing virtio, to keep in mind as many 
> possible users as possible.  In addition to block and net, I see at 
> least the following:
> 
> - vmgl (paravirtualized 3D graphics) 
> [http://www.cs.toronto.edu/~andreslc/xen-gl/]
> - scsi (for tape, cd writer, etc.)
> - framebuffer (with just one request to share the framebuffer?)
> 
> There are probably more.  Any ideas?

- Fast inter-domain networking, a-la XenSocket.
- PCI (or your favorite HW bus) passthrough, for your favorite oddball
  device (e.g., crypto-accelerators).

Cheers,
Muli

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Xen-devel] More virtio users
       [not found] ` <20070610080602.GD3738@rhun.haifa.ibm.com>
@ 2007-06-10  8:09   ` Avi Kivity
       [not found]   ` <466BB1AF.1000601@qumranet.com>
  1 sibling, 0 replies; 25+ messages in thread
From: Avi Kivity @ 2007-06-10  8:09 UTC (permalink / raw)
  To: Muli Ben-Yehuda; +Cc: kvm-devel, xen-devel, virtualization

Muli Ben-Yehuda wrote:
> On Sun, Jun 10, 2007 at 10:33:57AM +0300, Avi Kivity wrote:
>   
>> It is worthwhile, when designing virtio, to keep in mind as many 
>> possible users as possible.  In addition to block and net, I see at 
>> least the following:
>>
>> - vmgl (paravirtualized 3D graphics) 
>> [http://www.cs.toronto.edu/~andreslc/xen-gl/]
>> - scsi (for tape, cd writer, etc.)
>> - framebuffer (with just one request to share the framebuffer?)
>>
>> There are probably more.  Any ideas?
>>     
>
> - Fast inter-domain networking, a-la XenSocket.
>   

Yup.

> - PCI (or your favorite HW bus) passthrough, for your favorite oddball
>   device (e.g., crypto-accelerators).
>   

Won't all high-bandwidth traffic be through dma, bypassing virtio?

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: More virtio users
       [not found] <466BA965.6050208@qumranet.com>
  2007-06-10  8:06 ` [Xen-devel] More virtio users Muli Ben-Yehuda
       [not found] ` <20070610080602.GD3738@rhun.haifa.ibm.com>
@ 2007-06-10  8:13 ` Rusty Russell
  2007-06-11  3:04 ` [Xen-devel] " ron minnich
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 25+ messages in thread
From: Rusty Russell @ 2007-06-10  8:13 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm-devel, xen-devel, virtualization

On Sun, 2007-06-10 at 10:33 +0300, Avi Kivity wrote:
> It is worthwhile, when designing virtio, to keep in mind as many 
> possible users as possible.  In addition to block and net, I see at 
> least the following:
> 
> - vmgl (paravirtualized 3D graphics) 
> [http://www.cs.toronto.edu/~andreslc/xen-gl/]
> - scsi (for tape, cd writer, etc.)
> - framebuffer (with just one request to share the framebuffer?)

Console, USB.  HPA suggested an entropy device.

Framebuffer is an interesting one.  Virtio doesn't assume shared memory,
so naively the fb you would just send outbufs describing changed memory.
This would work, but describing rectangles is better.  A helper might be
the right approach here

Lguest doesn't have a framebuffer, so maybe this is a good thing for me
to hack on, but I promised myself I'd finish NAPI for the net device,
and tag for block device first.

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: More virtio users
       [not found] ` <1181463220.16428.24.camel@localhost.localdomain>
@ 2007-06-10  8:16   ` Avi Kivity
       [not found]   ` <466BB34B.9050105@qumranet.com>
  2007-06-11  8:16   ` [Xen-devel] " Gerd Hoffmann
  2 siblings, 0 replies; 25+ messages in thread
From: Avi Kivity @ 2007-06-10  8:16 UTC (permalink / raw)
  To: Rusty Russell; +Cc: kvm-devel, xen-devel, virtualization

Rusty Russell wrote:
> Lguest doesn't have a framebuffer, so maybe this is a good thing for me
> to hack on, but I promised myself I'd finish NAPI for the net device,
> and tag for block device first.
>   

If you're touching the block device, passing a request's io priority to 
the host can be useful.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: More virtio users
       [not found]   ` <466BB34B.9050105@qumranet.com>
@ 2007-06-10 12:37     ` Rusty Russell
       [not found]     ` <1181479060.16428.37.camel@localhost.localdomain>
  1 sibling, 0 replies; 25+ messages in thread
From: Rusty Russell @ 2007-06-10 12:37 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm-devel, Jens Axboe, xen-devel, virtualization

On Sun, 2007-06-10 at 11:16 +0300, Avi Kivity wrote:
> Rusty Russell wrote:
> > Lguest doesn't have a framebuffer, so maybe this is a good thing for me
> > to hack on, but I promised myself I'd finish NAPI for the net device,
> > and tag for block device first.
> >   
> 
> If you're touching the block device, passing a request's io priority to 
> the host can be useful.

OK, here's the interdiff.  I still don't handle non-fs requests, but I
haven't seen any yet.  I should probably BUG_ON() there and wait for
Jens to scream...

Changes:
1) Make virtio_blk.h userspace-friendly.
2) /dev/vbN -> /dev/vdN
3) Ordered tags, handed thru to other end.
4) Hand ioprio to other end, too.

diff -u b/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
--- b/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c	Sun Jun 10 22:09:10 2007 +1000
@@ -33,18 +33,19 @@
 	struct virtio_blk_inhdr in_hdr;
 };
 
-/* Jens gave me this nice helper to end all chunks of a request. */
-static void end_dequeued_request(struct request *req, int uptodate)
+static void end_tagged_request(struct request *req,
+			       request_queue_t *q, int uptodate)
 {
 	if (end_that_request_first(req, uptodate, req->hard_nr_sectors))
 		BUG();
 	add_disk_randomness(req->rq_disk);
+	blk_queue_end_tag(q, req);
 	end_that_request_last(req, uptodate);
 }
 
 static void finish(struct virtio_blk *vblk, struct virtblk_req *vbr)
 {
-	end_dequeued_request(vbr->req, !vbr->failed);
+	end_tagged_request(vbr->req, vblk->disk->queue, !vbr->failed);
 	list_del(&vbr->list);
 	mempool_free(vbr, vblk->pool);
 	/* In case queue is stopped waiting for more buffers. */
@@ -120,7 +121,7 @@
 		goto detach_inbuf_full;
 
 	pr_debug("Write: %p in=%lu out=%lu\n", vbr,
-		 vbr->out_hdr.id, vbr->out_id);
+		 (long)vbr->out_hdr.id, (long)vbr->out_id);
 	list_add_tail(&vbr->list, &vblk->reqs);
 	return true;
 
@@ -157,7 +158,7 @@
 		goto detach_inbuf_full;
 
 	pr_debug("Read: %p in=%lu out=%lu\n", vbr,
-		 vbr->out_hdr.id, vbr->out_id);
+		 (long)vbr->out_hdr.id, (long)vbr->out_id);
 	list_add_tail(&vbr->list, &vblk->reqs);
 	return true;
 
@@ -178,10 +179,9 @@
 
 		/* FIXME: handle these iff capable. */
 		if (!blk_fs_request(req)) {
-			pr_debug("Got non-command 0x%08x\n", req->cmd_type);
+			printk("Got non-command 0x%08x\n", req->cmd_type);
 			req->errors++;
-			blkdev_dequeue_request(req);
-			end_dequeued_request(req, 0);
+			end_tagged_request(req, vblk->disk->queue, 0);
 			continue;
 		}
 
@@ -193,6 +193,8 @@
 		vbr->req = req;
 		vbr->out_hdr.type = rq_data_dir(req);
 		vbr->out_hdr.sector = req->sector;
+		vbr->out_hdr.tag = req->tag;
+		vbr->out_hdr.ioprio = req->ioprio;
 
 		if (rq_data_dir(req) == WRITE) {
 			if (!do_write(q, vblk, vbr))
@@ -201,7 +203,6 @@
 			if (!do_read(q, vblk, vbr))
 				goto stop;
 		}
-		blkdev_dequeue_request(req);
 	}
 
 sync:
@@ -261,16 +262,25 @@
 		goto out_put_disk;
 	}
 
-	sprintf(vblk->disk->disk_name, "vb%c", virtblk_index++);
+	sprintf(vblk->disk->disk_name, "vd%c", virtblk_index++);
 	vblk->disk->major = major;
 	vblk->disk->first_minor = 0;
 	vblk->disk->private_data = vblk;
 	vblk->disk->fops = &virtblk_fops;
 
+	err = blk_queue_init_tags(vblk->disk->queue, 100 /* FIXME */, NULL);
+	if (err)
+		goto out_cleanup_queue;
+
+	blk_queue_ordered(vblk->disk->queue, QUEUE_ORDERED_TAG, NULL);
+	blk_queue_prep_rq(vblk->disk->queue, blk_queue_start_tag);
+
 	/* Caller can do blk_queue_max_hw_segments(), set_capacity()
 	 * etc then add_disk(). */
 	return vblk->disk;
 
+out_cleanup_queue:
+	blk_cleanup_queue(vblk->disk->queue);
 out_put_disk:
 	put_disk(vblk->disk);
 out_unregister_blkdev:
diff -u b/include/linux/virtio_blk.h b/include/linux/virtio_blk.h
--- b/include/linux/virtio_blk.h
+++ b/include/linux/virtio_blk.h	Sun Jun 10 22:09:10 2007 +1000
@@ -3,26 +3,31 @@
 #include <linux/types.h>
-struct gendisk;
-struct virtio_device;
-struct hd_geometry;
 
 /* This is the first element of the scatter-gather list. */
 struct virtio_blk_outhdr
 {
 	/* 0 == read, 1 == write */
-	u32 type;
+	__u32 type;
+	/* Ordered tag. */
+	__u16 tag;
+	/* Linux's ioprio. */
+	__u16 ioprio;
 	/* Sector (ie. 512 byte offset) */
-	unsigned long sector;
+	__u64 sector;
 	/* Where to put reply. */
-	unsigned long id;
+	__u64 id;
 };
 
 struct virtio_blk_inhdr
 {
 	/* 1 = OK, 0 = not ok. */
-	unsigned long status;
+	unsigned char status;
 };
 
+#ifdef __KERNEL__
+struct gendisk;
+struct virtio_device;
+
 struct gendisk *virtblk_probe(struct virtio_device *vdev);
 void virtblk_remove(struct gendisk *disk);
-
+#endif /* __KERNEL__ */
 #endif /* _LINUX_VIRTIO_BLK_H */
only in patch2:
unchanged:
--- a/include/linux/Kbuild	Sun Jun 10 18:25:37 2007 +1000
+++ b/include/linux/Kbuild	Sun Jun 10 22:09:10 2007 +1000
@@ -341,6 +341,7 @@ unifdef-y += utsname.h
 unifdef-y += utsname.h
 unifdef-y += videodev2.h
 unifdef-y += videodev.h
+unifdef-y += virtio_blk.h
 unifdef-y += wait.h
 unifdef-y += wanrouter.h
 unifdef-y += watchdog.h

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Xen-devel] More virtio users
       [not found] <466BA965.6050208@qumranet.com>
                   ` (2 preceding siblings ...)
  2007-06-10  8:13 ` Rusty Russell
@ 2007-06-11  3:04 ` ron minnich
       [not found] ` <1181463220.16428.24.camel@localhost.localdomain>
  2007-06-12 22:01 ` [kvm-devel] " Arnd Bergmann
  5 siblings, 0 replies; 25+ messages in thread
From: ron minnich @ 2007-06-11  3:04 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm-devel, xen-devel, virtualization

On 6/10/07, Avi Kivity <avi@qumranet.com> wrote:

> There are probably more.  Any ideas?

sessions to 9p servers. But if we had a way to, first, do PV that
send/receive to an fd pair, that would be a start.

ron

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: More virtio users
       [not found]     ` <1181479060.16428.37.camel@localhost.localdomain>
@ 2007-06-11  6:41       ` Jens Axboe
  2007-06-11  7:29         ` Rusty Russell
       [not found]         ` <1181546953.16428.96.camel@localhost.localdomain>
  0 siblings, 2 replies; 25+ messages in thread
From: Jens Axboe @ 2007-06-11  6:41 UTC (permalink / raw)
  To: Rusty Russell; +Cc: kvm-devel, xen-devel, virtualization

On Sun, Jun 10 2007, Rusty Russell wrote:
> On Sun, 2007-06-10 at 11:16 +0300, Avi Kivity wrote:
> > Rusty Russell wrote:
> > > Lguest doesn't have a framebuffer, so maybe this is a good thing for me
> > > to hack on, but I promised myself I'd finish NAPI for the net device,
> > > and tag for block device first.
> > >   
> > 
> > If you're touching the block device, passing a request's io priority to 
> > the host can be useful.
> 
> OK, here's the interdiff.  I still don't handle non-fs requests, but I
> haven't seen any yet.  I should probably BUG_ON() there and wait for
> Jens to scream...

Ehm no, that would certainly cause screaming :-)

Checking for blk_fs_request() and terminating the request if you don't
know how to handle it is the correct thing to do, a BUG() would
definitely not be.

Patch looks good to me, though:

> +	blk_queue_prep_rq(vblk->disk->queue, blk_queue_start_tag);
> +

is quite a novel way, I actually had to look that code up to check
whether it was correct. I'd much prefer a little wrapper around that,
ala:

static int virtio_block_prep(request_queue_t *q, struct request *rq)
{
        if (!blk_queue_start_tag(q, rq))
                return BLKPREP_OK;

        return BLKPREP_DEFER;
}

That is much easier to read, and as a bonus it wont eat your request
just because you run out of tags! The fact that blk_queue_start_tag()
just happens to share the prep_rq_fn definition is by coincidence only.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: More virtio users
  2007-06-11  6:41       ` Jens Axboe
@ 2007-06-11  7:29         ` Rusty Russell
       [not found]         ` <1181546953.16428.96.camel@localhost.localdomain>
  1 sibling, 0 replies; 25+ messages in thread
From: Rusty Russell @ 2007-06-11  7:29 UTC (permalink / raw)
  To: Jens Axboe, Tejun Heo; +Cc: kvm-devel, xen-devel, virtualization

On Mon, 2007-06-11 at 08:41 +0200, Jens Axboe wrote:
> On Sun, Jun 10 2007, Rusty Russell wrote:
> > On Sun, 2007-06-10 at 11:16 +0300, Avi Kivity wrote:
> > > Rusty Russell wrote:
> > > > Lguest doesn't have a framebuffer, so maybe this is a good thing for me
> > > > to hack on, but I promised myself I'd finish NAPI for the net device,
> > > > and tag for block device first.
> > > >   
> > > 
> > > If you're touching the block device, passing a request's io priority to 
> > > the host can be useful.
> > 
> > OK, here's the interdiff.  I still don't handle non-fs requests, but I
> > haven't seen any yet.  I should probably BUG_ON() there and wait for
> > Jens to scream...
> 
> Ehm no, that would certainly cause screaming :-)
> 
> Checking for blk_fs_request() and terminating the request if you don't
> know how to handle it is the correct thing to do, a BUG() would
> definitely not be.

So the problem is that I'd like to handle all of them, but I'm not clear
what requests my device can get.  I can't see a source of any other type
of request.

> > +	blk_queue_prep_rq(vblk->disk->queue, blk_queue_start_tag);
> > +
> 
> is quite a novel way, I actually had to look that code up to check
> whether it was correct. I'd much prefer a little wrapper around that,

OK, but I got it from the blk_queue_start_tag documentation:

 *  Description:
 *    This can either be used as a stand-alone helper, or possibly be
 *    assigned as the queue &prep_rq_fn (in which case &struct request
 *    automagically gets a tag assigned). Note that this function
 *    assumes that any type of request can be queued! 

I'm unsure on the whole ordered tag idea, though.  It seems like there
are never be multiple requests in the queue with the same tag, so it
effectively forces my (userspace) virtio server to serialize and
fdatasync on every write request:

		if (outhdr->type && outhdr->tag != vblk->last_tag) {
			while (vblk->in_progress)
				handle_io_finish(fd, dev);
			fdatasync(fd);
			vblk->last_tag = outhdr->tag;
		}


In fact, AFAICT any implementation of ordered tags will be forced into
serial order.  Am I better off telling the block layer to drain and
flush instead of ordered tags?  I can then push that through to the
virtio server.

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: More virtio users
       [not found]         ` <1181546953.16428.96.camel@localhost.localdomain>
@ 2007-06-11  7:33           ` Jens Axboe
  2007-06-12  0:31             ` Rusty Russell
       [not found]             ` <1181608287.16428.127.camel@localhost.localdomain>
  0 siblings, 2 replies; 25+ messages in thread
From: Jens Axboe @ 2007-06-11  7:33 UTC (permalink / raw)
  To: Rusty Russell; +Cc: kvm-devel, Tejun Heo, xen-devel, virtualization

On Mon, Jun 11 2007, Rusty Russell wrote:
> On Mon, 2007-06-11 at 08:41 +0200, Jens Axboe wrote:
> > On Sun, Jun 10 2007, Rusty Russell wrote:
> > > On Sun, 2007-06-10 at 11:16 +0300, Avi Kivity wrote:
> > > > Rusty Russell wrote:
> > > > > Lguest doesn't have a framebuffer, so maybe this is a good thing for me
> > > > > to hack on, but I promised myself I'd finish NAPI for the net device,
> > > > > and tag for block device first.
> > > > >   
> > > > 
> > > > If you're touching the block device, passing a request's io priority to 
> > > > the host can be useful.
> > > 
> > > OK, here's the interdiff.  I still don't handle non-fs requests, but I
> > > haven't seen any yet.  I should probably BUG_ON() there and wait for
> > > Jens to scream...
> > 
> > Ehm no, that would certainly cause screaming :-)
> > 
> > Checking for blk_fs_request() and terminating the request if you don't
> > know how to handle it is the correct thing to do, a BUG() would
> > definitely not be.
> 
> So the problem is that I'd like to handle all of them, but I'm not clear
> what requests my device can get.  I can't see a source of any other type
> of request.

The other main request type is blk_pc_request(). In the data setup it's
indentical to blk_fs_request(), there's a bio chain off ->bio. It's a
byte granularity entity though, so you should check ->data_len for the
size of it. ->cmd[] holds a SCSI cdb, which is the command you are
supposed to handle.

> > > +	blk_queue_prep_rq(vblk->disk->queue, blk_queue_start_tag);
> > > +
> > 
> > is quite a novel way, I actually had to look that code up to check
> > whether it was correct. I'd much prefer a little wrapper around that,
> 
> OK, but I got it from the blk_queue_start_tag documentation:
> 
>  *  Description:
>  *    This can either be used as a stand-alone helper, or possibly be
>  *    assigned as the queue &prep_rq_fn (in which case &struct request
>  *    automagically gets a tag assigned). Note that this function
>  *    assumes that any type of request can be queued! 

OK, bad documentation, I'll make a note to fix that! Sorry about that.

> I'm unsure on the whole ordered tag idea, though.  It seems like there
> are never be multiple requests in the queue with the same tag, so it
> effectively forces my (userspace) virtio server to serialize and
> fdatasync on every write request:
> 
> 		if (outhdr->type && outhdr->tag != vblk->last_tag) {
> 			while (vblk->in_progress)
> 				handle_io_finish(fd, dev);
> 			fdatasync(fd);
> 			vblk->last_tag = outhdr->tag;
> 		}
> 
> 
> In fact, AFAICT any implementation of ordered tags will be forced into
> serial order.  Am I better off telling the block layer to drain and
> flush instead of ordered tags?  I can then push that through to the
> virtio server.

Perhaps you are misunderstanding what the tag is? The tag is a unique
identifier for a pending request, so you will by definition never have
requests sharing a tag value. But the tag is not to be considered as
ordered, unless it has the barrier flag set as well. So you only need to
serialize on the device side when blk_barrier_rq() is true.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Xen-devel] Re: More virtio users
       [not found] ` <1181463220.16428.24.camel@localhost.localdomain>
  2007-06-10  8:16   ` Avi Kivity
       [not found]   ` <466BB34B.9050105@qumranet.com>
@ 2007-06-11  8:16   ` Gerd Hoffmann
  2007-06-11  8:19     ` Avi Kivity
                       ` (3 more replies)
  2 siblings, 4 replies; 25+ messages in thread
From: Gerd Hoffmann @ 2007-06-11  8:16 UTC (permalink / raw)
  To: Rusty Russell; +Cc: kvm-devel, xen-devel, virtualization

   Hi,

> Framebuffer is an interesting one.  Virtio doesn't assume shared memory,
> so naively the fb you would just send outbufs describing changed memory.
> This would work, but describing rectangles is better.  A helper might be
> the right approach here

Rectangles work just fine for a framebuffer console.  They stop working 
once you plan to run any graphical stuff such as an X-Server on top of 
the framebuffer.  Only way to get notified about changes is page faults, 
i.e. 4k granularity on the linear framebuffer memory.

Related to Framebuffer is virtual keyboard and virtual mouse (or better 
touchscreen), which probably works perfectly fine with virtio.  I'd 
guess you can even reuse the input layer event struct for the virtio events.

Xen has virtual framebuffer, kbd & mouse, although not (yet?) in the 
paravirt_ops patch queue, so there is something you can look at ;)

cheers,
   Gerd

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Xen-devel] Re: More virtio users
  2007-06-11  8:16   ` [Xen-devel] " Gerd Hoffmann
@ 2007-06-11  8:19     ` Avi Kivity
  2007-06-11 19:24     ` Anthony Liguori
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 25+ messages in thread
From: Avi Kivity @ 2007-06-11  8:19 UTC (permalink / raw)
  To: Gerd Hoffmann; +Cc: kvm-devel, xen-devel, virtualization

Gerd Hoffmann wrote:
>   Hi,
>
>> Framebuffer is an interesting one.  Virtio doesn't assume shared memory,
>> so naively the fb you would just send outbufs describing changed memory.
>> This would work, but describing rectangles is better.  A helper might be
>> the right approach here
>
> Rectangles work just fine for a framebuffer console.  They stop 
> working once you plan to run any graphical stuff such as an X-Server 
> on top of the framebuffer.  Only way to get notified about changes is 
> page faults, i.e. 4k granularity on the linear framebuffer memory.
>

For X an accelerated device is probably much better than a dumb framebuffer.

> Related to Framebuffer is virtual keyboard and virtual mouse (or 
> better touchscreen), which probably works perfectly fine with virtio.  
> I'd guess you can even reuse the input layer event struct for the 
> virtio events.

Need to be careful if we want to support non-Linux guests.


-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Xen-devel] Re: More virtio users
  2007-06-11  8:16   ` [Xen-devel] " Gerd Hoffmann
  2007-06-11  8:19     ` Avi Kivity
@ 2007-06-11 19:24     ` Anthony Liguori
  2007-06-11 23:19     ` Rusty Russell
       [not found]     ` <1181603983.16428.100.camel@localhost.localdomain>
  3 siblings, 0 replies; 25+ messages in thread
From: Anthony Liguori @ 2007-06-11 19:24 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: kvm-devel, Linux Kernel Mailing List, xen-devel, virtualization

Gerd Hoffmann wrote:
>   Hi,
> 
>> Framebuffer is an interesting one.  Virtio doesn't assume shared memory,
>> so naively the fb you would just send outbufs describing changed memory.
>> This would work, but describing rectangles is better.  A helper might be
>> the right approach here
> 
> Rectangles work just fine for a framebuffer console.  They stop working 
> once you plan to run any graphical stuff such as an X-Server on top of 
> the framebuffer.  Only way to get notified about changes is page faults, 
> i.e. 4k granularity on the linear framebuffer memory.
> 
> Related to Framebuffer is virtual keyboard and virtual mouse (or better 
> touchscreen), which probably works perfectly fine with virtio.  I'd 
> guess you can even reuse the input layer event struct for the virtio 
> events.

Virtio seems like overkill for either of those things.  It's necessary 
for pure PV but not at all necessary for something like KVM.

> Xen has virtual framebuffer, kbd & mouse, although not (yet?) in the 
> paravirt_ops patch queue, so there is something you can look at ;)

In retrospect, IMHO, a shared framebuffer was a bad idea for Xen.  It's 
easy enough for dealing with an unaccelerated display, but once you 
start getting into more advanced features like blitting (which is really 
important actually for decent VNC performance), the synchronization 
becomes a big problem.

If we were to do a PV display driver again, I really think something 
that is closer to a VNC protocol is the right way to go.  There simply 
isn't a significant performance overhead to copying the relatively small 
amount of memory.

So virtio in it's current form would actually be pretty useful for that.

Regards,

Anthony Liguori


> cheers,
>   Gerd

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Xen-devel] Re: More virtio users
  2007-06-11  8:16   ` [Xen-devel] " Gerd Hoffmann
  2007-06-11  8:19     ` Avi Kivity
  2007-06-11 19:24     ` Anthony Liguori
@ 2007-06-11 23:19     ` Rusty Russell
  2007-06-12  3:36       ` Anthony Liguori
       [not found]       ` <466E14A6.6020400@codemonkey.ws>
       [not found]     ` <1181603983.16428.100.camel@localhost.localdomain>
  3 siblings, 2 replies; 25+ messages in thread
From: Rusty Russell @ 2007-06-11 23:19 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: kvm-devel, Benjamin Herrenschmidt, xen-devel, virtualization

On Mon, 2007-06-11 at 10:16 +0200, Gerd Hoffmann wrote:
> Hi,
> 
> > Framebuffer is an interesting one.  Virtio doesn't assume shared memory,
> > so naively the fb you would just send outbufs describing changed memory.
> > This would work, but describing rectangles is better.  A helper might be
> > the right approach here
> 
> Rectangles work just fine for a framebuffer console.  They stop working 
> once you plan to run any graphical stuff such as an X-Server on top of 
> the framebuffer.  Only way to get notified about changes is page faults, 
> i.e. 4k granularity on the linear framebuffer memory.

Yes, I discussed this with Ben Herrenschmidt a couple of months ago.  It
would be better to provide a fb ioctl which X could use to describe
changed rectangles if available.  In the virtio case we could hand that
information through, and other virtualized framebuffers would be able to
use it similarly.

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: More virtio users
  2007-06-11  7:33           ` Jens Axboe
@ 2007-06-12  0:31             ` Rusty Russell
       [not found]             ` <1181608287.16428.127.camel@localhost.localdomain>
  1 sibling, 0 replies; 25+ messages in thread
From: Rusty Russell @ 2007-06-12  0:31 UTC (permalink / raw)
  To: Jens Axboe; +Cc: kvm-devel, Tejun Heo, xen-devel, virtualization

On Mon, 2007-06-11 at 09:33 +0200, Jens Axboe wrote:
> On Mon, Jun 11 2007, Rusty Russell wrote:
> > So the problem is that I'd like to handle all of them, but I'm not clear
> > what requests my device can get.  I can't see a source of any other type
> > of request.
> 
> The other main request type is blk_pc_request(). In the data setup it's
> indentical to blk_fs_request(), there's a bio chain off ->bio. It's a
> byte granularity entity though, so you should check ->data_len for the
> size of it. ->cmd[] holds a SCSI cdb, which is the command you are
> supposed to handle.

SCSI?  I'm even more lost now.

Q: So what *are* the commands?
Q: Who puts them in my queue?

I have *never* seen anything but an fs request come through to my
driver.

> Perhaps you are misunderstanding what the tag is? The tag is a unique
> identifier for a pending request, so you will by definition never have
> requests sharing a tag value.

Yes, I started to suspect that.

> But the tag is not to be considered as
> ordered, unless it has the barrier flag set as well. So you only need to
> serialize on the device side when blk_barrier_rq() is true.

blk_barrier_rq(req) is never true.  I put a BUG_ON(blk_barrier_rq(req))
in my code and booted.  This is using the device as a root ext3
filesystem.

I reverted all the tag changes, still no barriers.  I added
"blk_queue_ordered(vblk->disk->queue, QUEUE_ORDERED_DRAIN, NULL);",
still no barriers.

Dazed and confused,
Rusty.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Xen-devel] Re: More virtio users
       [not found]     ` <1181603983.16428.100.camel@localhost.localdomain>
@ 2007-06-12  0:47       ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 25+ messages in thread
From: Benjamin Herrenschmidt @ 2007-06-12  0:47 UTC (permalink / raw)
  To: Rusty Russell; +Cc: kvm-devel, xen-devel, virtualization

On Tue, 2007-06-12 at 09:19 +1000, Rusty Russell wrote:
> > Rectangles work just fine for a framebuffer console.  They stop
> working 
> > once you plan to run any graphical stuff such as an X-Server on top
> of 
> > the framebuffer.  Only way to get notified about changes is page
> faults, 
> > i.e. 4k granularity on the linear framebuffer memory.
> 
> Yes, I discussed this with Ben Herrenschmidt a couple of months ago.
> It
> would be better to provide a fb ioctl which X could use to describe
> changed rectangles if available.  In the virtio case we could hand
> that
> information through, and other virtualized framebuffers would be able
> to
> use it similarly.

Yes, with the X damage extension, we can have precise notification of
changed areas.

Ben.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Xen-devel] Re: More virtio users
  2007-06-11 23:19     ` Rusty Russell
@ 2007-06-12  3:36       ` Anthony Liguori
       [not found]       ` <466E14A6.6020400@codemonkey.ws>
  1 sibling, 0 replies; 25+ messages in thread
From: Anthony Liguori @ 2007-06-12  3:36 UTC (permalink / raw)
  To: Rusty Russell
  Cc: kvm-devel, Benjamin Herrenschmidt, xen-devel, virtualization

Rusty Russell wrote:
> On Mon, 2007-06-11 at 10:16 +0200, Gerd Hoffmann wrote:
>> Hi,
>>
>>> Framebuffer is an interesting one.  Virtio doesn't assume shared memory,
>>> so naively the fb you would just send outbufs describing changed memory.
>>> This would work, but describing rectangles is better.  A helper might be
>>> the right approach here
>> Rectangles work just fine for a framebuffer console.  They stop working 
>> once you plan to run any graphical stuff such as an X-Server on top of 
>> the framebuffer.  Only way to get notified about changes is page faults, 
>> i.e. 4k granularity on the linear framebuffer memory.
> 
> Yes, I discussed this with Ben Herrenschmidt a couple of months ago.  It
> would be better to provide a fb ioctl which X could use to describe
> changed rectangles if available.  In the virtio case we could hand that
> information through, and other virtualized framebuffers would be able to
> use it similarly.

The X fbdev driver is going to make supporting a new fb ioctl pretty 
fun.  It currently doesn't even support the existing fb ioctls and has a 
strange abstraction layer.

I reckon writing a new X driver from scratch (or based on something like 
the vnc X driver) would be easier in the long run.

Regards,

Anthony Liguori


> Cheers,
> Rusty.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Xen-devel] Re: More virtio users
       [not found]       ` <466E14A6.6020400@codemonkey.ws>
@ 2007-06-12  4:07         ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 25+ messages in thread
From: Benjamin Herrenschmidt @ 2007-06-12  4:07 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: kvm-devel, xen-devel, virtualization


> The X fbdev driver is going to make supporting a new fb ioctl pretty 
> fun.  It currently doesn't even support the existing fb ioctls and has a 
> strange abstraction layer.
> 
> I reckon writing a new X driver from scratch (or based on something like 
> the vnc X driver) would be easier in the long run.

Hrm... adding something to X fbdev should be fairly trivial... in fact,
it already have some sort of ShadowFB support which we could use to
track damage though it's less nice than using the damage extension I've
been told.

Ben.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: More virtio users
       [not found]             ` <1181608287.16428.127.camel@localhost.localdomain>
@ 2007-06-12  6:24               ` Jens Axboe
  2007-06-12  7:52                 ` Rusty Russell
       [not found]                 ` <1181634747.6237.79.camel@localhost.localdomain>
  0 siblings, 2 replies; 25+ messages in thread
From: Jens Axboe @ 2007-06-12  6:24 UTC (permalink / raw)
  To: Rusty Russell; +Cc: kvm-devel, Tejun Heo, xen-devel, virtualization

On Tue, Jun 12 2007, Rusty Russell wrote:
> On Mon, 2007-06-11 at 09:33 +0200, Jens Axboe wrote:
> > On Mon, Jun 11 2007, Rusty Russell wrote:
> > > So the problem is that I'd like to handle all of them, but I'm not clear
> > > what requests my device can get.  I can't see a source of any other type
> > > of request.
> > 
> > The other main request type is blk_pc_request(). In the data setup it's
> > indentical to blk_fs_request(), there's a bio chain off ->bio. It's a
> > byte granularity entity though, so you should check ->data_len for the
> > size of it. ->cmd[] holds a SCSI cdb, which is the command you are
> > supposed to handle.
> 
> SCSI?  I'm even more lost now.
> 
> Q: So what *are* the commands?

They are SCSI commands!

> Q: Who puts them in my queue?

If you want to support SG_IO for instance, you'd have to deal with SCSI
commands.

> I have *never* seen anything but an fs request come through to my
> driver.

You probably only did basic IO to it.

> > Perhaps you are misunderstanding what the tag is? The tag is a unique
> > identifier for a pending request, so you will by definition never have
> > requests sharing a tag value.
> 
> Yes, I started to suspect that.
> 
> > But the tag is not to be considered as
> > ordered, unless it has the barrier flag set as well. So you only need to
> > serialize on the device side when blk_barrier_rq() is true.
> 
> blk_barrier_rq(req) is never true.  I put a BUG_ON(blk_barrier_rq(req))
> in my code and booted.  This is using the device as a root ext3
> filesystem.
>
> I reverted all the tag changes, still no barriers.  I added
> "blk_queue_ordered(vblk->disk->queue, QUEUE_ORDERED_DRAIN, NULL);",
> still no barriers.

-o barrier=1 for ext3, it doesn't use barriers by default. Or
barrier=flush for reiserfs. XFS defaults to barriers on.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: More virtio users
  2007-06-12  6:24               ` Jens Axboe
@ 2007-06-12  7:52                 ` Rusty Russell
       [not found]                 ` <1181634747.6237.79.camel@localhost.localdomain>
  1 sibling, 0 replies; 25+ messages in thread
From: Rusty Russell @ 2007-06-12  7:52 UTC (permalink / raw)
  To: Jens Axboe; +Cc: kvm-devel, Tejun Heo, xen-devel, virtualization

On Tue, 2007-06-12 at 08:24 +0200, Jens Axboe wrote:
> On Tue, Jun 12 2007, Rusty Russell wrote:
> > On Mon, 2007-06-11 at 09:33 +0200, Jens Axboe wrote:
> > > The other main request type is blk_pc_request(). In the data setup it's
> > > indentical to blk_fs_request(), there's a bio chain off ->bio. It's a
> > > byte granularity entity though, so you should check ->data_len for the
> > > size of it. ->cmd[] holds a SCSI cdb, which is the command you are
> > > supposed to handle.
> > 
> > SCSI?  I'm even more lost now.
> > 
> > Q: So what *are* the commands?
> 
> They are SCSI commands!
> 
> > Q: Who puts them in my queue?
> 
> If you want to support SG_IO for instance, you'd have to deal with SCSI
> commands.

I do not.  If someone wants to implement a SCSI layer over virtio, I
think that's wonderful.  Fortunately, that's not the problem I'm trying
to solve.

> -o barrier=1 for ext3, it doesn't use barriers by default.

That's, um, a little disturbing.

But, it works.  Thanks!
Rusty.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: More virtio users
       [not found]                 ` <1181634747.6237.79.camel@localhost.localdomain>
@ 2007-06-12  7:56                   ` Jens Axboe
  0 siblings, 0 replies; 25+ messages in thread
From: Jens Axboe @ 2007-06-12  7:56 UTC (permalink / raw)
  To: Rusty Russell; +Cc: kvm-devel, Tejun Heo, xen-devel, virtualization

On Tue, Jun 12 2007, Rusty Russell wrote:
> On Tue, 2007-06-12 at 08:24 +0200, Jens Axboe wrote:
> > On Tue, Jun 12 2007, Rusty Russell wrote:
> > > On Mon, 2007-06-11 at 09:33 +0200, Jens Axboe wrote:
> > > > The other main request type is blk_pc_request(). In the data setup it's
> > > > indentical to blk_fs_request(), there's a bio chain off ->bio. It's a
> > > > byte granularity entity though, so you should check ->data_len for the
> > > > size of it. ->cmd[] holds a SCSI cdb, which is the command you are
> > > > supposed to handle.
> > > 
> > > SCSI?  I'm even more lost now.
> > > 
> > > Q: So what *are* the commands?
> > 
> > They are SCSI commands!
> > 
> > > Q: Who puts them in my queue?
> > 
> > If you want to support SG_IO for instance, you'd have to deal with SCSI
> > commands.
> 
> I do not.  If someone wants to implement a SCSI layer over virtio, I
> think that's wonderful.  Fortunately, that's not the problem I'm trying
> to solve.

Then you can blissfully ignore blk_pc_request() and just keep your
current code for rejecting !blk_fs_request().

> > -o barrier=1 for ext3, it doesn't use barriers by default.
> 
> That's, um, a little disturbing.
> 
> But, it works.  Thanks!

Well, feel free to send a patch making barrier=1 the default, then I'll
make sure that mails from users that are confused because performance is
suddenly much worse get redirected to you :-)

Kudos to XFS for making it the default, though!

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [kvm-devel] More virtio users
       [not found] <466BA965.6050208@qumranet.com>
                   ` (4 preceding siblings ...)
       [not found] ` <1181463220.16428.24.camel@localhost.localdomain>
@ 2007-06-12 22:01 ` Arnd Bergmann
  2007-06-14 16:27   ` [Xen-devel] " Mark Williamson
  5 siblings, 1 reply; 25+ messages in thread
From: Arnd Bergmann @ 2007-06-12 22:01 UTC (permalink / raw)
  To: kvm-devel; +Cc: xen-devel, virtualization

On Sunday 10 June 2007, Avi Kivity wrote:
> There are probably more.  Any ideas?

* watchdog timer
* tty ports (not just console) to attach to via host socket
* alsa
* hostfs (UML like)

	Arnd <><

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [kvm-devel] [Xen-devel] More virtio users
       [not found]   ` <466BB1AF.1000601@qumranet.com>
@ 2007-06-12 22:07     ` Arnd Bergmann
  2007-06-12 23:40       ` Caitlin Bestler
  0 siblings, 1 reply; 25+ messages in thread
From: Arnd Bergmann @ 2007-06-12 22:07 UTC (permalink / raw)
  To: kvm-devel; +Cc: xen-devel, virtualization

On Sunday 10 June 2007, Avi Kivity wrote:
> > - PCI (or your favorite HW bus) passthrough, for your favorite oddball
> >   device (e.g., crypto-accelerators).
> >   
> Won't all high-bandwidth traffic be through dma, bypassing virtio?

It can be done, but you'd also need a passthrough for the IOMMU
in that case, and you get a potential security hole: if a malicious
guest is smart enough to figure out IOMMU mappings from the device
to memory owned by the host.

	Arnd <><

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: [kvm-devel] [Xen-devel] More virtio users
  2007-06-12 22:07     ` [kvm-devel] " Arnd Bergmann
@ 2007-06-12 23:40       ` Caitlin Bestler
  0 siblings, 0 replies; 25+ messages in thread
From: Caitlin Bestler @ 2007-06-12 23:40 UTC (permalink / raw)
  To: Arnd Bergmann, kvm-devel; +Cc: xen-devel, virtualization

virtualization-bounces@lists.linux-foundation.org wrote:
> On Sunday 10 June 2007, Avi Kivity wrote:
>>> - PCI (or your favorite HW bus) passthrough, for your favorite
>>> oddball   device (e.g., crypto-accelerators).
>>> 
>> Won't all high-bandwidth traffic be through dma, bypassing virtio?
> 
> It can be done, but you'd also need a passthrough for the
> IOMMU in that case, and you get a potential security hole: if
> a malicious guest is smart enough to figure out IOMMU
> mappings from the device to memory owned by the host.
> 

If it is possible for a malicious guess to use the IOMMU
to access memory that was not assigned to it then either
the Hypervisor is not really a Hypervisor or the IOMMU
is not really an IOMMU.

The only real difference between enabling DMA and providing
IO buffers are the durations. The security implications are
identical.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Xen-devel] Re: [kvm-devel] More virtio users
  2007-06-12 22:01 ` [kvm-devel] " Arnd Bergmann
@ 2007-06-14 16:27   ` Mark Williamson
  0 siblings, 0 replies; 25+ messages in thread
From: Mark Williamson @ 2007-06-14 16:27 UTC (permalink / raw)
  To: xen-devel; +Cc: kvm-devel, Arnd Bergmann, virtualization

I've not reviewed the virtio patches but think I've gathered the gist of what 
they're doing (puppies would probably help here)...

> On Sunday 10 June 2007, Avi Kivity wrote:
> > There are probably more.  Any ideas?
>
> * watchdog timer

I recently knocked up a watchdog timer for Xen.  The Linux-side code is based 
on the softdog implementation, but the actual timer implementation is within 
Xen itself.

It'd be fairly trivial to make something like this call out to a variety of 
potential paravirt implementations - much of the code was effectively 
boilerplate (setting up char device, dealing with magic close character, 
etc).

Taking this further, it'd be quite easy to make a "null" implementation that 
does what softdog did.

Watchdogs grow from puppies, so I'd think this would suit Rusty quite well ;-)

> * tty ports (not just console) to attach to via host socket
> * alsa
> * hostfs (UML like)

My XenFS project is somewhat like hostfs but it's a) a bit hairy and b) not 
done yet :-)  A simpler hostfs-style filesystem would be useful and a better 
target for testing out virtio.

The main thing XenFS would want that I guess other virtio devices mightn't 
want is support for persistently sharing memory between virtual machines...  
In the Xen case this is less trivial than the UML hostfs case.  It's not 
*needed* yet, I just thought I'd throw it out there as a potential future 
thing.  Presumably virtual block devices that do memory sharing might need 
this sort of facility too.

Cheers,
Mark

-- 
Dave: Just a question. What use is a unicyle with no seat?  And no pedals!
Mark: To answer a question with a question: What use is a skateboard?
Dave: Skateboards have wheels.
Mark: My wheel has a wheel!

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2007-06-14 16:27 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <466BA965.6050208@qumranet.com>
2007-06-10  8:06 ` [Xen-devel] More virtio users Muli Ben-Yehuda
     [not found] ` <20070610080602.GD3738@rhun.haifa.ibm.com>
2007-06-10  8:09   ` Avi Kivity
     [not found]   ` <466BB1AF.1000601@qumranet.com>
2007-06-12 22:07     ` [kvm-devel] " Arnd Bergmann
2007-06-12 23:40       ` Caitlin Bestler
2007-06-10  8:13 ` Rusty Russell
2007-06-11  3:04 ` [Xen-devel] " ron minnich
     [not found] ` <1181463220.16428.24.camel@localhost.localdomain>
2007-06-10  8:16   ` Avi Kivity
     [not found]   ` <466BB34B.9050105@qumranet.com>
2007-06-10 12:37     ` Rusty Russell
     [not found]     ` <1181479060.16428.37.camel@localhost.localdomain>
2007-06-11  6:41       ` Jens Axboe
2007-06-11  7:29         ` Rusty Russell
     [not found]         ` <1181546953.16428.96.camel@localhost.localdomain>
2007-06-11  7:33           ` Jens Axboe
2007-06-12  0:31             ` Rusty Russell
     [not found]             ` <1181608287.16428.127.camel@localhost.localdomain>
2007-06-12  6:24               ` Jens Axboe
2007-06-12  7:52                 ` Rusty Russell
     [not found]                 ` <1181634747.6237.79.camel@localhost.localdomain>
2007-06-12  7:56                   ` Jens Axboe
2007-06-11  8:16   ` [Xen-devel] " Gerd Hoffmann
2007-06-11  8:19     ` Avi Kivity
2007-06-11 19:24     ` Anthony Liguori
2007-06-11 23:19     ` Rusty Russell
2007-06-12  3:36       ` Anthony Liguori
     [not found]       ` <466E14A6.6020400@codemonkey.ws>
2007-06-12  4:07         ` Benjamin Herrenschmidt
     [not found]     ` <1181603983.16428.100.camel@localhost.localdomain>
2007-06-12  0:47       ` Benjamin Herrenschmidt
2007-06-12 22:01 ` [kvm-devel] " Arnd Bergmann
2007-06-14 16:27   ` [Xen-devel] " Mark Williamson
2007-06-10  7:33 Avi Kivity

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).