* [Qemu-devel] [PATCH 0/2] virtio len fixes for qemu.
@ 2015-03-11 5:59 Rusty Russell
2015-03-11 5:59 ` [Qemu-devel] [PATCH 1/2] virtio: make it clear that "len" for a used descriptor is len written Rusty Russell
` (3 more replies)
0 siblings, 4 replies; 19+ messages in thread
From: Rusty Russell @ 2015-03-11 5:59 UTC (permalink / raw)
To: QEMU Developers, Michael S. Tsirkin; +Cc: Rusty Russell
The virtio 'used' ring describes descriptors which have been used. It
also says how many bytes have been written to the ring. For some cases,
this value is ignored by Linux guests, thus errors have not been noticed.
I was working on increasing the checking in Linux when I noticed this
behaviour.
The first patch changes the 'len' formal parameter name to 'len_written' to
make the API clearer, and adds an assert(). The second fixes block writes.
Cheers,
Rusty.
PS. It's based on MST's virtio-1.0 tree, but should be easily ported.
Rusty Russell (2):
virtio: make it clear that "len" for a used descriptor is len written.
virtio-blk: fix length calculations for write operations.
hw/block/virtio-blk.c | 9 ++++++++-
hw/virtio/virtio.c | 19 ++++++++++++-------
include/hw/virtio/virtio.h | 4 ++--
3 files changed, 22 insertions(+), 10 deletions(-)
--
2.1.0
^ permalink raw reply [flat|nested] 19+ messages in thread
* [Qemu-devel] [PATCH 1/2] virtio: make it clear that "len" for a used descriptor is len written.
2015-03-11 5:59 [Qemu-devel] [PATCH 0/2] virtio len fixes for qemu Rusty Russell
@ 2015-03-11 5:59 ` Rusty Russell
2015-03-11 5:59 ` [Qemu-devel] [PATCH 2/2] virtio-blk: fix length calculations for write operations Rusty Russell
` (2 subsequent siblings)
3 siblings, 0 replies; 19+ messages in thread
From: Rusty Russell @ 2015-03-11 5:59 UTC (permalink / raw)
To: QEMU Developers, Michael S. Tsirkin; +Cc: Rusty Russell
And enforce this with a check that it's <= the writable length.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
---
hw/virtio/virtio.c | 19 ++++++++++++-------
include/hw/virtio/virtio.h | 4 ++--
2 files changed, 14 insertions(+), 9 deletions(-)
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 882a31b..c944113 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -243,16 +243,21 @@ int virtio_queue_empty(VirtQueue *vq)
}
void virtqueue_fill(VirtQueue *vq, const VirtQueueElement *elem,
- unsigned int len, unsigned int idx)
+ unsigned int len_written, unsigned int idx)
{
- unsigned int offset;
+ unsigned int offset, tot_wlen;
int i;
- trace_virtqueue_fill(vq, elem, len, idx);
+ trace_virtqueue_fill(vq, elem, len_written, idx);
+
+ for (tot_wlen = i = 0; i < elem->in_num; i++) {
+ tot_wlen += elem->in_sg[i].iov_len;
+ }
+ assert(len_written <= tot_wlen);
offset = 0;
for (i = 0; i < elem->in_num; i++) {
- size_t size = MIN(len - offset, elem->in_sg[i].iov_len);
+ size_t size = MIN(len_written - offset, elem->in_sg[i].iov_len);
cpu_physical_memory_unmap(elem->in_sg[i].iov_base,
elem->in_sg[i].iov_len,
@@ -270,7 +275,7 @@ void virtqueue_fill(VirtQueue *vq, const VirtQueueElement *elem,
/* Get a pointer to the next entry in the used ring. */
vring_used_ring_id(vq, idx, elem->index);
- vring_used_ring_len(vq, idx, len);
+ vring_used_ring_len(vq, idx, len_written);
}
void virtqueue_flush(VirtQueue *vq, unsigned int count)
@@ -288,9 +293,9 @@ void virtqueue_flush(VirtQueue *vq, unsigned int count)
}
void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem,
- unsigned int len)
+ unsigned int len_written)
{
- virtqueue_fill(vq, elem, len, 0);
+ virtqueue_fill(vq, elem, len_written, 0);
virtqueue_flush(vq, 1);
}
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index df09993..153374f 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -191,10 +191,10 @@ VirtQueue *virtio_add_queue(VirtIODevice *vdev, int queue_size,
void virtio_del_queue(VirtIODevice *vdev, int n);
void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem,
- unsigned int len);
+ unsigned int len_written);
void virtqueue_flush(VirtQueue *vq, unsigned int count);
void virtqueue_fill(VirtQueue *vq, const VirtQueueElement *elem,
- unsigned int len, unsigned int idx);
+ unsigned int len_written, unsigned int idx);
void virtqueue_map_sg(struct iovec *sg, hwaddr *addr,
size_t num_sg, int is_write);
--
2.1.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [Qemu-devel] [PATCH 2/2] virtio-blk: fix length calculations for write operations.
2015-03-11 5:59 [Qemu-devel] [PATCH 0/2] virtio len fixes for qemu Rusty Russell
2015-03-11 5:59 ` [Qemu-devel] [PATCH 1/2] virtio: make it clear that "len" for a used descriptor is len written Rusty Russell
@ 2015-03-11 5:59 ` Rusty Russell
2015-03-11 6:48 ` Michael S. Tsirkin
2015-03-11 6:19 ` [Qemu-devel] [PATCH 0/2] virtio len fixes for qemu Michael S. Tsirkin
2015-03-18 12:32 ` Michael S. Tsirkin
3 siblings, 1 reply; 19+ messages in thread
From: Rusty Russell @ 2015-03-11 5:59 UTC (permalink / raw)
To: QEMU Developers, Michael S. Tsirkin; +Cc: Rusty Russell
We only fill in the 'req->qiov.size' bytes on a (successful) read,
not on a write.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
---
hw/block/virtio-blk.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 258bb4c..98d87a9 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -50,11 +50,19 @@ static void virtio_blk_complete_request(VirtIOBlockReq *req,
{
VirtIOBlock *s = req->dev;
VirtIODevice *vdev = VIRTIO_DEVICE(s);
+ int type = virtio_ldl_p(VIRTIO_DEVICE(req->dev), &req->out.type);
trace_virtio_blk_req_complete(req, status);
stb_p(&req->in->status, status);
- virtqueue_push(s->vq, &req->elem, req->qiov.size + sizeof(*req->in));
+
+ /* If we didn't succeed, we *may* have written more, but don't
+ * count on it. */
+ if (type == VIRTIO_BLK_T_IN && status == VIRTIO_BLK_S_OK) {
+ virtqueue_push(s->vq, &req->elem, req->qiov.size + sizeof(*req->in));
+ } else {
+ virtqueue_push(s->vq, &req->elem, sizeof(*req->in));
+ }
virtio_notify(vdev, s->vq);
}
--
2.1.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH 0/2] virtio len fixes for qemu.
2015-03-11 5:59 [Qemu-devel] [PATCH 0/2] virtio len fixes for qemu Rusty Russell
2015-03-11 5:59 ` [Qemu-devel] [PATCH 1/2] virtio: make it clear that "len" for a used descriptor is len written Rusty Russell
2015-03-11 5:59 ` [Qemu-devel] [PATCH 2/2] virtio-blk: fix length calculations for write operations Rusty Russell
@ 2015-03-11 6:19 ` Michael S. Tsirkin
2015-03-11 6:47 ` Fam Zheng
2015-03-18 12:32 ` Michael S. Tsirkin
3 siblings, 1 reply; 19+ messages in thread
From: Michael S. Tsirkin @ 2015-03-11 6:19 UTC (permalink / raw)
To: Rusty Russell; +Cc: QEMU Developers, stefanha
On Wed, Mar 11, 2015 at 04:29:30PM +1030, Rusty Russell wrote:
> The virtio 'used' ring describes descriptors which have been used. It
> also says how many bytes have been written to the ring. For some cases,
> this value is ignored by Linux guests, thus errors have not been noticed.
> I was working on increasing the checking in Linux when I noticed this
> behaviour.
>
> The first patch changes the 'len' formal parameter name to 'len_written' to
> make the API clearer, and adds an assert(). The second fixes block writes.
>
> Cheers,
> Rusty.
> PS. It's based on MST's virtio-1.0 tree, but should be easily ported.
Thanks, this applies to current master without issues.
However, I think it's best to apply patch 2, then patch 1,
to avoid triggering errors when bisecting.
> Rusty Russell (2):
> virtio: make it clear that "len" for a used descriptor is len written.
> virtio-blk: fix length calculations for write operations.
>
> hw/block/virtio-blk.c | 9 ++++++++-
> hw/virtio/virtio.c | 19 ++++++++++++-------
> include/hw/virtio/virtio.h | 4 ++--
> 3 files changed, 22 insertions(+), 10 deletions(-)
>
> --
> 2.1.0
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH 0/2] virtio len fixes for qemu.
2015-03-11 6:19 ` [Qemu-devel] [PATCH 0/2] virtio len fixes for qemu Michael S. Tsirkin
@ 2015-03-11 6:47 ` Fam Zheng
2015-03-11 6:50 ` Michael S. Tsirkin
0 siblings, 1 reply; 19+ messages in thread
From: Fam Zheng @ 2015-03-11 6:47 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: Rusty Russell, QEMU Developers, stefanha
On Wed, 03/11 07:19, Michael S. Tsirkin wrote:
> On Wed, Mar 11, 2015 at 04:29:30PM +1030, Rusty Russell wrote:
> > The virtio 'used' ring describes descriptors which have been used. It
> > also says how many bytes have been written to the ring. For some cases,
> > this value is ignored by Linux guests, thus errors have not been noticed.
> > I was working on increasing the checking in Linux when I noticed this
> > behaviour.
> >
> > The first patch changes the 'len' formal parameter name to 'len_written' to
> > make the API clearer, and adds an assert(). The second fixes block writes.
> >
> > Cheers,
> > Rusty.
> > PS. It's based on MST's virtio-1.0 tree, but should be easily ported.
>
> Thanks, this applies to current master without issues.
> However, I think it's best to apply patch 2, then patch 1,
> to avoid triggering errors when bisecting.
I'm seeing a make check failure. If this is a false alarm, the test should be
fixed too.
---
qemu-system-x86_64: /var/tmp/patchew-test/git/hw/virtio/virtio.c:254: virtqueue_fill: Assertion `len_written <= tot_wlen' failed.
Broken pipe
GTester: last random seed: R02Se642bf29179ebe0c4a92eb02cc488dd8
[vmxnet3][WR][vmxnet3_peer_has_vnet_hdr]: Peer has no virtio extension. Task offloads will be emulated.
make: *** [check-qtest-x86_64] Error 1
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH 2/2] virtio-blk: fix length calculations for write operations.
2015-03-11 5:59 ` [Qemu-devel] [PATCH 2/2] virtio-blk: fix length calculations for write operations Rusty Russell
@ 2015-03-11 6:48 ` Michael S. Tsirkin
2015-03-11 11:34 ` Rusty Russell
0 siblings, 1 reply; 19+ messages in thread
From: Michael S. Tsirkin @ 2015-03-11 6:48 UTC (permalink / raw)
To: Rusty Russell; +Cc: QEMU Developers
On Wed, Mar 11, 2015 at 04:29:32PM +1030, Rusty Russell wrote:
> We only fill in the 'req->qiov.size' bytes on a (successful) read,
> not on a write.
>
> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> ---
> hw/block/virtio-blk.c | 10 +++++++++-
> 1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> index 258bb4c..98d87a9 100644
> --- a/hw/block/virtio-blk.c
> +++ b/hw/block/virtio-blk.c
> @@ -50,11 +50,19 @@ static void virtio_blk_complete_request(VirtIOBlockReq *req,
> {
> VirtIOBlock *s = req->dev;
> VirtIODevice *vdev = VIRTIO_DEVICE(s);
> + int type = virtio_ldl_p(VIRTIO_DEVICE(req->dev), &req->out.type);
>
> trace_virtio_blk_req_complete(req, status);
>
> stb_p(&req->in->status, status);
> - virtqueue_push(s->vq, &req->elem, req->qiov.size + sizeof(*req->in));
> +
> + /* If we didn't succeed, we *may* have written more, but don't
> + * count on it. */
I wonder about this.
So length as you specify it is <= actually written length.
What are the advantages of this approach?
How about we do the reverse, specify that the length in descriptor
is >= the size actually written?
If we do this, all these buggy hosts suddenly become correct,
which seems better.
> + if (type == VIRTIO_BLK_T_IN && status == VIRTIO_BLK_S_OK) {
> + virtqueue_push(s->vq, &req->elem, req->qiov.size + sizeof(*req->in));
> + } else {
> + virtqueue_push(s->vq, &req->elem, sizeof(*req->in));
> + }
> virtio_notify(vdev, s->vq);
> }
>
> --
> 2.1.0
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH 0/2] virtio len fixes for qemu.
2015-03-11 6:47 ` Fam Zheng
@ 2015-03-11 6:50 ` Michael S. Tsirkin
2015-03-11 11:36 ` Rusty Russell
0 siblings, 1 reply; 19+ messages in thread
From: Michael S. Tsirkin @ 2015-03-11 6:50 UTC (permalink / raw)
To: Fam Zheng; +Cc: Rusty Russell, QEMU Developers, stefanha
On Wed, Mar 11, 2015 at 02:47:47PM +0800, Fam Zheng wrote:
> On Wed, 03/11 07:19, Michael S. Tsirkin wrote:
> > On Wed, Mar 11, 2015 at 04:29:30PM +1030, Rusty Russell wrote:
> > > The virtio 'used' ring describes descriptors which have been used. It
> > > also says how many bytes have been written to the ring. For some cases,
> > > this value is ignored by Linux guests, thus errors have not been noticed.
> > > I was working on increasing the checking in Linux when I noticed this
> > > behaviour.
> > >
> > > The first patch changes the 'len' formal parameter name to 'len_written' to
> > > make the API clearer, and adds an assert(). The second fixes block writes.
> > >
> > > Cheers,
> > > Rusty.
> > > PS. It's based on MST's virtio-1.0 tree, but should be easily ported.
> >
> > Thanks, this applies to current master without issues.
> > However, I think it's best to apply patch 2, then patch 1,
> > to avoid triggering errors when bisecting.
>
> I'm seeing a make check failure. If this is a false alarm, the test should be
> fixed too.
Yea, I'm also now thinking we need a spec clarification on this one, and
some testing with non linux drivers before jumping to changing hosts and
guests.
> ---
>
> qemu-system-x86_64: /var/tmp/patchew-test/git/hw/virtio/virtio.c:254: virtqueue_fill: Assertion `len_written <= tot_wlen' failed.
> Broken pipe
> GTester: last random seed: R02Se642bf29179ebe0c4a92eb02cc488dd8
> [vmxnet3][WR][vmxnet3_peer_has_vnet_hdr]: Peer has no virtio extension. Task offloads will be emulated.
> make: *** [check-qtest-x86_64] Error 1
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH 2/2] virtio-blk: fix length calculations for write operations.
2015-03-11 6:48 ` Michael S. Tsirkin
@ 2015-03-11 11:34 ` Rusty Russell
0 siblings, 0 replies; 19+ messages in thread
From: Rusty Russell @ 2015-03-11 11:34 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: QEMU Developers
"Michael S. Tsirkin" <mst@redhat.com> writes:
> On Wed, Mar 11, 2015 at 04:29:32PM +1030, Rusty Russell wrote:
>> We only fill in the 'req->qiov.size' bytes on a (successful) read,
>> not on a write.
>>
>> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
>> ---
>> hw/block/virtio-blk.c | 10 +++++++++-
>> 1 file changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
>> index 258bb4c..98d87a9 100644
>> --- a/hw/block/virtio-blk.c
>> +++ b/hw/block/virtio-blk.c
>> @@ -50,11 +50,19 @@ static void virtio_blk_complete_request(VirtIOBlockReq *req,
>> {
>> VirtIOBlock *s = req->dev;
>> VirtIODevice *vdev = VIRTIO_DEVICE(s);
>> + int type = virtio_ldl_p(VIRTIO_DEVICE(req->dev), &req->out.type);
>>
>> trace_virtio_blk_req_complete(req, status);
>>
>> stb_p(&req->in->status, status);
>> - virtqueue_push(s->vq, &req->elem, req->qiov.size + sizeof(*req->in));
>> +
>> + /* If we didn't succeed, we *may* have written more, but don't
>> + * count on it. */
>
> I wonder about this.
> So length as you specify it is <= actually written length.
> What are the advantages of this approach?
> How about we do the reverse, specify that the length in descriptor
> is >= the size actually written?
>
> If we do this, all these buggy hosts suddenly become correct,
> which seems better.
The point of telling the guest the amount written is that they don't
have to zero the receive buffer beforehand.
Cheers,
Rusty.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH 0/2] virtio len fixes for qemu.
2015-03-11 6:50 ` Michael S. Tsirkin
@ 2015-03-11 11:36 ` Rusty Russell
2015-03-11 12:39 ` Michael S. Tsirkin
0 siblings, 1 reply; 19+ messages in thread
From: Rusty Russell @ 2015-03-11 11:36 UTC (permalink / raw)
To: Michael S. Tsirkin, Fam Zheng; +Cc: QEMU Developers, stefanha
"Michael S. Tsirkin" <mst@redhat.com> writes:
> On Wed, Mar 11, 2015 at 02:47:47PM +0800, Fam Zheng wrote:
>> On Wed, 03/11 07:19, Michael S. Tsirkin wrote:
>> > On Wed, Mar 11, 2015 at 04:29:30PM +1030, Rusty Russell wrote:
>> > > The virtio 'used' ring describes descriptors which have been used. It
>> > > also says how many bytes have been written to the ring. For some cases,
>> > > this value is ignored by Linux guests, thus errors have not been noticed.
>> > > I was working on increasing the checking in Linux when I noticed this
>> > > behaviour.
>> > >
>> > > The first patch changes the 'len' formal parameter name to 'len_written' to
>> > > make the API clearer, and adds an assert(). The second fixes block writes.
>> > >
>> > > Cheers,
>> > > Rusty.
>> > > PS. It's based on MST's virtio-1.0 tree, but should be easily ported.
>> >
>> > Thanks, this applies to current master without issues.
>> > However, I think it's best to apply patch 2, then patch 1,
>> > to avoid triggering errors when bisecting.
>>
>> I'm seeing a make check failure. If this is a false alarm, the test should be
>> fixed too.
>
> Yea, I'm also now thinking we need a spec clarification on this one, and
> some testing with non linux drivers before jumping to changing hosts and
> guests.
The spec is very clear. The implementation is crap; let's fix it before
1.0.
Quote:
Each entry in the ring is a pair: \field{id} indicates the head
entry of the descriptor chain describing the buffer (this
matches an entry placed in the available ring by the guest
earlier), and \field{len} the total of bytes written into the
buffer. The latter is extremely useful for drivers using
untrusted buffers: if you do not know exactly how much has been
written by the device, you usually have to zero the buffer to
ensure no data leakage occurs.
I have a patch for the Linux side, too, which warns once per device
and fixes it up. I will make the warning conditional on v1.0.
Cheers,
Rusty.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH 0/2] virtio len fixes for qemu.
2015-03-11 11:36 ` Rusty Russell
@ 2015-03-11 12:39 ` Michael S. Tsirkin
2015-03-12 1:04 ` Rusty Russell
0 siblings, 1 reply; 19+ messages in thread
From: Michael S. Tsirkin @ 2015-03-11 12:39 UTC (permalink / raw)
To: Rusty Russell; +Cc: Fam Zheng, QEMU Developers, stefanha
On Wed, Mar 11, 2015 at 10:06:40PM +1030, Rusty Russell wrote:
> "Michael S. Tsirkin" <mst@redhat.com> writes:
> > On Wed, Mar 11, 2015 at 02:47:47PM +0800, Fam Zheng wrote:
> >> On Wed, 03/11 07:19, Michael S. Tsirkin wrote:
> >> > On Wed, Mar 11, 2015 at 04:29:30PM +1030, Rusty Russell wrote:
> >> > > The virtio 'used' ring describes descriptors which have been used. It
> >> > > also says how many bytes have been written to the ring. For some cases,
> >> > > this value is ignored by Linux guests, thus errors have not been noticed.
> >> > > I was working on increasing the checking in Linux when I noticed this
> >> > > behaviour.
> >> > >
> >> > > The first patch changes the 'len' formal parameter name to 'len_written' to
> >> > > make the API clearer, and adds an assert(). The second fixes block writes.
> >> > >
> >> > > Cheers,
> >> > > Rusty.
> >> > > PS. It's based on MST's virtio-1.0 tree, but should be easily ported.
> >> >
> >> > Thanks, this applies to current master without issues.
> >> > However, I think it's best to apply patch 2, then patch 1,
> >> > to avoid triggering errors when bisecting.
> >>
> >> I'm seeing a make check failure. If this is a false alarm, the test should be
> >> fixed too.
> >
> > Yea, I'm also now thinking we need a spec clarification on this one, and
> > some testing with non linux drivers before jumping to changing hosts and
> > guests.
>
> The spec is very clear. The implementation is crap; let's fix it before
> 1.0.
>
> Quote:
>
> Each entry in the ring is a pair: \field{id} indicates the head
> entry of the descriptor chain describing the buffer (this
> matches an entry placed in the available ring by the guest
> earlier), and \field{len} the total of bytes written into the
> buffer. The latter is extremely useful for drivers using
> untrusted buffers: if you do not know exactly how much has been
> written by the device, you usually have to zero the buffer to
> ensure no data leakage occurs.
Right so what does this "if you do not know exactly how much has been
written by the device" mean?
> I have a patch for the Linux side, too, which warns once per device
> and fixes it up. I will make the warning conditional on v1.0.
>
> Cheers,
> Rusty.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH 0/2] virtio len fixes for qemu.
2015-03-11 12:39 ` Michael S. Tsirkin
@ 2015-03-12 1:04 ` Rusty Russell
2015-03-12 6:35 ` Michael S. Tsirkin
0 siblings, 1 reply; 19+ messages in thread
From: Rusty Russell @ 2015-03-12 1:04 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: Fam Zheng, QEMU Developers, stefanha
"Michael S. Tsirkin" <mst@redhat.com> writes:
> On Wed, Mar 11, 2015 at 10:06:40PM +1030, Rusty Russell wrote:
>> Each entry in the ring is a pair: \field{id} indicates the head
>> entry of the descriptor chain describing the buffer (this
>> matches an entry placed in the available ring by the guest
>> earlier), and \field{len} the total of bytes written into the
>> buffer. The latter is extremely useful for drivers using
>> untrusted buffers: if you do not know exactly how much has been
>> written by the device, you usually have to zero the buffer to
>> ensure no data leakage occurs.
>
> Right so what does this "if you do not know exactly how much has been
> written by the device" mean?
It means "without this feature, you would not know how much has been
written by the device"...
Should probably become a Note:
Cheers,
Rusty.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH 0/2] virtio len fixes for qemu.
2015-03-12 1:04 ` Rusty Russell
@ 2015-03-12 6:35 ` Michael S. Tsirkin
2015-03-13 1:17 ` Rusty Russell
0 siblings, 1 reply; 19+ messages in thread
From: Michael S. Tsirkin @ 2015-03-12 6:35 UTC (permalink / raw)
To: Rusty Russell; +Cc: Fam Zheng, QEMU Developers, stefanha
On Thu, Mar 12, 2015 at 11:34:35AM +1030, Rusty Russell wrote:
> "Michael S. Tsirkin" <mst@redhat.com> writes:
> > On Wed, Mar 11, 2015 at 10:06:40PM +1030, Rusty Russell wrote:
> >> Each entry in the ring is a pair: \field{id} indicates the head
> >> entry of the descriptor chain describing the buffer (this
> >> matches an entry placed in the available ring by the guest
> >> earlier), and \field{len} the total of bytes written into the
> >> buffer. The latter is extremely useful for drivers using
> >> untrusted buffers: if you do not know exactly how much has been
> >> written by the device, you usually have to zero the buffer to
> >> ensure no data leakage occurs.
> >
> > Right so what does this "if you do not know exactly how much has been
> > written by the device" mean?
>
> It means "without this feature, you would not know how much has been
> written by the device"...
So imagine a situation where device does not know for sure
how much was written, like here.
Should it set len to value that was written for sure?
Or to value that was possibly written?
Also, e.g. RX in virtio net really depends on len for correctness,
not just as an optimization like the above text implies.
Looks like we need to define it specifically, per device.
> Should probably become a Note:
>
> Cheers,
> Rusty.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH 0/2] virtio len fixes for qemu.
2015-03-12 6:35 ` Michael S. Tsirkin
@ 2015-03-13 1:17 ` Rusty Russell
2015-03-13 13:49 ` Michael S. Tsirkin
0 siblings, 1 reply; 19+ messages in thread
From: Rusty Russell @ 2015-03-13 1:17 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: Fam Zheng, QEMU Developers, stefanha
"Michael S. Tsirkin" <mst@redhat.com> writes:
> On Thu, Mar 12, 2015 at 11:34:35AM +1030, Rusty Russell wrote:
>> "Michael S. Tsirkin" <mst@redhat.com> writes:
>> > On Wed, Mar 11, 2015 at 10:06:40PM +1030, Rusty Russell wrote:
>> >> Each entry in the ring is a pair: \field{id} indicates the head
>> >> entry of the descriptor chain describing the buffer (this
>> >> matches an entry placed in the available ring by the guest
>> >> earlier), and \field{len} the total of bytes written into the
>> >> buffer. The latter is extremely useful for drivers using
>> >> untrusted buffers: if you do not know exactly how much has been
>> >> written by the device, you usually have to zero the buffer to
>> >> ensure no data leakage occurs.
>> >
>> > Right so what does this "if you do not know exactly how much has been
>> > written by the device" mean?
>>
>> It means "without this feature, you would not know how much has been
>> written by the device"...
>
> So imagine a situation where device does not know for sure
> how much was written, like here.
> Should it set len to value that was written for sure?
> Or to value that was possibly written?
In this particular case, it doesn't matter since the failure is marked.
In general, as the stated purpose of 'len' is to avoid guest
receive-buffer zeroing, it is implied that it must not overestimate.
Imagine the case of a guest user process receiving network packets. If
the net device says it's written 1000 bytes (but it hasn't) we will hand
1000 bytes of uninitialized kernel memory to that process.
Here's my proposed spec patch, which spells this out:
diff --git a/content.tex b/content.tex
index 6ba079d..b6345a8 100644
--- a/content.tex
+++ b/content.tex
@@ -600,10 +600,19 @@ them: it is only written to by the device, and read by the driver.
Each entry in the ring is a pair: \field{id} indicates the head entry of the
descriptor chain describing the buffer (this matches an entry
placed in the available ring by the guest earlier), and \field{len} the total
-of bytes written into the buffer. The latter is extremely useful
+of bytes written into the buffer.
+
+\begin{note}
+\field{len} is extremely useful
for drivers using untrusted buffers: if you do not know exactly
-how much has been written by the device, you usually have to zero
-the buffer to ensure no data leakage occurs.
+how much has been written by the device, a driver would have to zero
+the buffer in advance to ensure no data leakage occurs.
+
+For example, a network driver may hand a received buffer directly to
+an unprivileged userspace application. If the network device has not
+overwritten the bytes which were in that buffer, this may leak the
+contents of freed memory from other processes to the application.
+\end{note}
\begin{note}
The legacy \hyperref[intro:Virtio PCI Draft]{[Virtio PCI Draft]}
@@ -612,6 +621,19 @@ the constant as VRING_USED_F_NO_NOTIFY, but the layout and value were
identical.
\end{note}
+\devicenormative{\subsubsection}{Virtqueue Notification Suppression}{Basic Facilities of a Virtio Device / Virtqueues / The Virtqueue Used Ring}
+
+The device MUST set \field{len} to the number of bytes known to be
+written to the descriptor, beginning at the first device-writable
+buffer.
+
+\begin{note}
+There are potential error cases where a device might not know what
+parts of the buffers have been written. In this case \field{len} may
+be an underestimate, but that's preferable to the driver believing
+that uninitialized memory has been overwritten when it has not/
+\end{note}
+
\subsection{Virtqueue Notification Suppression}\label{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue Notification Suppression}
The device can suppress notifications in a manner analogous to the way
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH 0/2] virtio len fixes for qemu.
2015-03-13 1:17 ` Rusty Russell
@ 2015-03-13 13:49 ` Michael S. Tsirkin
2015-03-16 3:14 ` Rusty Russell
0 siblings, 1 reply; 19+ messages in thread
From: Michael S. Tsirkin @ 2015-03-13 13:49 UTC (permalink / raw)
To: Rusty Russell; +Cc: Fam Zheng, QEMU Developers, stefanha
On Fri, Mar 13, 2015 at 11:47:18AM +1030, Rusty Russell wrote:
> "Michael S. Tsirkin" <mst@redhat.com> writes:
> > On Thu, Mar 12, 2015 at 11:34:35AM +1030, Rusty Russell wrote:
> >> "Michael S. Tsirkin" <mst@redhat.com> writes:
> >> > On Wed, Mar 11, 2015 at 10:06:40PM +1030, Rusty Russell wrote:
> >> >> Each entry in the ring is a pair: \field{id} indicates the head
> >> >> entry of the descriptor chain describing the buffer (this
> >> >> matches an entry placed in the available ring by the guest
> >> >> earlier), and \field{len} the total of bytes written into the
> >> >> buffer. The latter is extremely useful for drivers using
> >> >> untrusted buffers: if you do not know exactly how much has been
> >> >> written by the device, you usually have to zero the buffer to
> >> >> ensure no data leakage occurs.
> >> >
> >> > Right so what does this "if you do not know exactly how much has been
> >> > written by the device" mean?
> >>
> >> It means "without this feature, you would not know how much has been
> >> written by the device"...
> >
> > So imagine a situation where device does not know for sure
> > how much was written, like here.
> > Should it set len to value that was written for sure?
> > Or to value that was possibly written?
>
> In this particular case, it doesn't matter since the failure is marked.
>
> In general, as the stated purpose of 'len' is to avoid guest
> receive-buffer zeroing, it is implied that it must not overestimate.
>
> Imagine the case of a guest user process receiving network packets. If
> the net device says it's written 1000 bytes (but it hasn't) we will hand
> 1000 bytes of uninitialized kernel memory to that process.
Finally, I think I understand. Thanks for your patience.
> Here's my proposed spec patch, which spells this out:
>
> diff --git a/content.tex b/content.tex
> index 6ba079d..b6345a8 100644
> --- a/content.tex
> +++ b/content.tex
> @@ -600,10 +600,19 @@ them: it is only written to by the device, and read by the driver.
> Each entry in the ring is a pair: \field{id} indicates the head entry of the
> descriptor chain describing the buffer (this matches an entry
> placed in the available ring by the guest earlier), and \field{len} the total
> -of bytes written into the buffer. The latter is extremely useful
> +of bytes written into the buffer.
> +
> +\begin{note}
> +\field{len} is extremely useful
just "useful" maybe?
> for drivers using untrusted buffers: if you do not know exactly
replace "you" with "driver" here?
> -how much has been written by the device, you usually have to zero
> -the buffer to ensure no data leakage occurs.
> +how much has been written by the device, a driver would have to zero
> +the buffer in advance to ensure no data leakage occurs.
> +
> +For example, a network driver
any driver really, right?
> may hand a received buffer directly to
> +an unprivileged userspace application. If the network device has not
> +overwritten the bytes which were in that buffer, this may leak the
> +contents of freed memory from other processes to the application.
> +\end{note}
>
> \begin{note}
> The legacy \hyperref[intro:Virtio PCI Draft]{[Virtio PCI Draft]}
> @@ -612,6 +621,19 @@ the constant as VRING_USED_F_NO_NOTIFY, but the layout and value were
> identical.
> \end{note}
>
> +\devicenormative{\subsubsection}{Virtqueue Notification Suppression}{Basic Facilities of a Virtio Device / Virtqueues / The Virtqueue Used Ring}
> +
> +The device MUST set \field{len} to the number of bytes known to be
> +written to the descriptor, beginning at the first device-writable
> +buffer.
I think "known to be written" is still too indeterministic for my taste.
Reminds me of the Schrödinger's cat experiment for some reason.
How about something like this:
+The device MUST write at least \field{len} bytes to descriptor,
+beginning at the first device-writable buffer,
+prior to updating the used index field.
+The device MAY write more than \field{len} bytes to descriptor.
+The driver MUST NOT make assumptions about data in the buffer pointed to
+by the descriptor with WRITE flag
+beyond the first \field{len} bytes: the data
+might be unchanged by the device, or it might be
+overwritten by the device.
+The driver SHOULD ignore data beyond the first \field{len} bytes.
> +
> +\begin{note}
> +There are potential error cases where a device might not know what
> +parts of the buffers have been written. In this case \field{len} may
> +be an underestimate, but that's preferable to the driver believing
> +that uninitialized memory has been overwritten when it has not/
> +\end{note}
> +
> \subsection{Virtqueue Notification Suppression}\label{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue Notification Suppression}
>
> The device can suppress notifications in a manner analogous to the way
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH 0/2] virtio len fixes for qemu.
2015-03-13 13:49 ` Michael S. Tsirkin
@ 2015-03-16 3:14 ` Rusty Russell
2015-03-16 5:03 ` Michael S. Tsirkin
0 siblings, 1 reply; 19+ messages in thread
From: Rusty Russell @ 2015-03-16 3:14 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: Fam Zheng, QEMU Developers, stefanha
"Michael S. Tsirkin" <mst@redhat.com> writes:
> On Fri, Mar 13, 2015 at 11:47:18AM +1030, Rusty Russell wrote:
>> Here's my proposed spec patch, which spells this out:
>>
>> diff --git a/content.tex b/content.tex
>> index 6ba079d..b6345a8 100644
>> --- a/content.tex
>> +++ b/content.tex
>> @@ -600,10 +600,19 @@ them: it is only written to by the device, and read by the driver.
>> Each entry in the ring is a pair: \field{id} indicates the head entry of the
>> descriptor chain describing the buffer (this matches an entry
>> placed in the available ring by the guest earlier), and \field{len} the total
>> -of bytes written into the buffer. The latter is extremely useful
>> +of bytes written into the buffer.
>> +
>> +\begin{note}
>> +\field{len} is extremely useful
>
> just "useful" maybe?
OK.
>> for drivers using untrusted buffers: if you do not know exactly
>
> replace "you" with "driver" here?
Yep.
>> -how much has been written by the device, you usually have to zero
>> -the buffer to ensure no data leakage occurs.
>> +how much has been written by the device, a driver would have to zero
>> +the buffer in advance to ensure no data leakage occurs.
>> +
>> +For example, a network driver
>
> any driver really, right?
Well, the block device has an explicit status byte, and an fixed length.
But there's a subtler detail I was considering when I designed this.
Imagine a Xen-style "driver domain" which is actually your device; it's
*untrusted*. This is possible if the (trusted) host does that actual
data transfer, *and* reports the length; and such a mechanism is
generic, so the host doesn't need to whether this is a block, net, or
other device.
(Imagine the device-guest has R/O mapping of the avail ring and
descriptor table. Ignoring indirect descriptors you only need a "copy
this data to/from this avail entry" helper to make this work).
> How about something like this:
>
> +The device MUST write at least \field{len} bytes to descriptor,
> +beginning at the first device-writable buffer,
> +prior to updating the used index field.
> +The device MAY write more than \field{len} bytes to descriptor.
> +The driver MUST NOT make assumptions about data in the buffer pointed to
> +by the descriptor with WRITE flag
> +beyond the first \field{len} bytes: the data
> +might be unchanged by the device, or it might be
> +overwritten by the device.
> +The driver SHOULD ignore data beyond the first \field{len} bytes.
I like these, as long as we note that this MAY is to allow error cases,
otherwise people might think they should just set len to zero.
Here it is, using the device-writable terminology, and explicitly
requiring that the device must set len (otherwise the requirements
about the device obeying len makes it look like it's set by the driver):
diff --git a/content.tex b/content.tex
index 6ba079d..2c946a5 100644
--- a/content.tex
+++ b/content.tex
@@ -600,10 +600,19 @@ them: it is only written to by the device, and read by the driver.
Each entry in the ring is a pair: \field{id} indicates the head entry of the
descriptor chain describing the buffer (this matches an entry
placed in the available ring by the guest earlier), and \field{len} the total
-of bytes written into the buffer. The latter is extremely useful
-for drivers using untrusted buffers: if you do not know exactly
-how much has been written by the device, you usually have to zero
-the buffer to ensure no data leakage occurs.
+of bytes written into the buffer.
+
+\begin{note}
+\field{len} is useful
+for drivers using untrusted buffers: if a driver does not know exactly
+how much has been written by the device, the driver would have to zero
+the buffer in advance to ensure no data leakage occurs.
+
+For example, a network driver may hand a received buffer directly to
+an unprivileged userspace application. If the network device has not
+overwritten the bytes which were in that buffer, this may leak the
+contents of freed memory from other processes to the application.
+\end{note}
\begin{note}
The legacy \hyperref[intro:Virtio PCI Draft]{[Virtio PCI Draft]}
@@ -612,6 +621,28 @@ the constant as VRING_USED_F_NO_NOTIFY, but the layout and value were
identical.
\end{note}
+\devicenormative{\subsubsection}{Virtqueue Notification Suppression}{Basic Facilities of a Virtio Device / Virtqueues / The Virtqueue Used Ring}
+
+The device MUST set \field{len} prior to updating the used \field{idx}.
+
+The device MUST write at least \field{len} bytes to descriptor,
+beginning at the first device-writable buffer,
+prior to updating the used \field{idx}.
+
+The device MAY write more than \field{len} bytes to descriptor.
+
+\begin{note}
+There are potential error cases where a device might not know what
+parts of the buffers have been written. This is why \field{len} may
+be an underestimate, but that's preferable to the driver believing
+that uninitialized memory has been overwritten when it has not.
+\end{note}
+
+\drivernormative{\subsubsection}{Virtqueue Notification Suppression}{Basic Facilities of a Virtio Device / Virtqueues / The Virtqueue Used Ring}
+
+The driver MUST NOT make assumptions about data in device-writable buffers
+beyond the first \field{len} bytes, and SHOULD ignore it.
+
\subsection{Virtqueue Notification Suppression}\label{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue Notification Suppression}
The device can suppress notifications in a manner analogous to the way
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH 0/2] virtio len fixes for qemu.
2015-03-16 3:14 ` Rusty Russell
@ 2015-03-16 5:03 ` Michael S. Tsirkin
2015-03-16 15:37 ` Cornelia Huck
2015-03-20 0:59 ` Rusty Russell
0 siblings, 2 replies; 19+ messages in thread
From: Michael S. Tsirkin @ 2015-03-16 5:03 UTC (permalink / raw)
To: Rusty Russell; +Cc: Fam Zheng, QEMU Developers, stefanha
On Mon, Mar 16, 2015 at 01:44:22PM +1030, Rusty Russell wrote:
> "Michael S. Tsirkin" <mst@redhat.com> writes:
> > On Fri, Mar 13, 2015 at 11:47:18AM +1030, Rusty Russell wrote:
> >> Here's my proposed spec patch, which spells this out:
> >>
> >> diff --git a/content.tex b/content.tex
> >> index 6ba079d..b6345a8 100644
> >> --- a/content.tex
> >> +++ b/content.tex
> >> @@ -600,10 +600,19 @@ them: it is only written to by the device, and read by the driver.
> >> Each entry in the ring is a pair: \field{id} indicates the head entry of the
> >> descriptor chain describing the buffer (this matches an entry
> >> placed in the available ring by the guest earlier), and \field{len} the total
> >> -of bytes written into the buffer. The latter is extremely useful
> >> +of bytes written into the buffer.
> >> +
> >> +\begin{note}
> >> +\field{len} is extremely useful
> >
> > just "useful" maybe?
>
> OK.
>
> >> for drivers using untrusted buffers: if you do not know exactly
> >
> > replace "you" with "driver" here?
>
> Yep.
>
> >> -how much has been written by the device, you usually have to zero
> >> -the buffer to ensure no data leakage occurs.
> >> +how much has been written by the device, a driver would have to zero
> >> +the buffer in advance to ensure no data leakage occurs.
> >> +
> >> +For example, a network driver
> >
> > any driver really, right?
>
> Well, the block device has an explicit status byte, and an fixed length.
>
> But there's a subtler detail I was considering when I designed this.
>
> Imagine a Xen-style "driver domain" which is actually your device; it's
> *untrusted*. This is possible if the (trusted) host does that actual
> data transfer, *and* reports the length; and such a mechanism is
> generic, so the host doesn't need to whether this is a block, net, or
> other device.
>
> (Imagine the device-guest has R/O mapping of the avail ring and
> descriptor table. Ignoring indirect descriptors you only need a "copy
> this data to/from this avail entry" helper to make this work).
>
> > How about something like this:
> >
> > +The device MUST write at least \field{len} bytes to descriptor,
> > +beginning at the first device-writable buffer,
> > +prior to updating the used index field.
> > +The device MAY write more than \field{len} bytes to descriptor.
> > +The driver MUST NOT make assumptions about data in the buffer pointed to
> > +by the descriptor with WRITE flag
> > +beyond the first \field{len} bytes: the data
> > +might be unchanged by the device, or it might be
> > +overwritten by the device.
> > +The driver SHOULD ignore data beyond the first \field{len} bytes.
>
> I like these, as long as we note that this MAY is to allow error cases,
> otherwise people might think they should just set len to zero.
>
> Here it is, using the device-writable terminology, and explicitly
> requiring that the device must set len (otherwise the requirements
> about the device obeying len makes it look like it's set by the driver):
>
> diff --git a/content.tex b/content.tex
> index 6ba079d..2c946a5 100644
> --- a/content.tex
> +++ b/content.tex
> @@ -600,10 +600,19 @@ them: it is only written to by the device, and read by the driver.
> Each entry in the ring is a pair: \field{id} indicates the head entry of the
> descriptor chain describing the buffer (this matches an entry
> placed in the available ring by the guest earlier), and \field{len} the total
> -of bytes written into the buffer. The latter is extremely useful
> -for drivers using untrusted buffers: if you do not know exactly
> -how much has been written by the device, you usually have to zero
> -the buffer to ensure no data leakage occurs.
> +of bytes written into the buffer.
> +
> +\begin{note}
> +\field{len} is useful
> +for drivers using untrusted buffers: if a driver does not know exactly
> +how much has been written by the device, the driver would have to zero
> +the buffer in advance to ensure no data leakage occurs.
> +
> +For example, a network driver may hand a received buffer directly to
> +an unprivileged userspace application. If the network device has not
> +overwritten the bytes which were in that buffer, this may leak the
> +contents of freed memory from other processes to the application.
> +\end{note}
>
> \begin{note}
> The legacy \hyperref[intro:Virtio PCI Draft]{[Virtio PCI Draft]}
> @@ -612,6 +621,28 @@ the constant as VRING_USED_F_NO_NOTIFY, but the layout and value were
> identical.
> \end{note}
>
> +\devicenormative{\subsubsection}{Virtqueue Notification Suppression}{Basic Facilities of a Virtio Device / Virtqueues / The Virtqueue Used Ring}
> +
> +The device MUST set \field{len} prior to updating the used \field{idx}.
> +
> +The device MUST write at least \field{len} bytes to descriptor,
> +beginning at the first device-writable buffer,
> +prior to updating the used \field{idx}.
> +
> +The device MAY write more than \field{len} bytes to descriptor.
> +
> +\begin{note}
> +There are potential error cases where a device might not know what
> +parts of the buffers have been written. This is why \field{len} may
> +be an underestimate, but that's preferable to the driver believing
> +that uninitialized memory has been overwritten when it has not.
> +\end{note}
> +
> +\drivernormative{\subsubsection}{Virtqueue Notification Suppression}{Basic Facilities of a Virtio Device / Virtqueues / The Virtqueue Used Ring}
> +
> +The driver MUST NOT make assumptions about data in device-writable buffers
> +beyond the first \field{len} bytes, and SHOULD ignore it.
it -> this data.
Otherwise on first reading I thought "it" refers to len field.
> +
> \subsection{Virtqueue Notification Suppression}\label{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue Notification Suppression}
>
> The device can suppress notifications in a manner analogous to the way
Sounds good, let's move discussion to virtio/virtio-dev now?
I think it's 1.1 material - agree?
--
MST
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH 0/2] virtio len fixes for qemu.
2015-03-16 5:03 ` Michael S. Tsirkin
@ 2015-03-16 15:37 ` Cornelia Huck
2015-03-20 0:59 ` Rusty Russell
1 sibling, 0 replies; 19+ messages in thread
From: Cornelia Huck @ 2015-03-16 15:37 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: Rusty Russell, Fam Zheng, QEMU Developers, stefanha
On Mon, 16 Mar 2015 06:03:24 +0100
"Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Mon, Mar 16, 2015 at 01:44:22PM +1030, Rusty Russell wrote:
> > diff --git a/content.tex b/content.tex
> > index 6ba079d..2c946a5 100644
> > --- a/content.tex
> > +++ b/content.tex
> > @@ -600,10 +600,19 @@ them: it is only written to by the device, and read by the driver.
> > Each entry in the ring is a pair: \field{id} indicates the head entry of the
> > descriptor chain describing the buffer (this matches an entry
> > placed in the available ring by the guest earlier), and \field{len} the total
> > -of bytes written into the buffer. The latter is extremely useful
> > -for drivers using untrusted buffers: if you do not know exactly
> > -how much has been written by the device, you usually have to zero
> > -the buffer to ensure no data leakage occurs.
> > +of bytes written into the buffer.
> > +
> > +\begin{note}
> > +\field{len} is useful
> > +for drivers using untrusted buffers: if a driver does not know exactly
> > +how much has been written by the device, the driver would have to zero
> > +the buffer in advance to ensure no data leakage occurs.
> > +
> > +For example, a network driver may hand a received buffer directly to
> > +an unprivileged userspace application. If the network device has not
> > +overwritten the bytes which were in that buffer, this may leak the
s/may/might/ in both cases, so it doesn't get confused with MAY?
> > +contents of freed memory from other processes to the application.
> > +\end{note}
> >
> > \begin{note}
> > The legacy \hyperref[intro:Virtio PCI Draft]{[Virtio PCI Draft]}
> > @@ -612,6 +621,28 @@ the constant as VRING_USED_F_NO_NOTIFY, but the layout and value were
> > identical.
> > \end{note}
> >
> > +\devicenormative{\subsubsection}{Virtqueue Notification Suppression}{Basic Facilities of a Virtio Device / Virtqueues / The Virtqueue Used Ring}
> > +
> > +The device MUST set \field{len} prior to updating the used \field{idx}.
> > +
> > +The device MUST write at least \field{len} bytes to descriptor,
> > +beginning at the first device-writable buffer,
> > +prior to updating the used \field{idx}.
> > +
> > +The device MAY write more than \field{len} bytes to descriptor.
> > +
> > +\begin{note}
> > +There are potential error cases where a device might not know what
> > +parts of the buffers have been written. This is why \field{len} may
> > +be an underestimate, but that's preferable to the driver believing
> > +that uninitialized memory has been overwritten when it has not.
> > +\end{note}
> > +
> > +\drivernormative{\subsubsection}{Virtqueue Notification Suppression}{Basic Facilities of a Virtio Device / Virtqueues / The Virtqueue Used Ring}
> > +
> > +The driver MUST NOT make assumptions about data in device-writable buffers
> > +beyond the first \field{len} bytes, and SHOULD ignore it.
>
> it -> this data.
>
> Otherwise on first reading I thought "it" refers to len field.
>
> > +
> > \subsection{Virtqueue Notification Suppression}\label{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue Notification Suppression}
> >
> > The device can suppress notifications in a manner analogous to the way
>
> Sounds good, let's move discussion to virtio/virtio-dev now?
FWIW, this sounds good to me as well.
> I think it's 1.1 material - agree?
Given that nobody seems to have cared about this before, I agree.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH 0/2] virtio len fixes for qemu.
2015-03-11 5:59 [Qemu-devel] [PATCH 0/2] virtio len fixes for qemu Rusty Russell
` (2 preceding siblings ...)
2015-03-11 6:19 ` [Qemu-devel] [PATCH 0/2] virtio len fixes for qemu Michael S. Tsirkin
@ 2015-03-18 12:32 ` Michael S. Tsirkin
3 siblings, 0 replies; 19+ messages in thread
From: Michael S. Tsirkin @ 2015-03-18 12:32 UTC (permalink / raw)
To: Rusty Russell; +Cc: QEMU Developers
On Wed, Mar 11, 2015 at 04:29:30PM +1030, Rusty Russell wrote:
> The virtio 'used' ring describes descriptors which have been used. It
> also says how many bytes have been written to the ring. For some cases,
> this value is ignored by Linux guests, thus errors have not been noticed.
> I was working on increasing the checking in Linux when I noticed this
> behaviour.
>
> The first patch changes the 'len' formal parameter name to 'len_written' to
> make the API clearer, and adds an assert(). The second fixes block writes.
>
> Cheers,
> Rusty.
> PS. It's based on MST's virtio-1.0 tree, but should be easily ported.
After going back and forth on this, I decided it's
best to defer this change to 2.4.
Guests can't depend on this behaviour without checking virtio-1 anyway.
> Rusty Russell (2):
> virtio: make it clear that "len" for a used descriptor is len written.
> virtio-blk: fix length calculations for write operations.
>
> hw/block/virtio-blk.c | 9 ++++++++-
> hw/virtio/virtio.c | 19 ++++++++++++-------
> include/hw/virtio/virtio.h | 4 ++--
> 3 files changed, 22 insertions(+), 10 deletions(-)
>
> --
> 2.1.0
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH 0/2] virtio len fixes for qemu.
2015-03-16 5:03 ` Michael S. Tsirkin
2015-03-16 15:37 ` Cornelia Huck
@ 2015-03-20 0:59 ` Rusty Russell
1 sibling, 0 replies; 19+ messages in thread
From: Rusty Russell @ 2015-03-20 0:59 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: Fam Zheng, QEMU Developers, stefanha
"Michael S. Tsirkin" <mst@redhat.com> writes:
> On Mon, Mar 16, 2015 at 01:44:22PM +1030, Rusty Russell wrote:
>> diff --git a/content.tex b/content.tex
>> index 6ba079d..2c946a5 100644
>> --- a/content.tex
>> +++ b/content.tex
>> @@ -600,10 +600,19 @@ them: it is only written to by the device, and read by the driver.
>> Each entry in the ring is a pair: \field{id} indicates the head entry of the
>> descriptor chain describing the buffer (this matches an entry
>> placed in the available ring by the guest earlier), and \field{len} the total
>> -of bytes written into the buffer. The latter is extremely useful
>> -for drivers using untrusted buffers: if you do not know exactly
>> -how much has been written by the device, you usually have to zero
>> -the buffer to ensure no data leakage occurs.
>> +of bytes written into the buffer.
>> +
>> +\begin{note}
>> +\field{len} is useful
>> +for drivers using untrusted buffers: if a driver does not know exactly
>> +how much has been written by the device, the driver would have to zero
>> +the buffer in advance to ensure no data leakage occurs.
>> +
>> +For example, a network driver may hand a received buffer directly to
>> +an unprivileged userspace application. If the network device has not
>> +overwritten the bytes which were in that buffer, this may leak the
>> +contents of freed memory from other processes to the application.
>> +\end{note}
>>
>> \begin{note}
>> The legacy \hyperref[intro:Virtio PCI Draft]{[Virtio PCI Draft]}
>> @@ -612,6 +621,28 @@ the constant as VRING_USED_F_NO_NOTIFY, but the layout and value were
>> identical.
>> \end{note}
>>
>> +\devicenormative{\subsubsection}{Virtqueue Notification Suppression}{Basic Facilities of a Virtio Device / Virtqueues / The Virtqueue Used Ring}
>> +
>> +The device MUST set \field{len} prior to updating the used \field{idx}.
>> +
>> +The device MUST write at least \field{len} bytes to descriptor,
>> +beginning at the first device-writable buffer,
>> +prior to updating the used \field{idx}.
>> +
>> +The device MAY write more than \field{len} bytes to descriptor.
>> +
>> +\begin{note}
>> +There are potential error cases where a device might not know what
>> +parts of the buffers have been written. This is why \field{len} may
>> +be an underestimate, but that's preferable to the driver believing
>> +that uninitialized memory has been overwritten when it has not.
>> +\end{note}
>> +
>> +\drivernormative{\subsubsection}{Virtqueue Notification Suppression}{Basic Facilities of a Virtio Device / Virtqueues / The Virtqueue Used Ring}
>> +
>> +The driver MUST NOT make assumptions about data in device-writable buffers
>> +beyond the first \field{len} bytes, and SHOULD ignore it.
>
> it -> this data.
>
> Otherwise on first reading I thought "it" refers to len field.
Thanks, fixed.
>> +
>> \subsection{Virtqueue Notification Suppression}\label{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue Notification Suppression}
>>
>> The device can suppress notifications in a manner analogous to the way
>
> Sounds good, let's move discussion to virtio/virtio-dev now?
> I think it's 1.1 material - agree?
Moving. My intent was not to change the spec, but to clarify it. It's a
bug in the spec that we didn't spell out what len means, but left it by
implication.
I'm OK with leaving it for 1.1 if you think this fix is too much, as
long as implementations don't leave fixing it until 1.1 :)
Thanks,
Rusty.
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2015-03-20 1:13 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-03-11 5:59 [Qemu-devel] [PATCH 0/2] virtio len fixes for qemu Rusty Russell
2015-03-11 5:59 ` [Qemu-devel] [PATCH 1/2] virtio: make it clear that "len" for a used descriptor is len written Rusty Russell
2015-03-11 5:59 ` [Qemu-devel] [PATCH 2/2] virtio-blk: fix length calculations for write operations Rusty Russell
2015-03-11 6:48 ` Michael S. Tsirkin
2015-03-11 11:34 ` Rusty Russell
2015-03-11 6:19 ` [Qemu-devel] [PATCH 0/2] virtio len fixes for qemu Michael S. Tsirkin
2015-03-11 6:47 ` Fam Zheng
2015-03-11 6:50 ` Michael S. Tsirkin
2015-03-11 11:36 ` Rusty Russell
2015-03-11 12:39 ` Michael S. Tsirkin
2015-03-12 1:04 ` Rusty Russell
2015-03-12 6:35 ` Michael S. Tsirkin
2015-03-13 1:17 ` Rusty Russell
2015-03-13 13:49 ` Michael S. Tsirkin
2015-03-16 3:14 ` Rusty Russell
2015-03-16 5:03 ` Michael S. Tsirkin
2015-03-16 15:37 ` Cornelia Huck
2015-03-20 0:59 ` Rusty Russell
2015-03-18 12:32 ` Michael S. Tsirkin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).