qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/3] block: Fix unaligned bdrv_aio_write_zeroes
@ 2015-04-24  8:33 Fam Zheng
  2015-04-24  8:33 ` [Qemu-devel] [PATCH 1/3] scsi-disk: Limit zero write request to SCSI_WRITE_SAME_MAX Fam Zheng
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Fam Zheng @ 2015-04-24  8:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Paolo Bonzini, qemu-block, Stefan Hajnoczi,
	qemu-stable

An unaligned zero write causes NULL deferencing in bdrv_co_do_pwritev. That
path is reachable from bdrv_co_write_zeroes and bdrv_aio_write_zeroes.

You can easily trigger through the former with qemu-io, as the test case added
by 61815d6e0aa. For bdrv_aio_write_zeroes, in common cases there's always a
format driver (which uses 512 alignment), so it would be much rarer to have
unaligned requests (only concerning top level here, when the request goes down
to bs->file, where for example the alignment is 4k, it would then be calling
bdrv_co_write_zeroes because it's in a coroutine).

fc3959e4669a1c fixed bdrv_co_write_zeroes but not bdrv_aio_write_zeroes.  The
lattern is the actually used one by device model. Revert the previous fix, do
it in bdrv_co_do_pwritev, to cover both paths.

Fam


Fam Zheng (3):
  scsi-disk: Limit zero write request to SCSI_WRITE_SAME_MAX
  block: Fix NULL deference for unaligned write if qiov is NULL
  Revert "block: Fix unaligned zero write"

 block.c             | 69 ++++++++++++++++++++---------------------------------
 hw/scsi/scsi-disk.c |  7 +++++-
 2 files changed, 32 insertions(+), 44 deletions(-)

-- 
1.9.3

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Qemu-devel] [PATCH 1/3] scsi-disk: Limit zero write request to SCSI_WRITE_SAME_MAX
  2015-04-24  8:33 [Qemu-devel] [PATCH 0/3] block: Fix unaligned bdrv_aio_write_zeroes Fam Zheng
@ 2015-04-24  8:33 ` Fam Zheng
  2015-04-24  8:50   ` Paolo Bonzini
  2015-04-24  8:33 ` [Qemu-devel] [PATCH 2/3] block: Fix NULL deference for unaligned write if qiov is NULL Fam Zheng
  2015-04-24  8:33 ` [Qemu-devel] [PATCH 3/3] Revert "block: Fix unaligned zero write" Fam Zheng
  2 siblings, 1 reply; 8+ messages in thread
From: Fam Zheng @ 2015-04-24  8:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Paolo Bonzini, qemu-block, Stefan Hajnoczi,
	qemu-stable

SBC-4 says:

    If the number of logical blocks specified to be unmapped or written
    exceeds the value indicated in the MAXIMUM WRITE SAME LENGTH field
    in the Block Limits VPD page (see 6.6.4), then the device server
    shall terminate the command with CHECK CONDITION status with the
    sense key set to ILLEGAL REQUEST and the additional sense code set
    to INVALID FIELD IN CDB.

Check the request size to match the spec.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 hw/scsi/scsi-disk.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index 54d71f4..b748982 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -1707,6 +1707,11 @@ static void scsi_disk_emulate_write_same(SCSIDiskReq *r, uint8_t *inbuf)
         return;
     }
 
+    if (nb_sectors * (s->qdev.blocksize / 512) * 512 > SCSI_WRITE_SAME_MAX) {
+        scsi_check_condition(r, SENSE_CODE(INVALID_FIELD));
+        return;
+    }
+
     if (buffer_is_zero(inbuf, s->qdev.blocksize)) {
         int flags = (req->cmd.buf[1] & 0x8) ? BDRV_REQ_MAY_UNMAP : 0;
 
@@ -1726,7 +1731,7 @@ static void scsi_disk_emulate_write_same(SCSIDiskReq *r, uint8_t *inbuf)
     data->r = r;
     data->sector = r->req.cmd.lba * (s->qdev.blocksize / 512);
     data->nb_sectors = nb_sectors * (s->qdev.blocksize / 512);
-    data->iov.iov_len = MIN(data->nb_sectors * 512, SCSI_WRITE_SAME_MAX);
+    data->iov.iov_len = data->nb_sectors * 512;
     data->iov.iov_base = buf = blk_blockalign(s->qdev.conf.blk,
                                               data->iov.iov_len);
     qemu_iovec_init_external(&data->qiov, &data->iov, 1);
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [Qemu-devel] [PATCH 2/3] block: Fix NULL deference for unaligned write if qiov is NULL
  2015-04-24  8:33 [Qemu-devel] [PATCH 0/3] block: Fix unaligned bdrv_aio_write_zeroes Fam Zheng
  2015-04-24  8:33 ` [Qemu-devel] [PATCH 1/3] scsi-disk: Limit zero write request to SCSI_WRITE_SAME_MAX Fam Zheng
@ 2015-04-24  8:33 ` Fam Zheng
  2015-04-24  9:01   ` Paolo Bonzini
  2015-04-24  8:33 ` [Qemu-devel] [PATCH 3/3] Revert "block: Fix unaligned zero write" Fam Zheng
  2 siblings, 1 reply; 8+ messages in thread
From: Fam Zheng @ 2015-04-24  8:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Paolo Bonzini, qemu-block, Stefan Hajnoczi,
	qemu-stable

For zero write, qiov passed by callers (qemu-io "write -z" and
scsi-disk "write same") is NULL.

Commit fc3959e466 fixed bdrv_co_write_zeroes which is the common case
for this bug, but it still exists in bdrv_aio_write_zeroes. A simpler
fix would be in bdrv_co_do_pwritev which is the NULL dereference point
and covers both cases.

So don't access it in bdrv_co_do_pwritev, use a zeroed buffer instead.
Device model who calls into block layer should check the request size.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block.c | 24 ++++++++++++++++++++----
 1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/block.c b/block.c
index f2f8ae7..878a72d 100644
--- a/block.c
+++ b/block.c
@@ -3386,6 +3386,7 @@ static int coroutine_fn bdrv_co_do_pwritev(BlockDriverState *bs,
     uint64_t align = bdrv_get_align(bs);
     uint8_t *head_buf = NULL;
     uint8_t *tail_buf = NULL;
+    uint8_t *qiov_buf = NULL;
     QEMUIOVector local_qiov;
     bool use_local_qiov = false;
     int ret;
@@ -3436,9 +3437,15 @@ static int coroutine_fn bdrv_co_do_pwritev(BlockDriverState *bs,
         }
         BLKDBG_EVENT(bs, BLKDBG_PWRITEV_RMW_AFTER_HEAD);
 
-        qemu_iovec_init(&local_qiov, qiov->niov + 2);
+        qemu_iovec_init(&local_qiov, qiov ? qiov->niov + 2 : 3);
         qemu_iovec_add(&local_qiov, head_buf, offset & (align - 1));
-        qemu_iovec_concat(&local_qiov, qiov, 0, qiov->size);
+        if (qiov) {
+            qemu_iovec_concat(&local_qiov, qiov, 0, qiov->size);
+        } else {
+            qiov_buf = qemu_blockalign(bs, bytes);
+            memset(qiov_buf, 0, bytes);
+            qemu_iovec_add(&local_qiov, qiov_buf, bytes);
+        }
         use_local_qiov = true;
 
         bytes += offset & (align - 1);
@@ -3471,8 +3478,16 @@ static int coroutine_fn bdrv_co_do_pwritev(BlockDriverState *bs,
         BLKDBG_EVENT(bs, BLKDBG_PWRITEV_RMW_AFTER_TAIL);
 
         if (!use_local_qiov) {
-            qemu_iovec_init(&local_qiov, qiov->niov + 1);
-            qemu_iovec_concat(&local_qiov, qiov, 0, qiov->size);
+            qemu_iovec_init(&local_qiov, qiov ? qiov->niov + 1 : 2);
+            if (qiov) {
+                qemu_iovec_concat(&local_qiov, qiov, 0, qiov->size);
+            } else {
+                if (!qiov_buf) {
+                    qiov_buf = qemu_blockalign(bs, bytes);
+                    memset(qiov_buf, 0, bytes);
+                }
+                qemu_iovec_add(&local_qiov, qiov_buf, bytes);
+            }
             use_local_qiov = true;
         }
 
@@ -3498,6 +3513,7 @@ fail:
     }
     qemu_vfree(head_buf);
     qemu_vfree(tail_buf);
+    qemu_vfree(qiov_buf);
 
     return ret;
 }
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [Qemu-devel] [PATCH 3/3] Revert "block: Fix unaligned zero write"
  2015-04-24  8:33 [Qemu-devel] [PATCH 0/3] block: Fix unaligned bdrv_aio_write_zeroes Fam Zheng
  2015-04-24  8:33 ` [Qemu-devel] [PATCH 1/3] scsi-disk: Limit zero write request to SCSI_WRITE_SAME_MAX Fam Zheng
  2015-04-24  8:33 ` [Qemu-devel] [PATCH 2/3] block: Fix NULL deference for unaligned write if qiov is NULL Fam Zheng
@ 2015-04-24  8:33 ` Fam Zheng
  2 siblings, 0 replies; 8+ messages in thread
From: Fam Zheng @ 2015-04-24  8:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Paolo Bonzini, qemu-block, Stefan Hajnoczi,
	qemu-stable

This reverts commit fc3959e4669a1c2149b91ccb05101cfc7ae1fc05.

The core write code already handles the case, so remove this
duplication.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block.c | 45 ++++++---------------------------------------
 1 file changed, 6 insertions(+), 39 deletions(-)

diff --git a/block.c b/block.c
index 878a72d..995f1bd 100644
--- a/block.c
+++ b/block.c
@@ -3118,19 +3118,6 @@ out:
     return ret;
 }
 
-static inline uint64_t bdrv_get_align(BlockDriverState *bs)
-{
-    /* TODO Lift BDRV_SECTOR_SIZE restriction in BlockDriver interface */
-    return MAX(BDRV_SECTOR_SIZE, bs->request_alignment);
-}
-
-static inline bool bdrv_req_is_aligned(BlockDriverState *bs,
-                                       int64_t offset, size_t bytes)
-{
-    int64_t align = bdrv_get_align(bs);
-    return !(offset & (align - 1) || (bytes & (align - 1)));
-}
-
 /*
  * Handle a read request in coroutine context
  */
@@ -3141,7 +3128,8 @@ static int coroutine_fn bdrv_co_do_preadv(BlockDriverState *bs,
     BlockDriver *drv = bs->drv;
     BdrvTrackedRequest req;
 
-    uint64_t align = bdrv_get_align(bs);
+    /* TODO Lift BDRV_SECTOR_SIZE restriction in BlockDriver interface */
+    uint64_t align = MAX(BDRV_SECTOR_SIZE, bs->request_alignment);
     uint8_t *head_buf = NULL;
     uint8_t *tail_buf = NULL;
     QEMUIOVector local_qiov;
@@ -3383,7 +3371,8 @@ static int coroutine_fn bdrv_co_do_pwritev(BlockDriverState *bs,
     BdrvRequestFlags flags)
 {
     BdrvTrackedRequest req;
-    uint64_t align = bdrv_get_align(bs);
+    /* TODO Lift BDRV_SECTOR_SIZE restriction in BlockDriver interface */
+    uint64_t align = MAX(BDRV_SECTOR_SIZE, bs->request_alignment);
     uint8_t *head_buf = NULL;
     uint8_t *tail_buf = NULL;
     uint8_t *qiov_buf = NULL;
@@ -3497,10 +3486,6 @@ static int coroutine_fn bdrv_co_do_pwritev(BlockDriverState *bs,
         bytes = ROUND_UP(bytes, align);
     }
 
-    if (use_local_qiov) {
-        /* Local buffer may have non-zero data. */
-        flags &= ~BDRV_REQ_ZERO_WRITE;
-    }
     ret = bdrv_aligned_pwritev(bs, &req, offset, bytes,
                                use_local_qiov ? &local_qiov : qiov,
                                flags);
@@ -3542,32 +3527,14 @@ int coroutine_fn bdrv_co_write_zeroes(BlockDriverState *bs,
                                       int64_t sector_num, int nb_sectors,
                                       BdrvRequestFlags flags)
 {
-    int ret;
-
     trace_bdrv_co_write_zeroes(bs, sector_num, nb_sectors, flags);
 
     if (!(bs->open_flags & BDRV_O_UNMAP)) {
         flags &= ~BDRV_REQ_MAY_UNMAP;
     }
-    if (bdrv_req_is_aligned(bs, sector_num << BDRV_SECTOR_BITS,
-                            nb_sectors << BDRV_SECTOR_BITS)) {
-        ret = bdrv_co_do_writev(bs, sector_num, nb_sectors, NULL,
-                                BDRV_REQ_ZERO_WRITE | flags);
-    } else {
-        uint8_t *buf;
-        QEMUIOVector local_qiov;
-        size_t bytes = nb_sectors << BDRV_SECTOR_BITS;
 
-        buf = qemu_memalign(bdrv_opt_mem_align(bs), bytes);
-        memset(buf, 0, bytes);
-        qemu_iovec_init(&local_qiov, 1);
-        qemu_iovec_add(&local_qiov, buf, bytes);
-
-        ret = bdrv_co_do_writev(bs, sector_num, nb_sectors, &local_qiov,
-                                BDRV_REQ_ZERO_WRITE | flags);
-        qemu_vfree(buf);
-    }
-    return ret;
+    return bdrv_co_do_writev(bs, sector_num, nb_sectors, NULL,
+                             BDRV_REQ_ZERO_WRITE | flags);
 }
 
 /**
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] scsi-disk: Limit zero write request to SCSI_WRITE_SAME_MAX
  2015-04-24  8:33 ` [Qemu-devel] [PATCH 1/3] scsi-disk: Limit zero write request to SCSI_WRITE_SAME_MAX Fam Zheng
@ 2015-04-24  8:50   ` Paolo Bonzini
  2015-04-24  9:02     ` Fam Zheng
  0 siblings, 1 reply; 8+ messages in thread
From: Paolo Bonzini @ 2015-04-24  8:50 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Kevin Wolf, qemu-block, Stefan Hajnoczi, qemu-stable



On 24/04/2015 10:33, Fam Zheng wrote:
> SBC-4 says:
> 
>     If the number of logical blocks specified to be unmapped or written
>     exceeds the value indicated in the MAXIMUM WRITE SAME LENGTH field
>     in the Block Limits VPD page (see 6.6.4), then the device server
>     shall terminate the command with CHECK CONDITION status with the
>     sense key set to ILLEGAL REQUEST and the additional sense code set
>     to INVALID FIELD IN CDB.
> 
> Check the request size to match the spec.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  hw/scsi/scsi-disk.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
> index 54d71f4..b748982 100644
> --- a/hw/scsi/scsi-disk.c
> +++ b/hw/scsi/scsi-disk.c
> @@ -1707,6 +1707,11 @@ static void scsi_disk_emulate_write_same(SCSIDiskReq *r, uint8_t *inbuf)
>          return;
>      }
>  
> +    if (nb_sectors * (s->qdev.blocksize / 512) * 512 > SCSI_WRITE_SAME_MAX) {
> +        scsi_check_condition(r, SENSE_CODE(INVALID_FIELD));
> +        return;
> +    }
> +
>      if (buffer_is_zero(inbuf, s->qdev.blocksize)) {
>          int flags = (req->cmd.buf[1] & 0x8) ? BDRV_REQ_MAY_UNMAP : 0;
>  
> @@ -1726,7 +1731,7 @@ static void scsi_disk_emulate_write_same(SCSIDiskReq *r, uint8_t *inbuf)
>      data->r = r;
>      data->sector = r->req.cmd.lba * (s->qdev.blocksize / 512);
>      data->nb_sectors = nb_sectors * (s->qdev.blocksize / 512);
> -    data->iov.iov_len = MIN(data->nb_sectors * 512, SCSI_WRITE_SAME_MAX);
> +    data->iov.iov_len = data->nb_sectors * 512;
>      data->iov.iov_base = buf = blk_blockalign(s->qdev.conf.blk,
>                                                data->iov.iov_len);
>      qemu_iovec_init_external(&data->qiov, &data->iov, 1);
> 

SCSI_WRITE_SAME_MAX is not exported as the MAXIMUM WRITE SAME LENGTH in
the VPD.  In fact, we don't export anything and prefer to get very large
requests, because then we can choose ourselves how to split them and
serialize them.  By contrast, if you have a low limit, some guests
including Linux will send a lot of concurrent WRITE SAME requests which
will have a huge latency.

In addition, SCSI_WRITE_SAME_MAX is 512 *kilo*bytes.  That's really
little :) as some disks have a multi-*mega*byte unmap granularity.  So
what is SCSI_WRITE_SAME_MAX?

Simply, when we're asked to do a WRITE SAME with non-zero payload, we
build a 512 KiB buffer and fill it repeatedly with the pattern.  Then
the I/O is split in multiple writes, each covering 512 KiB except
possibly the last.  Note that non-zero WRITE SAME is not a fast path.

So this patch is not correct.  However, it shouldn't be a problem for
the rest of the series.

Paolo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH 2/3] block: Fix NULL deference for unaligned write if qiov is NULL
  2015-04-24  8:33 ` [Qemu-devel] [PATCH 2/3] block: Fix NULL deference for unaligned write if qiov is NULL Fam Zheng
@ 2015-04-24  9:01   ` Paolo Bonzini
  0 siblings, 0 replies; 8+ messages in thread
From: Paolo Bonzini @ 2015-04-24  9:01 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Kevin Wolf, qemu-block, Stefan Hajnoczi, qemu-stable



On 24/04/2015 10:33, Fam Zheng wrote:
> For zero write, qiov passed by callers (qemu-io "write -z" and
> scsi-disk "write same") is NULL.
> 
> Commit fc3959e466 fixed bdrv_co_write_zeroes which is the common case
> for this bug, but it still exists in bdrv_aio_write_zeroes. A simpler
> fix would be in bdrv_co_do_pwritev which is the NULL dereference point
> and covers both cases.
> 
> So don't access it in bdrv_co_do_pwritev, use a zeroed buffer instead.
> Device model who calls into block layer should check the request size.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  block.c | 24 ++++++++++++++++++++----
>  1 file changed, 20 insertions(+), 4 deletions(-)
> 
> diff --git a/block.c b/block.c
> index f2f8ae7..878a72d 100644
> --- a/block.c
> +++ b/block.c
> @@ -3386,6 +3386,7 @@ static int coroutine_fn bdrv_co_do_pwritev(BlockDriverState *bs,
>      uint64_t align = bdrv_get_align(bs);
>      uint8_t *head_buf = NULL;
>      uint8_t *tail_buf = NULL;
> +    uint8_t *qiov_buf = NULL;
>      QEMUIOVector local_qiov;
>      bool use_local_qiov = false;
>      int ret;
> @@ -3436,9 +3437,15 @@ static int coroutine_fn bdrv_co_do_pwritev(BlockDriverState *bs,
>          }
>          BLKDBG_EVENT(bs, BLKDBG_PWRITEV_RMW_AFTER_HEAD);
>  
> -        qemu_iovec_init(&local_qiov, qiov->niov + 2);
> +        qemu_iovec_init(&local_qiov, qiov ? qiov->niov + 2 : 3);
>          qemu_iovec_add(&local_qiov, head_buf, offset & (align - 1));
> -        qemu_iovec_concat(&local_qiov, qiov, 0, qiov->size);
> +        if (qiov) {
> +            qemu_iovec_concat(&local_qiov, qiov, 0, qiov->size);
> +        } else {

If we keep this approach, I would add an assertion that the flags
include BDRV_REQ_ZERO_WRITE.  Same below.

However, the problem here is that you drop the BDRV_REQ_MAY_UNMAP flag
for the central part of the write.

I think if BDRV_REQ_ZERO_WRITE is set, you should do three
bdrv_aligned_pwritev rather than just one.  The first without
BDRV_REQ_ZERO_WRITE and inside the if (offset & (align - 1)); the second
with BDRV_REQ_ZERO_WRITE and with NULL qiov, right after that if; and
the third is the one that exists already.  After each write you modify
offset and bytes.

leaving
> +            qiov_buf = qemu_blockalign(bs, bytes);
> +            memset(qiov_buf, 0, bytes);
> +            qemu_iovec_add(&local_qiov, qiov_buf, bytes);
> +        }
>          use_local_qiov = true;
>  
>          bytes += offset & (align - 1);
> @@ -3471,8 +3478,16 @@ static int coroutine_fn bdrv_co_do_pwritev(BlockDriverState *bs,
>          BLKDBG_EVENT(bs, BLKDBG_PWRITEV_RMW_AFTER_TAIL);
>  
>          if (!use_local_qiov) {
> -            qemu_iovec_init(&local_qiov, qiov->niov + 1);
> -            qemu_iovec_concat(&local_qiov, qiov, 0, qiov->size);
> +            qemu_iovec_init(&local_qiov, qiov ? qiov->niov + 1 : 2);
> +            if (qiov) {
> +                qemu_iovec_concat(&local_qiov, qiov, 0, qiov->size);
> +            } else {
> +                if (!qiov_buf) {
> +                    qiov_buf = qemu_blockalign(bs, bytes);
> +                    memset(qiov_buf, 0, bytes);
> +                }
> +                qemu_iovec_add(&local_qiov, qiov_buf, bytes);
> +            }

Same here.

>              use_local_qiov = true;
>          }
>  
> @@ -3498,6 +3513,7 @@ fail:
>      }
>      qemu_vfree(head_buf);
>      qemu_vfree(tail_buf);
> +    qemu_vfree(qiov_buf);
>  
>      return ret;
>  }
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] scsi-disk: Limit zero write request to SCSI_WRITE_SAME_MAX
  2015-04-24  8:50   ` Paolo Bonzini
@ 2015-04-24  9:02     ` Fam Zheng
  2015-04-24  9:03       ` Paolo Bonzini
  0 siblings, 1 reply; 8+ messages in thread
From: Fam Zheng @ 2015-04-24  9:02 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, Stefan Hajnoczi, qemu-devel, qemu-block, qemu-stable

On Fri, 04/24 10:50, Paolo Bonzini wrote:
> 
> 
> On 24/04/2015 10:33, Fam Zheng wrote:
> > SBC-4 says:
> > 
> >     If the number of logical blocks specified to be unmapped or written
> >     exceeds the value indicated in the MAXIMUM WRITE SAME LENGTH field
> >     in the Block Limits VPD page (see 6.6.4), then the device server
> >     shall terminate the command with CHECK CONDITION status with the
> >     sense key set to ILLEGAL REQUEST and the additional sense code set
> >     to INVALID FIELD IN CDB.
> > 
> > Check the request size to match the spec.
> > 
> > Signed-off-by: Fam Zheng <famz@redhat.com>
> > ---
> >  hw/scsi/scsi-disk.c | 7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
> > index 54d71f4..b748982 100644
> > --- a/hw/scsi/scsi-disk.c
> > +++ b/hw/scsi/scsi-disk.c
> > @@ -1707,6 +1707,11 @@ static void scsi_disk_emulate_write_same(SCSIDiskReq *r, uint8_t *inbuf)
> >          return;
> >      }
> >  
> > +    if (nb_sectors * (s->qdev.blocksize / 512) * 512 > SCSI_WRITE_SAME_MAX) {
> > +        scsi_check_condition(r, SENSE_CODE(INVALID_FIELD));
> > +        return;
> > +    }
> > +
> >      if (buffer_is_zero(inbuf, s->qdev.blocksize)) {
> >          int flags = (req->cmd.buf[1] & 0x8) ? BDRV_REQ_MAY_UNMAP : 0;
> >  
> > @@ -1726,7 +1731,7 @@ static void scsi_disk_emulate_write_same(SCSIDiskReq *r, uint8_t *inbuf)
> >      data->r = r;
> >      data->sector = r->req.cmd.lba * (s->qdev.blocksize / 512);
> >      data->nb_sectors = nb_sectors * (s->qdev.blocksize / 512);
> > -    data->iov.iov_len = MIN(data->nb_sectors * 512, SCSI_WRITE_SAME_MAX);
> > +    data->iov.iov_len = data->nb_sectors * 512;
> >      data->iov.iov_base = buf = blk_blockalign(s->qdev.conf.blk,
> >                                                data->iov.iov_len);
> >      qemu_iovec_init_external(&data->qiov, &data->iov, 1);
> > 
> 
> SCSI_WRITE_SAME_MAX is not exported as the MAXIMUM WRITE SAME LENGTH in
> the VPD.  In fact, we don't export anything and prefer to get very large
> requests, because then we can choose ourselves how to split them and
> serialize them.  By contrast, if you have a low limit, some guests
> including Linux will send a lot of concurrent WRITE SAME requests which
> will have a huge latency.
> 
> In addition, SCSI_WRITE_SAME_MAX is 512 *kilo*bytes.  That's really
> little :) as some disks have a multi-*mega*byte unmap granularity.  So
> what is SCSI_WRITE_SAME_MAX?
> 
> Simply, when we're asked to do a WRITE SAME with non-zero payload, we
> build a 512 KiB buffer and fill it repeatedly with the pattern.  Then
> the I/O is split in multiple writes, each covering 512 KiB except
> possibly the last.  Note that non-zero WRITE SAME is not a fast path.
> 
> So this patch is not correct.

OK, I get it. Thanks for explanation.

> However, it shouldn't be a problem for
> the rest of the series.

It is. The request has to be splitted to aligned part and unaligned part
because the latter requires a buffer, as we don't like unbounded allocation.

Fam

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] scsi-disk: Limit zero write request to SCSI_WRITE_SAME_MAX
  2015-04-24  9:02     ` Fam Zheng
@ 2015-04-24  9:03       ` Paolo Bonzini
  0 siblings, 0 replies; 8+ messages in thread
From: Paolo Bonzini @ 2015-04-24  9:03 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Kevin Wolf, Stefan Hajnoczi, qemu-devel, qemu-block, qemu-stable



On 24/04/2015 11:02, Fam Zheng wrote:
> 
>> > However, it shouldn't be a problem for
>> > the rest of the series.
> It is. The request has to be splitted to aligned part and unaligned part
> because the latter requires a buffer, as we don't like unbounded allocation.

Yes, I found that out after reading patch 2. :(

Paolo

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-04-24  9:04 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-04-24  8:33 [Qemu-devel] [PATCH 0/3] block: Fix unaligned bdrv_aio_write_zeroes Fam Zheng
2015-04-24  8:33 ` [Qemu-devel] [PATCH 1/3] scsi-disk: Limit zero write request to SCSI_WRITE_SAME_MAX Fam Zheng
2015-04-24  8:50   ` Paolo Bonzini
2015-04-24  9:02     ` Fam Zheng
2015-04-24  9:03       ` Paolo Bonzini
2015-04-24  8:33 ` [Qemu-devel] [PATCH 2/3] block: Fix NULL deference for unaligned write if qiov is NULL Fam Zheng
2015-04-24  9:01   ` Paolo Bonzini
2015-04-24  8:33 ` [Qemu-devel] [PATCH 3/3] Revert "block: Fix unaligned zero write" Fam Zheng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).