qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH for-2.5 0/4] block: replace IOV_MAX with BlockLimits.max_iov
@ 2015-07-08 15:30 Stefan Hajnoczi
  2015-07-08 15:30 ` [Qemu-devel] [PATCH for-2.5 1/4] block: add BlockLimits.max_iov field Stefan Hajnoczi
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Stefan Hajnoczi @ 2015-07-08 15:30 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, qemu-block

This series depends on "[PATCH] block/mirror: limit qiov to IOV_MAX elements".

IOV_MAX has been hardcoded in several places since preadv()/pwritev()/etc
refuse to take more iovecs per call.  That's actually an implementation detail
of raw-posix and we shouldn't spread that across the codebase.

This series adds BlockLimits.max_iov (similar to BlockLimits.max_transfer_len
and other fields).  By default it is INT_MAX but raw-posix sets it to IOV_MAX.

Current IOV_MAX users are converted to blk_get_max_iov().

By the way, the IOV_MAX vs BlockLimits.max_iov naming is slightly confusing but
follows the BlockLimits field naming convention.  Once IOV_MAX is banished no
one will be bothered by the reverse naming anymore.

Suggested-by: Kevin Wolf <kwolf@redhat.com>

Stefan Hajnoczi (4):
  block: add BlockLimits.max_iov field
  block-backend: add blk_get_max_iov()
  block: replace IOV_MAX with BlockLimits.max_iov
  block/mirror: replace IOV_MAX with blk_get_max_iov()

 block/block-backend.c          | 5 +++++
 block/io.c                     | 6 +++++-
 block/mirror.c                 | 6 ++++--
 block/raw-posix.c              | 1 +
 hw/block/virtio-blk.c          | 2 +-
 include/block/block_int.h      | 3 +++
 include/sysemu/block-backend.h | 1 +
 7 files changed, 20 insertions(+), 4 deletions(-)

-- 
2.4.3

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Qemu-devel] [PATCH for-2.5 1/4] block: add BlockLimits.max_iov field
  2015-07-08 15:30 [Qemu-devel] [PATCH for-2.5 0/4] block: replace IOV_MAX with BlockLimits.max_iov Stefan Hajnoczi
@ 2015-07-08 15:30 ` Stefan Hajnoczi
  2015-07-09  4:39   ` Peter Lieven
  2015-07-08 15:31 ` [Qemu-devel] [PATCH for-2.5 2/4] block-backend: add blk_get_max_iov() Stefan Hajnoczi
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 7+ messages in thread
From: Stefan Hajnoczi @ 2015-07-08 15:30 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Peter Lieven, Stefan Hajnoczi, qemu-block

The maximum number of struct iovec elements depends on the
BlockDriverState.  The raw-posix protocol has a maximum of IOV_MAX but
others could have different values.

Instead of assuming raw-posix and hardcoding IOV_MAX in several places,
put the limit into BlockLimits.

Cc: Peter Lieven <pl@kamp.de>
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
Peter Lieven: I think the SCSI LUN level does not have a maximum
scatter-gather segments constraint.  That is probably only at the HBA
level.  CCed you anyway in case you think block/iscsi.c should set the
max_iov field.

Kevin: The default is now INT_MAX.  This means non-raw-posix users will
now be able to merge more requests than before.  They were limited to
IOV_MAX previously.  This could expose limits in other BlockDrivers
which we weren't aware of...
---
 block/io.c                | 3 +++
 block/raw-posix.c         | 1 +
 include/block/block_int.h | 3 +++
 3 files changed, 7 insertions(+)

diff --git a/block/io.c b/block/io.c
index e295992..6750de6 100644
--- a/block/io.c
+++ b/block/io.c
@@ -165,9 +165,11 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
         bs->bl.max_transfer_length = bs->file->bl.max_transfer_length;
         bs->bl.min_mem_alignment = bs->file->bl.min_mem_alignment;
         bs->bl.opt_mem_alignment = bs->file->bl.opt_mem_alignment;
+        bs->bl.max_iov = bs->file->bl.max_iov;
     } else {
         bs->bl.min_mem_alignment = 512;
         bs->bl.opt_mem_alignment = getpagesize();
+        bs->bl.max_iov = INT_MAX;
     }
 
     if (bs->backing_hd) {
@@ -188,6 +190,7 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
         bs->bl.min_mem_alignment =
             MAX(bs->bl.min_mem_alignment,
                 bs->backing_hd->bl.min_mem_alignment);
+        bs->bl.max_iov = MIN(bs->bl.max_iov, bs->backing_hd->bl.max_iov);
     }
 
     /* Then let the driver override it */
diff --git a/block/raw-posix.c b/block/raw-posix.c
index cbe6574..faa6ae0 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -735,6 +735,7 @@ static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
     raw_probe_alignment(bs, s->fd, errp);
     bs->bl.min_mem_alignment = s->buf_align;
     bs->bl.opt_mem_alignment = MAX(s->buf_align, getpagesize());
+    bs->bl.max_iov = IOV_MAX; /* limit comes from preadv()/pwritev()/etc */
 }
 
 static int check_for_dasd(int fd)
diff --git a/include/block/block_int.h b/include/block/block_int.h
index b0476fc..767e83d 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -315,6 +315,9 @@ typedef struct BlockLimits {
 
     /* memory alignment for bounce buffer */
     size_t opt_mem_alignment;
+
+    /* maximum number of iovec elements */
+    int max_iov;
 } BlockLimits;
 
 typedef struct BdrvOpBlocker BdrvOpBlocker;
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [Qemu-devel] [PATCH for-2.5 2/4] block-backend: add blk_get_max_iov()
  2015-07-08 15:30 [Qemu-devel] [PATCH for-2.5 0/4] block: replace IOV_MAX with BlockLimits.max_iov Stefan Hajnoczi
  2015-07-08 15:30 ` [Qemu-devel] [PATCH for-2.5 1/4] block: add BlockLimits.max_iov field Stefan Hajnoczi
@ 2015-07-08 15:31 ` Stefan Hajnoczi
  2015-07-08 15:31 ` [Qemu-devel] [PATCH for-2.5 3/4] block: replace IOV_MAX with BlockLimits.max_iov Stefan Hajnoczi
  2015-07-08 15:31 ` [Qemu-devel] [PATCH for-2.5 4/4] block/mirror: replace IOV_MAX with blk_get_max_iov() Stefan Hajnoczi
  3 siblings, 0 replies; 7+ messages in thread
From: Stefan Hajnoczi @ 2015-07-08 15:31 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, qemu-block

Add a function to query BlockLimits.max_iov.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/block-backend.c          | 5 +++++
 include/sysemu/block-backend.h | 1 +
 2 files changed, 6 insertions(+)

diff --git a/block/block-backend.c b/block/block-backend.c
index aee8a12..5108aec 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -777,6 +777,11 @@ int blk_get_max_transfer_length(BlockBackend *blk)
     return blk->bs->bl.max_transfer_length;
 }
 
+int blk_get_max_iov(BlockBackend *blk)
+{
+    return blk->bs->bl.max_iov;
+}
+
 void blk_set_guest_block_size(BlockBackend *blk, int align)
 {
     bdrv_set_guest_block_size(blk->bs, align);
diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index 8fc960f..c114440 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -135,6 +135,7 @@ void blk_lock_medium(BlockBackend *blk, bool locked);
 void blk_eject(BlockBackend *blk, bool eject_flag);
 int blk_get_flags(BlockBackend *blk);
 int blk_get_max_transfer_length(BlockBackend *blk);
+int blk_get_max_iov(BlockBackend *blk);
 void blk_set_guest_block_size(BlockBackend *blk, int align);
 void *blk_blockalign(BlockBackend *blk, size_t size);
 bool blk_op_is_blocked(BlockBackend *blk, BlockOpType op, Error **errp);
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [Qemu-devel] [PATCH for-2.5 3/4] block: replace IOV_MAX with BlockLimits.max_iov
  2015-07-08 15:30 [Qemu-devel] [PATCH for-2.5 0/4] block: replace IOV_MAX with BlockLimits.max_iov Stefan Hajnoczi
  2015-07-08 15:30 ` [Qemu-devel] [PATCH for-2.5 1/4] block: add BlockLimits.max_iov field Stefan Hajnoczi
  2015-07-08 15:31 ` [Qemu-devel] [PATCH for-2.5 2/4] block-backend: add blk_get_max_iov() Stefan Hajnoczi
@ 2015-07-08 15:31 ` Stefan Hajnoczi
  2015-07-08 15:31 ` [Qemu-devel] [PATCH for-2.5 4/4] block/mirror: replace IOV_MAX with blk_get_max_iov() Stefan Hajnoczi
  3 siblings, 0 replies; 7+ messages in thread
From: Stefan Hajnoczi @ 2015-07-08 15:31 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, qemu-block

Request merging must not result in a huge request that exceeds the
maximum number of iovec elements.  Use BlockLimits.max_iov instead of
hardcoding IOV_MAX.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/io.c            | 3 ++-
 hw/block/virtio-blk.c | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/block/io.c b/block/io.c
index 6750de6..baffd02 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1818,7 +1818,8 @@ static int multiwrite_merge(BlockDriverState *bs, BlockRequest *reqs,
             merge = 1;
         }
 
-        if (reqs[outidx].qiov->niov + reqs[i].qiov->niov + 1 > IOV_MAX) {
+        if (reqs[outidx].qiov->niov + reqs[i].qiov->niov + 1 >
+            bs->bl.max_iov) {
             merge = 0;
         }
 
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 6aefda4..a186956 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -407,7 +407,7 @@ void virtio_blk_submit_multireq(BlockBackend *blk, MultiReqBuffer *mrb)
             bool merge = true;
 
             /* merge would exceed maximum number of IOVs */
-            if (niov + req->qiov.niov > IOV_MAX) {
+            if (niov + req->qiov.niov > blk_get_max_iov(blk)) {
                 merge = false;
             }
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [Qemu-devel] [PATCH for-2.5 4/4] block/mirror: replace IOV_MAX with blk_get_max_iov()
  2015-07-08 15:30 [Qemu-devel] [PATCH for-2.5 0/4] block: replace IOV_MAX with BlockLimits.max_iov Stefan Hajnoczi
                   ` (2 preceding siblings ...)
  2015-07-08 15:31 ` [Qemu-devel] [PATCH for-2.5 3/4] block: replace IOV_MAX with BlockLimits.max_iov Stefan Hajnoczi
@ 2015-07-08 15:31 ` Stefan Hajnoczi
  3 siblings, 0 replies; 7+ messages in thread
From: Stefan Hajnoczi @ 2015-07-08 15:31 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, qemu-block

Use blk_get_max_iov() instead of hardcoding IOV_MAX, which may not apply
to all BlockDrivers.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/mirror.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index 985ad00..0f4445c 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -160,11 +160,13 @@ static void mirror_read_complete(void *opaque, int ret)
 static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
 {
     BlockDriverState *source = s->common.bs;
-    int nb_sectors, sectors_per_chunk, nb_chunks;
+    int nb_sectors, sectors_per_chunk, nb_chunks, max_iov;
     int64_t end, sector_num, next_chunk, next_sector, hbitmap_next_sector;
     uint64_t delay_ns = 0;
     MirrorOp *op;
 
+    max_iov = MIN(source->bl.max_iov, s->target->bl.max_iov);
+
     s->sector_num = hbitmap_iter_next(&s->hbi);
     if (s->sector_num < 0) {
         bdrv_dirty_iter_init(s->dirty_bitmap, &s->hbi);
@@ -241,7 +243,7 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
             trace_mirror_break_buf_busy(s, nb_chunks, s->in_flight);
             break;
         }
-        if (IOV_MAX < nb_chunks + added_chunks) {
+        if (max_iov < nb_chunks + added_chunks) {
             trace_mirror_break_iov_max(s, nb_chunks, added_chunks);
             break;
         }
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.5 1/4] block: add BlockLimits.max_iov field
  2015-07-08 15:30 ` [Qemu-devel] [PATCH for-2.5 1/4] block: add BlockLimits.max_iov field Stefan Hajnoczi
@ 2015-07-09  4:39   ` Peter Lieven
  2015-07-09  9:46     ` Stefan Hajnoczi
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Lieven @ 2015-07-09  4:39 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel; +Cc: Kevin Wolf, qemu-block

Am 08.07.2015 um 17:30 schrieb Stefan Hajnoczi:
> The maximum number of struct iovec elements depends on the
> BlockDriverState.  The raw-posix protocol has a maximum of IOV_MAX but
> others could have different values.
>
> Instead of assuming raw-posix and hardcoding IOV_MAX in several places,
> put the limit into BlockLimits.
>
> Cc: Peter Lieven <pl@kamp.de>
> Suggested-by: Kevin Wolf <kwolf@redhat.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
> Peter Lieven: I think the SCSI LUN level does not have a maximum
> scatter-gather segments constraint.  That is probably only at the HBA
> level.  CCed you anyway in case you think block/iscsi.c should set the
> max_iov field.

libiscsi will send the iovec array straight to readv and writev to
read/write from the TCP socket. So we need IOV_MAX here as well.

>
> Kevin: The default is now INT_MAX.  This means non-raw-posix users will
> now be able to merge more requests than before.  They were limited to
> IOV_MAX previously.  This could expose limits in other BlockDrivers
> which we weren't aware of...

Why rise the default to INT_MAX and not leave it at IOV_MAX?
Is there any case where we except that much iovectors coming in?

Peter

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.5 1/4] block: add BlockLimits.max_iov field
  2015-07-09  4:39   ` Peter Lieven
@ 2015-07-09  9:46     ` Stefan Hajnoczi
  0 siblings, 0 replies; 7+ messages in thread
From: Stefan Hajnoczi @ 2015-07-09  9:46 UTC (permalink / raw)
  To: Peter Lieven; +Cc: Kevin Wolf, qemu-devel, qemu-block

[-- Attachment #1: Type: text/plain, Size: 1682 bytes --]

On Thu, Jul 09, 2015 at 06:39:02AM +0200, Peter Lieven wrote:
> Am 08.07.2015 um 17:30 schrieb Stefan Hajnoczi:
> >The maximum number of struct iovec elements depends on the
> >BlockDriverState.  The raw-posix protocol has a maximum of IOV_MAX but
> >others could have different values.
> >
> >Instead of assuming raw-posix and hardcoding IOV_MAX in several places,
> >put the limit into BlockLimits.
> >
> >Cc: Peter Lieven <pl@kamp.de>
> >Suggested-by: Kevin Wolf <kwolf@redhat.com>
> >Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> >---
> >Peter Lieven: I think the SCSI LUN level does not have a maximum
> >scatter-gather segments constraint.  That is probably only at the HBA
> >level.  CCed you anyway in case you think block/iscsi.c should set the
> >max_iov field.
> 
> libiscsi will send the iovec array straight to readv and writev to
> read/write from the TCP socket. So we need IOV_MAX here as well.

Thanks, will fix in v2!

> >
> >Kevin: The default is now INT_MAX.  This means non-raw-posix users will
> >now be able to merge more requests than before.  They were limited to
> >IOV_MAX previously.  This could expose limits in other BlockDrivers
> >which we weren't aware of...
> 
> Why rise the default to INT_MAX and not leave it at IOV_MAX?
> Is there any case where we except that much iovectors coming in?

Both block/mirror.c and request merge have resulted in >1024 iovecs in
the past.  That means they can take advantage of a higher limit.

On the other hand, I think you are right that the default should be
IOV_MAX and only drivers with higher/lower limits should set max_iov
themselves.  It's safer.

Stefan

[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-07-09  9:46 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-07-08 15:30 [Qemu-devel] [PATCH for-2.5 0/4] block: replace IOV_MAX with BlockLimits.max_iov Stefan Hajnoczi
2015-07-08 15:30 ` [Qemu-devel] [PATCH for-2.5 1/4] block: add BlockLimits.max_iov field Stefan Hajnoczi
2015-07-09  4:39   ` Peter Lieven
2015-07-09  9:46     ` Stefan Hajnoczi
2015-07-08 15:31 ` [Qemu-devel] [PATCH for-2.5 2/4] block-backend: add blk_get_max_iov() Stefan Hajnoczi
2015-07-08 15:31 ` [Qemu-devel] [PATCH for-2.5 3/4] block: replace IOV_MAX with BlockLimits.max_iov Stefan Hajnoczi
2015-07-08 15:31 ` [Qemu-devel] [PATCH for-2.5 4/4] block/mirror: replace IOV_MAX with blk_get_max_iov() Stefan Hajnoczi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).