[Qemu-devel] [PATCH] block: change default for discard and write zeroes to INT

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH] block: change default for discard and write zeroes to INT_MAX
@ 2015-02-02 14:48 Peter Lieven
  2015-02-02 16:13 ` Kevin Wolf
  2015-02-02 16:23 ` [Qemu-devel] [PATCH 1/2] glusterfs: fix max_discard Denis V. Lunev
  0 siblings, 2 replies; 21+ messages in thread
From: Peter Lieven @ 2015-02-02 14:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, den, Peter Lieven

do not trim requests if the driver does not supply a limit
through BlockLimits. For write zeroes we still keep a limit
for the unsupported path to avoid allocating a big bounce buffer.

Suggested-by: Kevin Wolf <kwolf@redhat.com>
Suggested-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Peter Lieven <pl@kamp.de>
---
 block.c | 16 ++++------------
 1 file changed, 4 insertions(+), 12 deletions(-)

diff --git a/block.c b/block.c
index d45e4dd..ee7ff2c 100644
--- a/block.c
+++ b/block.c
@@ -3192,10 +3192,7 @@ int coroutine_fn bdrv_co_copy_on_readv(BlockDriverState *bs,
                             BDRV_REQ_COPY_ON_READ);
 }
 
-/* if no limit is specified in the BlockLimits use a default
- * of 32768 512-byte sectors (16 MiB) per request.
- */
-#define MAX_WRITE_ZEROES_DEFAULT 32768
+#define MAX_WRITE_ZEROES_BOUNCE_BUFFER 32768
 
 static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
     int64_t sector_num, int nb_sectors, BdrvRequestFlags flags)
@@ -3206,7 +3203,7 @@ static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
     int ret = 0;
 
     int max_write_zeroes = bs->bl.max_write_zeroes ?
-                           bs->bl.max_write_zeroes : MAX_WRITE_ZEROES_DEFAULT;
+                           bs->bl.max_write_zeroes : INT_MAX;
 
     while (nb_sectors > 0 && !ret) {
         int num = nb_sectors;
@@ -3242,7 +3239,7 @@ static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
         if (ret == -ENOTSUP) {
             /* Fall back to bounce buffer if write zeroes is unsupported */
             int max_xfer_len = MIN_NON_ZERO(bs->bl.max_transfer_length,
-                                            MAX_WRITE_ZEROES_DEFAULT);
+                                            MAX_WRITE_ZEROES_BOUNCE_BUFFER);
             num = MIN(num, max_xfer_len);
             iov.iov_len = num * BDRV_SECTOR_SIZE;
             if (iov.iov_base == NULL) {
@@ -5097,11 +5094,6 @@ static void coroutine_fn bdrv_discard_co_entry(void *opaque)
     rwco->ret = bdrv_co_discard(rwco->bs, rwco->sector_num, rwco->nb_sectors);
 }
 
-/* if no limit is specified in the BlockLimits use a default
- * of 32768 512-byte sectors (16 MiB) per request.
- */
-#define MAX_DISCARD_DEFAULT 32768
-
 int coroutine_fn bdrv_co_discard(BlockDriverState *bs, int64_t sector_num,
                                  int nb_sectors)
 {
@@ -5126,7 +5118,7 @@ int coroutine_fn bdrv_co_discard(BlockDriverState *bs, int64_t sector_num,
         return 0;
     }
 
-    max_discard = bs->bl.max_discard ?  bs->bl.max_discard : MAX_DISCARD_DEFAULT;
+    max_discard = bs->bl.max_discard ?  bs->bl.max_discard : INT_MAX;
     while (nb_sectors > 0) {
         int ret;
         int num = nb_sectors;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH] block: change default for discard and write zeroes to INT_MAX
  2015-02-02 14:48 [Qemu-devel] [PATCH] block: change default for discard and write zeroes to INT_MAX Peter Lieven
@ 2015-02-02 16:13 ` Kevin Wolf
  2015-02-02 16:25   ` Denis V. Lunev
  2015-02-02 16:23 ` [Qemu-devel] [PATCH 1/2] glusterfs: fix max_discard Denis V. Lunev
  1 sibling, 1 reply; 21+ messages in thread
From: Kevin Wolf @ 2015-02-02 16:13 UTC (permalink / raw)
  To: Peter Lieven; +Cc: den, qemu-devel

Am 02.02.2015 um 15:48 hat Peter Lieven geschrieben:
> do not trim requests if the driver does not supply a limit
> through BlockLimits. For write zeroes we still keep a limit
> for the unsupported path to avoid allocating a big bounce buffer.
> 
> Suggested-by: Kevin Wolf <kwolf@redhat.com>
> Suggested-by: Denis V. Lunev <den@openvz.org>
> Signed-off-by: Peter Lieven <pl@kamp.de>

Thanks, applied to the block branch (and removed 'block/raw-posix: set
max_write_zeroes to INT_MAX for regular files' from the queue).

Kevin

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 1/2] glusterfs: fix max_discard
  2015-02-02 14:48 [Qemu-devel] [PATCH] block: change default for discard and write zeroes to INT_MAX Peter Lieven
  2015-02-02 16:13 ` Kevin Wolf
@ 2015-02-02 16:23 ` Denis V. Lunev
  2015-02-02 16:23   ` [Qemu-devel] [PATCH 2/2] nbd: " Denis V. Lunev
  1 sibling, 1 reply; 21+ messages in thread
From: Denis V. Lunev @ 2015-02-02 16:23 UTC (permalink / raw)
  Cc: Kevin Wolf, Denis V. Lunev, Peter Lieven, qemu-devel

qemu_gluster_co_discard calculates size to discard as follows
    size_t size = nb_sectors * BDRV_SECTOR_SIZE;
    ret = glfs_discard_async(s->fd, offset, size, &gluster_finish_aiocb, acb);

glfs_discard_async is declared as follows:
  int glfs_discard_async (glfs_fd_t *fd, off_t length, size_t lent,
                          glfs_io_cbk fn, void *data) __THROW
This is problematic on i686 as sizeof(size_t) == 4.

Set bl_max_discard to INT_MAX >> BDRV_SECTOR_BITS to avoid overflow.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Peter Lieven <pl@kamp.de>
---
 block/gluster.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/block/gluster.c b/block/gluster.c
index 1eb3a8c..ed09691 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -622,6 +622,11 @@ out:
     return ret;
 }
 
+static void qemu_gluster_refresh_limits(BlockDriverState *bs, Error **errp)
+{
+    bs->bl.max_discard = INT_MAX >> BDRV_SECTOR_BITS;
+}
+
 #ifdef CONFIG_GLUSTERFS_DISCARD
 static coroutine_fn int qemu_gluster_co_discard(BlockDriverState *bs,
         int64_t sector_num, int nb_sectors)
@@ -735,6 +740,7 @@ static BlockDriver bdrv_gluster = {
 #ifdef CONFIG_GLUSTERFS_DISCARD
     .bdrv_co_discard              = qemu_gluster_co_discard,
 #endif
+    .bdrv_refresh_limits          = qemu_gluster_refresh_limits,
 #ifdef CONFIG_GLUSTERFS_ZEROFILL
     .bdrv_co_write_zeroes         = qemu_gluster_co_write_zeroes,
 #endif
@@ -762,6 +768,7 @@ static BlockDriver bdrv_gluster_tcp = {
 #ifdef CONFIG_GLUSTERFS_DISCARD
     .bdrv_co_discard              = qemu_gluster_co_discard,
 #endif
+    .bdrv_refresh_limits          = qemu_gluster_refresh_limits,
 #ifdef CONFIG_GLUSTERFS_ZEROFILL
     .bdrv_co_write_zeroes         = qemu_gluster_co_write_zeroes,
 #endif
@@ -789,6 +796,7 @@ static BlockDriver bdrv_gluster_unix = {
 #ifdef CONFIG_GLUSTERFS_DISCARD
     .bdrv_co_discard              = qemu_gluster_co_discard,
 #endif
+    .bdrv_refresh_limits          = qemu_gluster_refresh_limits,
 #ifdef CONFIG_GLUSTERFS_ZEROFILL
     .bdrv_co_write_zeroes         = qemu_gluster_co_write_zeroes,
 #endif
@@ -816,6 +824,7 @@ static BlockDriver bdrv_gluster_rdma = {
 #ifdef CONFIG_GLUSTERFS_DISCARD
     .bdrv_co_discard              = qemu_gluster_co_discard,
 #endif
+    .bdrv_refresh_limits          = qemu_gluster_refresh_limits,
 #ifdef CONFIG_GLUSTERFS_ZEROFILL
     .bdrv_co_write_zeroes         = qemu_gluster_co_write_zeroes,
 #endif
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 2/2] nbd: fix max_discard
  2015-02-02 16:23 ` [Qemu-devel] [PATCH 1/2] glusterfs: fix max_discard Denis V. Lunev
@ 2015-02-02 16:23   ` Denis V. Lunev
  0 siblings, 0 replies; 21+ messages in thread
From: Denis V. Lunev @ 2015-02-02 16:23 UTC (permalink / raw)
  Cc: Kevin Wolf, Denis V. Lunev, Peter Lieven, qemu-devel

nbd_co_discard calls nbd_client_session_co_discard which uses uint32_t
as the length in bytes of the data to discard due to the following
definition:

struct nbd_request {
    uint32_t magic;
    uint32_t type;
    uint64_t handle;
    uint64_t from;
    uint32_t len; <-- the length of data to be discarded, in bytes
} QEMU_PACKED;

Thus we should limit bl_max_discard to INT_MAX >> BDRV_SECTOR_BITS to avoid
overflow.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Peter Lieven <pl@kamp.de>
---
 block/nbd.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/block/nbd.c b/block/nbd.c
index 04cc845..008d52e 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -301,6 +301,11 @@ static int nbd_co_flush(BlockDriverState *bs)
     return nbd_client_session_co_flush(&s->client);
 }
 
+static void nbd_refresh_limits(BlockDriverState *bs, Error **errp)
+{
+    bs->bl.max_discard = INT_MAX >> BDRV_SECTOR_BITS;
+}
+
 static int nbd_co_discard(BlockDriverState *bs, int64_t sector_num,
                           int nb_sectors)
 {
@@ -396,6 +401,7 @@ static BlockDriver bdrv_nbd = {
     .bdrv_close                 = nbd_close,
     .bdrv_co_flush_to_os        = nbd_co_flush,
     .bdrv_co_discard            = nbd_co_discard,
+    .bdrv_refresh_limits        = nbd_refresh_limits,
     .bdrv_getlength             = nbd_getlength,
     .bdrv_detach_aio_context    = nbd_detach_aio_context,
     .bdrv_attach_aio_context    = nbd_attach_aio_context,
@@ -413,6 +419,7 @@ static BlockDriver bdrv_nbd_tcp = {
     .bdrv_close                 = nbd_close,
     .bdrv_co_flush_to_os        = nbd_co_flush,
     .bdrv_co_discard            = nbd_co_discard,
+    .bdrv_refresh_limits        = nbd_refresh_limits,
     .bdrv_getlength             = nbd_getlength,
     .bdrv_detach_aio_context    = nbd_detach_aio_context,
     .bdrv_attach_aio_context    = nbd_attach_aio_context,
@@ -430,6 +437,7 @@ static BlockDriver bdrv_nbd_unix = {
     .bdrv_close                 = nbd_close,
     .bdrv_co_flush_to_os        = nbd_co_flush,
     .bdrv_co_discard            = nbd_co_discard,
+    .bdrv_refresh_limits        = nbd_refresh_limits,
     .bdrv_getlength             = nbd_getlength,
     .bdrv_detach_aio_context    = nbd_detach_aio_context,
     .bdrv_attach_aio_context    = nbd_attach_aio_context,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH] block: change default for discard and write zeroes to INT_MAX
  2015-02-02 16:13 ` Kevin Wolf
@ 2015-02-02 16:25   ` Denis V. Lunev
  2015-02-02 16:45     ` Kevin Wolf
  0 siblings, 1 reply; 21+ messages in thread
From: Denis V. Lunev @ 2015-02-02 16:25 UTC (permalink / raw)
  To: Kevin Wolf, Peter Lieven; +Cc: qemu-devel

On 02/02/15 19:13, Kevin Wolf wrote:
> Am 02.02.2015 um 15:48 hat Peter Lieven geschrieben:
>> do not trim requests if the driver does not supply a limit
>> through BlockLimits. For write zeroes we still keep a limit
>> for the unsupported path to avoid allocating a big bounce buffer.
>>
>> Suggested-by: Kevin Wolf <kwolf@redhat.com>
>> Suggested-by: Denis V. Lunev <den@openvz.org>
>> Signed-off-by: Peter Lieven <pl@kamp.de>
> Thanks, applied to the block branch (and removed 'block/raw-posix: set
> max_write_zeroes to INT_MAX for regular files' from the queue).
>
> Kevin
double checked the code.

There are 2 things to patch for discard, write_zeroes is OK for me.
Sorry, for not paying attention for discard branch :(

Den

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH] block: change default for discard and write zeroes to INT_MAX
  2015-02-02 16:25   ` Denis V. Lunev
@ 2015-02-02 16:45     ` Kevin Wolf
  2015-02-02 17:00       ` Denis V. Lunev
  0 siblings, 1 reply; 21+ messages in thread
From: Kevin Wolf @ 2015-02-02 16:45 UTC (permalink / raw)
  To: Denis V. Lunev; +Cc: Peter Lieven, qemu-devel

Am 02.02.2015 um 17:25 hat Denis V. Lunev geschrieben:
> On 02/02/15 19:13, Kevin Wolf wrote:
> >Am 02.02.2015 um 15:48 hat Peter Lieven geschrieben:
> >>do not trim requests if the driver does not supply a limit
> >>through BlockLimits. For write zeroes we still keep a limit
> >>for the unsupported path to avoid allocating a big bounce buffer.
> >>
> >>Suggested-by: Kevin Wolf <kwolf@redhat.com>
> >>Suggested-by: Denis V. Lunev <den@openvz.org>
> >>Signed-off-by: Peter Lieven <pl@kamp.de>
> >Thanks, applied to the block branch (and removed 'block/raw-posix: set
> >max_write_zeroes to INT_MAX for regular files' from the queue).
> >
> >Kevin
> double checked the code.
> 
> There are 2 things to patch for discard, write_zeroes is OK for me.
> Sorry, for not paying attention for discard branch :(

Good catch, thanks!

But shouldn't we use the actual limits instead of INT_MAX, i.e. SIZE_MAX
for gluster and UINT32_MAX for nbd?

Kevin

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH] block: change default for discard and write zeroes to INT_MAX
  2015-02-02 16:45     ` Kevin Wolf
@ 2015-02-02 17:00       ` Denis V. Lunev
  2015-02-02 17:34         ` Paolo Bonzini
  0 siblings, 1 reply; 21+ messages in thread
From: Denis V. Lunev @ 2015-02-02 17:00 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Peter Lieven, qemu-devel

On 02/02/15 19:45, Kevin Wolf wrote:
> Am 02.02.2015 um 17:25 hat Denis V. Lunev geschrieben:
>> On 02/02/15 19:13, Kevin Wolf wrote:
>>> Am 02.02.2015 um 15:48 hat Peter Lieven geschrieben:
>>>> do not trim requests if the driver does not supply a limit
>>>> through BlockLimits. For write zeroes we still keep a limit
>>>> for the unsupported path to avoid allocating a big bounce buffer.
>>>>
>>>> Suggested-by: Kevin Wolf <kwolf@redhat.com>
>>>> Suggested-by: Denis V. Lunev <den@openvz.org>
>>>> Signed-off-by: Peter Lieven <pl@kamp.de>
>>> Thanks, applied to the block branch (and removed 'block/raw-posix: set
>>> max_write_zeroes to INT_MAX for regular files' from the queue).
>>>
>>> Kevin
>> double checked the code.
>>
>> There are 2 things to patch for discard, write_zeroes is OK for me.
>> Sorry, for not paying attention for discard branch :(
> Good catch, thanks!
>
> But shouldn't we use the actual limits instead of INT_MAX, i.e. SIZE_MAX
> for gluster and UINT32_MAX for nbd?
>
> Kevin
You are absolutely correct for NBD case but I do not get the
point about SIZE_MAX for gluster. There is no such definition
in their git at git://github.com/gluster/glusterfs nor in
public API headers in Ubuntu :(

Regards,
     Den

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH] block: change default for discard and write zeroes to INT_MAX
  2015-02-02 17:00       ` Denis V. Lunev
@ 2015-02-02 17:34         ` Paolo Bonzini
  2015-02-02 17:40           ` Denis V. Lunev
  0 siblings, 1 reply; 21+ messages in thread
From: Paolo Bonzini @ 2015-02-02 17:34 UTC (permalink / raw)
  To: Denis V. Lunev, Kevin Wolf; +Cc: Peter Lieven, qemu-devel



On 02/02/2015 18:00, Denis V. Lunev wrote:
>>
> You are absolutely correct for NBD case but I do not get the
> point about SIZE_MAX for gluster. There is no such definition
> in their git at git://github.com/gluster/glusterfs nor in
> public API headers in Ubuntu :(

SIZE_MAX is defined in stdint.h (provided by the C library).

Paolo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH] block: change default for discard and write zeroes to INT_MAX
  2015-02-02 17:34         ` Paolo Bonzini
@ 2015-02-02 17:40           ` Denis V. Lunev
  0 siblings, 0 replies; 21+ messages in thread
From: Denis V. Lunev @ 2015-02-02 17:40 UTC (permalink / raw)
  To: Paolo Bonzini, Denis V. Lunev, Kevin Wolf; +Cc: Peter Lieven, qemu-devel

On 02/02/15 20:34, Paolo Bonzini wrote:
>
>
> On 02/02/2015 18:00, Denis V. Lunev wrote:
>>>
>> You are absolutely correct for NBD case but I do not get the
>> point about SIZE_MAX for gluster. There is no such definition
>> in their git at git://github.com/gluster/glusterfs nor in
>> public API headers in Ubuntu :(
>
> SIZE_MAX is defined in stdint.h (provided by the C library).
>
> Paolo
>
In this case the code should avoid overflow on 64bit arch.

OK, re-spinning.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 1/2] glusterfs: fix max_discard
  2015-02-02 18:29 [Qemu-devel] [PATCH v2 0/2] fix max_discard for NBD/gluster block drivers Denis V. Lunev
@ 2015-02-02 18:29 ` Denis V. Lunev
  2015-02-02 19:58   ` Peter Lieven
  0 siblings, 1 reply; 21+ messages in thread
From: Denis V. Lunev @ 2015-02-02 18:29 UTC (permalink / raw)
  Cc: Kevin Wolf, Denis V. Lunev, Peter Lieven, qemu-devel

qemu_gluster_co_discard calculates size to discard as follows
    size_t size = nb_sectors * BDRV_SECTOR_SIZE;
    ret = glfs_discard_async(s->fd, offset, size, &gluster_finish_aiocb, acb);

glfs_discard_async is declared as follows:
  int glfs_discard_async (glfs_fd_t *fd, off_t length, size_t lent,
                          glfs_io_cbk fn, void *data) __THROW
This is problematic on i686 as sizeof(size_t) == 4.

Set bl_max_discard to SIZE_MAX >> BDRV_SECTOR_BITS to avoid overflow
on i386.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Peter Lieven <pl@kamp.de>
---
 block/gluster.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/block/gluster.c b/block/gluster.c
index 1eb3a8c..47bf92d 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -622,6 +622,13 @@ out:
     return ret;
 }
 
+static void qemu_gluster_refresh_limits(BlockDriverState *bs, Error **errp)
+{
+#if SIZE_MAX == UINT_MAX
+    bs->bl.max_discard = SIZE_MAX >> BDRV_SECTOR_BITS;
+#endif
+}
+
 #ifdef CONFIG_GLUSTERFS_DISCARD
 static coroutine_fn int qemu_gluster_co_discard(BlockDriverState *bs,
         int64_t sector_num, int nb_sectors)
@@ -735,6 +742,7 @@ static BlockDriver bdrv_gluster = {
 #ifdef CONFIG_GLUSTERFS_DISCARD
     .bdrv_co_discard              = qemu_gluster_co_discard,
 #endif
+    .bdrv_refresh_limits          = qemu_gluster_refresh_limits,
 #ifdef CONFIG_GLUSTERFS_ZEROFILL
     .bdrv_co_write_zeroes         = qemu_gluster_co_write_zeroes,
 #endif
@@ -762,6 +770,7 @@ static BlockDriver bdrv_gluster_tcp = {
 #ifdef CONFIG_GLUSTERFS_DISCARD
     .bdrv_co_discard              = qemu_gluster_co_discard,
 #endif
+    .bdrv_refresh_limits          = qemu_gluster_refresh_limits,
 #ifdef CONFIG_GLUSTERFS_ZEROFILL
     .bdrv_co_write_zeroes         = qemu_gluster_co_write_zeroes,
 #endif
@@ -789,6 +798,7 @@ static BlockDriver bdrv_gluster_unix = {
 #ifdef CONFIG_GLUSTERFS_DISCARD
     .bdrv_co_discard              = qemu_gluster_co_discard,
 #endif
+    .bdrv_refresh_limits          = qemu_gluster_refresh_limits,
 #ifdef CONFIG_GLUSTERFS_ZEROFILL
     .bdrv_co_write_zeroes         = qemu_gluster_co_write_zeroes,
 #endif
@@ -816,6 +826,7 @@ static BlockDriver bdrv_gluster_rdma = {
 #ifdef CONFIG_GLUSTERFS_DISCARD
     .bdrv_co_discard              = qemu_gluster_co_discard,
 #endif
+    .bdrv_refresh_limits          = qemu_gluster_refresh_limits,
 #ifdef CONFIG_GLUSTERFS_ZEROFILL
     .bdrv_co_write_zeroes         = qemu_gluster_co_write_zeroes,
 #endif
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] glusterfs: fix max_discard
  2015-02-02 18:29 ` [Qemu-devel] [PATCH 1/2] glusterfs: fix max_discard Denis V. Lunev
@ 2015-02-02 19:58   ` Peter Lieven
  2015-02-02 20:09     ` Denis V. Lunev
  0 siblings, 1 reply; 21+ messages in thread
From: Peter Lieven @ 2015-02-02 19:58 UTC (permalink / raw)
  To: Denis V. Lunev; +Cc: Kevin Wolf, qemu-devel

Am 02.02.2015 um 19:29 schrieb Denis V. Lunev:
> qemu_gluster_co_discard calculates size to discard as follows
>     size_t size = nb_sectors * BDRV_SECTOR_SIZE;
>     ret = glfs_discard_async(s->fd, offset, size, &gluster_finish_aiocb, acb);
>
> glfs_discard_async is declared as follows:
>   int glfs_discard_async (glfs_fd_t *fd, off_t length, size_t lent,
>                           glfs_io_cbk fn, void *data) __THROW
> This is problematic on i686 as sizeof(size_t) == 4.
>
> Set bl_max_discard to SIZE_MAX >> BDRV_SECTOR_BITS to avoid overflow
> on i386.
>
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Kevin Wolf <kwolf@redhat.com>
> CC: Peter Lieven <pl@kamp.de>
> ---
>  block/gluster.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
>
> diff --git a/block/gluster.c b/block/gluster.c
> index 1eb3a8c..47bf92d 100644
> --- a/block/gluster.c
> +++ b/block/gluster.c
> @@ -622,6 +622,13 @@ out:
>      return ret;
>  }
>  
> +static void qemu_gluster_refresh_limits(BlockDriverState *bs, Error **errp)
> +{
> +#if SIZE_MAX == UINT_MAX
> +    bs->bl.max_discard = SIZE_MAX >> BDRV_SECTOR_BITS;
> +#endif

I would write:

bs->bl.max_discard = MIN(SIZE_MAX >> BDRV_SECTOR_BITS, INT_MAX);

without the condition.


Peter



> +}
> +
>  #ifdef CONFIG_GLUSTERFS_DISCARD
>  static coroutine_fn int qemu_gluster_co_discard(BlockDriverState *bs,
>          int64_t sector_num, int nb_sectors)
> @@ -735,6 +742,7 @@ static BlockDriver bdrv_gluster = {
>  #ifdef CONFIG_GLUSTERFS_DISCARD
>      .bdrv_co_discard              = qemu_gluster_co_discard,
>  #endif
> +    .bdrv_refresh_limits          = qemu_gluster_refresh_limits,
>  #ifdef CONFIG_GLUSTERFS_ZEROFILL
>      .bdrv_co_write_zeroes         = qemu_gluster_co_write_zeroes,
>  #endif
> @@ -762,6 +770,7 @@ static BlockDriver bdrv_gluster_tcp = {
>  #ifdef CONFIG_GLUSTERFS_DISCARD
>      .bdrv_co_discard              = qemu_gluster_co_discard,
>  #endif
> +    .bdrv_refresh_limits          = qemu_gluster_refresh_limits,
>  #ifdef CONFIG_GLUSTERFS_ZEROFILL
>      .bdrv_co_write_zeroes         = qemu_gluster_co_write_zeroes,
>  #endif
> @@ -789,6 +798,7 @@ static BlockDriver bdrv_gluster_unix = {
>  #ifdef CONFIG_GLUSTERFS_DISCARD
>      .bdrv_co_discard              = qemu_gluster_co_discard,
>  #endif
> +    .bdrv_refresh_limits          = qemu_gluster_refresh_limits,
>  #ifdef CONFIG_GLUSTERFS_ZEROFILL
>      .bdrv_co_write_zeroes         = qemu_gluster_co_write_zeroes,
>  #endif
> @@ -816,6 +826,7 @@ static BlockDriver bdrv_gluster_rdma = {
>  #ifdef CONFIG_GLUSTERFS_DISCARD
>      .bdrv_co_discard              = qemu_gluster_co_discard,
>  #endif
> +    .bdrv_refresh_limits          = qemu_gluster_refresh_limits,
>  #ifdef CONFIG_GLUSTERFS_ZEROFILL
>      .bdrv_co_write_zeroes         = qemu_gluster_co_write_zeroes,
>  #endif

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 1/2] glusterfs: fix max_discard
  2015-02-02 20:09 [Qemu-devel] [PATCH v3 0/2] fix max_discard for NBD/gluster block drivers Denis V. Lunev
@ 2015-02-02 20:09 ` Denis V. Lunev
  2015-02-02 20:40   ` Peter Lieven
  0 siblings, 1 reply; 21+ messages in thread
From: Denis V. Lunev @ 2015-02-02 20:09 UTC (permalink / raw)
  Cc: Kevin Wolf, Denis V. Lunev, Peter Lieven, qemu-devel

qemu_gluster_co_discard calculates size to discard as follows
    size_t size = nb_sectors * BDRV_SECTOR_SIZE;
    ret = glfs_discard_async(s->fd, offset, size, &gluster_finish_aiocb, acb);

glfs_discard_async is declared as follows:
  int glfs_discard_async (glfs_fd_t *fd, off_t length, size_t lent,
                          glfs_io_cbk fn, void *data) __THROW
This is problematic on i686 as sizeof(size_t) == 4.

Set bl_max_discard to SIZE_MAX >> BDRV_SECTOR_BITS to avoid overflow
on i386.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Peter Lieven <pl@kamp.de>
---
 block/gluster.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/block/gluster.c b/block/gluster.c
index 1eb3a8c..8a8c153 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -622,6 +622,11 @@ out:
     return ret;
 }
 
+static void qemu_gluster_refresh_limits(BlockDriverState *bs, Error **errp)
+{
+    bs->bl.max_discard = MIN(SIZE_MAX >> BDRV_SECTOR_BITS, INT_MAX);
+}
+
 #ifdef CONFIG_GLUSTERFS_DISCARD
 static coroutine_fn int qemu_gluster_co_discard(BlockDriverState *bs,
         int64_t sector_num, int nb_sectors)
@@ -735,6 +740,7 @@ static BlockDriver bdrv_gluster = {
 #ifdef CONFIG_GLUSTERFS_DISCARD
     .bdrv_co_discard              = qemu_gluster_co_discard,
 #endif
+    .bdrv_refresh_limits          = qemu_gluster_refresh_limits,
 #ifdef CONFIG_GLUSTERFS_ZEROFILL
     .bdrv_co_write_zeroes         = qemu_gluster_co_write_zeroes,
 #endif
@@ -762,6 +768,7 @@ static BlockDriver bdrv_gluster_tcp = {
 #ifdef CONFIG_GLUSTERFS_DISCARD
     .bdrv_co_discard              = qemu_gluster_co_discard,
 #endif
+    .bdrv_refresh_limits          = qemu_gluster_refresh_limits,
 #ifdef CONFIG_GLUSTERFS_ZEROFILL
     .bdrv_co_write_zeroes         = qemu_gluster_co_write_zeroes,
 #endif
@@ -789,6 +796,7 @@ static BlockDriver bdrv_gluster_unix = {
 #ifdef CONFIG_GLUSTERFS_DISCARD
     .bdrv_co_discard              = qemu_gluster_co_discard,
 #endif
+    .bdrv_refresh_limits          = qemu_gluster_refresh_limits,
 #ifdef CONFIG_GLUSTERFS_ZEROFILL
     .bdrv_co_write_zeroes         = qemu_gluster_co_write_zeroes,
 #endif
@@ -816,6 +824,7 @@ static BlockDriver bdrv_gluster_rdma = {
 #ifdef CONFIG_GLUSTERFS_DISCARD
     .bdrv_co_discard              = qemu_gluster_co_discard,
 #endif
+    .bdrv_refresh_limits          = qemu_gluster_refresh_limits,
 #ifdef CONFIG_GLUSTERFS_ZEROFILL
     .bdrv_co_write_zeroes         = qemu_gluster_co_write_zeroes,
 #endif
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] glusterfs: fix max_discard
  2015-02-02 19:58   ` Peter Lieven
@ 2015-02-02 20:09     ` Denis V. Lunev
  0 siblings, 0 replies; 21+ messages in thread
From: Denis V. Lunev @ 2015-02-02 20:09 UTC (permalink / raw)
  To: Peter Lieven; +Cc: Kevin Wolf, qemu-devel

On 02/02/15 22:58, Peter Lieven wrote:
> Am 02.02.2015 um 19:29 schrieb Denis V. Lunev:
>> qemu_gluster_co_discard calculates size to discard as follows
>>      size_t size = nb_sectors * BDRV_SECTOR_SIZE;
>>      ret = glfs_discard_async(s->fd, offset, size, &gluster_finish_aiocb, acb);
>>
>> glfs_discard_async is declared as follows:
>>    int glfs_discard_async (glfs_fd_t *fd, off_t length, size_t lent,
>>                            glfs_io_cbk fn, void *data) __THROW
>> This is problematic on i686 as sizeof(size_t) == 4.
>>
>> Set bl_max_discard to SIZE_MAX >> BDRV_SECTOR_BITS to avoid overflow
>> on i386.
>>
>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>> CC: Kevin Wolf <kwolf@redhat.com>
>> CC: Peter Lieven <pl@kamp.de>
>> ---
>>   block/gluster.c | 11 +++++++++++
>>   1 file changed, 11 insertions(+)
>>
>> diff --git a/block/gluster.c b/block/gluster.c
>> index 1eb3a8c..47bf92d 100644
>> --- a/block/gluster.c
>> +++ b/block/gluster.c
>> @@ -622,6 +622,13 @@ out:
>>       return ret;
>>   }
>>   
>> +static void qemu_gluster_refresh_limits(BlockDriverState *bs, Error **errp)
>> +{
>> +#if SIZE_MAX == UINT_MAX
>> +    bs->bl.max_discard = SIZE_MAX >> BDRV_SECTOR_BITS;
>> +#endif
> I would write:
>
> bs->bl.max_discard = MIN(SIZE_MAX >> BDRV_SECTOR_BITS, INT_MAX);
>
> without the condition.
>
>
> Peter
>
ok, respinned

Den

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] glusterfs: fix max_discard
  2015-02-02 20:09 ` [Qemu-devel] [PATCH 1/2] glusterfs: fix max_discard Denis V. Lunev
@ 2015-02-02 20:40   ` Peter Lieven
  2015-02-02 20:46     ` Denis V. Lunev
  0 siblings, 1 reply; 21+ messages in thread
From: Peter Lieven @ 2015-02-02 20:40 UTC (permalink / raw)
  To: Denis V. Lunev; +Cc: Kevin Wolf, qemu-devel

Am 02.02.2015 um 21:09 schrieb Denis V. Lunev:
> qemu_gluster_co_discard calculates size to discard as follows
>     size_t size = nb_sectors * BDRV_SECTOR_SIZE;
>     ret = glfs_discard_async(s->fd, offset, size, &gluster_finish_aiocb, acb);
>
> glfs_discard_async is declared as follows:
>   int glfs_discard_async (glfs_fd_t *fd, off_t length, size_t lent,
>                           glfs_io_cbk fn, void *data) __THROW
> This is problematic on i686 as sizeof(size_t) == 4.
>
> Set bl_max_discard to SIZE_MAX >> BDRV_SECTOR_BITS to avoid overflow
> on i386.
>
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Kevin Wolf <kwolf@redhat.com>
> CC: Peter Lieven <pl@kamp.de>
> ---
>  block/gluster.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
>
> diff --git a/block/gluster.c b/block/gluster.c
> index 1eb3a8c..8a8c153 100644
> --- a/block/gluster.c
> +++ b/block/gluster.c
> @@ -622,6 +622,11 @@ out:
>      return ret;
>  }
>  
> +static void qemu_gluster_refresh_limits(BlockDriverState *bs, Error **errp)
> +{
> +    bs->bl.max_discard = MIN(SIZE_MAX >> BDRV_SECTOR_BITS, INT_MAX);
> +}
> +

Looking at the gluster code bl.max_transfer_length should have the same limit, but thats a different patch.

>  #ifdef CONFIG_GLUSTERFS_DISCARD
>  static coroutine_fn int qemu_gluster_co_discard(BlockDriverState *bs,
>          int64_t sector_num, int nb_sectors)
> @@ -735,6 +740,7 @@ static BlockDriver bdrv_gluster = {
>  #ifdef CONFIG_GLUSTERFS_DISCARD
>      .bdrv_co_discard              = qemu_gluster_co_discard,
>  #endif
> +    .bdrv_refresh_limits          = qemu_gluster_refresh_limits,
>  #ifdef CONFIG_GLUSTERFS_ZEROFILL
>      .bdrv_co_write_zeroes         = qemu_gluster_co_write_zeroes,
>  #endif
> @@ -762,6 +768,7 @@ static BlockDriver bdrv_gluster_tcp = {
>  #ifdef CONFIG_GLUSTERFS_DISCARD
>      .bdrv_co_discard              = qemu_gluster_co_discard,
>  #endif
> +    .bdrv_refresh_limits          = qemu_gluster_refresh_limits,
>  #ifdef CONFIG_GLUSTERFS_ZEROFILL
>      .bdrv_co_write_zeroes         = qemu_gluster_co_write_zeroes,
>  #endif
> @@ -789,6 +796,7 @@ static BlockDriver bdrv_gluster_unix = {
>  #ifdef CONFIG_GLUSTERFS_DISCARD
>      .bdrv_co_discard              = qemu_gluster_co_discard,
>  #endif
> +    .bdrv_refresh_limits          = qemu_gluster_refresh_limits,
>  #ifdef CONFIG_GLUSTERFS_ZEROFILL
>      .bdrv_co_write_zeroes         = qemu_gluster_co_write_zeroes,
>  #endif
> @@ -816,6 +824,7 @@ static BlockDriver bdrv_gluster_rdma = {
>  #ifdef CONFIG_GLUSTERFS_DISCARD
>      .bdrv_co_discard              = qemu_gluster_co_discard,
>  #endif
> +    .bdrv_refresh_limits          = qemu_gluster_refresh_limits,
>  #ifdef CONFIG_GLUSTERFS_ZEROFILL
>      .bdrv_co_write_zeroes         = qemu_gluster_co_write_zeroes,
>  #endif

Reviewed-by: Peter Lieven <pl@kamp.de>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] glusterfs: fix max_discard
  2015-02-02 20:40   ` Peter Lieven
@ 2015-02-02 20:46     ` Denis V. Lunev
  2015-02-03  7:31       ` Denis V. Lunev
  0 siblings, 1 reply; 21+ messages in thread
From: Denis V. Lunev @ 2015-02-02 20:46 UTC (permalink / raw)
  To: Peter Lieven; +Cc: Kevin Wolf, qemu-devel

On 02/02/15 23:40, Peter Lieven wrote:
> Am 02.02.2015 um 21:09 schrieb Denis V. Lunev:
>> qemu_gluster_co_discard calculates size to discard as follows
>>      size_t size = nb_sectors * BDRV_SECTOR_SIZE;
>>      ret = glfs_discard_async(s->fd, offset, size, &gluster_finish_aiocb, acb);
>>
>> glfs_discard_async is declared as follows:
>>    int glfs_discard_async (glfs_fd_t *fd, off_t length, size_t lent,
>>                            glfs_io_cbk fn, void *data) __THROW
>> This is problematic on i686 as sizeof(size_t) == 4.
>>
>> Set bl_max_discard to SIZE_MAX >> BDRV_SECTOR_BITS to avoid overflow
>> on i386.
>>
>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>> CC: Kevin Wolf <kwolf@redhat.com>
>> CC: Peter Lieven <pl@kamp.de>
>> ---
>>   block/gluster.c | 9 +++++++++
>>   1 file changed, 9 insertions(+)
>>
>> diff --git a/block/gluster.c b/block/gluster.c
>> index 1eb3a8c..8a8c153 100644
>> --- a/block/gluster.c
>> +++ b/block/gluster.c
>> @@ -622,6 +622,11 @@ out:
>>       return ret;
>>   }
>>   
>> +static void qemu_gluster_refresh_limits(BlockDriverState *bs, Error **errp)
>> +{
>> +    bs->bl.max_discard = MIN(SIZE_MAX >> BDRV_SECTOR_BITS, INT_MAX);
>> +}
>> +
> Looking at the gluster code bl.max_transfer_length should have the same limit, but thats a different patch.
ha, the same applies to nbd code too.

I'll do this stuff tomorrow and also I think that some
audit in other drivers could reveal something interesting.

Den

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] glusterfs: fix max_discard
  2015-02-02 20:46     ` Denis V. Lunev
@ 2015-02-03  7:31       ` Denis V. Lunev
  2015-02-03 11:30         ` Peter Lieven
  0 siblings, 1 reply; 21+ messages in thread
From: Denis V. Lunev @ 2015-02-03  7:31 UTC (permalink / raw)
  To: Peter Lieven; +Cc: Kevin Wolf, Paolo Bonzini, qemu-devel

On 02/02/15 23:46, Denis V. Lunev wrote:
> On 02/02/15 23:40, Peter Lieven wrote:
>> Am 02.02.2015 um 21:09 schrieb Denis V. Lunev:
>>> qemu_gluster_co_discard calculates size to discard as follows
>>>      size_t size = nb_sectors * BDRV_SECTOR_SIZE;
>>>      ret = glfs_discard_async(s->fd, offset, size, 
>>> &gluster_finish_aiocb, acb);
>>>
>>> glfs_discard_async is declared as follows:
>>>    int glfs_discard_async (glfs_fd_t *fd, off_t length, size_t lent,
>>>                            glfs_io_cbk fn, void *data) __THROW
>>> This is problematic on i686 as sizeof(size_t) == 4.
>>>
>>> Set bl_max_discard to SIZE_MAX >> BDRV_SECTOR_BITS to avoid overflow
>>> on i386.
>>>
>>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>>> CC: Kevin Wolf <kwolf@redhat.com>
>>> CC: Peter Lieven <pl@kamp.de>
>>> ---
>>>   block/gluster.c | 9 +++++++++
>>>   1 file changed, 9 insertions(+)
>>>
>>> diff --git a/block/gluster.c b/block/gluster.c
>>> index 1eb3a8c..8a8c153 100644
>>> --- a/block/gluster.c
>>> +++ b/block/gluster.c
>>> @@ -622,6 +622,11 @@ out:
>>>       return ret;
>>>   }
>>>   +static void qemu_gluster_refresh_limits(BlockDriverState *bs, 
>>> Error **errp)
>>> +{
>>> +    bs->bl.max_discard = MIN(SIZE_MAX >> BDRV_SECTOR_BITS, INT_MAX);
>>> +}
>>> +
>> Looking at the gluster code bl.max_transfer_length should have the 
>> same limit, but thats a different patch.
> ha, the same applies to nbd code too.
>
> I'll do this stuff tomorrow and also I think that some
> audit in other drivers could reveal something interesting.
>
> Den
ok. The situation is well rotten here on i686.

The problem comes from the fact that QEMUIOVector
and iovec uses size_t as length. All API calls use
this abstraction. Thus all conversion operations
from nr_sectors to size could bang at any moment.

Putting dirty hands here is problematic from my point
of view. Should we really care about this? 32bit
applications are becoming old good history of IT...
Den

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] glusterfs: fix max_discard
  2015-02-03  7:31       ` Denis V. Lunev
@ 2015-02-03 11:30         ` Peter Lieven
  2015-02-03 11:37           ` Kevin Wolf
  0 siblings, 1 reply; 21+ messages in thread
From: Peter Lieven @ 2015-02-03 11:30 UTC (permalink / raw)
  To: Denis V. Lunev; +Cc: Kevin Wolf, Paolo Bonzini, qemu-devel

Am 03.02.2015 um 08:31 schrieb Denis V. Lunev:
> On 02/02/15 23:46, Denis V. Lunev wrote:
>> On 02/02/15 23:40, Peter Lieven wrote:
>>> Am 02.02.2015 um 21:09 schrieb Denis V. Lunev:
>>>> qemu_gluster_co_discard calculates size to discard as follows
>>>>      size_t size = nb_sectors * BDRV_SECTOR_SIZE;
>>>>      ret = glfs_discard_async(s->fd, offset, size, &gluster_finish_aiocb, acb);
>>>>
>>>> glfs_discard_async is declared as follows:
>>>>    int glfs_discard_async (glfs_fd_t *fd, off_t length, size_t lent,
>>>>                            glfs_io_cbk fn, void *data) __THROW
>>>> This is problematic on i686 as sizeof(size_t) == 4.
>>>>
>>>> Set bl_max_discard to SIZE_MAX >> BDRV_SECTOR_BITS to avoid overflow
>>>> on i386.
>>>>
>>>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>>>> CC: Kevin Wolf <kwolf@redhat.com>
>>>> CC: Peter Lieven <pl@kamp.de>
>>>> ---
>>>>   block/gluster.c | 9 +++++++++
>>>>   1 file changed, 9 insertions(+)
>>>>
>>>> diff --git a/block/gluster.c b/block/gluster.c
>>>> index 1eb3a8c..8a8c153 100644
>>>> --- a/block/gluster.c
>>>> +++ b/block/gluster.c
>>>> @@ -622,6 +622,11 @@ out:
>>>>       return ret;
>>>>   }
>>>>   +static void qemu_gluster_refresh_limits(BlockDriverState *bs, Error **errp)
>>>> +{
>>>> +    bs->bl.max_discard = MIN(SIZE_MAX >> BDRV_SECTOR_BITS, INT_MAX);
>>>> +}
>>>> +
>>> Looking at the gluster code bl.max_transfer_length should have the same limit, but thats a different patch.
>> ha, the same applies to nbd code too.
>>
>> I'll do this stuff tomorrow and also I think that some
>> audit in other drivers could reveal something interesting.
>>
>> Den
> ok. The situation is well rotten here on i686.
>
> The problem comes from the fact that QEMUIOVector
> and iovec uses size_t as length. All API calls use
> this abstraction. Thus all conversion operations
> from nr_sectors to size could bang at any moment.
>
> Putting dirty hands here is problematic from my point
> of view. Should we really care about this? 32bit
> applications are becoming old good history of IT...

The host has to be 32bit to be in trouble. And at least if we have KVM the host
has to support long mode.

I have on my todo to add generic code for honouring bl.max_transfer_length
in block.c. We could change default maximum from INT_MAX to SIZE_MAX >> BDRV_SECTOR_BITS
for bl.max_transfer_length.

Peter

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] glusterfs: fix max_discard
  2015-02-03 11:30         ` Peter Lieven
@ 2015-02-03 11:37           ` Kevin Wolf
  2015-02-03 11:47             ` Peter Lieven
  0 siblings, 1 reply; 21+ messages in thread
From: Kevin Wolf @ 2015-02-03 11:37 UTC (permalink / raw)
  To: Peter Lieven; +Cc: Paolo Bonzini, Denis V. Lunev, qemu-devel

Am 03.02.2015 um 12:30 hat Peter Lieven geschrieben:
> Am 03.02.2015 um 08:31 schrieb Denis V. Lunev:
> > On 02/02/15 23:46, Denis V. Lunev wrote:
> >> On 02/02/15 23:40, Peter Lieven wrote:
> >>> Am 02.02.2015 um 21:09 schrieb Denis V. Lunev:
> >>>> qemu_gluster_co_discard calculates size to discard as follows
> >>>>      size_t size = nb_sectors * BDRV_SECTOR_SIZE;
> >>>>      ret = glfs_discard_async(s->fd, offset, size, &gluster_finish_aiocb, acb);
> >>>>
> >>>> glfs_discard_async is declared as follows:
> >>>>    int glfs_discard_async (glfs_fd_t *fd, off_t length, size_t lent,
> >>>>                            glfs_io_cbk fn, void *data) __THROW
> >>>> This is problematic on i686 as sizeof(size_t) == 4.
> >>>>
> >>>> Set bl_max_discard to SIZE_MAX >> BDRV_SECTOR_BITS to avoid overflow
> >>>> on i386.
> >>>>
> >>>> Signed-off-by: Denis V. Lunev <den@openvz.org>
> >>>> CC: Kevin Wolf <kwolf@redhat.com>
> >>>> CC: Peter Lieven <pl@kamp.de>
> >>>> ---
> >>>>   block/gluster.c | 9 +++++++++
> >>>>   1 file changed, 9 insertions(+)
> >>>>
> >>>> diff --git a/block/gluster.c b/block/gluster.c
> >>>> index 1eb3a8c..8a8c153 100644
> >>>> --- a/block/gluster.c
> >>>> +++ b/block/gluster.c
> >>>> @@ -622,6 +622,11 @@ out:
> >>>>       return ret;
> >>>>   }
> >>>>   +static void qemu_gluster_refresh_limits(BlockDriverState *bs, Error **errp)
> >>>> +{
> >>>> +    bs->bl.max_discard = MIN(SIZE_MAX >> BDRV_SECTOR_BITS, INT_MAX);
> >>>> +}
> >>>> +
> >>> Looking at the gluster code bl.max_transfer_length should have the same limit, but thats a different patch.
> >> ha, the same applies to nbd code too.
> >>
> >> I'll do this stuff tomorrow and also I think that some
> >> audit in other drivers could reveal something interesting.
> >>
> >> Den
> > ok. The situation is well rotten here on i686.
> >
> > The problem comes from the fact that QEMUIOVector
> > and iovec uses size_t as length. All API calls use
> > this abstraction. Thus all conversion operations
> > from nr_sectors to size could bang at any moment.
> >
> > Putting dirty hands here is problematic from my point
> > of view. Should we really care about this? 32bit
> > applications are becoming old good history of IT...
> 
> The host has to be 32bit to be in trouble. And at least if we have KVM the host
> has to support long mode.
> 
> I have on my todo to add generic code for honouring bl.max_transfer_length
> in block.c. We could change default maximum from INT_MAX to SIZE_MAX >> BDRV_SECTOR_BITS
> for bl.max_transfer_length.

So the conclusion is that we'll apply this series as it is and you'll
take care of the rest later?

Kevin

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] glusterfs: fix max_discard
  2015-02-03 11:37           ` Kevin Wolf
@ 2015-02-03 11:47             ` Peter Lieven
  2015-02-03 11:51               ` Denis V. Lunev
  0 siblings, 1 reply; 21+ messages in thread
From: Peter Lieven @ 2015-02-03 11:47 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Paolo Bonzini, Denis V. Lunev, qemu-devel

Am 03.02.2015 um 12:37 schrieb Kevin Wolf:
> Am 03.02.2015 um 12:30 hat Peter Lieven geschrieben:
>> Am 03.02.2015 um 08:31 schrieb Denis V. Lunev:
>>> On 02/02/15 23:46, Denis V. Lunev wrote:
>>>> On 02/02/15 23:40, Peter Lieven wrote:
>>>>> Am 02.02.2015 um 21:09 schrieb Denis V. Lunev:
>>>>>> qemu_gluster_co_discard calculates size to discard as follows
>>>>>>      size_t size = nb_sectors * BDRV_SECTOR_SIZE;
>>>>>>      ret = glfs_discard_async(s->fd, offset, size, &gluster_finish_aiocb, acb);
>>>>>>
>>>>>> glfs_discard_async is declared as follows:
>>>>>>    int glfs_discard_async (glfs_fd_t *fd, off_t length, size_t lent,
>>>>>>                            glfs_io_cbk fn, void *data) __THROW
>>>>>> This is problematic on i686 as sizeof(size_t) == 4.
>>>>>>
>>>>>> Set bl_max_discard to SIZE_MAX >> BDRV_SECTOR_BITS to avoid overflow
>>>>>> on i386.
>>>>>>
>>>>>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>>>>>> CC: Kevin Wolf <kwolf@redhat.com>
>>>>>> CC: Peter Lieven <pl@kamp.de>
>>>>>> ---
>>>>>>   block/gluster.c | 9 +++++++++
>>>>>>   1 file changed, 9 insertions(+)
>>>>>>
>>>>>> diff --git a/block/gluster.c b/block/gluster.c
>>>>>> index 1eb3a8c..8a8c153 100644
>>>>>> --- a/block/gluster.c
>>>>>> +++ b/block/gluster.c
>>>>>> @@ -622,6 +622,11 @@ out:
>>>>>>       return ret;
>>>>>>   }
>>>>>>   +static void qemu_gluster_refresh_limits(BlockDriverState *bs, Error **errp)
>>>>>> +{
>>>>>> +    bs->bl.max_discard = MIN(SIZE_MAX >> BDRV_SECTOR_BITS, INT_MAX);
>>>>>> +}
>>>>>> +
>>>>> Looking at the gluster code bl.max_transfer_length should have the same limit, but thats a different patch.
>>>> ha, the same applies to nbd code too.
>>>>
>>>> I'll do this stuff tomorrow and also I think that some
>>>> audit in other drivers could reveal something interesting.
>>>>
>>>> Den
>>> ok. The situation is well rotten here on i686.
>>>
>>> The problem comes from the fact that QEMUIOVector
>>> and iovec uses size_t as length. All API calls use
>>> this abstraction. Thus all conversion operations
>>> from nr_sectors to size could bang at any moment.
>>>
>>> Putting dirty hands here is problematic from my point
>>> of view. Should we really care about this? 32bit
>>> applications are becoming old good history of IT...
>> The host has to be 32bit to be in trouble. And at least if we have KVM the host
>> has to support long mode.
>>
>> I have on my todo to add generic code for honouring bl.max_transfer_length
>> in block.c. We could change default maximum from INT_MAX to SIZE_MAX >> BDRV_SECTOR_BITS
>> for bl.max_transfer_length.
> So the conclusion is that we'll apply this series as it is and you'll
> take care of the rest later?

Yes, and actually we need a macro like

#define BDRV_MAX_REQUEST_SECTORS MIN(SIZE_MAX >> BDRV_SECTOR_BITS, INT_MAX)

as limit for everything. Because bdrv_check_byte_request already has a size_t argument.
So we could already create an overflow in bdrv_check_request when we convert
nb_sectors to size_t.

I will create a patch to catch at least this overflow shortly.

Peter

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] glusterfs: fix max_discard
  2015-02-03 11:47             ` Peter Lieven
@ 2015-02-03 11:51               ` Denis V. Lunev
  2015-02-03 12:13                 ` Peter Lieven
  0 siblings, 1 reply; 21+ messages in thread
From: Denis V. Lunev @ 2015-02-03 11:51 UTC (permalink / raw)
  To: Peter Lieven, Kevin Wolf; +Cc: Paolo Bonzini, qemu-devel

On 03/02/15 14:47, Peter Lieven wrote:
> Am 03.02.2015 um 12:37 schrieb Kevin Wolf:
>> Am 03.02.2015 um 12:30 hat Peter Lieven geschrieben:
>>> Am 03.02.2015 um 08:31 schrieb Denis V. Lunev:
>>>> On 02/02/15 23:46, Denis V. Lunev wrote:
>>>>> On 02/02/15 23:40, Peter Lieven wrote:
>>>>>> Am 02.02.2015 um 21:09 schrieb Denis V. Lunev:
>>>>>>> qemu_gluster_co_discard calculates size to discard as follows
>>>>>>>       size_t size = nb_sectors * BDRV_SECTOR_SIZE;
>>>>>>>       ret = glfs_discard_async(s->fd, offset, size, &gluster_finish_aiocb, acb);
>>>>>>>
>>>>>>> glfs_discard_async is declared as follows:
>>>>>>>     int glfs_discard_async (glfs_fd_t *fd, off_t length, size_t lent,
>>>>>>>                             glfs_io_cbk fn, void *data) __THROW
>>>>>>> This is problematic on i686 as sizeof(size_t) == 4.
>>>>>>>
>>>>>>> Set bl_max_discard to SIZE_MAX >> BDRV_SECTOR_BITS to avoid overflow
>>>>>>> on i386.
>>>>>>>
>>>>>>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>>>>>>> CC: Kevin Wolf <kwolf@redhat.com>
>>>>>>> CC: Peter Lieven <pl@kamp.de>
>>>>>>> ---
>>>>>>>    block/gluster.c | 9 +++++++++
>>>>>>>    1 file changed, 9 insertions(+)
>>>>>>>
>>>>>>> diff --git a/block/gluster.c b/block/gluster.c
>>>>>>> index 1eb3a8c..8a8c153 100644
>>>>>>> --- a/block/gluster.c
>>>>>>> +++ b/block/gluster.c
>>>>>>> @@ -622,6 +622,11 @@ out:
>>>>>>>        return ret;
>>>>>>>    }
>>>>>>>    +static void qemu_gluster_refresh_limits(BlockDriverState *bs, Error **errp)
>>>>>>> +{
>>>>>>> +    bs->bl.max_discard = MIN(SIZE_MAX >> BDRV_SECTOR_BITS, INT_MAX);
>>>>>>> +}
>>>>>>> +
>>>>>> Looking at the gluster code bl.max_transfer_length should have the same limit, but thats a different patch.
>>>>> ha, the same applies to nbd code too.
>>>>>
>>>>> I'll do this stuff tomorrow and also I think that some
>>>>> audit in other drivers could reveal something interesting.
>>>>>
>>>>> Den
>>>> ok. The situation is well rotten here on i686.
>>>>
>>>> The problem comes from the fact that QEMUIOVector
>>>> and iovec uses size_t as length. All API calls use
>>>> this abstraction. Thus all conversion operations
>>>> from nr_sectors to size could bang at any moment.
>>>>
>>>> Putting dirty hands here is problematic from my point
>>>> of view. Should we really care about this? 32bit
>>>> applications are becoming old good history of IT...
>>> The host has to be 32bit to be in trouble. And at least if we have KVM the host
>>> has to support long mode.
>>>
>>> I have on my todo to add generic code for honouring bl.max_transfer_length
>>> in block.c. We could change default maximum from INT_MAX to SIZE_MAX >> BDRV_SECTOR_BITS
>>> for bl.max_transfer_length.
>> So the conclusion is that we'll apply this series as it is and you'll
>> take care of the rest later?
> Yes, and actually we need a macro like
>
> #define BDRV_MAX_REQUEST_SECTORS MIN(SIZE_MAX >> BDRV_SECTOR_BITS, INT_MAX)
>
> as limit for everything. Because bdrv_check_byte_request already has a size_t argument.
> So we could already create an overflow in bdrv_check_request when we convert
> nb_sectors to size_t.
>
> I will create a patch to catch at least this overflow shortly.
>
> Peter
>
I like this macro :)

I vote to move MIN(SIZE_MAX >> BDRV_SECTOR_BITS, INT_MAX) into generic code
on discard/write_zero paths immediately and drop this exact patch.

Patch 2 of this set would be better to have additional
+bs->bl.max_transfer_length = UINT32_MAX >> BDRV_SECTOR_BITS;

I'll wait Peter's patch and respin on top of it to avoid unnecessary 
commits.

Den

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] glusterfs: fix max_discard
  2015-02-03 11:51               ` Denis V. Lunev
@ 2015-02-03 12:13                 ` Peter Lieven
  0 siblings, 0 replies; 21+ messages in thread
From: Peter Lieven @ 2015-02-03 12:13 UTC (permalink / raw)
  To: Denis V. Lunev, Kevin Wolf; +Cc: Paolo Bonzini, qemu-devel

Am 03.02.2015 um 12:51 schrieb Denis V. Lunev:
> On 03/02/15 14:47, Peter Lieven wrote:
>> Am 03.02.2015 um 12:37 schrieb Kevin Wolf:
>>> Am 03.02.2015 um 12:30 hat Peter Lieven geschrieben:
>>>> Am 03.02.2015 um 08:31 schrieb Denis V. Lunev:
>>>>> On 02/02/15 23:46, Denis V. Lunev wrote:
>>>>>> On 02/02/15 23:40, Peter Lieven wrote:
>>>>>>> Am 02.02.2015 um 21:09 schrieb Denis V. Lunev:
>>>>>>>> qemu_gluster_co_discard calculates size to discard as follows
>>>>>>>>       size_t size = nb_sectors * BDRV_SECTOR_SIZE;
>>>>>>>>       ret = glfs_discard_async(s->fd, offset, size, &gluster_finish_aiocb, acb);
>>>>>>>>
>>>>>>>> glfs_discard_async is declared as follows:
>>>>>>>>     int glfs_discard_async (glfs_fd_t *fd, off_t length, size_t lent,
>>>>>>>>                             glfs_io_cbk fn, void *data) __THROW
>>>>>>>> This is problematic on i686 as sizeof(size_t) == 4.
>>>>>>>>
>>>>>>>> Set bl_max_discard to SIZE_MAX >> BDRV_SECTOR_BITS to avoid overflow
>>>>>>>> on i386.
>>>>>>>>
>>>>>>>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>>>>>>>> CC: Kevin Wolf <kwolf@redhat.com>
>>>>>>>> CC: Peter Lieven <pl@kamp.de>
>>>>>>>> ---
>>>>>>>>    block/gluster.c | 9 +++++++++
>>>>>>>>    1 file changed, 9 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/block/gluster.c b/block/gluster.c
>>>>>>>> index 1eb3a8c..8a8c153 100644
>>>>>>>> --- a/block/gluster.c
>>>>>>>> +++ b/block/gluster.c
>>>>>>>> @@ -622,6 +622,11 @@ out:
>>>>>>>>        return ret;
>>>>>>>>    }
>>>>>>>>    +static void qemu_gluster_refresh_limits(BlockDriverState *bs, Error **errp)
>>>>>>>> +{
>>>>>>>> +    bs->bl.max_discard = MIN(SIZE_MAX >> BDRV_SECTOR_BITS, INT_MAX);
>>>>>>>> +}
>>>>>>>> +
>>>>>>> Looking at the gluster code bl.max_transfer_length should have the same limit, but thats a different patch.
>>>>>> ha, the same applies to nbd code too.
>>>>>>
>>>>>> I'll do this stuff tomorrow and also I think that some
>>>>>> audit in other drivers could reveal something interesting.
>>>>>>
>>>>>> Den
>>>>> ok. The situation is well rotten here on i686.
>>>>>
>>>>> The problem comes from the fact that QEMUIOVector
>>>>> and iovec uses size_t as length. All API calls use
>>>>> this abstraction. Thus all conversion operations
>>>>> from nr_sectors to size could bang at any moment.
>>>>>
>>>>> Putting dirty hands here is problematic from my point
>>>>> of view. Should we really care about this? 32bit
>>>>> applications are becoming old good history of IT...
>>>> The host has to be 32bit to be in trouble. And at least if we have KVM the host
>>>> has to support long mode.
>>>>
>>>> I have on my todo to add generic code for honouring bl.max_transfer_length
>>>> in block.c. We could change default maximum from INT_MAX to SIZE_MAX >> BDRV_SECTOR_BITS
>>>> for bl.max_transfer_length.
>>> So the conclusion is that we'll apply this series as it is and you'll
>>> take care of the rest later?
>> Yes, and actually we need a macro like
>>
>> #define BDRV_MAX_REQUEST_SECTORS MIN(SIZE_MAX >> BDRV_SECTOR_BITS, INT_MAX)
>>
>> as limit for everything. Because bdrv_check_byte_request already has a size_t argument.
>> So we could already create an overflow in bdrv_check_request when we convert
>> nb_sectors to size_t.
>>
>> I will create a patch to catch at least this overflow shortly.
>>
>> Peter
>>
> I like this macro :)

I see I looked at an old checkout in bdrv_check_request there is already such a check meanwhile.

static int bdrv_check_request(BlockDriverState *bs, int64_t sector_num,
                              int nb_sectors)
{
    if (nb_sectors < 0 || nb_sectors > INT_MAX / BDRV_SECTOR_SIZE) {
        return -EIO;
    }

    return bdrv_check_byte_request(bs, sector_num * BDRV_SECTOR_SIZE,
                                   nb_sectors * BDRV_SECTOR_SIZE);
}

>
> I vote to move MIN(SIZE_MAX >> BDRV_SECTOR_BITS, INT_MAX) into generic code
> on discard/write_zero paths immediately and drop this exact patch.
>
> Patch 2 of this set would be better to have additional
> +bs->bl.max_transfer_length = UINT32_MAX >> BDRV_SECTOR_BITS;
>
> I'll wait Peter's patch and respin on top of it to avoid unnecessary commits.
>
> Den

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2015-02-03 12:13 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-02-02 14:48 [Qemu-devel] [PATCH] block: change default for discard and write zeroes to INT_MAX Peter Lieven
2015-02-02 16:13 ` Kevin Wolf
2015-02-02 16:25   ` Denis V. Lunev
2015-02-02 16:45     ` Kevin Wolf
2015-02-02 17:00       ` Denis V. Lunev
2015-02-02 17:34         ` Paolo Bonzini
2015-02-02 17:40           ` Denis V. Lunev
2015-02-02 16:23 ` [Qemu-devel] [PATCH 1/2] glusterfs: fix max_discard Denis V. Lunev
2015-02-02 16:23   ` [Qemu-devel] [PATCH 2/2] nbd: " Denis V. Lunev
  -- strict thread matches above, loose matches on Subject: below --
2015-02-02 18:29 [Qemu-devel] [PATCH v2 0/2] fix max_discard for NBD/gluster block drivers Denis V. Lunev
2015-02-02 18:29 ` [Qemu-devel] [PATCH 1/2] glusterfs: fix max_discard Denis V. Lunev
2015-02-02 19:58   ` Peter Lieven
2015-02-02 20:09     ` Denis V. Lunev
2015-02-02 20:09 [Qemu-devel] [PATCH v3 0/2] fix max_discard for NBD/gluster block drivers Denis V. Lunev
2015-02-02 20:09 ` [Qemu-devel] [PATCH 1/2] glusterfs: fix max_discard Denis V. Lunev
2015-02-02 20:40   ` Peter Lieven
2015-02-02 20:46     ` Denis V. Lunev
2015-02-03  7:31       ` Denis V. Lunev
2015-02-03 11:30         ` Peter Lieven
2015-02-03 11:37           ` Kevin Wolf
2015-02-03 11:47             ` Peter Lieven
2015-02-03 11:51               ` Denis V. Lunev
2015-02-03 12:13                 ` Peter Lieven

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).