[Qemu-devel] [PATCH] block: fix big write

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH] block: fix big write
@ 2014-12-05 16:15 Ming Lei
  2014-12-05 16:33 ` Paolo Bonzini
  2014-12-05 17:03 ` Max Reitz
  0 siblings, 2 replies; 14+ messages in thread
From: Ming Lei @ 2014-12-05 16:15 UTC (permalink / raw)
  To: qemu-devel, Paolo Bonzini, Stefan Hajnoczi, Kevin Wolf
  Cc: Ming Lei, qemu-stable

From: Ming Lei <ming.lei@caonical.com>

QEMU block should have supported to read/write at most
0x7fffff * 512 bytes, unfortunately INT_MAX is used to check
bytes in both bdrv_co_do_writev() and bdrv_check_byte_request(),
so cause write failure if nr_sectors is equal or more
than 0x400000.

There are still other INT_MAX usages in block.c, and they might
need to change to UINT_MAX too in future, but at least
this patch's change can make SCSI WRITE SAME 16 workable.

Cc: qemu-stable@nongnu.org
Signed-off-by: Ming Lei <ming.lei@caonical.com>
---
 block.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index a612594..ddc18c2 100644
--- a/block.c
+++ b/block.c
@@ -2607,7 +2607,7 @@ static int bdrv_check_byte_request(BlockDriverState *bs, int64_t offset,
 {
     int64_t len;
 
-    if (size > INT_MAX) {
+    if (size > UINT_MAX) {
         return -EIO;
     }
 
@@ -3420,7 +3420,7 @@ static int coroutine_fn bdrv_co_do_writev(BlockDriverState *bs,
     int64_t sector_num, int nb_sectors, QEMUIOVector *qiov,
     BdrvRequestFlags flags)
 {
-    if (nb_sectors < 0 || nb_sectors > (INT_MAX >> BDRV_SECTOR_BITS)) {
+    if (nb_sectors < 0 || nb_sectors > (UINT_MAX >> BDRV_SECTOR_BITS)) {
         return -EINVAL;
     }
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH] block: fix big write
  2014-12-05 16:15 [Qemu-devel] [PATCH] block: fix big write Ming Lei
@ 2014-12-05 16:33 ` Paolo Bonzini
  2014-12-08  7:19   ` Ming Lei
  2014-12-05 17:03 ` Max Reitz
  1 sibling, 1 reply; 14+ messages in thread
From: Paolo Bonzini @ 2014-12-05 16:33 UTC (permalink / raw)
  To: Ming Lei, qemu-devel, Stefan Hajnoczi, Kevin Wolf; +Cc: Ming Lei, qemu-stable



On 05/12/2014 17:15, Ming Lei wrote:
> From: Ming Lei <ming.lei@caonical.com>
> 
> QEMU block should have supported to read/write at most
> 0x7fffff * 512 bytes, unfortunately INT_MAX is used to check
> bytes in both bdrv_co_do_writev() and bdrv_check_byte_request(),
> so cause write failure if nr_sectors is equal or more
> than 0x400000.
> 
> There are still other INT_MAX usages in block.c, and they might
> need to change to UINT_MAX too in future, but at least
> this patch's change can make SCSI WRITE SAME 16 workable.
> 
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Ming Lei <ming.lei@caonical.com>

Alternatively, I'd accept a SCSI patch setting max_ws_blocks and friends
to 2GB - 1 block.

Paolo

> ---
>  block.c |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/block.c b/block.c
> index a612594..ddc18c2 100644
> --- a/block.c
> +++ b/block.c
> @@ -2607,7 +2607,7 @@ static int bdrv_check_byte_request(BlockDriverState *bs, int64_t offset,
>  {
>      int64_t len;
>  
> -    if (size > INT_MAX) {
> +    if (size > UINT_MAX) {
>          return -EIO;
>      }
>  
> @@ -3420,7 +3420,7 @@ static int coroutine_fn bdrv_co_do_writev(BlockDriverState *bs,
>      int64_t sector_num, int nb_sectors, QEMUIOVector *qiov,
>      BdrvRequestFlags flags)
>  {
> -    if (nb_sectors < 0 || nb_sectors > (INT_MAX >> BDRV_SECTOR_BITS)) {
> +    if (nb_sectors < 0 || nb_sectors > (UINT_MAX >> BDRV_SECTOR_BITS)) {
>          return -EINVAL;
>      }
>  
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH] block: fix big write
  2014-12-05 16:33 ` Paolo Bonzini
@ 2014-12-08  7:19   ` Ming Lei
  2014-12-09 17:45     ` Paolo Bonzini
  0 siblings, 1 reply; 14+ messages in thread
From: Ming Lei @ 2014-12-08  7:19 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, qemu-stable, qemu-devel, Stefan Hajnoczi, Ming Lei

On Sat, Dec 6, 2014 at 12:33 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
>
> On 05/12/2014 17:15, Ming Lei wrote:
>> From: Ming Lei <ming.lei@caonical.com>
>>
>> QEMU block should have supported to read/write at most
>> 0x7fffff * 512 bytes, unfortunately INT_MAX is used to check
>> bytes in both bdrv_co_do_writev() and bdrv_check_byte_request(),
>> so cause write failure if nr_sectors is equal or more
>> than 0x400000.
>>
>> There are still other INT_MAX usages in block.c, and they might
>> need to change to UINT_MAX too in future, but at least
>> this patch's change can make SCSI WRITE SAME 16 workable.
>>
>> Cc: qemu-stable@nongnu.org
>> Signed-off-by: Ming Lei <ming.lei@caonical.com>
>
> Alternatively, I'd accept a SCSI patch setting max_ws_blocks and friends
> to 2GB - 1 block.

It should be better to not introduce the limit and split the writes
into size of 2GB - 1 block since there is only the limit for write zero.

Thanks,

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH] block: fix big write
  2014-12-08  7:19   ` Ming Lei
@ 2014-12-09 17:45     ` Paolo Bonzini
  2014-12-10  1:41       ` Ming Lei
  0 siblings, 1 reply; 14+ messages in thread
From: Paolo Bonzini @ 2014-12-09 17:45 UTC (permalink / raw)
  To: Ming Lei; +Cc: Kevin Wolf, qemu-stable, qemu-devel, Stefan Hajnoczi, Ming Lei



On 08/12/2014 08:19, Ming Lei wrote:
>> >
>> > Alternatively, I'd accept a SCSI patch setting max_ws_blocks and friends
>> > to 2GB - 1 block.
> It should be better to not introduce the limit and split the writes
> into size of 2GB - 1 block since there is only the limit for write zero.

Why? That's exactly what the max_ws_blocks is for, and there's code in
the guest already to do the split.  We're talking about 2GB, not 1MB.

Paolo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH] block: fix big write
  2014-12-09 17:45     ` Paolo Bonzini
@ 2014-12-10  1:41       ` Ming Lei
  2014-12-10  9:56         ` Paolo Bonzini
  0 siblings, 1 reply; 14+ messages in thread
From: Ming Lei @ 2014-12-10  1:41 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Kevin Wolf, qemu-stable, Stefan Hajnoczi, qemu-devel

On Wed, Dec 10, 2014 at 1:45 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
>
> On 08/12/2014 08:19, Ming Lei wrote:
>>> >
>>> > Alternatively, I'd accept a SCSI patch setting max_ws_blocks and friends
>>> > to 2GB - 1 block.
>> It should be better to not introduce the limit and split the writes
>> into size of 2GB - 1 block since there is only the limit for write zero.
>
> Why? That's exactly what the max_ws_blocks is for, and there's code in
> the guest already to do the split.  We're talking about 2GB, not 1MB.

The split in write same does not cover write zero, and that is the problem.
Otherwise write same just works fine. That said write same of QEMU SCSI
can work well without max write same sectors limit.

If we introduce the limit of max write same sectors, this limit will be put
on both write zero and write non-zero path.

Also "MAXIMUM WRITE SAME LENGTH" is just introduced on sbc3r35
in Jan, 2013, and some old host drivers may not support it, and not using
the limit should have better compatibility.

Thanks,
Ming Lei

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH] block: fix big write
  2014-12-10  1:41       ` Ming Lei
@ 2014-12-10  9:56         ` Paolo Bonzini
  2014-12-10 12:23           ` Ming Lei
  0 siblings, 1 reply; 14+ messages in thread
From: Paolo Bonzini @ 2014-12-10  9:56 UTC (permalink / raw)
  To: Ming Lei; +Cc: Kevin Wolf, qemu-stable, Stefan Hajnoczi, qemu-devel



On 10/12/2014 02:41, Ming Lei wrote:
> On Wed, Dec 10, 2014 at 1:45 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>
>>
>> On 08/12/2014 08:19, Ming Lei wrote:
>>>>>
>>>>> Alternatively, I'd accept a SCSI patch setting max_ws_blocks and friends
>>>>> to 2GB - 1 block.
>>> It should be better to not introduce the limit and split the writes
>>> into size of 2GB - 1 block since there is only the limit for write zero.
>>
>> Why? That's exactly what the max_ws_blocks is for, and there's code in
>> the guest already to do the split.  We're talking about 2GB, not 1MB.
> 
> The split in write same does not cover write zero, and that is the problem.
> Otherwise write same just works fine. That said write same of QEMU SCSI
> can work well without max write same sectors limit.
> 
> If we introduce the limit of max write same sectors, this limit will be put
> on both write zero and write non-zero path.

Yeah, but the 2GB limit happens also for the regular I/O path.  The
quirk is that it doesn't happen for non-zero WRITE SAME, not the other
way round.

> Also "MAXIMUM WRITE SAME LENGTH" is just introduced on sbc3r35
> in Jan, 2013, and some old host drivers may not support it, and not using
> the limit should have better compatibility.

Again, we're talking of 2GB and this is something that should never
happen in practice.

I'll write the patch myself.

Paolo

> Thanks,
> Ming Lei
> 
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH] block: fix big write
  2014-12-10  9:56         ` Paolo Bonzini
@ 2014-12-10 12:23           ` Ming Lei
  2014-12-10 12:55             ` Paolo Bonzini
  0 siblings, 1 reply; 14+ messages in thread
From: Ming Lei @ 2014-12-10 12:23 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Kevin Wolf, qemu-stable, Stefan Hajnoczi, qemu-devel

On Wed, Dec 10, 2014 at 5:56 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
>
> On 10/12/2014 02:41, Ming Lei wrote:
>> On Wed, Dec 10, 2014 at 1:45 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>>
>>>
>>> On 08/12/2014 08:19, Ming Lei wrote:
>>>>>>
>>>>>> Alternatively, I'd accept a SCSI patch setting max_ws_blocks and friends
>>>>>> to 2GB - 1 block.
>>>> It should be better to not introduce the limit and split the writes
>>>> into size of 2GB - 1 block since there is only the limit for write zero.
>>>
>>> Why? That's exactly what the max_ws_blocks is for, and there's code in
>>> the guest already to do the split.  We're talking about 2GB, not 1MB.
>>
>> The split in write same does not cover write zero, and that is the problem.
>> Otherwise write same just works fine. That said write same of QEMU SCSI
>> can work well without max write same sectors limit.
>>
>> If we introduce the limit of max write same sectors, this limit will be put
>> on both write zero and write non-zero path.
>
> Yeah, but the 2GB limit happens also for the regular I/O path.  The
> quirk is that it doesn't happen for non-zero WRITE SAME, not the other
> way round.
>
>> Also "MAXIMUM WRITE SAME LENGTH" is just introduced on sbc3r35
>> in Jan, 2013, and some old host drivers may not support it, and not using
>> the limit should have better compatibility.
>
> Again, we're talking of 2GB and this is something that should never
> happen in practice.

Again, the 2GB limit can be avoided if I/O is splitted, isn't it?

It is _not_ never happen at all, and easy to be triggered when using
mkfs.

Thanks,

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH] block: fix big write
  2014-12-10 12:23           ` Ming Lei
@ 2014-12-10 12:55             ` Paolo Bonzini
  2014-12-10 14:35               ` Ming Lei
  0 siblings, 1 reply; 14+ messages in thread
From: Paolo Bonzini @ 2014-12-10 12:55 UTC (permalink / raw)
  To: Ming Lei; +Cc: Kevin Wolf, qemu-stable, Stefan Hajnoczi, qemu-devel

On 10/12/2014 13:23, Ming Lei wrote:
>> >
>> > Again, we're talking of 2GB and this is something that should never
>> > happen in practice.
> Again, the 2GB limit can be avoided if I/O is splitted, isn't it?

Sure.  The guest kernel is doing the split.

> It is _not_ never happen at all, and easy to be triggered when using
> mkfs.

mkfs is not something to optimize for, it's just something that should
work.  (Also, some hardware may time out if you do write same with too
high a block count).

Both Linux and Windows will always use UNMAP on QEMU, except for the
small time period where Linux used WRITE SAME and this bug was
discovered.  And all versions of Linux that used WRITE SAME honored the
max_ws_blocks field.

Paolo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH] block: fix big write
  2014-12-10 12:55             ` Paolo Bonzini
@ 2014-12-10 14:35               ` Ming Lei
  2014-12-10 15:02                 ` Paolo Bonzini
  0 siblings, 1 reply; 14+ messages in thread
From: Ming Lei @ 2014-12-10 14:35 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Kevin Wolf, qemu-stable, Stefan Hajnoczi, qemu-devel

On Wed, Dec 10, 2014 at 8:55 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
>
> On 10/12/2014 13:23, Ming Lei wrote:
>>> >
>>> > Again, we're talking of 2GB and this is something that should never
>>> > happen in practice.
>> Again, the 2GB limit can be avoided if I/O is splitted, isn't it?
>
> Sure.  The guest kernel is doing the split.

QEMU SCSI can do it transparently.

>
>> It is _not_ never happen at all, and easy to be triggered when using
>> mkfs.
>
> mkfs is not something to optimize for, it's just something that should
> work.  (Also, some hardware may time out if you do write same with too
> high a block count).

I don't think it is related with the hardware time out issue since your
patch still splits the block count into 2G - 1, and both are same wrt.
block count.

>
> Both Linux and Windows will always use UNMAP on QEMU, except for the
> small time period where Linux used WRITE SAME and this bug was
> discovered.  And all versions of Linux that used WRITE SAME honored the
> max_ws_blocks field.

Not sure how you get the conclusion.

Firstly some old host scsi drivers can't respect max_ws_blocks
block limits since it is just introduced in last year.

Secondly SBC-3 draft doesn't describe the priority explicitly among
UNMAP, WRITE SAME 10, and WRITE SAME 16, so it is driver's
freedom to take anyone in theory.

Finally blkdev_issue_zeroout() can send WRITE SAME(10/16) directly
and it can be from user space, fs, and block drivers.

Thanks,
Ming Lei

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH] block: fix big write
  2014-12-10 14:35               ` Ming Lei
@ 2014-12-10 15:02                 ` Paolo Bonzini
  2014-12-10 15:47                   ` Ming Lei
  0 siblings, 1 reply; 14+ messages in thread
From: Paolo Bonzini @ 2014-12-10 15:02 UTC (permalink / raw)
  To: Ming Lei; +Cc: Kevin Wolf, qemu-stable, Stefan Hajnoczi, qemu-devel



On 10/12/2014 15:35, Ming Lei wrote:
>>> It is _not_ never happen at all, and easy to be triggered when using
>>> mkfs.
>>
>> mkfs is not something to optimize for, it's just something that should
>> work.  (Also, some hardware may time out if you do write same with too
>> high a block count).
> 
> I don't think it is related with the hardware time out issue since your
> patch still splits the block count into 2G - 1, and both are same wrt.
> block count.

If the guest sends a 1TB WRITE SAME, it's more likely to time out.

>> Both Linux and Windows will always use UNMAP on QEMU, except for the
>> small time period where Linux used WRITE SAME and this bug was
>> discovered.  And all versions of Linux that used WRITE SAME honored the
>> max_ws_blocks field.
> 
> Not sure how you get the conclusion.

Because the WRITE SAME patch was submitted ~1 month ago.

Windows uses UNMAP because Microsoft says so.

> Secondly SBC-3 draft doesn't describe the priority explicitly among
> UNMAP, WRITE SAME 10, and WRITE SAME 16, so it is driver's
> freedom to take anyone in theory.

Sure, but WRITE SAME with UNMAP doesn't make sense if you do not have
LBPRZ, which QEMU does not set.  In fact the only sensible things to do are:

- use WRITE SAME if LBPRZ

- use UNMAP if !LBPRZ

So any sensible guest will use UNMAP.

> Finally blkdev_issue_zeroout() can send WRITE SAME(10/16) directly
> and it can be from user space, fs, and block drivers.

That is WRITE SAME without UNMAP, it is not used by mkfs, and Linux has
always honored max_write_same_blocks for it (defaulting to a 65535 block
limit for older devices that did not report a limit).

So what *concrete* case would be fixed by adding extra little-used code
in QEMU to do the split?

Paolo

> Thanks,
> Ming Lei
> 
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH] block: fix big write
  2014-12-10 15:02                 ` Paolo Bonzini
@ 2014-12-10 15:47                   ` Ming Lei
  2014-12-10 16:44                     ` Paolo Bonzini
  0 siblings, 1 reply; 14+ messages in thread
From: Ming Lei @ 2014-12-10 15:47 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Kevin Wolf, qemu-stable, Stefan Hajnoczi, qemu-devel

On Wed, Dec 10, 2014 at 11:02 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
>
> On 10/12/2014 15:35, Ming Lei wrote:
>>>> It is _not_ never happen at all, and easy to be triggered when using
>>>> mkfs.
>>>
>>> mkfs is not something to optimize for, it's just something that should
>>> work.  (Also, some hardware may time out if you do write same with too
>>> high a block count).
>>
>> I don't think it is related with the hardware time out issue since your
>> patch still splits the block count into 2G - 1, and both are same wrt.
>> block count.
>
> If the guest sends a 1TB WRITE SAME, it's more likely to time out.

Guest implementation should have a top limit for the corresponding
time out, like SD_MAX_WS16_BLOCKS in linux.

>
>>> Both Linux and Windows will always use UNMAP on QEMU, except for the
>>> small time period where Linux used WRITE SAME and this bug was
>>> discovered.  And all versions of Linux that used WRITE SAME honored the
>>> max_ws_blocks field.
>>
>> Not sure how you get the conclusion.
>
> Because the WRITE SAME patch was submitted ~1 month ago.
>
> Windows uses UNMAP because Microsoft says so.
>
>> Secondly SBC-3 draft doesn't describe the priority explicitly among
>> UNMAP, WRITE SAME 10, and WRITE SAME 16, so it is driver's
>> freedom to take anyone in theory.
>
> Sure, but WRITE SAME with UNMAP doesn't make sense if you do not have
> LBPRZ, which QEMU does not set.  In fact the only sensible things to do are:
>
> - use WRITE SAME if LBPRZ
>
> - use UNMAP if !LBPRZ
>
> So any sensible guest will use UNMAP.
>
>> Finally blkdev_issue_zeroout() can send WRITE SAME(10/16) directly
>> and it can be from user space, fs, and block drivers.
>
> That is WRITE SAME without UNMAP, it is not used by mkfs, and Linux has
> always honored max_write_same_blocks for it (defaulting to a 65535 block
> limit for older devices that did not report a limit).

>From QEMU view, blk_aio_write_zeroes() still need to handle
case without UNMAP, and the default 65535 is just linux's current
implementation, and even the recent patch tries to increase
the default setting. Also the default limit might be bigger on other OS.

>
> So what *concrete* case would be fixed by adding extra little-used code
> in QEMU to do the split?
>
> Paolo
>
>> Thanks,
>> Ming Lei
>>
>>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH] block: fix big write
  2014-12-10 15:47                   ` Ming Lei
@ 2014-12-10 16:44                     ` Paolo Bonzini
  0 siblings, 0 replies; 14+ messages in thread
From: Paolo Bonzini @ 2014-12-10 16:44 UTC (permalink / raw)
  To: Ming Lei; +Cc: Kevin Wolf, qemu-stable, Stefan Hajnoczi, qemu-devel

On 10/12/2014 16:47, Ming Lei wrote:
> > > Finally blkdev_issue_zeroout() can send WRITE SAME(10/16) directly
> > > and it can be from user space, fs, and block drivers.
> >
> > That is WRITE SAME without UNMAP, it is not used by mkfs, and Linux has
> > always honored max_write_same_blocks for it (defaulting to a 65535 block
> > limit for older devices that did not report a limit).
> 
> From QEMU view, blk_aio_write_zeroes() still need to handle
> case without UNMAP, and the default 65535 is just linux's current
> implementation, and even the recent patch tries to increase
> the default setting. Also the default limit might be bigger on other OS.

What is "another OS" that produces WRITE SAME with >2GB of data?
http://blogs.vmware.com/vsphere/2012/06/low-level-vaai-behaviour.html#more-3129
says ESX's default write same size is 1MB (2048 blocks).

Windows does not use WRITE SAME at all, according to Microsoft at
http://msdn.microsoft.com/en-us/library/windows/hardware/dn265487%28v=vs.85%29.aspx.

65535 is a sensible default for a host that doesn't know about
max_write_same_sectors.  Anything else would break physical drives as well.

Please produce a concrete case that is broken, with a released OS.

Paolo

> > So what *concrete* case would be fixed by adding extra little-used code
> > in QEMU to do the split?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH] block: fix big write
  2014-12-05 16:15 [Qemu-devel] [PATCH] block: fix big write Ming Lei
  2014-12-05 16:33 ` Paolo Bonzini
@ 2014-12-05 17:03 ` Max Reitz
  2014-12-05 17:04   ` Paolo Bonzini
  1 sibling, 1 reply; 14+ messages in thread
From: Max Reitz @ 2014-12-05 17:03 UTC (permalink / raw)
  To: Ming Lei, qemu-devel, Paolo Bonzini, Stefan Hajnoczi, Kevin Wolf
  Cc: Ming Lei, qemu-stable

On 2014-12-05 at 17:15, Ming Lei wrote:
> From: Ming Lei <ming.lei@caonical.com>
>
> QEMU block should have supported to read/write at most
> 0x7fffff * 512 bytes, unfortunately INT_MAX is used to check
> bytes in both bdrv_co_do_writev() and bdrv_check_byte_request(),
> so cause write failure if nr_sectors is equal or more
> than 0x400000.
>
> There are still other INT_MAX usages in block.c, and they might
> need to change to UINT_MAX too in future, but at least
> this patch's change can make SCSI WRITE SAME 16 workable.
>
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Ming Lei <ming.lei@caonical.com>
> ---
>   block.c |    4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/block.c b/block.c
> index a612594..ddc18c2 100644
> --- a/block.c
> +++ b/block.c
> @@ -2607,7 +2607,7 @@ static int bdrv_check_byte_request(BlockDriverState *bs, int64_t offset,
>   {
>       int64_t len;
>   
> -    if (size > INT_MAX) {
> +    if (size > UINT_MAX) {
>           return -EIO;
>       }
>   
> @@ -3420,7 +3420,7 @@ static int coroutine_fn bdrv_co_do_writev(BlockDriverState *bs,
>       int64_t sector_num, int nb_sectors, QEMUIOVector *qiov,
>       BdrvRequestFlags flags)
>   {
> -    if (nb_sectors < 0 || nb_sectors > (INT_MAX >> BDRV_SECTOR_BITS)) {
> +    if (nb_sectors < 0 || nb_sectors > (UINT_MAX >> BDRV_SECTOR_BITS)) {
>           return -EINVAL;
>       }
>   

This is intentional so a byte length can be stored in an integer. This 
is a pretty bad design decision, but we have to live with it until we 
really fix the block layer regarding the type lengths are stored in.

Max

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH] block: fix big write
  2014-12-05 17:03 ` Max Reitz
@ 2014-12-05 17:04   ` Paolo Bonzini
  0 siblings, 0 replies; 14+ messages in thread
From: Paolo Bonzini @ 2014-12-05 17:04 UTC (permalink / raw)
  To: Max Reitz, Ming Lei, qemu-devel, Stefan Hajnoczi, Kevin Wolf
  Cc: Ming Lei, qemu-stable



On 05/12/2014 18:03, Max Reitz wrote:
> On 2014-12-05 at 17:15, Ming Lei wrote:
>> From: Ming Lei <ming.lei@caonical.com>
>>
>> QEMU block should have supported to read/write at most
>> 0x7fffff * 512 bytes, unfortunately INT_MAX is used to check
>> bytes in both bdrv_co_do_writev() and bdrv_check_byte_request(),
>> so cause write failure if nr_sectors is equal or more
>> than 0x400000.
>>
>> There are still other INT_MAX usages in block.c, and they might
>> need to change to UINT_MAX too in future, but at least
>> this patch's change can make SCSI WRITE SAME 16 workable.
>>
>> Cc: qemu-stable@nongnu.org
>> Signed-off-by: Ming Lei <ming.lei@caonical.com>
>> ---
>>   block.c |    4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/block.c b/block.c
>> index a612594..ddc18c2 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -2607,7 +2607,7 @@ static int
>> bdrv_check_byte_request(BlockDriverState *bs, int64_t offset,
>>   {
>>       int64_t len;
>>   -    if (size > INT_MAX) {
>> +    if (size > UINT_MAX) {
>>           return -EIO;
>>       }
>>   @@ -3420,7 +3420,7 @@ static int coroutine_fn
>> bdrv_co_do_writev(BlockDriverState *bs,
>>       int64_t sector_num, int nb_sectors, QEMUIOVector *qiov,
>>       BdrvRequestFlags flags)
>>   {
>> -    if (nb_sectors < 0 || nb_sectors > (INT_MAX >> BDRV_SECTOR_BITS)) {
>> +    if (nb_sectors < 0 || nb_sectors > (UINT_MAX >> BDRV_SECTOR_BITS)) {
>>           return -EINVAL;
>>       }
>>   
> 
> This is intentional so a byte length can be stored in an integer. This
> is a pretty bad design decision, but we have to live with it until we
> really fix the block layer regarding the type lengths are stored in.

No problem, let's fix SCSI (the correct way, which is not the patch
posted so far :)).

Paolo

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2014-12-10 16:44 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-05 16:15 [Qemu-devel] [PATCH] block: fix big write Ming Lei
2014-12-05 16:33 ` Paolo Bonzini
2014-12-08  7:19   ` Ming Lei
2014-12-09 17:45     ` Paolo Bonzini
2014-12-10  1:41       ` Ming Lei
2014-12-10  9:56         ` Paolo Bonzini
2014-12-10 12:23           ` Ming Lei
2014-12-10 12:55             ` Paolo Bonzini
2014-12-10 14:35               ` Ming Lei
2014-12-10 15:02                 ` Paolo Bonzini
2014-12-10 15:47                   ` Ming Lei
2014-12-10 16:44                     ` Paolo Bonzini
2014-12-05 17:03 ` Max Reitz
2014-12-05 17:04   ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).