qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2] block/vdi: Add locking for parallel requests
@ 2015-02-27 14:05 Max Reitz
  2015-02-27 16:57 ` Stefan Hajnoczi
  2015-02-27 17:42 ` Paolo Bonzini
  0 siblings, 2 replies; 20+ messages in thread
From: Max Reitz @ 2015-02-27 14:05 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, qemu-stable, qemu-devel, Max Reitz, Stefan Hajnoczi,
	Paolo Bonzini

Concurrently modifying the bmap does not seem to be a good idea; this patch adds
a lock for it. See https://bugs.launchpad.net/qemu/+bug/1422307 for what
can go wrong without.

Cc: qemu-stable <qemu-stable@nongnu.org>
Signed-off-by: Max Reitz <mreitz@redhat.com>
---
v2:
- Make the mutex cover vdi_co_write() completely [Kevin]
- Add a TODO comment [Kevin]
---
 block/vdi.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/block/vdi.c b/block/vdi.c
index 74030c6..f5f42ef 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -51,6 +51,7 @@
 
 #include "qemu-common.h"
 #include "block/block_int.h"
+#include "block/coroutine.h"
 #include "qemu/module.h"
 #include "migration/migration.h"
 
@@ -196,6 +197,8 @@ typedef struct {
     /* VDI header (converted to host endianness). */
     VdiHeader header;
 
+    CoMutex bmap_lock;
+
     Error *migration_blocker;
 } BDRVVdiState;
 
@@ -498,6 +501,8 @@ static int vdi_open(BlockDriverState *bs, QDict *options, int flags,
         goto fail_free_bmap;
     }
 
+    qemu_co_mutex_init(&s->bmap_lock);
+
     /* Disable migration when vdi images are used */
     error_set(&s->migration_blocker,
               QERR_BLOCK_FORMAT_FEATURE_NOT_SUPPORTED,
@@ -607,6 +612,9 @@ static int vdi_co_write(BlockDriverState *bs,
 
     logout("\n");
 
+    /* TODO: Figure out why this is necessary */
+    qemu_co_mutex_lock(&s->bmap_lock);
+
     while (ret >= 0 && nb_sectors > 0) {
         block_index = sector_num / s->block_sectors;
         sector_in_block = sector_num % s->block_sectors;
@@ -656,6 +664,7 @@ static int vdi_co_write(BlockDriverState *bs,
 
     logout("finished data write\n");
     if (ret < 0) {
+        qemu_co_mutex_unlock(&s->bmap_lock);
         return ret;
     }
 
@@ -674,6 +683,7 @@ static int vdi_co_write(BlockDriverState *bs,
         block = NULL;
 
         if (ret < 0) {
+            qemu_co_mutex_unlock(&s->bmap_lock);
             return ret;
         }
 
@@ -690,6 +700,7 @@ static int vdi_co_write(BlockDriverState *bs,
         ret = bdrv_write(bs->file, offset, base, n_sectors);
     }
 
+    qemu_co_mutex_unlock(&s->bmap_lock);
     return ret;
 }
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v2] block/vdi: Add locking for parallel requests
  2015-02-27 14:05 [Qemu-devel] [PATCH v2] block/vdi: Add locking for parallel requests Max Reitz
@ 2015-02-27 16:57 ` Stefan Hajnoczi
  2015-02-27 16:57   ` Max Reitz
  2015-02-27 17:25   ` Stefan Weil
  2015-02-27 17:42 ` Paolo Bonzini
  1 sibling, 2 replies; 20+ messages in thread
From: Stefan Hajnoczi @ 2015-02-27 16:57 UTC (permalink / raw)
  To: Max Reitz; +Cc: Kevin Wolf, Paolo Bonzini, qemu-devel, qemu-block, qemu-stable

[-- Attachment #1: Type: text/plain, Size: 2031 bytes --]

On Fri, Feb 27, 2015 at 09:05:47AM -0500, Max Reitz wrote:
> Concurrently modifying the bmap does not seem to be a good idea; this patch adds
> a lock for it. See https://bugs.launchpad.net/qemu/+bug/1422307 for what
> can go wrong without.
> 
> Cc: qemu-stable <qemu-stable@nongnu.org>
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
> v2:
> - Make the mutex cover vdi_co_write() completely [Kevin]
> - Add a TODO comment [Kevin]
> ---
>  block/vdi.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/block/vdi.c b/block/vdi.c
> index 74030c6..f5f42ef 100644
> --- a/block/vdi.c
> +++ b/block/vdi.c
> @@ -51,6 +51,7 @@
>  
>  #include "qemu-common.h"
>  #include "block/block_int.h"
> +#include "block/coroutine.h"
>  #include "qemu/module.h"
>  #include "migration/migration.h"
>  
> @@ -196,6 +197,8 @@ typedef struct {
>      /* VDI header (converted to host endianness). */
>      VdiHeader header;
>  
> +    CoMutex bmap_lock;
> +
>      Error *migration_blocker;
>  } BDRVVdiState;
>  
> @@ -498,6 +501,8 @@ static int vdi_open(BlockDriverState *bs, QDict *options, int flags,
>          goto fail_free_bmap;
>      }
>  
> +    qemu_co_mutex_init(&s->bmap_lock);
> +
>      /* Disable migration when vdi images are used */
>      error_set(&s->migration_blocker,
>                QERR_BLOCK_FORMAT_FEATURE_NOT_SUPPORTED,
> @@ -607,6 +612,9 @@ static int vdi_co_write(BlockDriverState *bs,
>  
>      logout("\n");
>  
> +    /* TODO: Figure out why this is necessary */
> +    qemu_co_mutex_lock(&s->bmap_lock);

If we don't know why bmap_lock works, it would be more approprate to
take the same approach as VMDK and VHDX where there is a simply s->lock
that protects all reads and writes.  That way we know for sure there is
no parallel I/O going on.

(Since the problem is not understood, maybe reads in parallel with
writes could also cause problems.  Better to really do a coarse lock
instead of just bmap_lock in write.)

Stefan

[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v2] block/vdi: Add locking for parallel requests
  2015-02-27 16:57 ` Stefan Hajnoczi
@ 2015-02-27 16:57   ` Max Reitz
  2015-02-27 17:25   ` Stefan Weil
  1 sibling, 0 replies; 20+ messages in thread
From: Max Reitz @ 2015-02-27 16:57 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Paolo Bonzini, qemu-devel, qemu-block, qemu-stable

On 2015-02-27 at 11:57, Stefan Hajnoczi wrote:
> On Fri, Feb 27, 2015 at 09:05:47AM -0500, Max Reitz wrote:
>> Concurrently modifying the bmap does not seem to be a good idea; this patch adds
>> a lock for it. See https://bugs.launchpad.net/qemu/+bug/1422307 for what
>> can go wrong without.
>>
>> Cc: qemu-stable <qemu-stable@nongnu.org>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
>> v2:
>> - Make the mutex cover vdi_co_write() completely [Kevin]
>> - Add a TODO comment [Kevin]
>> ---
>>   block/vdi.c | 11 +++++++++++
>>   1 file changed, 11 insertions(+)
>>
>> diff --git a/block/vdi.c b/block/vdi.c
>> index 74030c6..f5f42ef 100644
>> --- a/block/vdi.c
>> +++ b/block/vdi.c
>> @@ -51,6 +51,7 @@
>>   
>>   #include "qemu-common.h"
>>   #include "block/block_int.h"
>> +#include "block/coroutine.h"
>>   #include "qemu/module.h"
>>   #include "migration/migration.h"
>>   
>> @@ -196,6 +197,8 @@ typedef struct {
>>       /* VDI header (converted to host endianness). */
>>       VdiHeader header;
>>   
>> +    CoMutex bmap_lock;
>> +
>>       Error *migration_blocker;
>>   } BDRVVdiState;
>>   
>> @@ -498,6 +501,8 @@ static int vdi_open(BlockDriverState *bs, QDict *options, int flags,
>>           goto fail_free_bmap;
>>       }
>>   
>> +    qemu_co_mutex_init(&s->bmap_lock);
>> +
>>       /* Disable migration when vdi images are used */
>>       error_set(&s->migration_blocker,
>>                 QERR_BLOCK_FORMAT_FEATURE_NOT_SUPPORTED,
>> @@ -607,6 +612,9 @@ static int vdi_co_write(BlockDriverState *bs,
>>   
>>       logout("\n");
>>   
>> +    /* TODO: Figure out why this is necessary */
>> +    qemu_co_mutex_lock(&s->bmap_lock);
> If we don't know why bmap_lock works, it would be more approprate to
> take the same approach as VMDK and VHDX where there is a simply s->lock
> that protects all reads and writes.  That way we know for sure there is
> no parallel I/O going on.
>
> (Since the problem is not understood, maybe reads in parallel with
> writes could also cause problems.  Better to really do a coarse lock
> instead of just bmap_lock in write.)

OK, will do.

Max

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v2] block/vdi: Add locking for parallel requests
  2015-02-27 16:57 ` Stefan Hajnoczi
  2015-02-27 16:57   ` Max Reitz
@ 2015-02-27 17:25   ` Stefan Weil
  2015-02-27 17:28     ` Max Reitz
  2015-02-27 17:35     ` Paolo Bonzini
  1 sibling, 2 replies; 20+ messages in thread
From: Stefan Weil @ 2015-02-27 17:25 UTC (permalink / raw)
  To: Stefan Hajnoczi, Max Reitz
  Cc: Kevin Wolf, Paolo Bonzini, qemu-devel, qemu-block, qemu-stable

Am 27.02.2015 um 17:57 schrieb Stefan Hajnoczi:
> On Fri, Feb 27, 2015 at 09:05:47AM -0500, Max Reitz wrote:
>> Concurrently modifying the bmap does not seem to be a good idea; this patch adds
>> a lock for it. See https://bugs.launchpad.net/qemu/+bug/1422307 for what
>> can go wrong without.
>>
>> Cc: qemu-stable <qemu-stable@nongnu.org>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
>> v2:
>> - Make the mutex cover vdi_co_write() completely [Kevin]
>> - Add a TODO comment [Kevin]
[...]
>> If we don't know why bmap_lock works, it would be more approprate to
>> take the same approach as VMDK and VHDX where there is a simply s->lock
>> that protects all reads and writes.  That way we know for sure there is
>> no parallel I/O going on.
>>
>> (Since the problem is not understood, maybe reads in parallel with
>> writes could also cause problems.  Better to really do a coarse lock
>> instead of just bmap_lock in write.)
>>
>> Stefan

block/vdi.c was never written for multi-threaded access, see my comment
in the header of block/vdi.c:

 * The code is not thread safe (missing locks for changes in header and
 * block table, no problem with current QEMU).

This was true in the past, but obviously later multi-threaded access was
introduced for QEMU. Locking was added for qcow2 and other drivers in
2012 and 2013, but never for vdi.

I must admit that I don't know which parts of the block filesystem
drivers potentially run in parallel threads. Ideally there would be one
or more test cases which test multi-threaded operations and which
trigger a failure with the current vdi code.

If I had a simple test scenario, I could have a look on the problem.

The VMDK approach is fine as an intermediate work around, but please use
conditional compilation to allow easy tests without coarse locks (and
update the comments :-)).

Regards
Stefan (Weil)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v2] block/vdi: Add locking for parallel requests
  2015-02-27 17:25   ` Stefan Weil
@ 2015-02-27 17:28     ` Max Reitz
  2015-02-27 17:34       ` Stefan Weil
  2015-02-27 18:07       ` Stefan Weil
  2015-02-27 17:35     ` Paolo Bonzini
  1 sibling, 2 replies; 20+ messages in thread
From: Max Reitz @ 2015-02-27 17:28 UTC (permalink / raw)
  To: Stefan Weil, Stefan Hajnoczi
  Cc: Kevin Wolf, Paolo Bonzini, qemu-devel, qemu-block, qemu-stable

[-- Attachment #1: Type: text/plain, Size: 2120 bytes --]

On 2015-02-27 at 12:25, Stefan Weil wrote:
> Am 27.02.2015 um 17:57 schrieb Stefan Hajnoczi:
>> On Fri, Feb 27, 2015 at 09:05:47AM -0500, Max Reitz wrote:
>>> Concurrently modifying the bmap does not seem to be a good idea; this patch adds
>>> a lock for it. See https://bugs.launchpad.net/qemu/+bug/1422307 for what
>>> can go wrong without.
>>>
>>> Cc: qemu-stable <qemu-stable@nongnu.org>
>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>> ---
>>> v2:
>>> - Make the mutex cover vdi_co_write() completely [Kevin]
>>> - Add a TODO comment [Kevin]
> [...]
>>> If we don't know why bmap_lock works, it would be more approprate to
>>> take the same approach as VMDK and VHDX where there is a simply s->lock
>>> that protects all reads and writes.  That way we know for sure there is
>>> no parallel I/O going on.
>>>
>>> (Since the problem is not understood, maybe reads in parallel with
>>> writes could also cause problems.  Better to really do a coarse lock
>>> instead of just bmap_lock in write.)
>>>
>>> Stefan
> block/vdi.c was never written for multi-threaded access, see my comment
> in the header of block/vdi.c:
>
>   * The code is not thread safe (missing locks for changes in header and
>   * block table, no problem with current QEMU).
>
> This was true in the past, but obviously later multi-threaded access was
> introduced for QEMU. Locking was added for qcow2 and other drivers in
> 2012 and 2013, but never for vdi.
>
> I must admit that I don't know which parts of the block filesystem
> drivers potentially run in parallel threads.Ideally there would be one
> or more test cases which test multi-threaded operations and which
> trigger a failure with the current vdi code.
>
> If I had a simple test scenario, I could have a look on the problem.

I have one for you. See the attached ruby script.

(If there are no "Pattern verification failed" messages, everything is good)

> The VMDK approach is fine as an intermediate work around, but please use
> conditional compilation to allow easy tests without coarse locks (and
> update the comments :-)).

Will a macro defined in vdi.c be enough?

Max

[-- Attachment #2: test.rb --]
[-- Type: application/x-ruby, Size: 677 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v2] block/vdi: Add locking for parallel requests
  2015-02-27 17:28     ` Max Reitz
@ 2015-02-27 17:34       ` Stefan Weil
  2015-02-27 18:07       ` Stefan Weil
  1 sibling, 0 replies; 20+ messages in thread
From: Stefan Weil @ 2015-02-27 17:34 UTC (permalink / raw)
  To: Max Reitz, Stefan Hajnoczi
  Cc: Kevin Wolf, Paolo Bonzini, qemu-devel, qemu-block, qemu-stable

Am 27.02.2015 um 18:28 schrieb Max Reitz:
> On 2015-02-27 at 12:25, Stefan Weil wrote:
>> block/vdi.c was never written for multi-threaded access, see my comment
>> in the header of block/vdi.c:
>>
>>   * The code is not thread safe (missing locks for changes in header and
>>   * block table, no problem with current QEMU).
>>
>> This was true in the past, but obviously later multi-threaded access was
>> introduced for QEMU. Locking was added for qcow2 and other drivers in
>> 2012 and 2013, but never for vdi.
>>
>> I must admit that I don't know which parts of the block filesystem
>> drivers potentially run in parallel threads.Ideally there would be one
>> or more test cases which test multi-threaded operations and which
>> trigger a failure with the current vdi code.
>>
>> If I had a simple test scenario, I could have a look on the problem.
>
> I have one for you. See the attached ruby script.
>
> (If there are no "Pattern verification failed" messages, everything is
> good)
>
>> The VMDK approach is fine as an intermediate work around, but please use
>> conditional compilation to allow easy tests without coarse locks (and
>> update the comments :-)).
>
> Will a macro defined in vdi.c be enough?

Yes, that would be fine. vdi.c already has several locally defined
CONFIG_VDI_... macros.

Stefan

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v2] block/vdi: Add locking for parallel requests
  2015-02-27 17:25   ` Stefan Weil
  2015-02-27 17:28     ` Max Reitz
@ 2015-02-27 17:35     ` Paolo Bonzini
  1 sibling, 0 replies; 20+ messages in thread
From: Paolo Bonzini @ 2015-02-27 17:35 UTC (permalink / raw)
  To: Stefan Weil, Stefan Hajnoczi, Max Reitz
  Cc: Kevin Wolf, qemu-devel, qemu-block, qemu-stable



On 27/02/2015 18:25, Stefan Weil wrote:
> block/vdi.c was never written for multi-threaded access, see my comment
> in the header of block/vdi.c:

It is not using threads, only coroutines.  Preemption points of
coroutines are well defined, and I think that the bug could be present
even in the initial AIO-based version.

>  * The code is not thread safe (missing locks for changes in header and
>  * block table, no problem with current QEMU).
> 
> This was true in the past, but obviously later multi-threaded access was
> introduced for QEMU. Locking was added for qcow2 and other drivers in
> 2012 and 2013, but never for vdi.

qcow2 already had locking (based on AsyncContexts) before the conversion
to coroutines.  Other drivers implicitly had locking because they were
synchronous; locking was added because the conversion to coroutines made
them asynchronous.

vdi never got its locking because it was already asynchronous.

Paolo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v2] block/vdi: Add locking for parallel requests
  2015-02-27 14:05 [Qemu-devel] [PATCH v2] block/vdi: Add locking for parallel requests Max Reitz
  2015-02-27 16:57 ` Stefan Hajnoczi
@ 2015-02-27 17:42 ` Paolo Bonzini
  2015-02-27 18:09   ` Max Reitz
  2015-02-27 21:44   ` Stefan Weil
  1 sibling, 2 replies; 20+ messages in thread
From: Paolo Bonzini @ 2015-02-27 17:42 UTC (permalink / raw)
  To: Max Reitz, qemu-block
  Cc: Kevin Wolf, qemu-stable, Stefan Hajnoczi, qemu-devel



On 27/02/2015 15:05, Max Reitz wrote:
> Concurrently modifying the bmap does not seem to be a good idea; this patch adds
> a lock for it. See https://bugs.launchpad.net/qemu/+bug/1422307 for what
> can go wrong without.
> 
> Cc: qemu-stable <qemu-stable@nongnu.org>
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
> v2:
> - Make the mutex cover vdi_co_write() completely [Kevin]
> - Add a TODO comment [Kevin]

I think I know what the bug is.  Suppose you have two concurrent writes
to a non-allocated block, one at 16K...32K (in bytes) and one at
32K...48K.  The first write is enlarged to contain zeros, the second is
not.  Then you have two writes in flight:

      0       zeros
      ...     zeros
      16K     data1
      ...     data1
      32K     zeros      data2
      ...     zeros      data2
      48K     zeros
      ...     zeros
      64K

And the contents of 32K...48K are undefined.  If the above diagnosis is
correct, I'm not even sure why Max's v1 patch worked...

An optimized fix could be to use a CoRwLock, then:

- take it shared (read) around the write in the
"VDI_IS_ALLOCATED(bmap_entry)" path

- take it exclusive (write) around the write in the
"!VDI_IS_ALLOCATED(bmap_entry)" path

Paolo

> ---
>  block/vdi.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/block/vdi.c b/block/vdi.c
> index 74030c6..f5f42ef 100644
> --- a/block/vdi.c
> +++ b/block/vdi.c
> @@ -51,6 +51,7 @@
>  
>  #include "qemu-common.h"
>  #include "block/block_int.h"
> +#include "block/coroutine.h"
>  #include "qemu/module.h"
>  #include "migration/migration.h"
>  
> @@ -196,6 +197,8 @@ typedef struct {
>      /* VDI header (converted to host endianness). */
>      VdiHeader header;
>  
> +    CoMutex bmap_lock;
> +
>      Error *migration_blocker;
>  } BDRVVdiState;
>  
> @@ -498,6 +501,8 @@ static int vdi_open(BlockDriverState *bs, QDict *options, int flags,
>          goto fail_free_bmap;
>      }
>  
> +    qemu_co_mutex_init(&s->bmap_lock);
> +
>      /* Disable migration when vdi images are used */
>      error_set(&s->migration_blocker,
>                QERR_BLOCK_FORMAT_FEATURE_NOT_SUPPORTED,
> @@ -607,6 +612,9 @@ static int vdi_co_write(BlockDriverState *bs,
>  
>      logout("\n");
>  
> +    /* TODO: Figure out why this is necessary */
> +    qemu_co_mutex_lock(&s->bmap_lock);
> +
>      while (ret >= 0 && nb_sectors > 0) {
>          block_index = sector_num / s->block_sectors;
>          sector_in_block = sector_num % s->block_sectors;
> @@ -656,6 +664,7 @@ static int vdi_co_write(BlockDriverState *bs,
>  
>      logout("finished data write\n");
>      if (ret < 0) {
> +        qemu_co_mutex_unlock(&s->bmap_lock);
>          return ret;
>      }
>  
> @@ -674,6 +683,7 @@ static int vdi_co_write(BlockDriverState *bs,
>          block = NULL;
>  
>          if (ret < 0) {
> +            qemu_co_mutex_unlock(&s->bmap_lock);
>              return ret;
>          }
>  
> @@ -690,6 +700,7 @@ static int vdi_co_write(BlockDriverState *bs,
>          ret = bdrv_write(bs->file, offset, base, n_sectors);
>      }
>  
> +    qemu_co_mutex_unlock(&s->bmap_lock);
>      return ret;
>  }
>  
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v2] block/vdi: Add locking for parallel requests
  2015-02-27 17:28     ` Max Reitz
  2015-02-27 17:34       ` Stefan Weil
@ 2015-02-27 18:07       ` Stefan Weil
  2015-02-27 18:09         ` Max Reitz
  1 sibling, 1 reply; 20+ messages in thread
From: Stefan Weil @ 2015-02-27 18:07 UTC (permalink / raw)
  To: Max Reitz, Stefan Hajnoczi
  Cc: Kevin Wolf, Paolo Bonzini, qemu-devel, qemu-block, qemu-stable

Am 27.02.2015 um 18:28 schrieb Max Reitz:
> On 2015-02-27 at 12:25, Stefan Weil wrote:
>> If I had a simple test scenario, I could have a look on the problem.
>
> I have one for you. See the attached ruby script.
>
> (If there are no "Pattern verification failed" messages, everything is
> good)

Your test runs without any failure message in my four test scenarios:

* test on x86_64 virtualized host, Debian Wheezy, with and without debug
enabled

* test on x86_64 Intel Core i7, Debian Jessie, with and without debug
enabled

I repeated the test several times. Do I have to run it in an endless loop?

Stefan

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v2] block/vdi: Add locking for parallel requests
  2015-02-27 17:42 ` Paolo Bonzini
@ 2015-02-27 18:09   ` Max Reitz
  2015-02-27 18:27     ` Max Reitz
  2015-02-27 21:44   ` Stefan Weil
  1 sibling, 1 reply; 20+ messages in thread
From: Max Reitz @ 2015-02-27 18:09 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-block
  Cc: Kevin Wolf, qemu-stable, Stefan Hajnoczi, qemu-devel

On 2015-02-27 at 12:42, Paolo Bonzini wrote:
>
> On 27/02/2015 15:05, Max Reitz wrote:
>> Concurrently modifying the bmap does not seem to be a good idea; this patch adds
>> a lock for it. See https://bugs.launchpad.net/qemu/+bug/1422307 for what
>> can go wrong without.
>>
>> Cc: qemu-stable <qemu-stable@nongnu.org>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
>> v2:
>> - Make the mutex cover vdi_co_write() completely [Kevin]
>> - Add a TODO comment [Kevin]
> I think I know what the bug is.  Suppose you have two concurrent writes
> to a non-allocated block, one at 16K...32K (in bytes) and one at
> 32K...48K.  The first write is enlarged to contain zeros, the second is
> not.  Then you have two writes in flight:
>
>        0       zeros
>        ...     zeros
>        16K     data1
>        ...     data1
>        32K     zeros      data2
>        ...     zeros      data2
>        48K     zeros
>        ...     zeros
>        64K
>
> And the contents of 32K...48K are undefined.  If the above diagnosis is
> correct, I'm not even sure why Max's v1 patch worked...

Maybe that's an issue, too; but the test case I sent out does 1 MB 
requests (and it fails), so this shouldn't matter there.

> An optimized fix could be to use a CoRwLock, then:

Yes, I'm actually already working on that.

Max

> - take it shared (read) around the write in the
> "VDI_IS_ALLOCATED(bmap_entry)" path
>
> - take it exclusive (write) around the write in the
> "!VDI_IS_ALLOCATED(bmap_entry)" path
>
> Paolo
>
>> ---
>>   block/vdi.c | 11 +++++++++++
>>   1 file changed, 11 insertions(+)
>>
>> diff --git a/block/vdi.c b/block/vdi.c
>> index 74030c6..f5f42ef 100644
>> --- a/block/vdi.c
>> +++ b/block/vdi.c
>> @@ -51,6 +51,7 @@
>>   
>>   #include "qemu-common.h"
>>   #include "block/block_int.h"
>> +#include "block/coroutine.h"
>>   #include "qemu/module.h"
>>   #include "migration/migration.h"
>>   
>> @@ -196,6 +197,8 @@ typedef struct {
>>       /* VDI header (converted to host endianness). */
>>       VdiHeader header;
>>   
>> +    CoMutex bmap_lock;
>> +
>>       Error *migration_blocker;
>>   } BDRVVdiState;
>>   
>> @@ -498,6 +501,8 @@ static int vdi_open(BlockDriverState *bs, QDict *options, int flags,
>>           goto fail_free_bmap;
>>       }
>>   
>> +    qemu_co_mutex_init(&s->bmap_lock);
>> +
>>       /* Disable migration when vdi images are used */
>>       error_set(&s->migration_blocker,
>>                 QERR_BLOCK_FORMAT_FEATURE_NOT_SUPPORTED,
>> @@ -607,6 +612,9 @@ static int vdi_co_write(BlockDriverState *bs,
>>   
>>       logout("\n");
>>   
>> +    /* TODO: Figure out why this is necessary */
>> +    qemu_co_mutex_lock(&s->bmap_lock);
>> +
>>       while (ret >= 0 && nb_sectors > 0) {
>>           block_index = sector_num / s->block_sectors;
>>           sector_in_block = sector_num % s->block_sectors;
>> @@ -656,6 +664,7 @@ static int vdi_co_write(BlockDriverState *bs,
>>   
>>       logout("finished data write\n");
>>       if (ret < 0) {
>> +        qemu_co_mutex_unlock(&s->bmap_lock);
>>           return ret;
>>       }
>>   
>> @@ -674,6 +683,7 @@ static int vdi_co_write(BlockDriverState *bs,
>>           block = NULL;
>>   
>>           if (ret < 0) {
>> +            qemu_co_mutex_unlock(&s->bmap_lock);
>>               return ret;
>>           }
>>   
>> @@ -690,6 +700,7 @@ static int vdi_co_write(BlockDriverState *bs,
>>           ret = bdrv_write(bs->file, offset, base, n_sectors);
>>       }
>>   
>> +    qemu_co_mutex_unlock(&s->bmap_lock);
>>       return ret;
>>   }
>>   
>>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v2] block/vdi: Add locking for parallel requests
  2015-02-27 18:07       ` Stefan Weil
@ 2015-02-27 18:09         ` Max Reitz
  2015-02-27 18:12           ` Stefan Weil
  0 siblings, 1 reply; 20+ messages in thread
From: Max Reitz @ 2015-02-27 18:09 UTC (permalink / raw)
  To: Stefan Weil, Stefan Hajnoczi
  Cc: Kevin Wolf, Paolo Bonzini, qemu-devel, qemu-block, qemu-stable

On 2015-02-27 at 13:07, Stefan Weil wrote:
> Am 27.02.2015 um 18:28 schrieb Max Reitz:
>> On 2015-02-27 at 12:25, Stefan Weil wrote:
>>> If I had a simple test scenario, I could have a look on the problem.
>> I have one for you. See the attached ruby script.
>>
>> (If there are no "Pattern verification failed" messages, everything is
>> good)
> Your test runs without any failure message in my four test scenarios:
>
> * test on x86_64 virtualized host, Debian Wheezy, with and without debug
> enabled
>
> * test on x86_64 Intel Core i7, Debian Jessie, with and without debug
> enabled
>
> I repeated the test several times. Do I have to run it in an endless loop?

It always fails for me. Do you have an SSD?

Max

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v2] block/vdi: Add locking for parallel requests
  2015-02-27 18:09         ` Max Reitz
@ 2015-02-27 18:12           ` Stefan Weil
  2015-02-27 18:15             ` Max Reitz
  0 siblings, 1 reply; 20+ messages in thread
From: Stefan Weil @ 2015-02-27 18:12 UTC (permalink / raw)
  To: Max Reitz, Stefan Hajnoczi
  Cc: Kevin Wolf, Paolo Bonzini, qemu-devel, qemu-block, qemu-stable

Am 27.02.2015 um 19:09 schrieb Max Reitz:
> It always fails for me. Do you have an SSD?

On the real machine: yes. On the virtual machine: maybe.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v2] block/vdi: Add locking for parallel requests
  2015-02-27 18:12           ` Stefan Weil
@ 2015-02-27 18:15             ` Max Reitz
  2015-02-27 18:55               ` Max Reitz
  0 siblings, 1 reply; 20+ messages in thread
From: Max Reitz @ 2015-02-27 18:15 UTC (permalink / raw)
  To: Stefan Weil, Stefan Hajnoczi
  Cc: Kevin Wolf, Paolo Bonzini, qemu-devel, qemu-block, qemu-stable

On 2015-02-27 at 13:12, Stefan Weil wrote:
> Am 27.02.2015 um 19:09 schrieb Max Reitz:
>> It always fails for me. Do you have an SSD?
> On the real machine: yes. On the virtual machine: maybe.

I don't, so maybe that's the issue. But running it in tmpfs doesn't 
change anything for me, still fails (every time).

Max

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v2] block/vdi: Add locking for parallel requests
  2015-02-27 18:09   ` Max Reitz
@ 2015-02-27 18:27     ` Max Reitz
  0 siblings, 0 replies; 20+ messages in thread
From: Max Reitz @ 2015-02-27 18:27 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-block
  Cc: Kevin Wolf, qemu-stable, Stefan Hajnoczi, qemu-devel

On 2015-02-27 at 13:09, Max Reitz wrote:
> On 2015-02-27 at 12:42, Paolo Bonzini wrote:
>>
>> On 27/02/2015 15:05, Max Reitz wrote:
>>> Concurrently modifying the bmap does not seem to be a good idea; 
>>> this patch adds
>>> a lock for it. See https://bugs.launchpad.net/qemu/+bug/1422307 for 
>>> what
>>> can go wrong without.
>>>
>>> Cc: qemu-stable <qemu-stable@nongnu.org>
>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>> ---
>>> v2:
>>> - Make the mutex cover vdi_co_write() completely [Kevin]
>>> - Add a TODO comment [Kevin]
>> I think I know what the bug is.  Suppose you have two concurrent writes
>> to a non-allocated block, one at 16K...32K (in bytes) and one at
>> 32K...48K.  The first write is enlarged to contain zeros, the second is
>> not.  Then you have two writes in flight:
>>
>>        0       zeros
>>        ...     zeros
>>        16K     data1
>>        ...     data1
>>        32K     zeros      data2
>>        ...     zeros      data2
>>        48K     zeros
>>        ...     zeros
>>        64K
>>
>> And the contents of 32K...48K are undefined.  If the above diagnosis is
>> correct, I'm not even sure why Max's v1 patch worked...
>
> Maybe that's an issue, too; but the test case I sent out does 1 MB 
> requests (and it fails), so this shouldn't matter there.

Considering that test case didn't work for Stefan (Weil), and it fails 
in a pretty strange way for me (no output from the qemu-io command at 
all, and while most reads from raw were successful, all reads from vdi 
failed (the pattern verification failed), maybe that's something 
completely different.

Indeed, when I do sub-MB writes, I get sporadic errors which seem much 
more related to the original bug report, so it's probably the issue you 
found that's the real problem.

Also, my test case suddenly stopped reproducing the issue on my HDD and 
only does it on tmpfs. Weird.

Max

>> An optimized fix could be to use a CoRwLock, then:
>
> Yes, I'm actually already working on that.
>
> Max
>
>> - take it shared (read) around the write in the
>> "VDI_IS_ALLOCATED(bmap_entry)" path
>>
>> - take it exclusive (write) around the write in the
>> "!VDI_IS_ALLOCATED(bmap_entry)" path
>>
>> Paolo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v2] block/vdi: Add locking for parallel requests
  2015-02-27 18:15             ` Max Reitz
@ 2015-02-27 18:55               ` Max Reitz
  2015-02-27 20:21                 ` Stefan Weil
  0 siblings, 1 reply; 20+ messages in thread
From: Max Reitz @ 2015-02-27 18:55 UTC (permalink / raw)
  To: Stefan Weil, Stefan Hajnoczi
  Cc: Kevin Wolf, Paolo Bonzini, qemu-devel, qemu-block, qemu-stable

[-- Attachment #1: Type: text/plain, Size: 545 bytes --]

On 2015-02-27 at 13:15, Max Reitz wrote:
> On 2015-02-27 at 13:12, Stefan Weil wrote:
>> Am 27.02.2015 um 19:09 schrieb Max Reitz:
>>> It always fails for me. Do you have an SSD?
>> On the real machine: yes. On the virtual machine: maybe.
>
> I don't, so maybe that's the issue. But running it in tmpfs doesn't 
> change anything for me, still fails (every time).

And now it doesn't fail any more on my HDD (but still in tmpfs). Great.

How about this test case (which is in line with what Paolo identified is 
the potential problem)? :-)

Max

[-- Attachment #2: test.rb --]
[-- Type: application/x-ruby, Size: 788 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v2] block/vdi: Add locking for parallel requests
  2015-02-27 18:55               ` Max Reitz
@ 2015-02-27 20:21                 ` Stefan Weil
  2015-02-27 20:23                   ` Max Reitz
  0 siblings, 1 reply; 20+ messages in thread
From: Stefan Weil @ 2015-02-27 20:21 UTC (permalink / raw)
  To: Max Reitz, Stefan Hajnoczi
  Cc: Kevin Wolf, Paolo Bonzini, qemu-devel, qemu-block, qemu-stable

Am 27.02.2015 um 19:55 schrieb Max Reitz:
> On 2015-02-27 at 13:15, Max Reitz wrote:
>> On 2015-02-27 at 13:12, Stefan Weil wrote:
>>> Am 27.02.2015 um 19:09 schrieb Max Reitz:
>>>> It always fails for me. Do you have an SSD?
>>> On the real machine: yes. On the virtual machine: maybe.
>>
>> I don't, so maybe that's the issue. But running it in tmpfs doesn't 
>> change anything for me, still fails (every time).
>
> And now it doesn't fail any more on my HDD (but still in tmpfs). Great.
>
> How about this test case (which is in line with what Paolo identified 
> is the potential problem)? :-)
>
> Max


No failure with the new one, too.

Stefan

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v2] block/vdi: Add locking for parallel requests
  2015-02-27 20:21                 ` Stefan Weil
@ 2015-02-27 20:23                   ` Max Reitz
  2015-02-27 20:37                     ` Stefan Weil
  0 siblings, 1 reply; 20+ messages in thread
From: Max Reitz @ 2015-02-27 20:23 UTC (permalink / raw)
  To: Stefan Weil, Stefan Hajnoczi
  Cc: Kevin Wolf, Paolo Bonzini, qemu-devel, qemu-block, qemu-stable

On 2015-02-27 at 15:21, Stefan Weil wrote:
> Am 27.02.2015 um 19:55 schrieb Max Reitz:
>> On 2015-02-27 at 13:15, Max Reitz wrote:
>>> On 2015-02-27 at 13:12, Stefan Weil wrote:
>>>> Am 27.02.2015 um 19:09 schrieb Max Reitz:
>>>>> It always fails for me. Do you have an SSD?
>>>> On the real machine: yes. On the virtual machine: maybe.
>>>
>>> I don't, so maybe that's the issue. But running it in tmpfs doesn't 
>>> change anything for me, still fails (every time).
>>
>> And now it doesn't fail any more on my HDD (but still in tmpfs). Great.
>>
>> How about this test case (which is in line with what Paolo identified 
>> is the potential problem)? :-)
>>
>> Max
>
>
> No failure with the new one, too.

Hm, have you tried it multiple times?

Max

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v2] block/vdi: Add locking for parallel requests
  2015-02-27 20:23                   ` Max Reitz
@ 2015-02-27 20:37                     ` Stefan Weil
  0 siblings, 0 replies; 20+ messages in thread
From: Stefan Weil @ 2015-02-27 20:37 UTC (permalink / raw)
  To: Max Reitz, Stefan Hajnoczi
  Cc: Kevin Wolf, Paolo Bonzini, qemu-devel, qemu-block, qemu-stable

Am 27.02.2015 um 21:23 schrieb Max Reitz:
> On 2015-02-27 at 15:21, Stefan Weil wrote:
>> Am 27.02.2015 um 19:55 schrieb Max Reitz:
>>> On 2015-02-27 at 13:15, Max Reitz wrote:
>>>> On 2015-02-27 at 13:12, Stefan Weil wrote:
>>>>> Am 27.02.2015 um 19:09 schrieb Max Reitz:
>>>>>> It always fails for me. Do you have an SSD?
>>>>> On the real machine: yes. On the virtual machine: maybe.
>>>>
>>>> I don't, so maybe that's the issue. But running it in tmpfs doesn't 
>>>> change anything for me, still fails (every time).
>>>
>>> And now it doesn't fail any more on my HDD (but still in tmpfs). Great.
>>>
>>> How about this test case (which is in line with what Paolo 
>>> identified is the potential problem)? :-)
>>>
>>> Max
>>
>>
>> No failure with the new one, too.
>
> Hm, have you tried it multiple times?
>
> Max

I tried it only a few times. While running it in a loop I now get failures.

Stefan

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v2] block/vdi: Add locking for parallel requests
  2015-02-27 17:42 ` Paolo Bonzini
  2015-02-27 18:09   ` Max Reitz
@ 2015-02-27 21:44   ` Stefan Weil
  2015-02-27 21:46     ` Max Reitz
  1 sibling, 1 reply; 20+ messages in thread
From: Stefan Weil @ 2015-02-27 21:44 UTC (permalink / raw)
  To: Paolo Bonzini, Max Reitz, qemu-block
  Cc: Kevin Wolf, qemu-stable, Stefan Hajnoczi, qemu-devel

Am 27.02.2015 um 18:42 schrieb Paolo Bonzini:
>
> An optimized fix could be to use a CoRwLock, then:


Note related to our previous discussion: why does the code use CoRwlock 
instead of CoRwLock? Should it be renamed before more code uses it?

Stefan

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v2] block/vdi: Add locking for parallel requests
  2015-02-27 21:44   ` Stefan Weil
@ 2015-02-27 21:46     ` Max Reitz
  0 siblings, 0 replies; 20+ messages in thread
From: Max Reitz @ 2015-02-27 21:46 UTC (permalink / raw)
  To: Stefan Weil, Paolo Bonzini, qemu-block
  Cc: Kevin Wolf, qemu-stable, Stefan Hajnoczi, qemu-devel

On 2015-02-27 at 16:44, Stefan Weil wrote:
> Am 27.02.2015 um 18:42 schrieb Paolo Bonzini:
>>
>> An optimized fix could be to use a CoRwLock, then:
>
>
> Note related to our previous discussion: why does the code use 
> CoRwlock instead of CoRwLock? Should it be renamed before more code 
> uses it?

Well, I'm simply not using it (in v3 at least...), so I hope I'll get 
away without touching that. :-)

Max

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2015-02-27 21:46 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-02-27 14:05 [Qemu-devel] [PATCH v2] block/vdi: Add locking for parallel requests Max Reitz
2015-02-27 16:57 ` Stefan Hajnoczi
2015-02-27 16:57   ` Max Reitz
2015-02-27 17:25   ` Stefan Weil
2015-02-27 17:28     ` Max Reitz
2015-02-27 17:34       ` Stefan Weil
2015-02-27 18:07       ` Stefan Weil
2015-02-27 18:09         ` Max Reitz
2015-02-27 18:12           ` Stefan Weil
2015-02-27 18:15             ` Max Reitz
2015-02-27 18:55               ` Max Reitz
2015-02-27 20:21                 ` Stefan Weil
2015-02-27 20:23                   ` Max Reitz
2015-02-27 20:37                     ` Stefan Weil
2015-02-27 17:35     ` Paolo Bonzini
2015-02-27 17:42 ` Paolo Bonzini
2015-02-27 18:09   ` Max Reitz
2015-02-27 18:27     ` Max Reitz
2015-02-27 21:44   ` Stefan Weil
2015-02-27 21:46     ` Max Reitz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).