From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:43285) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YRQzo-0000lC-0X for qemu-devel@nongnu.org; Fri, 27 Feb 2015 14:54:49 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YRQzm-0000dH-S6 for qemu-devel@nongnu.org; Fri, 27 Feb 2015 14:54:47 -0500 From: Max Reitz Date: Fri, 27 Feb 2015 14:54:39 -0500 Message-Id: <1425066879-27326-1-git-send-email-mreitz@redhat.com> Subject: [Qemu-devel] [PATCH v3] block/vdi: Add locking for parallel requests List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-block@nongnu.org Cc: Kevin Wolf , Stefan Weil , qemu-devel@nongnu.org, Max Reitz , Stefan Hajnoczi , Paolo Bonzini When allocating a new cluster, the first write to it must be the one doing the allocation, because that one pads its write request to the cluster size; if another write to that cluster is executed before it, that write will be overwritten due to the padding. See https://bugs.launchpad.net/qemu/+bug/1422307 for what can go wrong without this patch. Cc: qemu-stable Signed-off-by: Max Reitz --- v3: Hopefully finally found the real issue which causes the problems described in the bug report; at least it sounds very reasonable and I can no longer reproduce any of the issues described there. Thank you, Paolo and Stefan! --- block/vdi.c | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/block/vdi.c b/block/vdi.c index 74030c6..53bd02f 100644 --- a/block/vdi.c +++ b/block/vdi.c @@ -53,6 +53,7 @@ #include "block/block_int.h" #include "qemu/module.h" #include "migration/migration.h" +#include "block/coroutine.h" #if defined(CONFIG_UUID) #include @@ -196,6 +197,8 @@ typedef struct { /* VDI header (converted to host endianness). */ VdiHeader header; + CoMutex write_lock; + Error *migration_blocker; } BDRVVdiState; @@ -504,6 +507,8 @@ static int vdi_open(BlockDriverState *bs, QDict *options, int flags, "vdi", bdrv_get_device_name(bs), "live migration"); migrate_add_blocker(s->migration_blocker); + qemu_co_mutex_init(&s->write_lock); + return 0; fail_free_bmap: @@ -639,11 +644,31 @@ static int vdi_co_write(BlockDriverState *bs, buf, n_sectors * SECTOR_SIZE); memset(block + (sector_in_block + n_sectors) * SECTOR_SIZE, 0, (s->block_sectors - n_sectors - sector_in_block) * SECTOR_SIZE); + + /* Note that this coroutine does not yield anywhere from reading the + * bmap entry until here, so in regards to all the coroutines trying + * to write to this cluster, the one doing the allocation will + * always be the first to try to acquire the lock. + * Therefore, it is also the first that will actually be able to + * acquire the lock and thus the padded cluster is written before + * the other coroutines can write to the affected area. */ + qemu_co_mutex_lock(&s->write_lock); ret = bdrv_write(bs->file, offset, block, s->block_sectors); + qemu_co_mutex_unlock(&s->write_lock); } else { uint64_t offset = s->header.offset_data / SECTOR_SIZE + (uint64_t)bmap_entry * s->block_sectors + sector_in_block; + qemu_co_mutex_lock(&s->write_lock); + /* This lock is only used to make sure the following write operation + * is executed after the write issued by the coroutine allocating + * this cluster, therefore we do not need to keep it locked. + * As stated above, the allocating coroutine will always try to lock + * the mutex before all the other concurrent accesses to that + * cluster, therefore at this point we can be absolutely certain + * that that write operation has returned (there may be other writes + * in flight, but they do not concern this very operation). */ + qemu_co_mutex_unlock(&s->write_lock); ret = bdrv_write(bs->file, offset, buf, n_sectors); } -- 2.1.0