* [Qemu-devel] [RFC PATCH] block/migration: Disable cache invalidate for incoming migration @ 2014-10-02 8:52 Alexey Kardashevskiy 2014-10-02 9:45 ` Paolo Bonzini 2014-10-02 14:52 ` Stefan Hajnoczi 0 siblings, 2 replies; 7+ messages in thread From: Alexey Kardashevskiy @ 2014-10-02 8:52 UTC (permalink / raw) To: qemu-devel Cc: Alexey Kardashevskiy, Kevin Wolf, Stefan Hajnoczi, Paolo Bonzini When migrated using libvirt with "--copy-storage-all", at the end of migration there is race between NBD mirroring task trying to do flush and migration completion, both end up invalidating cache. Since qcow2 driver does not handle this situation very well, random crashes happen. This disables the BDRV_O_INCOMING flag for the block device being migrated and restores it when NBD task is done. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> --- The commit log is not full and most likely incorrect as well as the patch :) Please, help. Thanks! The patch seems to fix the initial problem though. btw is there any easy way to migrate one QEMU to another using NBD (i.e. not using "migrate -b") and not using libvirt? What would the command line be? Debugging with libvirt is real pain :( --- block.c | 17 ++++------------- migration.c | 1 - nbd.c | 11 +++++++++++ 3 files changed, 15 insertions(+), 14 deletions(-) diff --git a/block.c b/block.c index c5a251c..ed72e0a 100644 --- a/block.c +++ b/block.c @@ -5073,6 +5073,10 @@ void bdrv_invalidate_cache_all(Error **errp) QTAILQ_FOREACH(bs, &bdrv_states, device_list) { AioContext *aio_context = bdrv_get_aio_context(bs); + if (!(bs->open_flags & BDRV_O_INCOMING)) { + continue; + } + aio_context_acquire(aio_context); bdrv_invalidate_cache(bs, &local_err); aio_context_release(aio_context); @@ -5083,19 +5087,6 @@ void bdrv_invalidate_cache_all(Error **errp) } } -void bdrv_clear_incoming_migration_all(void) -{ - BlockDriverState *bs; - - QTAILQ_FOREACH(bs, &bdrv_states, device_list) { - AioContext *aio_context = bdrv_get_aio_context(bs); - - aio_context_acquire(aio_context); - bs->open_flags = bs->open_flags & ~(BDRV_O_INCOMING); - aio_context_release(aio_context); - } -} - int bdrv_flush(BlockDriverState *bs) { Coroutine *co; diff --git a/migration.c b/migration.c index 8d675b3..c49a05a 100644 --- a/migration.c +++ b/migration.c @@ -103,7 +103,6 @@ static void process_incoming_migration_co(void *opaque) } qemu_announce_self(); - bdrv_clear_incoming_migration_all(); /* Make sure all file formats flush their mutable metadata */ bdrv_invalidate_cache_all(&local_err); if (local_err) { diff --git a/nbd.c b/nbd.c index e9b539b..7b479c0 100644 --- a/nbd.c +++ b/nbd.c @@ -106,6 +106,7 @@ struct NBDExport { off_t dev_offset; off_t size; uint32_t nbdflags; + bool restore_incoming; QTAILQ_HEAD(, NBDClient) clients; QTAILQ_ENTRY(NBDExport) next; @@ -972,6 +973,13 @@ NBDExport *nbd_export_new(BlockDriverState *bs, off_t dev_offset, exp->ctx = bdrv_get_aio_context(bs); bdrv_ref(bs); bdrv_add_aio_context_notifier(bs, bs_aio_attached, bs_aio_detach, exp); + + if (bs->open_flags & BDRV_O_INCOMING) { + bdrv_invalidate_cache(bs, NULL); + exp->restore_incoming = !!(bs->open_flags & BDRV_O_INCOMING); + bs->open_flags &= ~(BDRV_O_INCOMING); + } + return exp; } @@ -1021,6 +1029,9 @@ void nbd_export_close(NBDExport *exp) if (exp->bs) { bdrv_remove_aio_context_notifier(exp->bs, bs_aio_attached, bs_aio_detach, exp); + if (exp->restore_incoming) { + exp->bs->open_flags |= BDRV_O_INCOMING; + } bdrv_unref(exp->bs); exp->bs = NULL; } -- 2.0.0 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [RFC PATCH] block/migration: Disable cache invalidate for incoming migration 2014-10-02 8:52 [Qemu-devel] [RFC PATCH] block/migration: Disable cache invalidate for incoming migration Alexey Kardashevskiy @ 2014-10-02 9:45 ` Paolo Bonzini 2014-10-02 10:19 ` Alexey Kardashevskiy 2014-10-02 14:52 ` Stefan Hajnoczi 1 sibling, 1 reply; 7+ messages in thread From: Paolo Bonzini @ 2014-10-02 9:45 UTC (permalink / raw) To: Alexey Kardashevskiy, qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi Il 02/10/2014 10:52, Alexey Kardashevskiy ha scritto: > When migrated using libvirt with "--copy-storage-all", at the end of > migration there is race between NBD mirroring task trying to do flush > and migration completion, both end up invalidating cache. Since qcow2 > driver does not handle this situation very well, random crashes happen. > > This disables the BDRV_O_INCOMING flag for the block device being migrated > and restores it when NBD task is done. > > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> > --- > > > The commit log is not full and most likely incorrect as well > as the patch :) Please, help. Thanks! > > The patch seems to fix the initial problem though. > > > btw is there any easy way to migrate one QEMU to another > using NBD (i.e. not using "migrate -b") and not using libvirt? > What would the command line be? Debugging with libvirt is real > pain :( > > > --- > block.c | 17 ++++------------- > migration.c | 1 - > nbd.c | 11 +++++++++++ > 3 files changed, 15 insertions(+), 14 deletions(-) > > diff --git a/block.c b/block.c > index c5a251c..ed72e0a 100644 > --- a/block.c > +++ b/block.c > @@ -5073,6 +5073,10 @@ void bdrv_invalidate_cache_all(Error **errp) > QTAILQ_FOREACH(bs, &bdrv_states, device_list) { > AioContext *aio_context = bdrv_get_aio_context(bs); > > + if (!(bs->open_flags & BDRV_O_INCOMING)) { > + continue; > + } > + > aio_context_acquire(aio_context); > bdrv_invalidate_cache(bs, &local_err); > aio_context_release(aio_context); This part is okay, though perhaps we should add it to bdrv_invalidate_cache instead? > @@ -5083,19 +5087,6 @@ void bdrv_invalidate_cache_all(Error **errp) > } > } > > -void bdrv_clear_incoming_migration_all(void) > -{ > - BlockDriverState *bs; > - > - QTAILQ_FOREACH(bs, &bdrv_states, device_list) { > - AioContext *aio_context = bdrv_get_aio_context(bs); > - > - aio_context_acquire(aio_context); > - bs->open_flags = bs->open_flags & ~(BDRV_O_INCOMING); > - aio_context_release(aio_context); > - } > -} > - > int bdrv_flush(BlockDriverState *bs) > { > Coroutine *co; > diff --git a/migration.c b/migration.c > index 8d675b3..c49a05a 100644 > --- a/migration.c > +++ b/migration.c > @@ -103,7 +103,6 @@ static void process_incoming_migration_co(void *opaque) > } > qemu_announce_self(); > > - bdrv_clear_incoming_migration_all(); > /* Make sure all file formats flush their mutable metadata */ > bdrv_invalidate_cache_all(&local_err); > if (local_err) { This part I don't understand. Shouldn't you at least be adding bs->open_flags = bs->open_flags & ~(BDRV_O_INCOMING); to bdrv_invalidate_cache? > diff --git a/nbd.c b/nbd.c > index e9b539b..7b479c0 100644 > --- a/nbd.c > +++ b/nbd.c > @@ -106,6 +106,7 @@ struct NBDExport { > off_t dev_offset; > off_t size; > uint32_t nbdflags; > + bool restore_incoming; > QTAILQ_HEAD(, NBDClient) clients; > QTAILQ_ENTRY(NBDExport) next; > > @@ -972,6 +973,13 @@ NBDExport *nbd_export_new(BlockDriverState *bs, off_t dev_offset, > exp->ctx = bdrv_get_aio_context(bs); > bdrv_ref(bs); > bdrv_add_aio_context_notifier(bs, bs_aio_attached, bs_aio_detach, exp); > + > + if (bs->open_flags & BDRV_O_INCOMING) { > + bdrv_invalidate_cache(bs, NULL); > + exp->restore_incoming = !!(bs->open_flags & BDRV_O_INCOMING); > + bs->open_flags &= ~(BDRV_O_INCOMING); > + } > + > return exp; > } > > @@ -1021,6 +1029,9 @@ void nbd_export_close(NBDExport *exp) > if (exp->bs) { > bdrv_remove_aio_context_notifier(exp->bs, bs_aio_attached, > bs_aio_detach, exp); > + if (exp->restore_incoming) { > + exp->bs->open_flags |= BDRV_O_INCOMING; > + } > bdrv_unref(exp->bs); > exp->bs = NULL; > } > For this, I don't think you even need exp->restore_incoming, and then it can simply be a one-liner + bdrv_invalidate_cache(bs, NULL); if you modify bdrv_invalidate_cache instead of bdrv_invalidate_cache_all. Paolo ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [RFC PATCH] block/migration: Disable cache invalidate for incoming migration 2014-10-02 9:45 ` Paolo Bonzini @ 2014-10-02 10:19 ` Alexey Kardashevskiy 0 siblings, 0 replies; 7+ messages in thread From: Alexey Kardashevskiy @ 2014-10-02 10:19 UTC (permalink / raw) To: Paolo Bonzini, qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 10/02/2014 07:45 PM, Paolo Bonzini wrote: > Il 02/10/2014 10:52, Alexey Kardashevskiy ha scritto: >> When migrated using libvirt with "--copy-storage-all", at the end >> of migration there is race between NBD mirroring task trying to do >> flush and migration completion, both end up invalidating cache. >> Since qcow2 driver does not handle this situation very well, random >> crashes happen. >> >> This disables the BDRV_O_INCOMING flag for the block device being >> migrated and restores it when NBD task is done. >> >> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> --- >> >> >> The commit log is not full and most likely incorrect as well as the >> patch :) Please, help. Thanks! >> >> The patch seems to fix the initial problem though. >> >> >> btw is there any easy way to migrate one QEMU to another using NBD >> (i.e. not using "migrate -b") and not using libvirt? What would the >> command line be? Debugging with libvirt is real pain :( >> >> >> --- block.c | 17 ++++------------- migration.c | 1 - nbd.c >> | 11 +++++++++++ 3 files changed, 15 insertions(+), 14 deletions(-) >> >> diff --git a/block.c b/block.c index c5a251c..ed72e0a 100644 --- >> a/block.c +++ b/block.c @@ -5073,6 +5073,10 @@ void >> bdrv_invalidate_cache_all(Error **errp) QTAILQ_FOREACH(bs, >> &bdrv_states, device_list) { AioContext *aio_context = >> bdrv_get_aio_context(bs); >> >> + if (!(bs->open_flags & BDRV_O_INCOMING)) { + >> continue; + } + aio_context_acquire(aio_context); >> bdrv_invalidate_cache(bs, &local_err); >> aio_context_release(aio_context); > > This part is okay, though perhaps we should add it to > bdrv_invalidate_cache instead? Yes, makes perfect sense. > >> @@ -5083,19 +5087,6 @@ void bdrv_invalidate_cache_all(Error **errp) >> } } >> >> -void bdrv_clear_incoming_migration_all(void) -{ - >> BlockDriverState *bs; - - QTAILQ_FOREACH(bs, &bdrv_states, >> device_list) { - AioContext *aio_context = >> bdrv_get_aio_context(bs); - - >> aio_context_acquire(aio_context); - bs->open_flags = >> bs->open_flags & ~(BDRV_O_INCOMING); - >> aio_context_release(aio_context); - } -} - int >> bdrv_flush(BlockDriverState *bs) { Coroutine *co; diff --git >> a/migration.c b/migration.c index 8d675b3..c49a05a 100644 --- >> a/migration.c +++ b/migration.c @@ -103,7 +103,6 @@ static void >> process_incoming_migration_co(void *opaque) } qemu_announce_self(); >> >> - bdrv_clear_incoming_migration_all(); /* Make sure all file >> formats flush their mutable metadata */ >> bdrv_invalidate_cache_all(&local_err); if (local_err) { > > This part I don't understand. > > Shouldn't you at least be adding > > bs->open_flags = bs->open_flags & ~(BDRV_O_INCOMING); > > to bdrv_invalidate_cache? Reset the flag after caches has been invalidated? What is the exact semantic of this BDRV_O_INCOMING? blockdev_init() sets it, we reset it on the first bdrv_invalidate_cache() and then we never set it again? I am still missing the bigger picture... >> diff --git a/nbd.c b/nbd.c index e9b539b..7b479c0 100644 --- >> a/nbd.c +++ b/nbd.c @@ -106,6 +106,7 @@ struct NBDExport { off_t >> dev_offset; off_t size; uint32_t nbdflags; + bool >> restore_incoming; QTAILQ_HEAD(, NBDClient) clients; >> QTAILQ_ENTRY(NBDExport) next; >> >> @@ -972,6 +973,13 @@ NBDExport *nbd_export_new(BlockDriverState *bs, >> off_t dev_offset, exp->ctx = bdrv_get_aio_context(bs); >> bdrv_ref(bs); bdrv_add_aio_context_notifier(bs, bs_aio_attached, >> bs_aio_detach, exp); + + if (bs->open_flags & BDRV_O_INCOMING) { >> + bdrv_invalidate_cache(bs, NULL); + >> exp->restore_incoming = !!(bs->open_flags & BDRV_O_INCOMING); + >> bs->open_flags &= ~(BDRV_O_INCOMING); + } + return exp; } >> >> @@ -1021,6 +1029,9 @@ void nbd_export_close(NBDExport *exp) if >> (exp->bs) { bdrv_remove_aio_context_notifier(exp->bs, >> bs_aio_attached, bs_aio_detach, exp); + if >> (exp->restore_incoming) { + exp->bs->open_flags |= >> BDRV_O_INCOMING; + } bdrv_unref(exp->bs); exp->bs = NULL; } >> > > For this, I don't think you even need exp->restore_incoming, and then > it can simply be a one-liner > > + bdrv_invalidate_cache(bs, NULL); > > if you modify bdrv_invalidate_cache instead of > bdrv_invalidate_cache_all. I did not understand that modification but if I do not restore BDRV_O_INCOMING, then changes to the disk I made on the source side before migration - they are lost after rebooting the destination guest. - -- Alexey -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBAgAGBQJULSalAAoJEIYTPdgrwSC5tVcP/RAKlDwP2E3hRpfqK4oR+BnR /OF89+kVvlvXtKSImY1/8oHlPwoKIqA974ZuxYnJNZw3xx2xDmMnT3V3UOVs77Te rRhs87ps/xjk+FXrqRQnuITyoJzOCjIuhkx5cVO66caLyfJaesbPmKgPbThH3EoI FHbwe/XsKjttMGAwd721tDrx/1fwAp5BnpFOMP2ZgqMGkRC3+9+xnxIWqOUvpMTl AVsjWvWO5rRSyj/QE+8RQi+XNPtqfiCYaUHLNy+g23GQjIAjol+zY88sS5f9axJD e4BthhumaALrCfJXf/3p0kszV+oUZ6SSnFcbZnMNe90o5+erDjNEt2i2HGW82sPY 42NP6Tpdg3q01L9zzw7Q+kR8dSy8SQKxeC8Brdi2sfX3KS0JI8mYtYdvWsRjeQ1L OpAYh2eWcqbb9JI1mIE5KWLF/hZPj0epWYNz1VUTB5zmT2VqtmPd+7Xf1mAbh2xN EUWhNQOSrnIxwVcm62SiSy8jYVXfzKIfgmz2Ax/W12Q0zqSxo4896zvaep3PlC+l Ms33JpDPa2qIyWBhZ9ofufV+smqnOgPxC9+Spg4QSlTAL4MHBUGH+fVhml/p4/rn jQo8+0ifbvl9ARv+B0oEERk2Lr1LL7fIcmZDyddQUTswmSK7vTUeKZqpCMN00Ryx 9ms4MHSEolQQJUhrVnX2 =qkLF -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [RFC PATCH] block/migration: Disable cache invalidate for incoming migration 2014-10-02 8:52 [Qemu-devel] [RFC PATCH] block/migration: Disable cache invalidate for incoming migration Alexey Kardashevskiy 2014-10-02 9:45 ` Paolo Bonzini @ 2014-10-02 14:52 ` Stefan Hajnoczi 2014-10-03 4:12 ` Alexey Kardashevskiy 1 sibling, 1 reply; 7+ messages in thread From: Stefan Hajnoczi @ 2014-10-02 14:52 UTC (permalink / raw) To: Alexey Kardashevskiy Cc: Kevin Wolf, Paolo Bonzini, qemu-devel, Stefan Hajnoczi [-- Attachment #1: Type: text/plain, Size: 4227 bytes --] On Thu, Oct 02, 2014 at 06:52:52PM +1000, Alexey Kardashevskiy wrote: > When migrated using libvirt with "--copy-storage-all", at the end of > migration there is race between NBD mirroring task trying to do flush > and migration completion, both end up invalidating cache. Since qcow2 > driver does not handle this situation very well, random crashes happen. > > This disables the BDRV_O_INCOMING flag for the block device being migrated > and restores it when NBD task is done. > > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> > --- > > > The commit log is not full and most likely incorrect as well > as the patch :) Please, help. Thanks! > > The patch seems to fix the initial problem though. > > > btw is there any easy way to migrate one QEMU to another > using NBD (i.e. not using "migrate -b") and not using libvirt? > What would the command line be? Debugging with libvirt is real > pain :( > > > --- > block.c | 17 ++++------------- > migration.c | 1 - > nbd.c | 11 +++++++++++ > 3 files changed, 15 insertions(+), 14 deletions(-) > > diff --git a/block.c b/block.c > index c5a251c..ed72e0a 100644 > --- a/block.c > +++ b/block.c > @@ -5073,6 +5073,10 @@ void bdrv_invalidate_cache_all(Error **errp) > QTAILQ_FOREACH(bs, &bdrv_states, device_list) { > AioContext *aio_context = bdrv_get_aio_context(bs); > > + if (!(bs->open_flags & BDRV_O_INCOMING)) { > + continue; > + } > + We shouldn't touch bs before acquiring the AioContext. Acquiring the AioContext is basically the "lock" for the BDS. It needs to be moved... > aio_context_acquire(aio_context); ...in here. > bdrv_invalidate_cache(bs, &local_err); > aio_context_release(aio_context); > @@ -5083,19 +5087,6 @@ void bdrv_invalidate_cache_all(Error **errp) > } > } > > -void bdrv_clear_incoming_migration_all(void) > -{ > - BlockDriverState *bs; > - > - QTAILQ_FOREACH(bs, &bdrv_states, device_list) { > - AioContext *aio_context = bdrv_get_aio_context(bs); > - > - aio_context_acquire(aio_context); > - bs->open_flags = bs->open_flags & ~(BDRV_O_INCOMING); > - aio_context_release(aio_context); > - } > -} > - > int bdrv_flush(BlockDriverState *bs) > { > Coroutine *co; > diff --git a/migration.c b/migration.c > index 8d675b3..c49a05a 100644 > --- a/migration.c > +++ b/migration.c > @@ -103,7 +103,6 @@ static void process_incoming_migration_co(void *opaque) > } > qemu_announce_self(); > > - bdrv_clear_incoming_migration_all(); > /* Make sure all file formats flush their mutable metadata */ > bdrv_invalidate_cache_all(&local_err); BDRV_O_INCOMING needs to be cleared, otherwise the block drivers will think that incoming migration is still taking place and treat the file as effectively read-only during open. On IRC I suggested doing away with the bdrv_invalidate_cache_all() name since it no longer calls bdrv_invalidate_cache() on all BDSes. Combine both clearing BDRV_O_INCOMING and calling bdrv_invalidate_cache() if BDRV_O_INCOMING was previously set into one function - you could reuse bdrv_clear_incoming_migration_all() for that. > if (local_err) { > diff --git a/nbd.c b/nbd.c > index e9b539b..7b479c0 100644 > --- a/nbd.c > +++ b/nbd.c > @@ -106,6 +106,7 @@ struct NBDExport { > off_t dev_offset; > off_t size; > uint32_t nbdflags; > + bool restore_incoming; Not needed, it does not make sense to restore BDRV_O_INCOMING because once we've written to a file it cannot be in use by another host at the same time. > QTAILQ_HEAD(, NBDClient) clients; > QTAILQ_ENTRY(NBDExport) next; > > @@ -972,6 +973,13 @@ NBDExport *nbd_export_new(BlockDriverState *bs, off_t dev_offset, > exp->ctx = bdrv_get_aio_context(bs); > bdrv_ref(bs); > bdrv_add_aio_context_notifier(bs, bs_aio_attached, bs_aio_detach, exp); > + > + if (bs->open_flags & BDRV_O_INCOMING) { I think the flag has to be cleared before calling bdrv_invalidate_cache() because the .bdrv_open() function looks at the flag. [-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [RFC PATCH] block/migration: Disable cache invalidate for incoming migration 2014-10-02 14:52 ` Stefan Hajnoczi @ 2014-10-03 4:12 ` Alexey Kardashevskiy 2014-10-06 10:03 ` Stefan Hajnoczi 0 siblings, 1 reply; 7+ messages in thread From: Alexey Kardashevskiy @ 2014-10-03 4:12 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: Kevin Wolf, Paolo Bonzini, qemu-devel, Stefan Hajnoczi -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 10/03/2014 12:52 AM, Stefan Hajnoczi wrote: > On Thu, Oct 02, 2014 at 06:52:52PM +1000, Alexey Kardashevskiy wrote: >> When migrated using libvirt with "--copy-storage-all", at the end >> of migration there is race between NBD mirroring task trying to do >> flush and migration completion, both end up invalidating cache. >> Since qcow2 driver does not handle this situation very well, random >> crashes happen. >> >> This disables the BDRV_O_INCOMING flag for the block device being >> migrated and restores it when NBD task is done. >> >> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> --- >> >> >> The commit log is not full and most likely incorrect as well as the >> patch :) Please, help. Thanks! >> >> The patch seems to fix the initial problem though. >> >> >> btw is there any easy way to migrate one QEMU to another using NBD >> (i.e. not using "migrate -b") and not using libvirt? What would the >> command line be? Debugging with libvirt is real pain :( >> >> >> --- block.c | 17 ++++------------- migration.c | 1 - nbd.c >> | 11 +++++++++++ 3 files changed, 15 insertions(+), 14 deletions(-) >> >> diff --git a/block.c b/block.c index c5a251c..ed72e0a 100644 --- >> a/block.c +++ b/block.c @@ -5073,6 +5073,10 @@ void >> bdrv_invalidate_cache_all(Error **errp) QTAILQ_FOREACH(bs, >> &bdrv_states, device_list) { AioContext *aio_context = >> bdrv_get_aio_context(bs); >> >> + if (!(bs->open_flags & BDRV_O_INCOMING)) { + >> continue; + } + > > We shouldn't touch bs before acquiring the AioContext. Acquiring the > AioContext is basically the "lock" for the BDS. > > It needs to be moved... > >> aio_context_acquire(aio_context); > > ...in here. > >> bdrv_invalidate_cache(bs, &local_err); >> aio_context_release(aio_context); @@ -5083,19 +5087,6 @@ void >> bdrv_invalidate_cache_all(Error **errp) } } >> >> -void bdrv_clear_incoming_migration_all(void) -{ - >> BlockDriverState *bs; - - QTAILQ_FOREACH(bs, &bdrv_states, >> device_list) { - AioContext *aio_context = >> bdrv_get_aio_context(bs); - - >> aio_context_acquire(aio_context); - bs->open_flags = >> bs->open_flags & ~(BDRV_O_INCOMING); - >> aio_context_release(aio_context); - } -} - int >> bdrv_flush(BlockDriverState *bs) { Coroutine *co; diff --git >> a/migration.c b/migration.c index 8d675b3..c49a05a 100644 --- >> a/migration.c +++ b/migration.c @@ -103,7 +103,6 @@ static void >> process_incoming_migration_co(void *opaque) } qemu_announce_self(); >> >> - bdrv_clear_incoming_migration_all(); /* Make sure all file >> formats flush their mutable metadata */ >> bdrv_invalidate_cache_all(&local_err); > > BDRV_O_INCOMING needs to be cleared, otherwise the block drivers will > think that incoming migration is still taking place and treat the > file as effectively read-only during open. > > On IRC I suggested doing away with the bdrv_invalidate_cache_all() > name since it no longer calls bdrv_invalidate_cache() on all BDSes. > > Combine both clearing BDRV_O_INCOMING and calling > bdrv_invalidate_cache() if BDRV_O_INCOMING was previously set into > one function - you could reuse bdrv_clear_incoming_migration_all() for > that. I reread why bdrv_invalidate_cache() was added in the first place and what BDRV_O_INCOMING is, every time I look at the code it plays in new colors (not sure if it is a correct figure of speech :) ). BDRV_O_INCOMING is only set when QEMU is about to receive migration and we do not want QEMU to check the file at opening time as there is likely garbage. Is there any other use of BDRV_O_INCOMING? There must be some as bdrv_clear_incoming_migration_all() is called at the end of live migration and I believe there must be a reason for it (cache does not migrate, does it?). bdrv_invalidate_cache() flushes cache as it could be already initialized even if QEMU is receiving migration - QEMU could have cached some of real disk data. Is that correct? I do not really understand why it would happen if there is BDRV_O_INCOMING set but ok. I also thought that bdrv_invalidate_cache() is called more often as its name suggests, not during migration only, I was wrong here. The patch below then should solve the initial problem. bdrv_clear_incoming_migration_all() is unclear though... diff --git a/block.c b/block.c index c5a251c..6314af7 100644 - --- a/block.c +++ b/block.c @@ -5048,6 +5048,11 @@ void bdrv_invalidate_cache(BlockDriverState *bs, Error **errp) return; } + if (!(bs->open_flags & BDRV_O_INCOMING)) { + return; + } + bs->open_flags &= ~(BDRV_O_INCOMING); + if (bs->drv->bdrv_invalidate_cache) { bs->drv->bdrv_invalidate_cache(bs, &local_err); } else if (bs->file) { @@ -5083,19 +5088,6 @@ void bdrv_invalidate_cache_all(Error **errp) } } - -void bdrv_clear_incoming_migration_all(void) - -{ - - BlockDriverState *bs; - - - - QTAILQ_FOREACH(bs, &bdrv_states, device_list) { - - AioContext *aio_context = bdrv_get_aio_context(bs); - - - - aio_context_acquire(aio_context); - - bs->open_flags = bs->open_flags & ~(BDRV_O_INCOMING); - - aio_context_release(aio_context); - - } - -} - - int bdrv_flush(BlockDriverState *bs) { Coroutine *co; diff --git a/migration.c b/migration.c index 8d675b3..c49a05a 100644 - --- a/migration.c +++ b/migration.c @@ -103,7 +103,6 @@ static void process_incoming_migration_co(void *opaque) } qemu_announce_self(); - - bdrv_clear_incoming_migration_all(); /* Make sure all file formats flush their mutable metadata */ bdrv_invalidate_cache_all(&local_err); if (local_err) { diff --git a/nbd.c b/nbd.c index e9b539b..953c378 100644 - --- a/nbd.c +++ b/nbd.c @@ -972,6 +972,7 @@ NBDExport *nbd_export_new(BlockDriverState *bs, off_t dev_offset, exp->ctx = bdrv_get_aio_context(bs); bdrv_ref(bs); bdrv_add_aio_context_notifier(bs, bs_aio_attached, bs_aio_detach, exp); + bdrv_invalidate_cache(bs, NULL); return exp; } >> if (local_err) { diff --git a/nbd.c b/nbd.c index e9b539b..7b479c0 >> 100644 --- a/nbd.c +++ b/nbd.c @@ -106,6 +106,7 @@ struct NBDExport >> { off_t dev_offset; off_t size; uint32_t nbdflags; + bool >> restore_incoming; > > Not needed, it does not make sense to restore BDRV_O_INCOMING because > once we've written to a file it cannot be in use by another host at > the same time. > >> QTAILQ_HEAD(, NBDClient) clients; QTAILQ_ENTRY(NBDExport) next; >> >> @@ -972,6 +973,13 @@ NBDExport *nbd_export_new(BlockDriverState *bs, >> off_t dev_offset, exp->ctx = bdrv_get_aio_context(bs); >> bdrv_ref(bs); bdrv_add_aio_context_notifier(bs, bs_aio_attached, >> bs_aio_detach, exp); + + if (bs->open_flags & BDRV_O_INCOMING) { > > I think the flag has to be cleared before calling > bdrv_invalidate_cache() because the .bdrv_open() function looks at > the flag. > - -- Alexey -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBAgAGBQJULiIVAAoJEIYTPdgrwSC57ZIP+gI2ckqbMTf2LFl4MVP1oOEZ IkbZGiWJRfVsUitKD4+fjYB7ysnkt1hlrqdaaoTaPPzst676cGek91KdkpxqgorU L80e5TaRO/Ltx65yN1gMHhGJJnck9KvnzSp9atqUZNBGiMbYDPj8lN5NeyBadwdk 37uR/SvFrVRUWMVpgs7XF7q60C6YRnoGy+8qqmUSlwx3aYtFZpOfSyQYWkY5mT5j 3nsymm1ieFAQOd255K6X1oZW6GjwNCXa5MsUspK5gokPufjZQguGorRUMdBDmKa+ L9dvvMDG0I88027W3eTI0a2WI0D3BllizNuThacDHQEiLTIVGwd9VIDPYUJiz85W 7p88H6c8AozhUlcQiNK1o185HFCC+bcxlX0rHHRTLG+WfEWPMJ0Y4iewBvVa/IPZ Dzje5mEvdOm5MeU1/BiyTA6fc5nN5CpVth5mu7stMnnVvEAjK+gttKqTMMcR4BWa bzbREsrw97eqgGZcqH6T8X69Y9kEjgofdvs9uOlH4XlCD5CjJZNoylF8bzcwAFnO jlDNPAHCJATKLxIH6SClwaVf74lRn6Jv1WEKwKX9zuAq8rN34mSVT4VauEvRSsP0 RKnkqbUl37MZcf/dpiqr3M2W0pXDArvPZpme42IBJkTtpgFH5jC5Cry9OipOPqNc dFVR8sIgTQ4HLktVx4Fz =3bSl -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [RFC PATCH] block/migration: Disable cache invalidate for incoming migration 2014-10-03 4:12 ` Alexey Kardashevskiy @ 2014-10-06 10:03 ` Stefan Hajnoczi 2014-10-06 22:47 ` Alexey Kardashevskiy 0 siblings, 1 reply; 7+ messages in thread From: Stefan Hajnoczi @ 2014-10-06 10:03 UTC (permalink / raw) To: Alexey Kardashevskiy Cc: Kevin Wolf, Stefan Hajnoczi, qemu-devel, Paolo Bonzini [-- Attachment #1: Type: text/plain, Size: 2080 bytes --] On Fri, Oct 03, 2014 at 02:12:06PM +1000, Alexey Kardashevskiy wrote: > BDRV_O_INCOMING is only set when QEMU is about to receive migration and > we do not want QEMU to check the file at opening time as there is likely > garbage. Is there any other use of BDRV_O_INCOMING? There must be some as > bdrv_clear_incoming_migration_all() is called at the end of live > migration and I believe there must be a reason for it (cache does not > migrate, does it?). BDRV_O_INCOMING is just for live migration. The cached data is not migrated, this is why it must be refreshed upon migration handover. > bdrv_invalidate_cache() flushes cache as it could be already initialized > even if QEMU is receiving migration - QEMU could have cached some of real > disk data. Is that correct? I do not really understand why it would > happen if there is BDRV_O_INCOMING set but ok. .bdrv_open() can load metadata from image files (such as the qcow2 L1 tables) and it does this even when BDRV_O_INCOMING is set. That data needs to be re-read at migration handover to prevent the destination QEMU from running with stale image metadata. > diff --git a/nbd.c b/nbd.c > index e9b539b..953c378 100644 > - --- a/nbd.c > +++ b/nbd.c > @@ -972,6 +972,7 @@ NBDExport *nbd_export_new(BlockDriverState *bs, off_t > dev_offset, > exp->ctx = bdrv_get_aio_context(bs); > bdrv_ref(bs); > bdrv_add_aio_context_notifier(bs, bs_aio_attached, bs_aio_detach, exp); > + bdrv_invalidate_cache(bs, NULL); > return exp; > } Please add a comment to explain why this call is necessary: /* NBD exports are used for non-shared storage migration. Make sure * that BDRV_O_INCOMING is cleared and the image is ready for write * access since the export could be available before migration handover. */ If someone can come up with a 2- or 3-line comment that explains it better, great. The rest of the patch looks like it will work. I'm not thrilled about putting BDRV_O_INCOMING logical inside bdrv_invalidate_cache() because there was no coupling there before, but the code seems correct now. [-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [RFC PATCH] block/migration: Disable cache invalidate for incoming migration 2014-10-06 10:03 ` Stefan Hajnoczi @ 2014-10-06 22:47 ` Alexey Kardashevskiy 0 siblings, 0 replies; 7+ messages in thread From: Alexey Kardashevskiy @ 2014-10-06 22:47 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: Kevin Wolf, Stefan Hajnoczi, qemu-devel, Paolo Bonzini -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 10/06/2014 09:03 PM, Stefan Hajnoczi wrote: > On Fri, Oct 03, 2014 at 02:12:06PM +1000, Alexey Kardashevskiy wrote: >> BDRV_O_INCOMING is only set when QEMU is about to receive migration >> and we do not want QEMU to check the file at opening time as there >> is likely garbage. Is there any other use of BDRV_O_INCOMING? There >> must be some as bdrv_clear_incoming_migration_all() is called at the >> end of live migration and I believe there must be a reason for it >> (cache does not migrate, does it?). > > BDRV_O_INCOMING is just for live migration. The cached data is not > migrated, this is why it must be refreshed upon migration handover. > >> bdrv_invalidate_cache() flushes cache as it could be already >> initialized even if QEMU is receiving migration - QEMU could have >> cached some of real disk data. Is that correct? I do not really >> understand why it would happen if there is BDRV_O_INCOMING set but >> ok. > > .bdrv_open() can load metadata from image files (such as the qcow2 L1 > tables) and it does this even when BDRV_O_INCOMING is set. That data > needs to be re-read at migration handover to prevent the destination > QEMU from running with stale image metadata. Would not it be easier/more correct if bdrv_open would not load this metadata if BDRV_O_INCOMING is set? > >> diff --git a/nbd.c b/nbd.c index e9b539b..953c378 100644 - --- >> a/nbd.c +++ b/nbd.c @@ -972,6 +972,7 @@ NBDExport >> *nbd_export_new(BlockDriverState *bs, off_t dev_offset, exp->ctx = >> bdrv_get_aio_context(bs); bdrv_ref(bs); >> bdrv_add_aio_context_notifier(bs, bs_aio_attached, bs_aio_detach, >> exp); + bdrv_invalidate_cache(bs, NULL); return exp; } > > Please add a comment to explain why this call is necessary: > > /* NBD exports are used for non-shared storage migration. Make sure * > that BDRV_O_INCOMING is cleared and the image is ready for write * > access since the export could be available before migration handover. > */ > > If someone can come up with a 2- or 3-line comment that explains it > better, great. > > The rest of the patch looks like it will work. I'm not thrilled > about putting BDRV_O_INCOMING logical inside bdrv_invalidate_cache() > because there was no coupling there before, but the code seems correct > now. Ok, will repost today. - -- Alexey -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBAgAGBQJUMxvkAAoJEIYTPdgrwSC5+GMP/1sDhIJf3BlGb7FFqi7CeRDk X65p61AKc0BKjchFSGsl62DctcIUFPUN+gO8bGYlfihEdBu7n0pinG0iovg+Dhk3 auExumJGQony7D5K0/7BnR5Ciyvakk8lYs3qUaSE7PD4FrVDtnl8x7RTwP3qet3Q P4TO9yTVCoqnMBSj3ZZzcirwty8+MpFNWD9vhzBsyC3uVkXG/3pPr2LJWWogzosz ewmZQPQ5ydFncpBvT8gZhD+eseRVo6y0iTEac7TGmhDGCSWtyNcZng695lmzbUpB +lIQ5kqXOgGbcn8zgJ2VUwuBy7/sI0TsSIxYO69qwdgOqahSCrd7KgN0t2BoRVtH SOdxKxZhI3ULaNKAvRax93yq+gL8WZSvVmhpfEVh1EA7mZ2TeGMwZ3OsQzvGwi5/ RjsDTruiWg7FWoU6cI2VuPskNkehr0CXTMsheWoR8xvj+WVGz35quropSp8dw1qy NnJ8RmL5sQbtmh3Y8njdP+kwRUilqJAPcrpH6p4a+2cnXH4b3By4gyt0UROl7afs h8Pf/86HFa92kbMZAFs5ajhBj3Dg+SLHdAElzrRuz25/MVREAslaM8Us1q53xx/g P8neTnQIZfbcaH0b4JLxWPN2JPOyYXDV4IaOLEWyoAG0m2ks085+AB1L0o56hn0/ 6RVXGOU8iJKtQ6KGiDy/ =pB8l -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2014-10-06 22:47 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-10-02 8:52 [Qemu-devel] [RFC PATCH] block/migration: Disable cache invalidate for incoming migration Alexey Kardashevskiy 2014-10-02 9:45 ` Paolo Bonzini 2014-10-02 10:19 ` Alexey Kardashevskiy 2014-10-02 14:52 ` Stefan Hajnoczi 2014-10-03 4:12 ` Alexey Kardashevskiy 2014-10-06 10:03 ` Stefan Hajnoczi 2014-10-06 22:47 ` Alexey Kardashevskiy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).