qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC PATCH for-2.9] block: Ignore guest dev permissions during incoming migration
@ 2017-04-04 15:35 Kevin Wolf
  2017-04-05 13:22 ` Max Reitz
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Kevin Wolf @ 2017-04-04 15:35 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, mreitz, pbonzini, jcody, Ciprian.Barbu, Alexandru.Avadanii,
	armband, eblake, qemu-devel

Usually guest devices don't like other writers to the same image, so
they use blk_set_perm() to prevent this from happening. In the migration
phase before the VM is actually running, though, they don't have a
problem with writes to the image. On the other hand, storage migration
needs to be able to write to the image in this phase, so the restrictive
blk_set_perm() call of qdev devices breaks it.

This patch flags all BlockBackends with a qdev device as
blk->disable_perm during incoming migration, which means that the
requested permissions are stored in the BlockBackend, but not actually
applied to its root node yet.

Once migration has finished and the VM should be resumed, the
permissions are applied. If they cannot be applied (e.g. because the NBD
server used for block migration hasn't been shut down), resuming the VM
fails.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/block-backend.c | 40 +++++++++++++++++++++++++++++++++++++++-
 include/block/block.h |  2 ++
 migration/migration.c |  8 ++++++++
 qmp.c                 |  6 ++++++
 4 files changed, 55 insertions(+), 1 deletion(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index 0b63773..f817040 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -61,6 +61,7 @@ struct BlockBackend {
 
     uint64_t perm;
     uint64_t shared_perm;
+    bool disable_perm;
 
     bool allow_write_beyond_eof;
 
@@ -578,7 +579,7 @@ int blk_set_perm(BlockBackend *blk, uint64_t perm, uint64_t shared_perm,
 {
     int ret;
 
-    if (blk->root) {
+    if (blk->root && !blk->disable_perm) {
         ret = bdrv_child_try_set_perm(blk->root, perm, shared_perm, errp);
         if (ret < 0) {
             return ret;
@@ -597,15 +598,52 @@ void blk_get_perm(BlockBackend *blk, uint64_t *perm, uint64_t *shared_perm)
     *shared_perm = blk->shared_perm;
 }
 
+/*
+ * Notifies the user of all BlockBackends that migration has completed. qdev
+ * devices can tighten their permissions in response (specifically revoke
+ * shared write permissions that we needed for storage migration).
+ *
+ * If an error is returned, the VM cannot be allowed to be resumed.
+ */
+void blk_resume_after_migration(Error **errp)
+{
+    BlockBackend *blk;
+    Error *local_err = NULL;
+
+    for (blk = blk_next(NULL); blk; blk = blk_next(blk)) {
+        if (!blk->disable_perm) {
+            continue;
+        }
+
+        blk->disable_perm = false;
+
+        blk_set_perm(blk, blk->perm, blk->shared_perm, &local_err);
+        if (local_err) {
+            error_propagate(errp, local_err);
+            blk->disable_perm = true;
+            return;
+        }
+    }
+}
+
 static int blk_do_attach_dev(BlockBackend *blk, void *dev)
 {
     if (blk->dev) {
         return -EBUSY;
     }
+
+    /* While migration is still incoming, we don't need to apply the
+     * permissions of guest device BlockBackends. We might still have a block
+     * job or NBD server writing to the image for storage migration. */
+    if (runstate_check(RUN_STATE_INMIGRATE)) {
+        blk->disable_perm = true;
+    }
+
     blk_ref(blk);
     blk->dev = dev;
     blk->legacy_dev = false;
     blk_iostatus_reset(blk);
+
     return 0;
 }
 
diff --git a/include/block/block.h b/include/block/block.h
index 5149260..3e09222 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -366,6 +366,8 @@ void bdrv_invalidate_cache(BlockDriverState *bs, Error **errp);
 void bdrv_invalidate_cache_all(Error **errp);
 int bdrv_inactivate_all(void);
 
+void blk_resume_after_migration(Error **errp);
+
 /* Ensure contents are flushed to disk.  */
 int bdrv_flush(BlockDriverState *bs);
 int coroutine_fn bdrv_co_flush(BlockDriverState *bs);
diff --git a/migration/migration.c b/migration/migration.c
index 54060f7..ad4036f 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -349,6 +349,14 @@ static void process_incoming_migration_bh(void *opaque)
         exit(EXIT_FAILURE);
     }
 
+    /* If we get an error here, just don't restart the VM yet. */
+    blk_resume_after_migration(&local_err);
+    if (local_err) {
+        error_free(local_err);
+        local_err = NULL;
+        autostart = false;
+    }
+
     /*
      * This must happen after all error conditions are dealt with and
      * we're sure the VM is going to be running on this host.
diff --git a/qmp.c b/qmp.c
index fa82b59..a744e44 100644
--- a/qmp.c
+++ b/qmp.c
@@ -207,6 +207,12 @@ void qmp_cont(Error **errp)
         }
     }
 
+    blk_resume_after_migration(&local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        return;
+    }
+
     if (runstate_check(RUN_STATE_INMIGRATE)) {
         autostart = 1;
     } else {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [RFC PATCH for-2.9] block: Ignore guest dev permissions during incoming migration
  2017-04-04 15:35 [Qemu-devel] [RFC PATCH for-2.9] block: Ignore guest dev permissions during incoming migration Kevin Wolf
@ 2017-04-05 13:22 ` Max Reitz
  2017-04-06 11:22   ` Kevin Wolf
  2017-04-06 11:15 ` Kevin Wolf
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 8+ messages in thread
From: Max Reitz @ 2017-04-05 13:22 UTC (permalink / raw)
  To: Kevin Wolf, qemu-block
  Cc: pbonzini, jcody, Ciprian.Barbu, Alexandru.Avadanii, armband,
	eblake, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 2852 bytes --]

On 04.04.2017 17:35, Kevin Wolf wrote:
> Usually guest devices don't like other writers to the same image, so
> they use blk_set_perm() to prevent this from happening. In the migration
> phase before the VM is actually running, though, they don't have a
> problem with writes to the image. On the other hand, storage migration
> needs to be able to write to the image in this phase, so the restrictive
> blk_set_perm() call of qdev devices breaks it.
> 
> This patch flags all BlockBackends with a qdev device as
> blk->disable_perm during incoming migration, which means that the
> requested permissions are stored in the BlockBackend, but not actually
> applied to its root node yet.
> 
> Once migration has finished and the VM should be resumed, the
> permissions are applied. If they cannot be applied (e.g. because the NBD
> server used for block migration hasn't been shut down), resuming the VM
> fails.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block/block-backend.c | 40 +++++++++++++++++++++++++++++++++++++++-
>  include/block/block.h |  2 ++
>  migration/migration.c |  8 ++++++++
>  qmp.c                 |  6 ++++++
>  4 files changed, 55 insertions(+), 1 deletion(-)
> 
> diff --git a/block/block-backend.c b/block/block-backend.c
> index 0b63773..f817040 100644
> --- a/block/block-backend.c
> +++ b/block/block-backend.c

[...]

> @@ -597,15 +598,52 @@ void blk_get_perm(BlockBackend *blk, uint64_t *perm, uint64_t *shared_perm)
>      *shared_perm = blk->shared_perm;
>  }
>  
> +/*
> + * Notifies the user of all BlockBackends that migration has completed. qdev
> + * devices can tighten their permissions in response (specifically revoke
> + * shared write permissions that we needed for storage migration).
> + *
> + * If an error is returned, the VM cannot be allowed to be resumed.
> + */
> +void blk_resume_after_migration(Error **errp)
> +{
> +    BlockBackend *blk;
> +    Error *local_err = NULL;
> +
> +    for (blk = blk_next(NULL); blk; blk = blk_next(blk)) {

Shouldn't we use blk_all_next()?

Trusting you that silently disabling autostart is something the upper
layers can deal with, the rest looks good to me.

(The only other runtime changes of autostart apart from stop/cont appear
to be in blockdev_init() (if (bdrv_key_required()), but I don't think
that can happen anymore) and in migration/colo.c (which enables it and
emits an error message).)

Max

> +        if (!blk->disable_perm) {
> +            continue;
> +        }
> +
> +        blk->disable_perm = false;
> +
> +        blk_set_perm(blk, blk->perm, blk->shared_perm, &local_err);
> +        if (local_err) {
> +            error_propagate(errp, local_err);
> +            blk->disable_perm = true;
> +            return;
> +        }
> +    }
> +}
> +


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 512 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [RFC PATCH for-2.9] block: Ignore guest dev permissions during incoming migration
  2017-04-04 15:35 [Qemu-devel] [RFC PATCH for-2.9] block: Ignore guest dev permissions during incoming migration Kevin Wolf
  2017-04-05 13:22 ` Max Reitz
@ 2017-04-06 11:15 ` Kevin Wolf
  2017-04-06 17:16 ` [Qemu-devel] [Qemu-block] " Kashyap Chamarthy
  2017-04-06 17:26 ` Kashyap Chamarthy
  3 siblings, 0 replies; 8+ messages in thread
From: Kevin Wolf @ 2017-04-06 11:15 UTC (permalink / raw)
  To: qemu-block
  Cc: mreitz, pbonzini, jcody, Ciprian.Barbu, Alexandru.Avadanii,
	armband, eblake, qemu-devel

Am 04.04.2017 um 17:35 hat Kevin Wolf geschrieben:
> Usually guest devices don't like other writers to the same image, so
> they use blk_set_perm() to prevent this from happening. In the migration
> phase before the VM is actually running, though, they don't have a
> problem with writes to the image. On the other hand, storage migration
> needs to be able to write to the image in this phase, so the restrictive
> blk_set_perm() call of qdev devices breaks it.
> 
> This patch flags all BlockBackends with a qdev device as
> blk->disable_perm during incoming migration, which means that the
> requested permissions are stored in the BlockBackend, but not actually
> applied to its root node yet.
> 
> Once migration has finished and the VM should be resumed, the
> permissions are applied. If they cannot be applied (e.g. because the NBD
> server used for block migration hasn't been shut down), resuming the VM
> fails.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>

Ciprian, can you give this patch a try and report back whether it fixes
the storage migration bug you encountered?

Kevin

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [RFC PATCH for-2.9] block: Ignore guest dev permissions during incoming migration
  2017-04-05 13:22 ` Max Reitz
@ 2017-04-06 11:22   ` Kevin Wolf
  2017-04-06 11:31     ` Daniel P. Berrange
  0 siblings, 1 reply; 8+ messages in thread
From: Kevin Wolf @ 2017-04-06 11:22 UTC (permalink / raw)
  To: Max Reitz
  Cc: qemu-block, pbonzini, jcody, Ciprian.Barbu, Alexandru.Avadanii,
	armband, eblake, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 3094 bytes --]

Am 05.04.2017 um 15:22 hat Max Reitz geschrieben:
> On 04.04.2017 17:35, Kevin Wolf wrote:
> > Usually guest devices don't like other writers to the same image, so
> > they use blk_set_perm() to prevent this from happening. In the migration
> > phase before the VM is actually running, though, they don't have a
> > problem with writes to the image. On the other hand, storage migration
> > needs to be able to write to the image in this phase, so the restrictive
> > blk_set_perm() call of qdev devices breaks it.
> > 
> > This patch flags all BlockBackends with a qdev device as
> > blk->disable_perm during incoming migration, which means that the
> > requested permissions are stored in the BlockBackend, but not actually
> > applied to its root node yet.
> > 
> > Once migration has finished and the VM should be resumed, the
> > permissions are applied. If they cannot be applied (e.g. because the NBD
> > server used for block migration hasn't been shut down), resuming the VM
> > fails.
> > 
> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > ---
> >  block/block-backend.c | 40 +++++++++++++++++++++++++++++++++++++++-
> >  include/block/block.h |  2 ++
> >  migration/migration.c |  8 ++++++++
> >  qmp.c                 |  6 ++++++
> >  4 files changed, 55 insertions(+), 1 deletion(-)
> > 
> > diff --git a/block/block-backend.c b/block/block-backend.c
> > index 0b63773..f817040 100644
> > --- a/block/block-backend.c
> > +++ b/block/block-backend.c
> 
> [...]
> 
> > @@ -597,15 +598,52 @@ void blk_get_perm(BlockBackend *blk, uint64_t *perm, uint64_t *shared_perm)
> >      *shared_perm = blk->shared_perm;
> >  }
> >  
> > +/*
> > + * Notifies the user of all BlockBackends that migration has completed. qdev
> > + * devices can tighten their permissions in response (specifically revoke
> > + * shared write permissions that we needed for storage migration).
> > + *
> > + * If an error is returned, the VM cannot be allowed to be resumed.
> > + */
> > +void blk_resume_after_migration(Error **errp)
> > +{
> > +    BlockBackend *blk;
> > +    Error *local_err = NULL;
> > +
> > +    for (blk = blk_next(NULL); blk; blk = blk_next(blk)) {
> 
> Shouldn't we use blk_all_next()?

Good catch, thanks.

At first I added it into the loop in qmp_cont() and then copied it here
without noticing the resetting the iostatus is really only needed for
monitor-owned BBs at the moment, but this one is different.

Of course, as soon as we improve query-block to work reasonably well
with -device drive=<node-name>, qmp_cont() needs to use blk_all_next(),
too.

> Trusting you that silently disabling autostart is something the upper
> layers can deal with, the rest looks good to me.
> 
> (The only other runtime changes of autostart apart from stop/cont appear
> to be in blockdev_init() (if (bdrv_key_required()), but I don't think
> that can happen anymore) and in migration/colo.c (which enables it and
> emits an error message).)

I think in practice libvirt always sets -S on the destination anyway.

Kevin

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [RFC PATCH for-2.9] block: Ignore guest dev permissions during incoming migration
  2017-04-06 11:22   ` Kevin Wolf
@ 2017-04-06 11:31     ` Daniel P. Berrange
  0 siblings, 0 replies; 8+ messages in thread
From: Daniel P. Berrange @ 2017-04-06 11:31 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Max Reitz, qemu-block, armband, jcody, Ciprian.Barbu, qemu-devel,
	Alexandru.Avadanii, pbonzini

On Thu, Apr 06, 2017 at 01:22:56PM +0200, Kevin Wolf wrote:
> Am 05.04.2017 um 15:22 hat Max Reitz geschrieben:
> > On 04.04.2017 17:35, Kevin Wolf wrote:
> > > Usually guest devices don't like other writers to the same image, so
> > > they use blk_set_perm() to prevent this from happening. In the migration
> > > phase before the VM is actually running, though, they don't have a
> > > problem with writes to the image. On the other hand, storage migration
> > > needs to be able to write to the image in this phase, so the restrictive
> > > blk_set_perm() call of qdev devices breaks it.
> > > 
> > > This patch flags all BlockBackends with a qdev device as
> > > blk->disable_perm during incoming migration, which means that the
> > > requested permissions are stored in the BlockBackend, but not actually
> > > applied to its root node yet.
> > > 
> > > Once migration has finished and the VM should be resumed, the
> > > permissions are applied. If they cannot be applied (e.g. because the NBD
> > > server used for block migration hasn't been shut down), resuming the VM
> > > fails.
> > > 
> > > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > > ---
> > >  block/block-backend.c | 40 +++++++++++++++++++++++++++++++++++++++-
> > >  include/block/block.h |  2 ++
> > >  migration/migration.c |  8 ++++++++
> > >  qmp.c                 |  6 ++++++
> > >  4 files changed, 55 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/block/block-backend.c b/block/block-backend.c
> > > index 0b63773..f817040 100644
> > > --- a/block/block-backend.c
> > > +++ b/block/block-backend.c
> > 
> > [...]
> > 
> > > @@ -597,15 +598,52 @@ void blk_get_perm(BlockBackend *blk, uint64_t *perm, uint64_t *shared_perm)
> > >      *shared_perm = blk->shared_perm;
> > >  }
> > >  
> > > +/*
> > > + * Notifies the user of all BlockBackends that migration has completed. qdev
> > > + * devices can tighten their permissions in response (specifically revoke
> > > + * shared write permissions that we needed for storage migration).
> > > + *
> > > + * If an error is returned, the VM cannot be allowed to be resumed.
> > > + */
> > > +void blk_resume_after_migration(Error **errp)
> > > +{
> > > +    BlockBackend *blk;
> > > +    Error *local_err = NULL;
> > > +
> > > +    for (blk = blk_next(NULL); blk; blk = blk_next(blk)) {
> > 
> > Shouldn't we use blk_all_next()?
> 
> Good catch, thanks.
> 
> At first I added it into the loop in qmp_cont() and then copied it here
> without noticing the resetting the iostatus is really only needed for
> monitor-owned BBs at the moment, but this one is different.
> 
> Of course, as soon as we improve query-block to work reasonably well
> with -device drive=<node-name>, qmp_cont() needs to use blk_all_next(),
> too.
> 
> > Trusting you that silently disabling autostart is something the upper
> > layers can deal with, the rest looks good to me.
> > 
> > (The only other runtime changes of autostart apart from stop/cont appear
> > to be in blockdev_init() (if (bdrv_key_required()), but I don't think
> > that can happen anymore) and in migration/colo.c (which enables it and
> > emits an error message).)
> 
> I think in practice libvirt always sets -S on the destination anyway.

Libvirt uses -S for *all* QEMU instances it starts, regardless of use of
migration, since we need todo things after qemu starts, but before CPUs
are run.


Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [RFC PATCH for-2.9] block: Ignore guest dev permissions during incoming migration
  2017-04-04 15:35 [Qemu-devel] [RFC PATCH for-2.9] block: Ignore guest dev permissions during incoming migration Kevin Wolf
  2017-04-05 13:22 ` Max Reitz
  2017-04-06 11:15 ` Kevin Wolf
@ 2017-04-06 17:16 ` Kashyap Chamarthy
  2017-04-06 17:26 ` Kashyap Chamarthy
  3 siblings, 0 replies; 8+ messages in thread
From: Kashyap Chamarthy @ 2017-04-06 17:16 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: qemu-block, armband, Ciprian.Barbu, qemu-devel, mreitz,
	Alexandru.Avadanii, pbonzini

On Tue, Apr 04, 2017 at 05:35:56PM +0200, Kevin Wolf wrote:
> Usually guest devices don't like other writers to the same image, so
> they use blk_set_perm() to prevent this from happening. In the migration
> phase before the VM is actually running, though, they don't have a
> problem with writes to the image. On the other hand, storage migration
> needs to be able to write to the image in this phase, so the restrictive
> blk_set_perm() call of qdev devices breaks it.
> 
> This patch flags all BlockBackends with a qdev device as
> blk->disable_perm during incoming migration, which means that the
> requested permissions are stored in the BlockBackend, but not actually
> applied to its root node yet.
> 
> Once migration has finished and the VM should be resumed, the
> permissions are applied. If they cannot be applied (e.g. because the NBD
> server used for block migration hasn't been shut down), resuming the VM
> fails.

So I have an environment with a patched QEMU built with your fix to test
it with libvirt APIs, however, there's a libvirt bug that I just
discovered which fails the NBD-based live storage migration:

    https://www.redhat.com/archives/libvir-list/2017-April/msg00350.html
    -- NBD-based storage migration fails with "error: invalid argument:
    monitor must not be NULL"

Meanwhile, I'm about it test with plain QMP.

> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block/block-backend.c | 40 +++++++++++++++++++++++++++++++++++++++-
>  include/block/block.h |  2 ++
>  migration/migration.c |  8 ++++++++
>  qmp.c                 |  6 ++++++
>  4 files changed, 55 insertions(+), 1 deletion(-)
> 
> diff --git a/block/block-backend.c b/block/block-backend.c
> index 0b63773..f817040 100644
> --- a/block/block-backend.c
> +++ b/block/block-backend.c
> @@ -61,6 +61,7 @@ struct BlockBackend {
>  
>      uint64_t perm;
>      uint64_t shared_perm;
> +    bool disable_perm;
>  
>      bool allow_write_beyond_eof;
>  
> @@ -578,7 +579,7 @@ int blk_set_perm(BlockBackend *blk, uint64_t perm, uint64_t shared_perm,
>  {
>      int ret;
>  
> -    if (blk->root) {
> +    if (blk->root && !blk->disable_perm) {
>          ret = bdrv_child_try_set_perm(blk->root, perm, shared_perm, errp);
>          if (ret < 0) {
>              return ret;
> @@ -597,15 +598,52 @@ void blk_get_perm(BlockBackend *blk, uint64_t *perm, uint64_t *shared_perm)
>      *shared_perm = blk->shared_perm;
>  }
>  
> +/*
> + * Notifies the user of all BlockBackends that migration has completed. qdev
> + * devices can tighten their permissions in response (specifically revoke
> + * shared write permissions that we needed for storage migration).
> + *
> + * If an error is returned, the VM cannot be allowed to be resumed.
> + */
> +void blk_resume_after_migration(Error **errp)
> +{
> +    BlockBackend *blk;
> +    Error *local_err = NULL;
> +
> +    for (blk = blk_next(NULL); blk; blk = blk_next(blk)) {
> +        if (!blk->disable_perm) {
> +            continue;
> +        }
> +
> +        blk->disable_perm = false;
> +
> +        blk_set_perm(blk, blk->perm, blk->shared_perm, &local_err);
> +        if (local_err) {
> +            error_propagate(errp, local_err);
> +            blk->disable_perm = true;
> +            return;
> +        }
> +    }
> +}
> +
>  static int blk_do_attach_dev(BlockBackend *blk, void *dev)
>  {
>      if (blk->dev) {
>          return -EBUSY;
>      }
> +
> +    /* While migration is still incoming, we don't need to apply the
> +     * permissions of guest device BlockBackends. We might still have a block
> +     * job or NBD server writing to the image for storage migration. */
> +    if (runstate_check(RUN_STATE_INMIGRATE)) {
> +        blk->disable_perm = true;
> +    }
> +
>      blk_ref(blk);
>      blk->dev = dev;
>      blk->legacy_dev = false;
>      blk_iostatus_reset(blk);
> +
>      return 0;
>  }
>  
> diff --git a/include/block/block.h b/include/block/block.h
> index 5149260..3e09222 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -366,6 +366,8 @@ void bdrv_invalidate_cache(BlockDriverState *bs, Error **errp);
>  void bdrv_invalidate_cache_all(Error **errp);
>  int bdrv_inactivate_all(void);
>  
> +void blk_resume_after_migration(Error **errp);
> +
>  /* Ensure contents are flushed to disk.  */
>  int bdrv_flush(BlockDriverState *bs);
>  int coroutine_fn bdrv_co_flush(BlockDriverState *bs);
> diff --git a/migration/migration.c b/migration/migration.c
> index 54060f7..ad4036f 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -349,6 +349,14 @@ static void process_incoming_migration_bh(void *opaque)
>          exit(EXIT_FAILURE);
>      }
>  
> +    /* If we get an error here, just don't restart the VM yet. */
> +    blk_resume_after_migration(&local_err);
> +    if (local_err) {
> +        error_free(local_err);
> +        local_err = NULL;
> +        autostart = false;
> +    }
> +
>      /*
>       * This must happen after all error conditions are dealt with and
>       * we're sure the VM is going to be running on this host.
> diff --git a/qmp.c b/qmp.c
> index fa82b59..a744e44 100644
> --- a/qmp.c
> +++ b/qmp.c
> @@ -207,6 +207,12 @@ void qmp_cont(Error **errp)
>          }
>      }
>  
> +    blk_resume_after_migration(&local_err);
> +    if (local_err) {
> +        error_propagate(errp, local_err);
> +        return;
> +    }
> +
>      if (runstate_check(RUN_STATE_INMIGRATE)) {
>          autostart = 1;
>      } else {
> -- 
> 1.8.3.1
> 
> 

-- 
/kashyap

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [RFC PATCH for-2.9] block: Ignore guest dev permissions during incoming migration
  2017-04-04 15:35 [Qemu-devel] [RFC PATCH for-2.9] block: Ignore guest dev permissions during incoming migration Kevin Wolf
                   ` (2 preceding siblings ...)
  2017-04-06 17:16 ` [Qemu-devel] [Qemu-block] " Kashyap Chamarthy
@ 2017-04-06 17:26 ` Kashyap Chamarthy
  2017-04-06 18:46   ` Kashyap Chamarthy
  3 siblings, 1 reply; 8+ messages in thread
From: Kashyap Chamarthy @ 2017-04-06 17:26 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: qemu-block, armband, Ciprian.Barbu, qemu-devel, mreitz,
	Alexandru.Avadanii, pbonzini

On Tue, Apr 04, 2017 at 05:35:56PM +0200, Kevin Wolf wrote:
> Usually guest devices don't like other writers to the same image, so
> they use blk_set_perm() to prevent this from happening. In the migration
> phase before the VM is actually running, though, they don't have a
> problem with writes to the image. On the other hand, storage migration
> needs to be able to write to the image in this phase, so the restrictive
> blk_set_perm() call of qdev devices breaks it.
> 
> This patch flags all BlockBackends with a qdev device as
> blk->disable_perm during incoming migration, which means that the
> requested permissions are stored in the BlockBackend, but not actually
> applied to its root node yet.
> 
> Once migration has finished and the VM should be resumed, the
> permissions are applied. If they cannot be applied (e.g. because the NBD
> server used for block migration hasn't been shut down), resuming the VM
> fails.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block/block-backend.c | 40 +++++++++++++++++++++++++++++++++++++++-
>  include/block/block.h |  2 ++
>  migration/migration.c |  8 ++++++++
>  qmp.c                 |  6 ++++++
>  4 files changed, 55 insertions(+), 1 deletion(-)

With your fix applied, now I don't see the original error ("error:
internal error: unable to execute QEMU command 'nbd-server-add':
Conflicts with use by drive-virtio-disk0 as 'root', which does not allow
'write' on #block163"), and I can export a disk via `nbd-server-add`
with 'writeable' flag.  

However, with this fix, running `drive-mirror` seg-faults source QEMU.

Reproducer:

(1) On destination QEMU

$ /home/stack/build/build-qemu/x86_64-softmmu/qemu-system-x86_64 \
    -display none -nodefconfig -nodefaults -m 512 \
    -blockdev node-name=bar,driver=qcow2,file.driver=file,file.filename=./dst-disk.qcow2 \
    -serial unix:/tmp/monitor,server,nowait \
    -incoming tcp:localhost:6666 -qmp stdio
{"QMP": {"version": {"qemu": {"micro": 93, "minor": 8, "major": 2}, "package": " (v2.9.0-rc3-3-g3a8624b)"}, "capabilities": []}}
{ "execute": "qmp_capabilities" }
{"return": {}}
{ "execute": "nbd-server-start", "arguments": { "addr": { "type": "inet","data": { "host": "localhost", "port": "3333" } } } }
{"return": {}}
{ "execute": "nbd-server-add", "arguments": { "device": "bar","writable": true } }
{"return": {}}
/home/stack/src/qemu/nbd/server.c:nbd_receive_request():L706: read failed


(2) On source QEMU:

$  /home/stack/build/build-qemu/x86_64-softmmu/qemu-system-x86_64 \
    -display none -nodefconfig -nodefaults -m 512 \
    -device virtio-scsi-pci,id=scsi -device virtio-serial-pci \
    -blockdev node-name=foo,driver=qcow2,file.driver=file,file.filename=./cirros-0.3.5.qcow2
    -qmp stdio
{"QMP": {"version": {"qemu": {"micro": 93, "minor": 8, "major": 2}, "package": " (v2.9.0-rc3-3-g3a8624b)"}, "capabilities": []}}
{ "execute": "qmp_capabilities" }
{"return": {}}
{ "execute": "drive-mirror", "arguments": { "device": "foo", "target": "nbd:localhost:3333:exportname=bar", "sync": "full","format": "raw", "mode": "existing" } }
Segmentation fault (core dumped)


[...]

-- 
/kashyap

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [RFC PATCH for-2.9] block: Ignore guest dev permissions during incoming migration
  2017-04-06 17:26 ` Kashyap Chamarthy
@ 2017-04-06 18:46   ` Kashyap Chamarthy
  0 siblings, 0 replies; 8+ messages in thread
From: Kashyap Chamarthy @ 2017-04-06 18:46 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: qemu-block, armband, Ciprian.Barbu, qemu-devel, mreitz,
	Alexandru.Avadanii, pbonzini

On Thu, Apr 06, 2017 at 07:26:15PM +0200, Kashyap Chamarthy wrote:
> On Tue, Apr 04, 2017 at 05:35:56PM +0200, Kevin Wolf wrote:
> > Usually guest devices don't like other writers to the same image, so
> > they use blk_set_perm() to prevent this from happening. In the migration
> > phase before the VM is actually running, though, they don't have a
> > problem with writes to the image. On the other hand, storage migration
> > needs to be able to write to the image in this phase, so the restrictive
> > blk_set_perm() call of qdev devices breaks it.
> > 
> > This patch flags all BlockBackends with a qdev device as
> > blk->disable_perm during incoming migration, which means that the
> > requested permissions are stored in the BlockBackend, but not actually
> > applied to its root node yet.
> > 
> > Once migration has finished and the VM should be resumed, the
> > permissions are applied. If they cannot be applied (e.g. because the NBD
> > server used for block migration hasn't been shut down), resuming the VM
> > fails.
> > 
> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > ---
> >  block/block-backend.c | 40 +++++++++++++++++++++++++++++++++++++++-
> >  include/block/block.h |  2 ++
> >  migration/migration.c |  8 ++++++++
> >  qmp.c                 |  6 ++++++
> >  4 files changed, 55 insertions(+), 1 deletion(-)
> 
> With your fix applied, now I don't see the original error ("error:
> internal error: unable to execute QEMU command 'nbd-server-add':
> Conflicts with use by drive-virtio-disk0 as 'root', which does not allow
> 'write' on #block163"), and I can export a disk via `nbd-server-add`
> with 'writeable' flag.  
> 
> However, with this fix, running `drive-mirror` seg-faults source QEMU.

TL;DR: Kevin / Max pointed out that segmentation fault should be fixed
by supplying a 'job-id' parameter to `drive-mirror`.  (Longer version,
refer below.)

So:

	Tested-by: Kashyap Chamarthy <kchamart@redhat.com>

> Reproducer:
> 
> (1) On destination QEMU
> 
> $ /home/stack/build/build-qemu/x86_64-softmmu/qemu-system-x86_64 \
>     -display none -nodefconfig -nodefaults -m 512 \
>     -blockdev node-name=bar,driver=qcow2,file.driver=file,file.filename=./dst-disk.qcow2 \
>     -serial unix:/tmp/monitor,server,nowait \
>     -incoming tcp:localhost:6666 -qmp stdio
> {"QMP": {"version": {"qemu": {"micro": 93, "minor": 8, "major": 2}, "package": " (v2.9.0-rc3-3-g3a8624b)"}, "capabilities": []}}
> { "execute": "qmp_capabilities" }
> {"return": {}}
> { "execute": "nbd-server-start", "arguments": { "addr": { "type": "inet","data": { "host": "localhost", "port": "3333" } } } }
> {"return": {}}
> { "execute": "nbd-server-add", "arguments": { "device": "bar","writable": true } }
> {"return": {}}
> /home/stack/src/qemu/nbd/server.c:nbd_receive_request():L706: read failed
> 
> 
> (2) On source QEMU:
> 
> $  /home/stack/build/build-qemu/x86_64-softmmu/qemu-system-x86_64 \
>     -display none -nodefconfig -nodefaults -m 512 \
>     -device virtio-scsi-pci,id=scsi -device virtio-serial-pci \
>     -blockdev node-name=foo,driver=qcow2,file.driver=file,file.filename=./cirros-0.3.5.qcow2
>     -qmp stdio
> {"QMP": {"version": {"qemu": {"micro": 93, "minor": 8, "major": 2}, "package": " (v2.9.0-rc3-3-g3a8624b)"}, "capabilities": []}}
> { "execute": "qmp_capabilities" }
> {"return": {}}
> { "execute": "drive-mirror", "arguments": { "device": "foo", "target": "nbd:localhost:3333:exportname=bar", "sync": "full","format": "raw", "mode": "existing" } }
> Segmentation fault (core dumped)

Okay, there was a missing piece:  I was pointed out on IRC that the
`drive-mirror` QMP command was missing the 'job-id' (introduced by
commit: 71aa986 - "mirror: Add 'job-id' parameter to 'blockdev-mirror'
and 'drive-mirror'") parameter.  Kevin tells me that this is mandatory
_if_ you're using 'node-name' (which I was, in my '-blockdev'
command-line above).

And Max points out, the above should be caught by his:

	https://lists.nongnu.org/archive/html/qemu-block/2017-04/msg00059.html --
	[PATCH for-2.9 0/2] block/mirror: Fix use-after-free

And sure enough, adding the 'job-id' to `drive-mirror` on source will
let `drive-mirror` proceed succesfully: 

[...]
{ "execute": "drive-mirror", "arguments": { "device": "foo", "job-id": "job-0", "target": "nbd:localhost:3333:exportname=bar", "sync": "full","format": "raw", "mode": "existing" } }
{"return": {}}
{"timestamp": {"seconds": 1491498318, "microseconds": 435816}, "event": "BLOCK_JOB_READY", "data": {"device": "job-0", "len": 41126400, "offset": 41126400, "speed": 0, "type": "mirror"}}


Issue a `block-job-complete` (or `block-job-cancel) to end the running
mirror job:

[...]
{"execute": "block-job-complete", "arguments": {"device": "job-0"} }
{"return": {}}
{"timestamp": {"seconds": 1491500183, "microseconds": 768746}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "job-0", "len": 41126400, "offset": 41126400, "speed": 0, "type": "mirror"}}

Then, stop the NBD server on the destination QEMU:

[...]
{"execute":"nbd-server-stop"}
{"return": {}}


* * *

Just writing for completeness' sake, if you try to stop the NBD server
on the destination, _without_ gracefully completing or cancelling the
`drive-mirror` job on the source, you see the below:

[...]
{"execute":"nbd-server-stop"}
{"return": {}}
/home/stack/src/qemu/nbd/server.c:nbd_receive_request():L706: read failed
[...]

Kevin and Max (thanks!) are aware of this on IRC, and are investigating
this.


-- 
/kashyap

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-04-06 18:47 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-04-04 15:35 [Qemu-devel] [RFC PATCH for-2.9] block: Ignore guest dev permissions during incoming migration Kevin Wolf
2017-04-05 13:22 ` Max Reitz
2017-04-06 11:22   ` Kevin Wolf
2017-04-06 11:31     ` Daniel P. Berrange
2017-04-06 11:15 ` Kevin Wolf
2017-04-06 17:16 ` [Qemu-devel] [Qemu-block] " Kashyap Chamarthy
2017-04-06 17:26 ` Kashyap Chamarthy
2017-04-06 18:46   ` Kashyap Chamarthy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).