qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] migration: Don't free the reason after calling migrate_add_blocker
@ 2025-10-24  9:28 Bin Guo
  2025-10-24 11:15 ` Markus Armbruster
  0 siblings, 1 reply; 13+ messages in thread
From: Bin Guo @ 2025-10-24  9:28 UTC (permalink / raw)
  To: qemu-devel; +Cc: peterx, farosas

Signed-off-by: Bin Guo <guobin@linux.alibaba.com>
---
 hw/intc/arm_gicv3_kvm.c | 1 -
 target/i386/sev.c       | 1 -
 2 files changed, 2 deletions(-)

diff --git a/hw/intc/arm_gicv3_kvm.c b/hw/intc/arm_gicv3_kvm.c
index 66b0dddfd4..6f311e37ef 100644
--- a/hw/intc/arm_gicv3_kvm.c
+++ b/hw/intc/arm_gicv3_kvm.c
@@ -841,7 +841,6 @@ static void kvm_arm_gicv3_realize(DeviceState *dev, Error **errp)
         error_setg(&kvm_nv_migration_blocker,
                    "Live migration disabled because KVM nested virt is enabled");
         if (migrate_add_blocker(&kvm_nv_migration_blocker, errp)) {
-            error_free(kvm_nv_migration_blocker);
             return;
         }
 
diff --git a/target/i386/sev.c b/target/i386/sev.c
index 1057b8ab2c..fd2dada013 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -1661,7 +1661,6 @@ sev_snp_launch_finish(SevCommonState *sev_common)
     ret = migrate_add_blocker(&sev_mig_blocker, &local_err);
     if (local_err) {
         error_report_err(local_err);
-        error_free(sev_mig_blocker);
         exit(1);
     }
 }
-- 
2.39.5 (Apple Git-154)



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] migration: Don't free the reason after calling migrate_add_blocker
  2025-10-24  9:28 [PATCH] migration: Don't free the reason after calling migrate_add_blocker Bin Guo
@ 2025-10-24 11:15 ` Markus Armbruster
  2025-10-24 11:27   ` Daniel P. Berrangé
  2025-10-24 11:53   ` Bin Guo
  0 siblings, 2 replies; 13+ messages in thread
From: Markus Armbruster @ 2025-10-24 11:15 UTC (permalink / raw)
  To: Bin Guo; +Cc: qemu-devel, peterx, farosas

Bin Guo <guobin@linux.alibaba.com> writes:

> Signed-off-by: Bin Guo <guobin@linux.alibaba.com>
> ---
>  hw/intc/arm_gicv3_kvm.c | 1 -
>  target/i386/sev.c       | 1 -
>  2 files changed, 2 deletions(-)
>
> diff --git a/hw/intc/arm_gicv3_kvm.c b/hw/intc/arm_gicv3_kvm.c
> index 66b0dddfd4..6f311e37ef 100644
> --- a/hw/intc/arm_gicv3_kvm.c
> +++ b/hw/intc/arm_gicv3_kvm.c
> @@ -841,7 +841,6 @@ static void kvm_arm_gicv3_realize(DeviceState *dev, Error **errp)
>          error_setg(&kvm_nv_migration_blocker,
>                     "Live migration disabled because KVM nested virt is enabled");
>          if (migrate_add_blocker(&kvm_nv_migration_blocker, errp)) {
> -            error_free(kvm_nv_migration_blocker);
>              return;
>          }
>  
> diff --git a/target/i386/sev.c b/target/i386/sev.c
> index 1057b8ab2c..fd2dada013 100644
> --- a/target/i386/sev.c
> +++ b/target/i386/sev.c
> @@ -1661,7 +1661,6 @@ sev_snp_launch_finish(SevCommonState *sev_common)
>      ret = migrate_add_blocker(&sev_mig_blocker, &local_err);
>      if (local_err) {
>          error_report_err(local_err);
> -        error_free(sev_mig_blocker);
>          exit(1);
>      }
>  }

Does this fix use-after-free bugs?



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] migration: Don't free the reason after calling migrate_add_blocker
  2025-10-24 11:15 ` Markus Armbruster
@ 2025-10-24 11:27   ` Daniel P. Berrangé
  2025-10-24 13:41     ` Markus Armbruster
  2025-10-24 11:53   ` Bin Guo
  1 sibling, 1 reply; 13+ messages in thread
From: Daniel P. Berrangé @ 2025-10-24 11:27 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Bin Guo, qemu-devel, peterx, farosas

On Fri, Oct 24, 2025 at 01:15:40PM +0200, Markus Armbruster wrote:
> Bin Guo <guobin@linux.alibaba.com> writes:
> 
> > Signed-off-by: Bin Guo <guobin@linux.alibaba.com>
> > ---
> >  hw/intc/arm_gicv3_kvm.c | 1 -
> >  target/i386/sev.c       | 1 -
> >  2 files changed, 2 deletions(-)
> >
> > diff --git a/hw/intc/arm_gicv3_kvm.c b/hw/intc/arm_gicv3_kvm.c
> > index 66b0dddfd4..6f311e37ef 100644
> > --- a/hw/intc/arm_gicv3_kvm.c
> > +++ b/hw/intc/arm_gicv3_kvm.c
> > @@ -841,7 +841,6 @@ static void kvm_arm_gicv3_realize(DeviceState *dev, Error **errp)
> >          error_setg(&kvm_nv_migration_blocker,
> >                     "Live migration disabled because KVM nested virt is enabled");
> >          if (migrate_add_blocker(&kvm_nv_migration_blocker, errp)) {
> > -            error_free(kvm_nv_migration_blocker);
> >              return;
> >          }
> >  
> > diff --git a/target/i386/sev.c b/target/i386/sev.c
> > index 1057b8ab2c..fd2dada013 100644
> > --- a/target/i386/sev.c
> > +++ b/target/i386/sev.c
> > @@ -1661,7 +1661,6 @@ sev_snp_launch_finish(SevCommonState *sev_common)
> >      ret = migrate_add_blocker(&sev_mig_blocker, &local_err);
> >      if (local_err) {
> >          error_report_err(local_err);
> > -        error_free(sev_mig_blocker);
> >          exit(1);
> >      }
> >  }
> 
> Does this fix use-after-free bugs?

I don't think so, because when migrate_add_blocker() returns error,
the Error for the blocker will have been propagated into the errp
parameter, and then set to NULL. So these two error_free calls
should be a no-op.

But wow, the migrate_add_blocker API design is unpleasant with its
pair of "Error **" parameters - it is practically designed to
maximise confusion & surprise.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] migration: Don't free the reason after calling migrate_add_blocker
  2025-10-24 11:15 ` Markus Armbruster
  2025-10-24 11:27   ` Daniel P. Berrangé
@ 2025-10-24 11:53   ` Bin Guo
  2025-10-24 13:40     ` Markus Armbruster
  1 sibling, 1 reply; 13+ messages in thread
From: Bin Guo @ 2025-10-24 11:53 UTC (permalink / raw)
  To: armbru; +Cc: farosas, peterx, qemu-devel, berrange

Markus Armbruster <armbru@redhat.com> writes:

> > Signed-off-by: Bin Guo <guobin@linux.alibaba.com>
> > ---
> >  hw/intc/arm_gicv3_kvm.c | 1 -
> >  target/i386/sev.c       | 1 -
> >  2 files changed, 2 deletions(-)
> >
> > diff --git a/hw/intc/arm_gicv3_kvm.c b/hw/intc/arm_gicv3_kvm.c
> > index 66b0dddfd4..6f311e37ef 100644
> > --- a/hw/intc/arm_gicv3_kvm.c
> > +++ b/hw/intc/arm_gicv3_kvm.c
> > @@ -841,7 +841,6 @@ static void kvm_arm_gicv3_realize(DeviceState *dev, Error **errp)
> >          error_setg(&kvm_nv_migration_blocker,
> >                     "Live migration disabled because KVM nested virt is enabled");
> >          if (migrate_add_blocker(&kvm_nv_migration_blocker, errp)) {
> > -            error_free(kvm_nv_migration_blocker);
> >              return;
> >          }
> >  
> > diff --git a/target/i386/sev.c b/target/i386/sev.c
> > index 1057b8ab2c..fd2dada013 100644
> > --- a/target/i386/sev.c
> > +++ b/target/i386/sev.c
> > @@ -1661,7 +1661,6 @@ sev_snp_launch_finish(SevCommonState *sev_common)
> >      ret = migrate_add_blocker(&sev_mig_blocker, &local_err);
> >      if (local_err) {
> >          error_report_err(local_err);
> > -        error_free(sev_mig_blocker);
> >          exit(1);
> >      }
> >  }
> 
> Does this fix use-after-free bugs?

No, just delete the unnecessary code and follow the best practice.
Function migrate_add_blocker will free the reason and set it to NULL
if failure is returned.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] migration: Don't free the reason after calling migrate_add_blocker
  2025-10-24 11:53   ` Bin Guo
@ 2025-10-24 13:40     ` Markus Armbruster
  0 siblings, 0 replies; 13+ messages in thread
From: Markus Armbruster @ 2025-10-24 13:40 UTC (permalink / raw)
  To: Bin Guo; +Cc: farosas, peterx, qemu-devel, berrange

Bin Guo <guobin@linux.alibaba.com> writes:

> Markus Armbruster <armbru@redhat.com> writes:
>
>> > Signed-off-by: Bin Guo <guobin@linux.alibaba.com>
>> > ---
>> >  hw/intc/arm_gicv3_kvm.c | 1 -
>> >  target/i386/sev.c       | 1 -
>> >  2 files changed, 2 deletions(-)
>> >
>> > diff --git a/hw/intc/arm_gicv3_kvm.c b/hw/intc/arm_gicv3_kvm.c
>> > index 66b0dddfd4..6f311e37ef 100644
>> > --- a/hw/intc/arm_gicv3_kvm.c
>> > +++ b/hw/intc/arm_gicv3_kvm.c
>> > @@ -841,7 +841,6 @@ static void kvm_arm_gicv3_realize(DeviceState *dev, Error **errp)
>> >          error_setg(&kvm_nv_migration_blocker,
>> >                     "Live migration disabled because KVM nested virt is enabled");
>> >          if (migrate_add_blocker(&kvm_nv_migration_blocker, errp)) {
>> > -            error_free(kvm_nv_migration_blocker);
>> >              return;
>> >          }
>> >  
>> > diff --git a/target/i386/sev.c b/target/i386/sev.c
>> > index 1057b8ab2c..fd2dada013 100644
>> > --- a/target/i386/sev.c
>> > +++ b/target/i386/sev.c
>> > @@ -1661,7 +1661,6 @@ sev_snp_launch_finish(SevCommonState *sev_common)
>> >      ret = migrate_add_blocker(&sev_mig_blocker, &local_err);
>> >      if (local_err) {
>> >          error_report_err(local_err);
>> > -        error_free(sev_mig_blocker);
>> >          exit(1);
>> >      }
>> >  }
>> 
>> Does this fix use-after-free bugs?
>
> No, just delete the unnecessary code and follow the best practice.
> Function migrate_add_blocker will free the reason and set it to NULL
> if failure is returned.

Please work the second sentence into the commit message.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] migration: Don't free the reason after calling migrate_add_blocker
  2025-10-24 11:27   ` Daniel P. Berrangé
@ 2025-10-24 13:41     ` Markus Armbruster
  2025-10-24 14:08       ` Markus Armbruster
  0 siblings, 1 reply; 13+ messages in thread
From: Markus Armbruster @ 2025-10-24 13:41 UTC (permalink / raw)
  To: Daniel P. Berrangé; +Cc: Bin Guo, qemu-devel, peterx, farosas

Daniel P. Berrangé <berrange@redhat.com> writes:

> On Fri, Oct 24, 2025 at 01:15:40PM +0200, Markus Armbruster wrote:
>> Bin Guo <guobin@linux.alibaba.com> writes:
>> 
>> > Signed-off-by: Bin Guo <guobin@linux.alibaba.com>
>> > ---
>> >  hw/intc/arm_gicv3_kvm.c | 1 -
>> >  target/i386/sev.c       | 1 -
>> >  2 files changed, 2 deletions(-)
>> >
>> > diff --git a/hw/intc/arm_gicv3_kvm.c b/hw/intc/arm_gicv3_kvm.c
>> > index 66b0dddfd4..6f311e37ef 100644
>> > --- a/hw/intc/arm_gicv3_kvm.c
>> > +++ b/hw/intc/arm_gicv3_kvm.c
>> > @@ -841,7 +841,6 @@ static void kvm_arm_gicv3_realize(DeviceState *dev, Error **errp)
>> >          error_setg(&kvm_nv_migration_blocker,
>> >                     "Live migration disabled because KVM nested virt is enabled");
>> >          if (migrate_add_blocker(&kvm_nv_migration_blocker, errp)) {
>> > -            error_free(kvm_nv_migration_blocker);
>> >              return;
>> >          }
>> >  
>> > diff --git a/target/i386/sev.c b/target/i386/sev.c
>> > index 1057b8ab2c..fd2dada013 100644
>> > --- a/target/i386/sev.c
>> > +++ b/target/i386/sev.c
>> > @@ -1661,7 +1661,6 @@ sev_snp_launch_finish(SevCommonState *sev_common)
>> >      ret = migrate_add_blocker(&sev_mig_blocker, &local_err);
>> >      if (local_err) {
>> >          error_report_err(local_err);
>> > -        error_free(sev_mig_blocker);
>> >          exit(1);
>> >      }
>> >  }
>> 
>> Does this fix use-after-free bugs?
>
> I don't think so, because when migrate_add_blocker() returns error,
> the Error for the blocker will have been propagated into the errp
> parameter, and then set to NULL. So these two error_free calls
> should be a no-op.
>
> But wow, the migrate_add_blocker API design is unpleasant with its
> pair of "Error **" parameters - it is practically designed to
> maximise confusion & surprise.

It's quite a sight, isn't it?

I'll give it a quick Friday afternoon try.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] migration: Don't free the reason after calling migrate_add_blocker
  2025-10-24 13:41     ` Markus Armbruster
@ 2025-10-24 14:08       ` Markus Armbruster
  2025-10-24 16:17         ` Peter Xu
  0 siblings, 1 reply; 13+ messages in thread
From: Markus Armbruster @ 2025-10-24 14:08 UTC (permalink / raw)
  To: Daniel P. Berrangé; +Cc: Bin Guo, qemu-devel, peterx, farosas

Markus Armbruster <armbru@redhat.com> writes:

> Daniel P. Berrangé <berrange@redhat.com> writes:

[...]

>> But wow, the migrate_add_blocker API design is unpleasant with its
>> pair of "Error **" parameters - it is practically designed to
>> maximise confusion & surprise.
>
> It's quite a sight, isn't it?
>
> I'll give it a quick Friday afternoon try.

Alright, my confusion has been maximised.  Giving up on this.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] migration: Don't free the reason after calling migrate_add_blocker
  2025-10-24 14:08       ` Markus Armbruster
@ 2025-10-24 16:17         ` Peter Xu
  2025-10-24 16:31           ` Daniel P. Berrangé
  0 siblings, 1 reply; 13+ messages in thread
From: Peter Xu @ 2025-10-24 16:17 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Daniel P. Berrangé, Bin Guo, qemu-devel, farosas

On Fri, Oct 24, 2025 at 04:08:39PM +0200, Markus Armbruster wrote:
> Markus Armbruster <armbru@redhat.com> writes:
> 
> > Daniel P. Berrangé <berrange@redhat.com> writes:
> 
> [...]
> 
> >> But wow, the migrate_add_blocker API design is unpleasant with its
> >> pair of "Error **" parameters - it is practically designed to
> >> maximise confusion & surprise.
> >
> > It's quite a sight, isn't it?
> >
> > I'll give it a quick Friday afternoon try.
> 
> Alright, my confusion has been maximised.  Giving up on this.

Besides the use of two Error** that might be confusing, what is more
confusing (if not wrong..): migrate_add_blocker() will take ownership of
the 1st Error**, no matter whether the helper succeeded or not. However, it
only resets the first Error** if failed.

I think it means if migrate_add_blocker() succeeded, the caller will have a
non-NULL pointer, even if it has lost the ownership of that pointer.

I'm guessing it never caused issue only because we don't usually
error_free() the migration blocker anywhere.. but I think maybe we should
at least do an error_copy() in add_blockers()..

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] migration: Don't free the reason after calling migrate_add_blocker
  2025-10-24 16:17         ` Peter Xu
@ 2025-10-24 16:31           ` Daniel P. Berrangé
  2025-10-24 18:15             ` Fabiano Rosas
  0 siblings, 1 reply; 13+ messages in thread
From: Daniel P. Berrangé @ 2025-10-24 16:31 UTC (permalink / raw)
  To: Peter Xu; +Cc: Markus Armbruster, Bin Guo, qemu-devel, farosas

On Fri, Oct 24, 2025 at 12:17:20PM -0400, Peter Xu wrote:
> On Fri, Oct 24, 2025 at 04:08:39PM +0200, Markus Armbruster wrote:
> > Markus Armbruster <armbru@redhat.com> writes:
> > 
> > > Daniel P. Berrangé <berrange@redhat.com> writes:
> > 
> > [...]
> > 
> > >> But wow, the migrate_add_blocker API design is unpleasant with its
> > >> pair of "Error **" parameters - it is practically designed to
> > >> maximise confusion & surprise.
> > >
> > > It's quite a sight, isn't it?
> > >
> > > I'll give it a quick Friday afternoon try.
> > 
> > Alright, my confusion has been maximised.  Giving up on this.
> 
> Besides the use of two Error** that might be confusing, what is more
> confusing (if not wrong..): migrate_add_blocker() will take ownership of
> the 1st Error**, no matter whether the helper succeeded or not. However, it
> only resets the first Error** if failed.
> 
> I think it means if migrate_add_blocker() succeeded, the caller will have a
> non-NULL pointer, even if it has lost the ownership of that pointer.
> 
> I'm guessing it never caused issue only because we don't usually
> error_free() the migration blocker anywhere.. but I think maybe we should
> at least do an error_copy() in add_blockers()..

IMHO we should not even be using an Error object for the the blocker.
AFAICT, internally all we care about is the formatted string. The main
reason for using an Error object appears to be to have a convenient
pointer to use as an identifier to later pass to del_blocker.

I'd be inclined to just have passed in a fixed string, and return an
integer identifier for the blocker. eg

    int64 migrate_add_blocker(const char *reason, Error **errp);

    void migrate_del_blocker(int64 blockerid);

The migrate_add_blocker method would strdup(reason) to keep its own
copy.

The usage would thus be clear & simple:

    int64 blockerid = migrate_add_blocker("cannot migrate vfio", errp);
    if (!blockerid) {
         return;
    }

    ... some time later...

    migrate_del_blocker(blockerid);


In some cases we needed dynamically formatted strings, which could have
been achieved thus:

    g_autofree char *msg = g_strdup_printf("cannot migrate vfio %d", blah);
    int64 blockerid = migrate_add_blocker(msg, errp);
    ...the rest as above...

yes, this costs an extra strdup(), but that is an acceptable & negligible
overhead in the context in which we're doing this.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] migration: Don't free the reason after calling migrate_add_blocker
  2025-10-24 16:31           ` Daniel P. Berrangé
@ 2025-10-24 18:15             ` Fabiano Rosas
  2025-10-27 10:25               ` Daniel P. Berrangé
  0 siblings, 1 reply; 13+ messages in thread
From: Fabiano Rosas @ 2025-10-24 18:15 UTC (permalink / raw)
  To: Daniel P. Berrangé, Peter Xu; +Cc: Markus Armbruster, Bin Guo, qemu-devel

Daniel P. Berrangé <berrange@redhat.com> writes:

> On Fri, Oct 24, 2025 at 12:17:20PM -0400, Peter Xu wrote:
>> On Fri, Oct 24, 2025 at 04:08:39PM +0200, Markus Armbruster wrote:
>> > Markus Armbruster <armbru@redhat.com> writes:
>> > 
>> > > Daniel P. Berrangé <berrange@redhat.com> writes:
>> > 
>> > [...]
>> > 
>> > >> But wow, the migrate_add_blocker API design is unpleasant with its
>> > >> pair of "Error **" parameters - it is practically designed to
>> > >> maximise confusion & surprise.
>> > >
>> > > It's quite a sight, isn't it?
>> > >
>> > > I'll give it a quick Friday afternoon try.
>> > 
>> > Alright, my confusion has been maximised.  Giving up on this.
>> 
>> Besides the use of two Error** that might be confusing, what is more
>> confusing (if not wrong..): migrate_add_blocker() will take ownership of
>> the 1st Error**, no matter whether the helper succeeded or not. However, it
>> only resets the first Error** if failed.
>> 
>> I think it means if migrate_add_blocker() succeeded, the caller will have a
>> non-NULL pointer, even if it has lost the ownership of that pointer.
>> 
>> I'm guessing it never caused issue only because we don't usually
>> error_free() the migration blocker anywhere.. but I think maybe we should
>> at least do an error_copy() in add_blockers()..
>
> IMHO we should not even be using an Error object for the the blocker.
> AFAICT, internally all we care about is the formatted string. The main
> reason for using an Error object appears to be to have a convenient
> pointer to use as an identifier to later pass to del_blocker.
>
> I'd be inclined to just have passed in a fixed string, and return an
> integer identifier for the blocker. eg
>
>     int64 migrate_add_blocker(const char *reason, Error **errp);
>
>     void migrate_del_blocker(int64 blockerid);
>
> The migrate_add_blocker method would strdup(reason) to keep its own
> copy.
>
> The usage would thus be clear & simple:
>
>     int64 blockerid = migrate_add_blocker("cannot migrate vfio", errp);
>     if (!blockerid) {
>          return;
>     }
>
>     ... some time later...
>
>     migrate_del_blocker(blockerid);
>
>
> In some cases we needed dynamically formatted strings, which could have
> been achieved thus:
>
>     g_autofree char *msg = g_strdup_printf("cannot migrate vfio %d", blah);
>     int64 blockerid = migrate_add_blocker(msg, errp);
>     ...the rest as above...
>
> yes, this costs an extra strdup(), but that is an acceptable & negligible
> overhead in the context in which we're doing this.
>

Hmm, I must disagree. This is more complex than what we have
today. Calling error_setg(err, "msg") is pretty standard, already gives
us formatting and keeps all (potentially) user-facing messages uniform.

Asking for people to deal with strings and storing an int64 in their
code is not improving the situation. Besides, the Error is already used
by the block layer when blocking operations, for instance. If anything
we should be integrating the two usages instead of inventing yet another
for the migration code. See:

replication.c:
  error_setg(&s->blocker,
             "Block device is in use by internal backup job");
  ...
  bdrv_op_block_all(top_bs, s->blocker);

block.c:
  void bdrv_op_block(BlockDriverState *bs, BlockOpType op, Error *reason)
  {
      BdrvOpBlocker *blocker;
      assert((int) op >= 0 && op < BLOCK_OP_TYPE_MAX);

      blocker = g_new0(BdrvOpBlocker, 1);
      blocker->reason = reason;
      QLIST_INSERT_HEAD(&bs->op_blockers[op], blocker, list);
}


> With regards,
> Daniel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] migration: Don't free the reason after calling migrate_add_blocker
  2025-10-24 18:15             ` Fabiano Rosas
@ 2025-10-27 10:25               ` Daniel P. Berrangé
  2025-10-27 12:32                 ` Markus Armbruster
  0 siblings, 1 reply; 13+ messages in thread
From: Daniel P. Berrangé @ 2025-10-27 10:25 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: Peter Xu, Markus Armbruster, Bin Guo, qemu-devel

On Fri, Oct 24, 2025 at 03:15:05PM -0300, Fabiano Rosas wrote:
> Daniel P. Berrangé <berrange@redhat.com> writes:
> 
> > On Fri, Oct 24, 2025 at 12:17:20PM -0400, Peter Xu wrote:
> >> On Fri, Oct 24, 2025 at 04:08:39PM +0200, Markus Armbruster wrote:
> >> > Markus Armbruster <armbru@redhat.com> writes:
> >> > 
> >> > > Daniel P. Berrangé <berrange@redhat.com> writes:
> >> > 
> >> > [...]
> >> > 
> >> > >> But wow, the migrate_add_blocker API design is unpleasant with its
> >> > >> pair of "Error **" parameters - it is practically designed to
> >> > >> maximise confusion & surprise.
> >> > >
> >> > > It's quite a sight, isn't it?
> >> > >
> >> > > I'll give it a quick Friday afternoon try.
> >> > 
> >> > Alright, my confusion has been maximised.  Giving up on this.
> >> 
> >> Besides the use of two Error** that might be confusing, what is more
> >> confusing (if not wrong..): migrate_add_blocker() will take ownership of
> >> the 1st Error**, no matter whether the helper succeeded or not. However, it
> >> only resets the first Error** if failed.
> >> 
> >> I think it means if migrate_add_blocker() succeeded, the caller will have a
> >> non-NULL pointer, even if it has lost the ownership of that pointer.
> >> 
> >> I'm guessing it never caused issue only because we don't usually
> >> error_free() the migration blocker anywhere.. but I think maybe we should
> >> at least do an error_copy() in add_blockers()..
> >
> > IMHO we should not even be using an Error object for the the blocker.
> > AFAICT, internally all we care about is the formatted string. The main
> > reason for using an Error object appears to be to have a convenient
> > pointer to use as an identifier to later pass to del_blocker.
> >
> > I'd be inclined to just have passed in a fixed string, and return an
> > integer identifier for the blocker. eg
> >
> >     int64 migrate_add_blocker(const char *reason, Error **errp);
> >
> >     void migrate_del_blocker(int64 blockerid);
> >
> > The migrate_add_blocker method would strdup(reason) to keep its own
> > copy.
> >
> > The usage would thus be clear & simple:
> >
> >     int64 blockerid = migrate_add_blocker("cannot migrate vfio", errp);
> >     if (!blockerid) {
> >          return;
> >     }
> >
> >     ... some time later...
> >
> >     migrate_del_blocker(blockerid);
> >
> >
> > In some cases we needed dynamically formatted strings, which could have
> > been achieved thus:
> >
> >     g_autofree char *msg = g_strdup_printf("cannot migrate vfio %d", blah);
> >     int64 blockerid = migrate_add_blocker(msg, errp);
> >     ...the rest as above...
> >
> > yes, this costs an extra strdup(), but that is an acceptable & negligible
> > overhead in the context in which we're doing this.
> >
> 
> Hmm, I must disagree. This is more complex than what we have
> today. Calling error_setg(err, "msg") is pretty standard, already gives
> us formatting and keeps all (potentially) user-facing messages uniform.

IMHO this usage in migration is not really about error reporting
though, and the lifecycle ownership of the Error objects in this
migration usage is very diferent from the typical lifecycle
ownership of Error objects used in reporting errors, which I think
leads to a surprising / unusual API.

> Asking for people to deal with strings and storing an int64 in their
> code is not improving the situation. Besides, the Error is already used
> by the block layer when blocking operations, for instance. If anything
> we should be integrating the two usages instead of inventing yet another
> for the migration code. See:

Yes, having a common API for these two similar use cases would be
a useful thing. I'm just not convinced we should be (mis|re)using
the Error object for either of these two situations.

> 
> replication.c:
>   error_setg(&s->blocker,
>              "Block device is in use by internal backup job");
>   ...
>   bdrv_op_block_all(top_bs, s->blocker);
> 
> block.c:
>   void bdrv_op_block(BlockDriverState *bs, BlockOpType op, Error *reason)
>   {
>       BdrvOpBlocker *blocker;
>       assert((int) op >= 0 && op < BLOCK_OP_TYPE_MAX);
> 
>       blocker = g_new0(BdrvOpBlocker, 1);
>       blocker->reason = reason;
>       QLIST_INSERT_HEAD(&bs->op_blockers[op], blocker, list);
> }


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] migration: Don't free the reason after calling migrate_add_blocker
  2025-10-27 10:25               ` Daniel P. Berrangé
@ 2025-10-27 12:32                 ` Markus Armbruster
  2025-10-27 13:54                   ` Fabiano Rosas
  0 siblings, 1 reply; 13+ messages in thread
From: Markus Armbruster @ 2025-10-27 12:32 UTC (permalink / raw)
  To: Daniel P. Berrangé; +Cc: Fabiano Rosas, Peter Xu, Bin Guo, qemu-devel

Daniel P. Berrangé <berrange@redhat.com> writes:

> On Fri, Oct 24, 2025 at 03:15:05PM -0300, Fabiano Rosas wrote:
>> Daniel P. Berrangé <berrange@redhat.com> writes:
>> > IMHO we should not even be using an Error object for the the blocker.
>> > AFAICT, internally all we care about is the formatted string. The main
>> > reason for using an Error object appears to be to have a convenient
>> > pointer to use as an identifier to later pass to del_blocker.
>> >
>> > I'd be inclined to just have passed in a fixed string, and return an
>> > integer identifier for the blocker. eg
>> >
>> >     int64 migrate_add_blocker(const char *reason, Error **errp);
>> >
>> >     void migrate_del_blocker(int64 blockerid);
>> >
>> > The migrate_add_blocker method would strdup(reason) to keep its own
>> > copy.
>> >
>> > The usage would thus be clear & simple:
>> >
>> >     int64 blockerid = migrate_add_blocker("cannot migrate vfio", errp);
>> >     if (!blockerid) {
>> >          return;
>> >     }
>> >
>> >     ... some time later...
>> >
>> >     migrate_del_blocker(blockerid);
>> >
>> >
>> > In some cases we needed dynamically formatted strings, which could have
>> > been achieved thus:
>> >
>> >     g_autofree char *msg = g_strdup_printf("cannot migrate vfio %d", blah);
>> >     int64 blockerid = migrate_add_blocker(msg, errp);
>> >     ...the rest as above...
>> >
>> > yes, this costs an extra strdup(), but that is an acceptable & negligible
>> > overhead in the context in which we're doing this.
>> >
>> 
>> Hmm, I must disagree. This is more complex than what we have
>> today. Calling error_setg(err, "msg") is pretty standard, already gives
>> us formatting and keeps all (potentially) user-facing messages uniform.
>
> IMHO this usage in migration is not really about error reporting
> though, and the lifecycle ownership of the Error objects in this
> migration usage is very diferent from the typical lifecycle
> ownership of Error objects used in reporting errors, which I think
> leads to a surprising / unusual API.

I think a blocker interface where you pass the error to use when the
blocker blocks something is defensible.

Passing an error message or even a text snippet to be interpolated into
the error message would also be defensible.

We're using the former, and it has turned out to be confusing.  Less so
in the block layer, where we sensibly pass Error *.  More so in
migration, where we pass Error **.  Error ** is almost always used to
receive an error, so when we use it for something else, we risk
confusion.

>> Asking for people to deal with strings and storing an int64 in their
>> code is not improving the situation. Besides, the Error is already used
>> by the block layer when blocking operations, for instance. If anything
>> we should be integrating the two usages instead of inventing yet another
>> for the migration code. See:
>
> Yes, having a common API for these two similar use cases would be
> a useful thing. I'm just not convinced we should be (mis|re)using
> the Error object for either of these two situations.

I guess we could have a generic Blocker object instead of using Error
for the purpose.

In addition to an error message, an Error object has an error class
(rarely used remnant of the past), where in the source code the Error
object was created (reported to the user when handling &error_abort),
and an optional hint.  Is any of this useful for blockers?

>> replication.c:
>>   error_setg(&s->blocker,
>>              "Block device is in use by internal backup job");
>>   ...
>>   bdrv_op_block_all(top_bs, s->blocker);
>> 
>> block.c:
>>   void bdrv_op_block(BlockDriverState *bs, BlockOpType op, Error *reason)
>>   {
>>       BdrvOpBlocker *blocker;
>>       assert((int) op >= 0 && op < BLOCK_OP_TYPE_MAX);
>> 
>>       blocker = g_new0(BdrvOpBlocker, 1);
>>       blocker->reason = reason;
>>       QLIST_INSERT_HEAD(&bs->op_blockers[op], blocker, list);
>> }
>
>
> With regards,
> Daniel



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] migration: Don't free the reason after calling migrate_add_blocker
  2025-10-27 12:32                 ` Markus Armbruster
@ 2025-10-27 13:54                   ` Fabiano Rosas
  0 siblings, 0 replies; 13+ messages in thread
From: Fabiano Rosas @ 2025-10-27 13:54 UTC (permalink / raw)
  To: Markus Armbruster, Daniel P. Berrangé; +Cc: Peter Xu, Bin Guo, qemu-devel

Markus Armbruster <armbru@redhat.com> writes:

> Daniel P. Berrangé <berrange@redhat.com> writes:
>
>> On Fri, Oct 24, 2025 at 03:15:05PM -0300, Fabiano Rosas wrote:
>>> Daniel P. Berrangé <berrange@redhat.com> writes:
>>> > IMHO we should not even be using an Error object for the the blocker.
>>> > AFAICT, internally all we care about is the formatted string. The main
>>> > reason for using an Error object appears to be to have a convenient
>>> > pointer to use as an identifier to later pass to del_blocker.
>>> >
>>> > I'd be inclined to just have passed in a fixed string, and return an
>>> > integer identifier for the blocker. eg
>>> >
>>> >     int64 migrate_add_blocker(const char *reason, Error **errp);
>>> >
>>> >     void migrate_del_blocker(int64 blockerid);
>>> >
>>> > The migrate_add_blocker method would strdup(reason) to keep its own
>>> > copy.
>>> >
>>> > The usage would thus be clear & simple:
>>> >
>>> >     int64 blockerid = migrate_add_blocker("cannot migrate vfio", errp);
>>> >     if (!blockerid) {
>>> >          return;
>>> >     }
>>> >
>>> >     ... some time later...
>>> >
>>> >     migrate_del_blocker(blockerid);
>>> >
>>> >
>>> > In some cases we needed dynamically formatted strings, which could have
>>> > been achieved thus:
>>> >
>>> >     g_autofree char *msg = g_strdup_printf("cannot migrate vfio %d", blah);
>>> >     int64 blockerid = migrate_add_blocker(msg, errp);
>>> >     ...the rest as above...
>>> >
>>> > yes, this costs an extra strdup(), but that is an acceptable & negligible
>>> > overhead in the context in which we're doing this.
>>> >
>>> 
>>> Hmm, I must disagree. This is more complex than what we have
>>> today. Calling error_setg(err, "msg") is pretty standard, already gives
>>> us formatting and keeps all (potentially) user-facing messages uniform.
>>
>> IMHO this usage in migration is not really about error reporting
>> though, and the lifecycle ownership of the Error objects in this
>> migration usage is very diferent from the typical lifecycle
>> ownership of Error objects used in reporting errors, which I think
>> leads to a surprising / unusual API.

The blocker does eventually show up to the user via
migration_is_blocked().

I get that there is an initial surprise to passing an Error around for
something that is not strictly an error, but IMO this is just a small
idiosyncrasy. For the lifecycle point, we could probably simplify what
we're doing in migration, I don't see why the Error ** needs to be
cleared in case the blocker cannot be installed.

>
> I think a blocker interface where you pass the error to use when the
> blocker blocks something is defensible.
>
> Passing an error message or even a text snippet to be interpolated into
> the error message would also be defensible.
>
> We're using the former, and it has turned out to be confusing.  Less so
> in the block layer, where we sensibly pass Error *.  More so in
> migration, where we pass Error **.  Error ** is almost always used to
> receive an error, so when we use it for something else, we risk
> confusion.
>

I don't understand exactly why we need the Error **, it looks like we're
just storing that error twice, once via the device's migration_blocker
and another via the migration core's GSList
*migration_blockers.

The block layer doesn't have the use case of blocking the blocker like
migration does, but it still looks like the two are doing pretty much
the same, with the block "op" being analogous to the migration "mode".

>>> Asking for people to deal with strings and storing an int64 in their
>>> code is not improving the situation. Besides, the Error is already used
>>> by the block layer when blocking operations, for instance. If anything
>>> we should be integrating the two usages instead of inventing yet another
>>> for the migration code. See:
>>
>> Yes, having a common API for these two similar use cases would be
>> a useful thing. I'm just not convinced we should be (mis|re)using
>> the Error object for either of these two situations.
>
> I guess we could have a generic Blocker object instead of using Error
> for the purpose.
>
> In addition to an error message, an Error object has an error class
> (rarely used remnant of the past), where in the source code the Error
> object was created (reported to the user when handling &error_abort),
> and an optional hint.  Is any of this useful for blockers?
>

I haven't found the need for those.

>>> replication.c:
>>>   error_setg(&s->blocker,
>>>              "Block device is in use by internal backup job");
>>>   ...
>>>   bdrv_op_block_all(top_bs, s->blocker);
>>> 
>>> block.c:
>>>   void bdrv_op_block(BlockDriverState *bs, BlockOpType op, Error *reason)
>>>   {
>>>       BdrvOpBlocker *blocker;
>>>       assert((int) op >= 0 && op < BLOCK_OP_TYPE_MAX);
>>> 
>>>       blocker = g_new0(BdrvOpBlocker, 1);
>>>       blocker->reason = reason;
>>>       QLIST_INSERT_HEAD(&bs->op_blockers[op], blocker, list);
>>> }
>>
>>
>> With regards,
>> Daniel


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2025-10-27 13:57 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-24  9:28 [PATCH] migration: Don't free the reason after calling migrate_add_blocker Bin Guo
2025-10-24 11:15 ` Markus Armbruster
2025-10-24 11:27   ` Daniel P. Berrangé
2025-10-24 13:41     ` Markus Armbruster
2025-10-24 14:08       ` Markus Armbruster
2025-10-24 16:17         ` Peter Xu
2025-10-24 16:31           ` Daniel P. Berrangé
2025-10-24 18:15             ` Fabiano Rosas
2025-10-27 10:25               ` Daniel P. Berrangé
2025-10-27 12:32                 ` Markus Armbruster
2025-10-27 13:54                   ` Fabiano Rosas
2025-10-24 11:53   ` Bin Guo
2025-10-24 13:40     ` Markus Armbruster

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).