qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Claudio Fontana <cfontana@suse.de>
To: Markus Armbruster <armbru@redhat.com>
Cc: "Paolo Bonzini" <pbonzini@redhat.com>,
	"Richard Henderson" <richard.henderson@linaro.org>,
	"Kevin Wolf" <kwolf@redhat.com>,
	qemu-devel@nongnu.org, dinechin@redhat.com,
	"Gerd Hoffmann" <kraxel@redhat.com>,
	"Marc-André Lureau" <marcandre.lureau@redhat.com>,
	"Daniel P.Berrangé" <berrange@redhat.com>,
	"Philippe Mathieu-Daudé" <f4bug@amsat.org>
Subject: Re: [PATCH v7 3/5] module: add Error arguments to module_load and module_load_qom
Date: Wed, 28 Sep 2022 14:02:48 +0200	[thread overview]
Message-ID: <3487e03f-30dc-6545-4d24-cfee9ad15642@suse.de> (raw)
In-Reply-To: <87zgejybu4.fsf@pond.sub.org>

On 9/28/22 13:31, Markus Armbruster wrote:
> Claudio Fontana <cfontana@suse.de> writes:
> 
>> improve error handling during module load, by changing:
>>
>> bool module_load(const char *prefix, const char *lib_name);
>> void module_load_qom(const char *type);
>>
>> to:
>>
>> int module_load(const char *prefix, const char *name, Error **errp);
>> int module_load_qom(const char *type, Error **errp);
>>
>> where the return value is:
>>
>>  -1 on module load error, and errp is set with the error
>>   0 on module or one of its dependencies are not installed
>>   1 on module load success
>>   2 on module load success (module already loaded or built-in)
> 
> Two changes, if I understand things correctly:
> 
> 1. Convert to Error API from fprintf(stderr, ...)
> 
> 2. Return a more useful value
> 
> Right?

Yes.

> 
> Do you add any new errors here that weren't reported before?  Just

Yes.

> trying to calibrate my expectations before I dig into the actual patch.
> 
>> module_load_qom_one has been introduced in:
>>
>> commit 28457744c345 ("module: qom module support"), which built on top of
>> module_load_one, but discarded the bool return value. Restore it.
>>
>> Adapt all callers to emit errors, or ignore them, or fail hard,
>> as appropriate in each context.
>>
>> Some memory leaks also fixed as part of the module_load changes.
> 
> Where?

I misinterpreted the use of g_hash_table_add, so there is a fix for me here too (module_name should not be freed as the g_hash_table_add takes ownership).
The g_* API is a bit messy here, as the ownership and whether the key is freed depends on how the table was created (supplying a free function or not).

However, the code in the loop following the g_hash_table_add may "return false;" from the function directly,
skipping the cleanup code:

g_hash_table_add(loaded_modules, module_name);

-    for (modinfo = module_info; modinfo->name != NULL; modinfo++) {
-        if (modinfo->arch) {
-            if (strcmp(modinfo->name, module_name) == 0) {
-                if (!module_check_arch(modinfo)) {
-                    return false;
-                }
-            }
-        }
     }

So there is no chance to free at:

-    if (!success) {
-        g_hash_table_remove(loaded_modules, module_name);
-        g_free(module_name);
      }

But, I should not have taken the g_free(...) out of the test, will fix.



> 
>> audio: when attempting to load an audio module, report module load errors.
>> block: when attempting to load a block module, report module load errors.
>> console: when attempting to load a display module, report module load errors.
>>
>> qdev: when creating a new qdev Device object (DeviceState), report load errors.
>>       If a module cannot be loaded to create that device, now abort execution.
>>
>> qom/object.c: when initializing a QOM object, or looking up class_by_name,
>>               report module load errors.
>>
>> qtest: when processing the "module_load" qtest command, report errors
>>        in the load of the module.
> 
> This looks like a list of behavioral changes.  Appreciated!  It's a bit
> terse, though.  I might come back to this and suggest improvement.  But
> first, I need to understand the patch.
> 
>>
>> Signed-off-by: Claudio Fontana <cfontana@suse.de>
>> ---
>>  audio/audio.c         |  16 ++--
>>  block.c               |  20 +++-
>>  block/dmg.c           |  14 ++-
>>  hw/core/qdev.c        |  17 +++-
>>  include/qemu/module.h |  37 +++++++-
>>  qom/object.c          |  18 +++-
>>  softmmu/qtest.c       |   8 +-
>>  ui/console.c          |  18 +++-
>>  util/module.c         | 211 +++++++++++++++++++++++-------------------
>>  9 files changed, 235 insertions(+), 124 deletions(-)
>>
>> diff --git a/audio/audio.c b/audio/audio.c
>> index 0a682336a0..ea51793843 100644
>> --- a/audio/audio.c
>> +++ b/audio/audio.c
>> @@ -72,20 +72,24 @@ void audio_driver_register(audio_driver *drv)
>>  audio_driver *audio_driver_lookup(const char *name)
>>  {
>>      struct audio_driver *d;
>> +    Error *local_err = NULL;
>> +    int rv;
>>  
>>      QLIST_FOREACH(d, &audio_drivers, next) {
>>          if (strcmp(name, d->name) == 0) {
>>              return d;
>>          }
>>      }
>> -
>> -    audio_module_load(name);
>> -    QLIST_FOREACH(d, &audio_drivers, next) {
>> -        if (strcmp(name, d->name) == 0) {
>> -            return d;
>> +    rv = audio_module_load(name, &local_err);
>> +    if (rv > 0) {
>> +        QLIST_FOREACH(d, &audio_drivers, next) {
>> +            if (strcmp(name, d->name) == 0) {
>> +                return d;
>> +            }
>>          }
>> +    } else if (rv < 0) {
>> +        error_report_err(local_err);
>>      }
>> -
>>      return NULL;
>>  }
>>  
> 
> Before: audio_module_load() reports to stderr, but the caller can't

before: reports _some_ errors to stderr

> know.  So, error or no error, search the driver registry for the one we
> want.  Return it if found, else fail.
> 
> After: if audio_module_load() fails, report to stderr or current
> monitor, and fail.  If it could find no module or loaded one, search the
> driver registry.  Return it if found, else fail.
> 
> What if audio_module_load() fails, but a search for the driver succeeds?
> Before the patch, we succeed.  

audio_module_load() is the only way that audio_drivers can be updated and the search would return a different result.
The previous code was just lazily running the same code twice to make it simpler for the programmer in those conditions.

Afterwards, we fail.  Can this happen?

No.

> Yes: module_load() starts with
> 
>        if (!g_module_supported()) {
>            error_setg(errp, "%s", "this platform does not support GLib modules");
>            return -1;
>        }
> 
> Regression.
> 
> One way to fix: ensure module_load() cannot fail when the search will
> succeed.
> 
> Another: search first, and module_load() only if not found, then search
> again.
> 

Does not apply.

> Now let's review your use of the Error API.  Error objects should be
> passed up the call chain to the place that handles then.  The call
> chains are:
> 
> * audio_init(), which is called by
> 
>   - AUD_register_card(), which is called by a bunch of device creation
>     helpers called within machine initialization functions (I think),
>     which are called from qemu_init() via qemu_init_board() and
>     machine_run_board_init()
> 
>   - AUD_add_capture(), which is called by
> 
>     · wav_start_capture(), which is called by hmp_wavcapture on behalf
>       of HMP command wavcapture
> 
>     · audio_add(), which is called by protocol_client_msg() in respone
>       of a VNC_MSG_CLIENT_QEMU_AUDIO_ENABLE message from a VNC client (I
>       think).  Since audio_add() does return anything, its callers
>       remain oblivious of failure.
> 
>   - audio_init_audiodevs(), which is called by
>     qemu_create_early_backends(), which is called by qemu_init()
> 
> * audio_help(), which is called by
> 
>   - audio_parse_option(), which is called by qemu_init()
> 
>   - qemu_init()
> 
> * audio_handle_legacy_opts()
> 
>   - audio_init(); see above
> 
>   - audio_legacy_help(), which is called by qemu_init()
> 
> For the call chains that end in qemu_init(), error_report_err() is okay.
> Just beware of error cascades, i.e. one issue triggering multiple
> reports.  That's bad UX.  The extra reporting should predate this patch,
> which means it's no reason to reject this patch.
> 
> Likewise for call chains that end in an HMP command, because
> error_report_err() does the right thing then: it reports to the current
> monitor.  Bug fix: before the patch, it reports to stderr.  Commit
> message should mention the fix.
> 
> The problematic case is audio_add().  Reporting an error and then
> ignoring it feels wrong.  But it's how audio_add() works even before
> this patch.  Not this patch's problem to solve.
> 
>> diff --git a/block.c b/block.c
>> index 72c7f6d47d..7a94739aed 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -464,12 +464,18 @@ BlockDriver *bdrv_find_format(const char *format_name)
>>      /* The driver isn't registered, maybe we need to load a module */
>>      for (i = 0; i < (int)ARRAY_SIZE(block_driver_modules); ++i) {
>>          if (!strcmp(block_driver_modules[i].format_name, format_name)) {
>> -            block_module_load(block_driver_modules[i].library_name);
>> +            Error *local_err = NULL;
>> +            int rv = block_module_load(block_driver_modules[i].library_name,
>> +                                       &local_err);
>> +            if (rv > 0) {
>> +                return bdrv_do_find_format(format_name);
>> +            } else if (rv < 0) {
>> +                error_report_err(local_err);
>> +            }
>>              break;
>>          }
>>      }
>> -
>> -    return bdrv_do_find_format(format_name);
>> +    return NULL;
>>  }
>>  
> 
> Same regression, I think.

There is no problem here either.

> 
> Callers:
> 
> * bdrv_open_common()
> 
> * bdrv_fill_options()
> 
> * bdrv_open_inherit()
> 
> * bdrv_insert_node()
> 
> * bdrv_img_create()
> 
> * qmp_x_blockdev_amend()
> 
> * qmp_blockdev_create()
> 
> * qcow_co_create_opts()
> 
> * enable_write_target()
> 
> These all use the Error API.  Thus, we have instances of the
> "error_report() or similar from within a user of th Error API"
> anti-pattern.
> 
> Let's look more closely at just one of them: qmp_blockdev_create().
> Since we're in QMP monitor context, bdrv_find_format()'s
> error_report_err() will report a specific error to stderr, and then
> qmp_blockdev_create() will report a less specific one to the QMP client.
> This is wrong.
> 
> The obvious fix is to convert bdrv_find_format() to the Error API.
> 
> If you can't do that now, please note the issue in the commit message.
> 
>>  static int bdrv_format_is_whitelisted(const char *format_name, bool read_only)
>> @@ -976,12 +982,16 @@ BlockDriver *bdrv_find_protocol(const char *filename,
>>      for (i = 0; i < (int)ARRAY_SIZE(block_driver_modules); ++i) {
>>          if (block_driver_modules[i].protocol_name &&
>>              !strcmp(block_driver_modules[i].protocol_name, protocol)) {
>> -            block_module_load(block_driver_modules[i].library_name);
>> +            int rv = block_module_load(block_driver_modules[i].library_name, errp);
>> +            if (rv > 0) {
>> +                drv1 = bdrv_do_find_protocol(protocol);
>> +            } else if (rv < 0) {
>> +                return NULL;
>> +            }
>>              break;
>>          }
>>      }
>>  
>> -    drv1 = bdrv_do_find_protocol(protocol);
>>      if (!drv1) {
>>          error_setg(errp, "Unknown protocol '%s'", protocol);
>>      }
> 
> Same regression, I think.
> 
> bdrv_find_protocol() already uses the Error API, and your patch passes
> the Error up the chain.  Good.

This can be done here.

> 
> [...]
> 
> Out of steam for today, intend to continue later.  If you want to respin
> meanwhile, that's okay.  If you prefer to wait for me to finish
> reviewing this one, that's okay, too.
> 

I will be in vacation for two weeks, my concern is that this issue will be lost to the noise.
Will respin with the actual fixes, and hope this can be put into qemu and then move from there.

Thanks,

Claudio


  reply	other threads:[~2022-09-28 13:45 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-27 13:38 [PATCH v7 0/5] improve error handling for module load Claudio Fontana
2022-09-27 13:38 ` [PATCH v7 1/5] module: removed unused function argument "mayfail" Claudio Fontana
2022-09-27 13:38 ` [PATCH v7 2/5] module: rename module_load_one to module_load Claudio Fontana
2022-09-27 13:38 ` [PATCH v7 3/5] module: add Error arguments to module_load and module_load_qom Claudio Fontana
2022-09-28 11:31   ` Markus Armbruster
2022-09-28 12:02     ` Claudio Fontana [this message]
2022-09-30 11:50       ` Markus Armbruster
2022-10-24 11:29         ` Claudio Fontana
2022-09-27 13:38 ` [PATCH v7 4/5] dmg: warn when opening dmg images containing blocks of unknown type Claudio Fontana
2022-09-28  8:19   ` Markus Armbruster
2022-09-27 13:38 ` [PATCH v7 5/5] accel: abort if we fail to load the accelerator plugin Claudio Fontana
2022-09-28  8:28   ` Markus Armbruster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3487e03f-30dc-6545-4d24-cfee9ad15642@suse.de \
    --to=cfontana@suse.de \
    --cc=armbru@redhat.com \
    --cc=berrange@redhat.com \
    --cc=dinechin@redhat.com \
    --cc=f4bug@amsat.org \
    --cc=kraxel@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=marcandre.lureau@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).