qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Alex Bennée" <alex.bennee@linaro.org>
To: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: "Manos Pitsidianakis" <manos.pitsidianakis@linaro.org>,
	qemu-devel@nongnu.org, "Paolo Bonzini" <pbonzini@redhat.com>,
	"Marc-André Lureau" <marcandre.lureau@redhat.com>,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>,
	"Markus Armbruster" <armbru@redhat.com>,
	"Gustavo Romero" <gustavo.romero@linaro.org>,
	"Pierrick Bouvier" <pierrick.bouvier@linaro.org>
Subject: Re: [PATCH RFC] util/error.c: Print backtrace on error
Date: Wed, 06 Aug 2025 12:11:38 +0100	[thread overview]
Message-ID: <87h5ykzout.fsf@draig.linaro.org> (raw)
In-Reply-To: <aJJGvL8feHr7Wme7@redhat.com> ("Daniel P. Berrangé"'s message of "Tue, 5 Aug 2025 19:00:28 +0100")

Daniel P. Berrangé <berrange@redhat.com> writes:

> On Tue, Aug 05, 2025 at 07:57:38PM +0300, Manos Pitsidianakis wrote:
>> On Tue, Aug 5, 2025 at 7:49 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
>> >
>> > On Tue, Aug 05, 2025 at 07:22:14PM +0300, Manos Pitsidianakis wrote:
>> > > On Tue, Aug 5, 2025 at 7:00 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
>> > > >
>> > > > On Tue, Aug 05, 2025 at 12:19:26PM +0300, Manos Pitsidianakis wrote:
>> > > > > Add a backtrace_on_error meson feature (enabled with
>> > > > > --enable-backtrace-on-error) that compiles system binaries with
>> > > > > -rdynamic option and prints a function backtrace on error to stderr.
>> > > > >
>> > > > > Example output by adding an unconditional error_setg on error_abort in hw/arm/boot.c:
>> > > > >
>> > > > >   ./qemu-system-aarch64(+0x13b4a2c) [0x55d015406a2c]
>> > > > >   ./qemu-system-aarch64(+0x13b4abd) [0x55d015406abd]
>> > > > >   ./qemu-system-aarch64(+0x13b4d49) [0x55d015406d49]
>> > > > >   ./qemu-system-aarch64(error_setg_internal+0xe7) [0x55d015406f62]
>> > > > >   ./qemu-system-aarch64(arm_load_dtb+0xbf) [0x55d014d7686f]
>> > > > >   ./qemu-system-aarch64(+0xd2f1d8) [0x55d014d811d8]
>> > > > >   ./qemu-system-aarch64(notifier_list_notify+0x44) [0x55d01540a282]
>> > > > >   ./qemu-system-aarch64(qdev_machine_creation_done+0xa0) [0x55d01476ae17]
>> > > > >   ./qemu-system-aarch64(+0xaa691e) [0x55d014af891e]
>> > > > >   ./qemu-system-aarch64(qmp_x_exit_preconfig+0x72) [0x55d014af8a5d]
>> > > > >   ./qemu-system-aarch64(qemu_init+0x2a89) [0x55d014afb657]
>> > > > >   ./qemu-system-aarch64(main+0x2f) [0x55d01521e836]
>> > > > >   /lib/x86_64-linux-gnu/libc.so.6(+0x29ca8) [0x7f3033d67ca8]
>> > > > >   /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85) [0x7f3033d67d65]
>> > > > >   ./qemu-system-aarch64(_start+0x21) [0x55d0146814f1]
>> > > > >
>> > > > >   Unexpected error in arm_load_dtb() at ../hw/arm/boot.c:529:
>> > > >
>> > > > From an end-user POV, IMHO the error messages need to be good enough
>> > > > that such backtraces aren't needed to understand the problem. For
>> > > > developers, GDB can give much better backtraces (file+line numbers,
>> > > > plus parameters plus local variables) in the ideally rare cases that
>> > > > the error message alone has insufficient info. So I'm not really
>> > > > convinced that programs (in general, not just QEMU) should try to
>> > > > create backtraces themselves.
>> > >
>> > > I don't think there's value in replacing gdb debugging with this, I
>> > > agree. I think it has value for "fire and forget" uses, when errors
>> > > happen unexpectedly and are hard to replicate and you only end up with
>> > > log entries and no easy way to debug it.
>> >
>> > If the log entry with the error message is useless for devs, then it
>> > is even worse for end users... who will be copying that message into
>> > bug reports anyway. This patch doesn't feel like something we could
>> > enable in formal builds in the distro, so we still need better error
>> > reporting without it, such that user bug reports are actionable.
>> >
>> > Was there a specific place where you found things hard to debug
>> > from the error message alone ?  I'm sure we have plenty of examples
>> > of errors that can be improved, but wondering if there are some
>> > general patterns we're doing badly that would be a good win
>> > to improve ?
>> 
>> Some months ago I was debugging a MemoryRegion use-after-free and used
>> this code to figure out that the free was called from RCU context
>> instead of the main thread.
>
> We give useful names to many (but not neccessarily all) threads that we
> spawn. Perhaps we should call pthread_getname_np() to fetch the current
> thread name, and used that as a prefix on the error message we print
> out, as a bit of extra context ?

Do we always have sensible names for threads or only if we enable the
option?

> Obviously not as much info as a full stack trace, but that is something
> we could likely enable unconditionally without any overheads to worry
> about, so a likely incremental wni.

The place where it comes in useful is when we get bug reports from users
who have crashed QEMU in a embedded docker container and can't give us a
reasonable reproducer. If we can encourage such users to enable this
option (or maybe make it part of --enable-debug-info) then we could get
a slightly more useful backtrace for those bugs.

I agree most sane configurations (i.e. a distro) would just attach gdb
and use whatever symbol resolution the distro provides.

>
> With regards,
> Daniel

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


  reply	other threads:[~2025-08-06 11:12 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-05  9:19 [PATCH RFC] util/error.c: Print backtrace on error Manos Pitsidianakis
2025-08-05 15:59 ` Daniel P. Berrangé
2025-08-05 16:22   ` Manos Pitsidianakis
2025-08-05 16:48     ` Daniel P. Berrangé
2025-08-05 16:57       ` Manos Pitsidianakis
2025-08-05 18:00         ` Daniel P. Berrangé
2025-08-06 11:11           ` Alex Bennée [this message]
2025-08-06 11:34             ` Daniel P. Berrangé
2025-08-06 20:26               ` Pierrick Bouvier
2025-08-07  5:41                 ` Markus Armbruster
2025-08-18 16:49                   ` Daniel P. Berrangé
2025-08-18 16:52               ` Daniel P. Berrangé
2025-08-07  5:23     ` Markus Armbruster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87h5ykzout.fsf@draig.linaro.org \
    --to=alex.bennee@linaro.org \
    --cc=armbru@redhat.com \
    --cc=berrange@redhat.com \
    --cc=gustavo.romero@linaro.org \
    --cc=manos.pitsidianakis@linaro.org \
    --cc=marcandre.lureau@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=philmd@linaro.org \
    --cc=pierrick.bouvier@linaro.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).