From: "Alex Bennée" <alex.bennee@linaro.org>
To: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: "Manos Pitsidianakis" <manos.pitsidianakis@linaro.org>,
qemu-devel@nongnu.org, "Paolo Bonzini" <pbonzini@redhat.com>,
"Marc-André Lureau" <marcandre.lureau@redhat.com>,
"Philippe Mathieu-Daudé" <philmd@linaro.org>,
"Markus Armbruster" <armbru@redhat.com>,
"Gustavo Romero" <gustavo.romero@linaro.org>,
"Pierrick Bouvier" <pierrick.bouvier@linaro.org>
Subject: Re: [PATCH RFC] util/error.c: Print backtrace on error
Date: Wed, 06 Aug 2025 12:11:38 +0100 [thread overview]
Message-ID: <87h5ykzout.fsf@draig.linaro.org> (raw)
In-Reply-To: <aJJGvL8feHr7Wme7@redhat.com> ("Daniel P. Berrangé"'s message of "Tue, 5 Aug 2025 19:00:28 +0100")
Daniel P. Berrangé <berrange@redhat.com> writes:
> On Tue, Aug 05, 2025 at 07:57:38PM +0300, Manos Pitsidianakis wrote:
>> On Tue, Aug 5, 2025 at 7:49 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
>> >
>> > On Tue, Aug 05, 2025 at 07:22:14PM +0300, Manos Pitsidianakis wrote:
>> > > On Tue, Aug 5, 2025 at 7:00 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
>> > > >
>> > > > On Tue, Aug 05, 2025 at 12:19:26PM +0300, Manos Pitsidianakis wrote:
>> > > > > Add a backtrace_on_error meson feature (enabled with
>> > > > > --enable-backtrace-on-error) that compiles system binaries with
>> > > > > -rdynamic option and prints a function backtrace on error to stderr.
>> > > > >
>> > > > > Example output by adding an unconditional error_setg on error_abort in hw/arm/boot.c:
>> > > > >
>> > > > > ./qemu-system-aarch64(+0x13b4a2c) [0x55d015406a2c]
>> > > > > ./qemu-system-aarch64(+0x13b4abd) [0x55d015406abd]
>> > > > > ./qemu-system-aarch64(+0x13b4d49) [0x55d015406d49]
>> > > > > ./qemu-system-aarch64(error_setg_internal+0xe7) [0x55d015406f62]
>> > > > > ./qemu-system-aarch64(arm_load_dtb+0xbf) [0x55d014d7686f]
>> > > > > ./qemu-system-aarch64(+0xd2f1d8) [0x55d014d811d8]
>> > > > > ./qemu-system-aarch64(notifier_list_notify+0x44) [0x55d01540a282]
>> > > > > ./qemu-system-aarch64(qdev_machine_creation_done+0xa0) [0x55d01476ae17]
>> > > > > ./qemu-system-aarch64(+0xaa691e) [0x55d014af891e]
>> > > > > ./qemu-system-aarch64(qmp_x_exit_preconfig+0x72) [0x55d014af8a5d]
>> > > > > ./qemu-system-aarch64(qemu_init+0x2a89) [0x55d014afb657]
>> > > > > ./qemu-system-aarch64(main+0x2f) [0x55d01521e836]
>> > > > > /lib/x86_64-linux-gnu/libc.so.6(+0x29ca8) [0x7f3033d67ca8]
>> > > > > /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85) [0x7f3033d67d65]
>> > > > > ./qemu-system-aarch64(_start+0x21) [0x55d0146814f1]
>> > > > >
>> > > > > Unexpected error in arm_load_dtb() at ../hw/arm/boot.c:529:
>> > > >
>> > > > From an end-user POV, IMHO the error messages need to be good enough
>> > > > that such backtraces aren't needed to understand the problem. For
>> > > > developers, GDB can give much better backtraces (file+line numbers,
>> > > > plus parameters plus local variables) in the ideally rare cases that
>> > > > the error message alone has insufficient info. So I'm not really
>> > > > convinced that programs (in general, not just QEMU) should try to
>> > > > create backtraces themselves.
>> > >
>> > > I don't think there's value in replacing gdb debugging with this, I
>> > > agree. I think it has value for "fire and forget" uses, when errors
>> > > happen unexpectedly and are hard to replicate and you only end up with
>> > > log entries and no easy way to debug it.
>> >
>> > If the log entry with the error message is useless for devs, then it
>> > is even worse for end users... who will be copying that message into
>> > bug reports anyway. This patch doesn't feel like something we could
>> > enable in formal builds in the distro, so we still need better error
>> > reporting without it, such that user bug reports are actionable.
>> >
>> > Was there a specific place where you found things hard to debug
>> > from the error message alone ? I'm sure we have plenty of examples
>> > of errors that can be improved, but wondering if there are some
>> > general patterns we're doing badly that would be a good win
>> > to improve ?
>>
>> Some months ago I was debugging a MemoryRegion use-after-free and used
>> this code to figure out that the free was called from RCU context
>> instead of the main thread.
>
> We give useful names to many (but not neccessarily all) threads that we
> spawn. Perhaps we should call pthread_getname_np() to fetch the current
> thread name, and used that as a prefix on the error message we print
> out, as a bit of extra context ?
Do we always have sensible names for threads or only if we enable the
option?
> Obviously not as much info as a full stack trace, but that is something
> we could likely enable unconditionally without any overheads to worry
> about, so a likely incremental wni.
The place where it comes in useful is when we get bug reports from users
who have crashed QEMU in a embedded docker container and can't give us a
reasonable reproducer. If we can encourage such users to enable this
option (or maybe make it part of --enable-debug-info) then we could get
a slightly more useful backtrace for those bugs.
I agree most sane configurations (i.e. a distro) would just attach gdb
and use whatever symbol resolution the distro provides.
>
> With regards,
> Daniel
--
Alex Bennée
Virtualisation Tech Lead @ Linaro
next prev parent reply other threads:[~2025-08-06 11:12 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-05 9:19 [PATCH RFC] util/error.c: Print backtrace on error Manos Pitsidianakis
2025-08-05 15:59 ` Daniel P. Berrangé
2025-08-05 16:22 ` Manos Pitsidianakis
2025-08-05 16:48 ` Daniel P. Berrangé
2025-08-05 16:57 ` Manos Pitsidianakis
2025-08-05 18:00 ` Daniel P. Berrangé
2025-08-06 11:11 ` Alex Bennée [this message]
2025-08-06 11:34 ` Daniel P. Berrangé
2025-08-06 20:26 ` Pierrick Bouvier
2025-08-07 5:41 ` Markus Armbruster
2025-08-18 16:49 ` Daniel P. Berrangé
2025-08-18 16:52 ` Daniel P. Berrangé
2025-08-07 5:23 ` Markus Armbruster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87h5ykzout.fsf@draig.linaro.org \
--to=alex.bennee@linaro.org \
--cc=armbru@redhat.com \
--cc=berrange@redhat.com \
--cc=gustavo.romero@linaro.org \
--cc=manos.pitsidianakis@linaro.org \
--cc=marcandre.lureau@redhat.com \
--cc=pbonzini@redhat.com \
--cc=philmd@linaro.org \
--cc=pierrick.bouvier@linaro.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.