From: "Alex Bennée" <alex.bennee@linaro.org>
To: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: "Manos Pitsidianakis" <manos.pitsidianakis@linaro.org>,
qemu-devel@nongnu.org, "Paolo Bonzini" <pbonzini@redhat.com>,
"Marc-André Lureau" <marcandre.lureau@redhat.com>,
"Philippe Mathieu-Daudé" <philmd@linaro.org>,
"Markus Armbruster" <armbru@redhat.com>,
"Gustavo Romero" <gustavo.romero@linaro.org>,
"Pierrick Bouvier" <pierrick.bouvier@linaro.org>
Subject: Re: [PATCH RFC] util/error.c: Print backtrace on error
Date: Wed, 06 Aug 2025 12:11:38 +0100 [thread overview]
Message-ID: <87h5ykzout.fsf@draig.linaro.org> (raw)
In-Reply-To: <aJJGvL8feHr7Wme7@redhat.com> ("Daniel P. Berrangé"'s message of "Tue, 5 Aug 2025 19:00:28 +0100")
Daniel P. Berrangé <berrange@redhat.com> writes:
> On Tue, Aug 05, 2025 at 07:57:38PM +0300, Manos Pitsidianakis wrote:
>> On Tue, Aug 5, 2025 at 7:49 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
>> >
>> > On Tue, Aug 05, 2025 at 07:22:14PM +0300, Manos Pitsidianakis wrote:
>> > > On Tue, Aug 5, 2025 at 7:00 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
>> > > >
>> > > > On Tue, Aug 05, 2025 at 12:19:26PM +0300, Manos Pitsidianakis wrote:
>> > > > > Add a backtrace_on_error meson feature (enabled with
>> > > > > --enable-backtrace-on-error) that compiles system binaries with
>> > > > > -rdynamic option and prints a function backtrace on error to stderr.
>> > > > >
>> > > > > Example output by adding an unconditional error_setg on error_abort in hw/arm/boot.c:
>> > > > >
>> > > > > ./qemu-system-aarch64(+0x13b4a2c) [0x55d015406a2c]
>> > > > > ./qemu-system-aarch64(+0x13b4abd) [0x55d015406abd]
>> > > > > ./qemu-system-aarch64(+0x13b4d49) [0x55d015406d49]
>> > > > > ./qemu-system-aarch64(error_setg_internal+0xe7) [0x55d015406f62]
>> > > > > ./qemu-system-aarch64(arm_load_dtb+0xbf) [0x55d014d7686f]
>> > > > > ./qemu-system-aarch64(+0xd2f1d8) [0x55d014d811d8]
>> > > > > ./qemu-system-aarch64(notifier_list_notify+0x44) [0x55d01540a282]
>> > > > > ./qemu-system-aarch64(qdev_machine_creation_done+0xa0) [0x55d01476ae17]
>> > > > > ./qemu-system-aarch64(+0xaa691e) [0x55d014af891e]
>> > > > > ./qemu-system-aarch64(qmp_x_exit_preconfig+0x72) [0x55d014af8a5d]
>> > > > > ./qemu-system-aarch64(qemu_init+0x2a89) [0x55d014afb657]
>> > > > > ./qemu-system-aarch64(main+0x2f) [0x55d01521e836]
>> > > > > /lib/x86_64-linux-gnu/libc.so.6(+0x29ca8) [0x7f3033d67ca8]
>> > > > > /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85) [0x7f3033d67d65]
>> > > > > ./qemu-system-aarch64(_start+0x21) [0x55d0146814f1]
>> > > > >
>> > > > > Unexpected error in arm_load_dtb() at ../hw/arm/boot.c:529:
>> > > >
>> > > > From an end-user POV, IMHO the error messages need to be good enough
>> > > > that such backtraces aren't needed to understand the problem. For
>> > > > developers, GDB can give much better backtraces (file+line numbers,
>> > > > plus parameters plus local variables) in the ideally rare cases that
>> > > > the error message alone has insufficient info. So I'm not really
>> > > > convinced that programs (in general, not just QEMU) should try to
>> > > > create backtraces themselves.
>> > >
>> > > I don't think there's value in replacing gdb debugging with this, I
>> > > agree. I think it has value for "fire and forget" uses, when errors
>> > > happen unexpectedly and are hard to replicate and you only end up with
>> > > log entries and no easy way to debug it.
>> >
>> > If the log entry with the error message is useless for devs, then it
>> > is even worse for end users... who will be copying that message into
>> > bug reports anyway. This patch doesn't feel like something we could
>> > enable in formal builds in the distro, so we still need better error
>> > reporting without it, such that user bug reports are actionable.
>> >
>> > Was there a specific place where you found things hard to debug
>> > from the error message alone ? I'm sure we have plenty of examples
>> > of errors that can be improved, but wondering if there are some
>> > general patterns we're doing badly that would be a good win
>> > to improve ?
>>
>> Some months ago I was debugging a MemoryRegion use-after-free and used
>> this code to figure out that the free was called from RCU context
>> instead of the main thread.
>
> We give useful names to many (but not neccessarily all) threads that we
> spawn. Perhaps we should call pthread_getname_np() to fetch the current
> thread name, and used that as a prefix on the error message we print
> out, as a bit of extra context ?
Do we always have sensible names for threads or only if we enable the
option?
> Obviously not as much info as a full stack trace, but that is something
> we could likely enable unconditionally without any overheads to worry
> about, so a likely incremental wni.
The place where it comes in useful is when we get bug reports from users
who have crashed QEMU in a embedded docker container and can't give us a
reasonable reproducer. If we can encourage such users to enable this
option (or maybe make it part of --enable-debug-info) then we could get
a slightly more useful backtrace for those bugs.
I agree most sane configurations (i.e. a distro) would just attach gdb
and use whatever symbol resolution the distro provides.
>
> With regards,
> Daniel
--
Alex Bennée
Virtualisation Tech Lead @ Linaro
next prev parent reply other threads:[~2025-08-06 11:12 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-05 9:19 [PATCH RFC] util/error.c: Print backtrace on error Manos Pitsidianakis
2025-08-05 15:59 ` Daniel P. Berrangé
2025-08-05 16:22 ` Manos Pitsidianakis
2025-08-05 16:48 ` Daniel P. Berrangé
2025-08-05 16:57 ` Manos Pitsidianakis
2025-08-05 18:00 ` Daniel P. Berrangé
2025-08-06 11:11 ` Alex Bennée [this message]
2025-08-06 11:34 ` Daniel P. Berrangé
2025-08-06 20:26 ` Pierrick Bouvier
2025-08-07 5:41 ` Markus Armbruster
2025-08-18 16:49 ` Daniel P. Berrangé
2025-08-18 16:52 ` Daniel P. Berrangé
2025-08-07 5:23 ` Markus Armbruster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87h5ykzout.fsf@draig.linaro.org \
--to=alex.bennee@linaro.org \
--cc=armbru@redhat.com \
--cc=berrange@redhat.com \
--cc=gustavo.romero@linaro.org \
--cc=manos.pitsidianakis@linaro.org \
--cc=marcandre.lureau@redhat.com \
--cc=pbonzini@redhat.com \
--cc=philmd@linaro.org \
--cc=pierrick.bouvier@linaro.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).