From: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: qemu-devel@nongnu.org, kwolf@redhat.com, peterx@redhat.com,
vsementsov@yandex-team.ru, den@virtuozzo.com
Subject: Re: [PATCH v2 4/4] scripts/qemugdb: coroutine: Add option for obtaining detailed trace in coredump
Date: Wed, 3 Dec 2025 11:39:07 +0200 [thread overview]
Message-ID: <3eda31c5-84e5-4ad1-a4b0-74b112e33256@virtuozzo.com> (raw)
In-Reply-To: <20251202193001.GB964933@fedora>
On 12/2/25 9:30 PM, Stefan Hajnoczi wrote:
> On Tue, Dec 02, 2025 at 06:31:19PM +0200, Andrey Drobyshev wrote:
>> Commit 772f86839f ("scripts/qemu-gdb: Support coroutine dumps in
>> coredumps") introduced coroutine traces in coredumps using raw stack
>> unwinding. While this works, this approach does not allow to view the
>> function arguments in the corresponding stack frames.
>>
>> As an alternative, we can obtain saved registers from the coroutine's
>> jmpbuf, patch them into the coredump's struct elf_prstatus in place, and
>> execute another gdb subprocess to get backtrace from the patched temporary
>> coredump.
>>
>> While providing more detailed info, this alternative approach, however, is
>> more invasive as it might potentially corrupt the coredump file. We do take
>> precautions by saving the original registers values into a separate binary
>> blob /path/to/coredump.ptregs, so that it can be restores in the next
>> GDB session. Still, instead of making it a new deault, let's keep raw unwind
>> the default behaviour, but add the '--detailed' option for 'qemu bt' and
>> 'qemu coroutine' command which would enforce the new behaviour.
>>
>> That's how this looks:
>>
>> (gdb) qemu coroutine 0x7fda9335a508
>> #0 0x5602bdb41c26 in qemu_coroutine_switch<+214> () at ../util/coroutine-ucontext.c:321
>> #1 0x5602bdb3e8fe in qemu_aio_coroutine_enter<+493> () at ../util/qemu-coroutine.c:293
>> #2 0x5602bdb3c4eb in co_schedule_bh_cb<+538> () at ../util/async.c:547
>> #3 0x5602bdb3b518 in aio_bh_call<+119> () at ../util/async.c:172
>> #4 0x5602bdb3b79a in aio_bh_poll<+457> () at ../util/async.c:219
>> #5 0x5602bdb10f22 in aio_poll<+1201> () at ../util/aio-posix.c:719
>> #6 0x5602bd8fb1ac in iothread_run<+123> () at ../iothread.c:63
>> #7 0x5602bdb18a24 in qemu_thread_start<+355> () at ../util/qemu-thread-posix.c:393
>>
>> (gdb) qemu coroutine 0x7fda9335a508 --detailed
>> patching core file /tmp/tmpq4hmk2qc
>> found "CORE" at 0x10c48
>> assume pt_regs at 0x10cbc
>> write r15 at 0x10cbc
>> write r14 at 0x10cc4
>> write r13 at 0x10ccc
>> write r12 at 0x10cd4
>> write rbp at 0x10cdc
>> write rbx at 0x10ce4
>> write rip at 0x10d3c
>> write rsp at 0x10d54
>>
>> #0 0x00005602bdb41c26 in qemu_coroutine_switch (from_=0x7fda9335a508, to_=0x7fda8400c280, action=COROUTINE_ENTER) at ../util/coroutine-ucontext.c:321
>> #1 0x00005602bdb3e8fe in qemu_aio_coroutine_enter (ctx=0x5602bf7147c0, co=0x7fda8400c280) at ../util/qemu-coroutine.c:293
>> #2 0x00005602bdb3c4eb in co_schedule_bh_cb (opaque=0x5602bf7147c0) at ../util/async.c:547
>> #3 0x00005602bdb3b518 in aio_bh_call (bh=0x5602bf714a40) at ../util/async.c:172
>> #4 0x00005602bdb3b79a in aio_bh_poll (ctx=0x5602bf7147c0) at ../util/async.c:219
>> #5 0x00005602bdb10f22 in aio_poll (ctx=0x5602bf7147c0, blocking=true) at ../util/aio-posix.c:719
>> #6 0x00005602bd8fb1ac in iothread_run (opaque=0x5602bf42b100) at ../iothread.c:63
>> #7 0x00005602bdb18a24 in qemu_thread_start (args=0x5602bf7164a0) at ../util/qemu-thread-posix.c:393
>> #8 0x00007fda9e89f7f2 in start_thread (arg=<optimized out>) at pthread_create.c:443
>> #9 0x00007fda9e83f450 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
>>
>> CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>> CC: Peter Xu <peterx@redhat.com>
>> Originally-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
>> ---
>> scripts/qemugdb/coroutine.py | 243 +++++++++++++++++++++++++++++++++--
>> 1 file changed, 233 insertions(+), 10 deletions(-)
>>
>> diff --git a/scripts/qemugdb/coroutine.py b/scripts/qemugdb/coroutine.py
>> index e98fc48a4b..280c02c12d 100644
>> --- a/scripts/qemugdb/coroutine.py
>> +++ b/scripts/qemugdb/coroutine.py
>> @@ -10,9 +10,116 @@
>> # or later. See the COPYING file in the top-level directory.
>>
>> import gdb
>> +import os
>> +import pty
>> +import re
>> +import struct
>> +import textwrap
>> +
>> +from collections import OrderedDict
>> +from copy import deepcopy
>>
>> VOID_PTR = gdb.lookup_type('void').pointer()
>>
>> +# Registers in the same order they're present in ELF coredump file.
>> +# See asm/ptrace.h
>> +PT_REGS = ['r15', 'r14', 'r13', 'r12', 'rbp', 'rbx', 'r11', 'r10', 'r9',
>> + 'r8', 'rax', 'rcx', 'rdx', 'rsi', 'rdi', 'orig_rax', 'rip', 'cs',
>> + 'eflags', 'rsp', 'ss']
>> +
>> +coredump = None
>> +
>> +
>> +class Coredump:
>> + _ptregs_suff = '.ptregs'
>> +
>> + def __init__(self, coredump, executable):
>> + gdb.events.exited.connect(self._cleanup)
>
> It's not clear to me that this cleanup mechanism is reliable:
>
> - The restore_regs() method is called from invoke(), but not in a
> `finally` block that would guarantee it runs even when an exception is
> thrown. Maybe _cleanup() can be called without a prior restore_regs()
> call. It would be inconvenient to lose the original register values.
>
Agreed. We might as well put restore_regs() call into a `finally` block
to make sure it's called in any case, like that:
> try:
> while True:
> co = co_cast(co_ptr)
> co_ptr = co["base"]["caller"]
> if co_ptr == 0:
> break
> gdb.write("\nCoroutine at " + str(co_ptr) + ":\n")
> bt_jmpbuf(coroutine_to_jmpbuf(co_ptr), detailed=detailed)
>
> finally:
> coredump.restore_regs()
And also we should probably call restore_regs() during the cleanup if
the dirty flag is set.
> - I'm not sure if gdb.events.exited (when GDB's inferior terminates) is
> the correct event to ensure cleanup. The worst case is that the
> temporary file is leaked, which is not a serious problem.
>
Hmm indeed, this callback isn't called upon signals. I guess we can
just call atexit.register(self._cleanup). This seems to handle both
normal and abnormal exit (except SIGKILL of course).
> But then this is a debugging script and it's probably fine:
>
> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
prev parent reply other threads:[~2025-12-03 9:41 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-02 16:31 [PATCH v2 0/4] Fixes and improvements for scripts/qemugdb commands Andrey Drobyshev
2025-12-02 16:31 ` [PATCH v2 1/4] scripts/qemugdb: mtree: Fix OverflowError in mtree with 128-bit addresses Andrey Drobyshev
2025-12-02 16:31 ` [PATCH v2 2/4] scripts/qemugdb: timers: Fix KeyError in 'qemu timers' command Andrey Drobyshev
2025-12-02 16:31 ` [PATCH v2 3/4] scripts/qemugdb: timers: Improve 'qemu timers' command readability Andrey Drobyshev
2025-12-02 16:31 ` [PATCH v2 4/4] scripts/qemugdb: coroutine: Add option for obtaining detailed trace in coredump Andrey Drobyshev
2025-12-02 19:30 ` Stefan Hajnoczi
2025-12-03 9:39 ` Andrey Drobyshev [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3eda31c5-84e5-4ad1-a4b0-74b112e33256@virtuozzo.com \
--to=andrey.drobyshev@virtuozzo.com \
--cc=den@virtuozzo.com \
--cc=kwolf@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
--cc=vsementsov@yandex-team.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).