qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: qemu-devel@nongnu.org, kwolf@redhat.com, peterx@redhat.com,
	vsementsov@yandex-team.ru, den@virtuozzo.com
Subject: Re: [PATCH v2 4/4] scripts/qemugdb: coroutine: Add option for obtaining detailed trace in coredump
Date: Wed, 3 Dec 2025 11:39:07 +0200	[thread overview]
Message-ID: <3eda31c5-84e5-4ad1-a4b0-74b112e33256@virtuozzo.com> (raw)
In-Reply-To: <20251202193001.GB964933@fedora>

On 12/2/25 9:30 PM, Stefan Hajnoczi wrote:
> On Tue, Dec 02, 2025 at 06:31:19PM +0200, Andrey Drobyshev wrote:
>> Commit 772f86839f ("scripts/qemu-gdb: Support coroutine dumps in
>> coredumps") introduced coroutine traces in coredumps using raw stack
>> unwinding.  While this works, this approach does not allow to view the
>> function arguments in the corresponding stack frames.
>>
>> As an alternative, we can obtain saved registers from the coroutine's
>> jmpbuf, patch them into the coredump's struct elf_prstatus in place, and
>> execute another gdb subprocess to get backtrace from the patched temporary
>> coredump.
>>
>> While providing more detailed info, this alternative approach, however, is
>> more invasive as it might potentially corrupt the coredump file. We do take
>> precautions by saving the original registers values into a separate binary
>> blob /path/to/coredump.ptregs, so that it can be restores in the next
>> GDB session.  Still, instead of making it a new deault, let's keep raw unwind
>> the default behaviour, but add the '--detailed' option for 'qemu bt' and
>> 'qemu coroutine' command which would enforce the new behaviour.
>>
>> That's how this looks:
>>
>>   (gdb) qemu coroutine 0x7fda9335a508
>>   #0  0x5602bdb41c26 in qemu_coroutine_switch<+214> () at ../util/coroutine-ucontext.c:321
>>   #1  0x5602bdb3e8fe in qemu_aio_coroutine_enter<+493> () at ../util/qemu-coroutine.c:293
>>   #2  0x5602bdb3c4eb in co_schedule_bh_cb<+538> () at ../util/async.c:547
>>   #3  0x5602bdb3b518 in aio_bh_call<+119> () at ../util/async.c:172
>>   #4  0x5602bdb3b79a in aio_bh_poll<+457> () at ../util/async.c:219
>>   #5  0x5602bdb10f22 in aio_poll<+1201> () at ../util/aio-posix.c:719
>>   #6  0x5602bd8fb1ac in iothread_run<+123> () at ../iothread.c:63
>>   #7  0x5602bdb18a24 in qemu_thread_start<+355> () at ../util/qemu-thread-posix.c:393
>>
>>   (gdb) qemu coroutine 0x7fda9335a508 --detailed
>>   patching core file /tmp/tmpq4hmk2qc
>>   found "CORE" at 0x10c48
>>   assume pt_regs at 0x10cbc
>>   write r15 at 0x10cbc
>>   write r14 at 0x10cc4
>>   write r13 at 0x10ccc
>>   write r12 at 0x10cd4
>>   write rbp at 0x10cdc
>>   write rbx at 0x10ce4
>>   write rip at 0x10d3c
>>   write rsp at 0x10d54
>>
>>   #0  0x00005602bdb41c26 in qemu_coroutine_switch (from_=0x7fda9335a508, to_=0x7fda8400c280, action=COROUTINE_ENTER) at ../util/coroutine-ucontext.c:321
>>   #1  0x00005602bdb3e8fe in qemu_aio_coroutine_enter (ctx=0x5602bf7147c0, co=0x7fda8400c280) at ../util/qemu-coroutine.c:293
>>   #2  0x00005602bdb3c4eb in co_schedule_bh_cb (opaque=0x5602bf7147c0) at ../util/async.c:547
>>   #3  0x00005602bdb3b518 in aio_bh_call (bh=0x5602bf714a40) at ../util/async.c:172
>>   #4  0x00005602bdb3b79a in aio_bh_poll (ctx=0x5602bf7147c0) at ../util/async.c:219
>>   #5  0x00005602bdb10f22 in aio_poll (ctx=0x5602bf7147c0, blocking=true) at ../util/aio-posix.c:719
>>   #6  0x00005602bd8fb1ac in iothread_run (opaque=0x5602bf42b100) at ../iothread.c:63
>>   #7  0x00005602bdb18a24 in qemu_thread_start (args=0x5602bf7164a0) at ../util/qemu-thread-posix.c:393
>>   #8  0x00007fda9e89f7f2 in start_thread (arg=<optimized out>) at pthread_create.c:443
>>   #9  0x00007fda9e83f450 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
>>
>> CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>> CC: Peter Xu <peterx@redhat.com>
>> Originally-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
>> ---
>>  scripts/qemugdb/coroutine.py | 243 +++++++++++++++++++++++++++++++++--
>>  1 file changed, 233 insertions(+), 10 deletions(-)
>>
>> diff --git a/scripts/qemugdb/coroutine.py b/scripts/qemugdb/coroutine.py
>> index e98fc48a4b..280c02c12d 100644
>> --- a/scripts/qemugdb/coroutine.py
>> +++ b/scripts/qemugdb/coroutine.py
>> @@ -10,9 +10,116 @@
>>  # or later.  See the COPYING file in the top-level directory.
>>  
>>  import gdb
>> +import os
>> +import pty
>> +import re
>> +import struct
>> +import textwrap
>> +
>> +from collections import OrderedDict
>> +from copy import deepcopy
>>  
>>  VOID_PTR = gdb.lookup_type('void').pointer()
>>  
>> +# Registers in the same order they're present in ELF coredump file.
>> +# See asm/ptrace.h
>> +PT_REGS = ['r15', 'r14', 'r13', 'r12', 'rbp', 'rbx', 'r11', 'r10', 'r9',
>> +           'r8', 'rax', 'rcx', 'rdx', 'rsi', 'rdi', 'orig_rax', 'rip', 'cs',
>> +           'eflags', 'rsp', 'ss']
>> +
>> +coredump = None
>> +
>> +
>> +class Coredump:
>> +    _ptregs_suff = '.ptregs'
>> +
>> +    def __init__(self, coredump, executable):
>> +        gdb.events.exited.connect(self._cleanup)
> 
> It's not clear to me that this cleanup mechanism is reliable:
> 
> - The restore_regs() method is called from invoke(), but not in a
>   `finally` block that would guarantee it runs even when an exception is
>   thrown. Maybe _cleanup() can be called without a prior restore_regs()
>   call. It would be inconvenient to lose the original register values.
> 

Agreed.  We might as well put restore_regs() call into a `finally` block
to make sure it's called in any case, like that:

>         try:
>             while True:
>                 co = co_cast(co_ptr)
>                 co_ptr = co["base"]["caller"]
>                 if co_ptr == 0:
>                     break
>                 gdb.write("\nCoroutine at " + str(co_ptr) + ":\n")
>                 bt_jmpbuf(coroutine_to_jmpbuf(co_ptr), detailed=detailed)
> 
>         finally:
>             coredump.restore_regs()

And also we should probably call restore_regs() during the cleanup if
the dirty flag is set.

> - I'm not sure if gdb.events.exited (when GDB's inferior terminates) is
>   the correct event to ensure cleanup. The worst case is that the
>   temporary file is leaked, which is not a serious problem.
> 

Hmm indeed, this callback isn't called upon signals.  I guess we can
just call atexit.register(self._cleanup).  This seems to handle both
normal and abnormal exit (except SIGKILL of course).

> But then this is a debugging script and it's probably fine:
> 
> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>



      reply	other threads:[~2025-12-03  9:41 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-02 16:31 [PATCH v2 0/4] Fixes and improvements for scripts/qemugdb commands Andrey Drobyshev
2025-12-02 16:31 ` [PATCH v2 1/4] scripts/qemugdb: mtree: Fix OverflowError in mtree with 128-bit addresses Andrey Drobyshev
2025-12-02 16:31 ` [PATCH v2 2/4] scripts/qemugdb: timers: Fix KeyError in 'qemu timers' command Andrey Drobyshev
2025-12-02 16:31 ` [PATCH v2 3/4] scripts/qemugdb: timers: Improve 'qemu timers' command readability Andrey Drobyshev
2025-12-02 16:31 ` [PATCH v2 4/4] scripts/qemugdb: coroutine: Add option for obtaining detailed trace in coredump Andrey Drobyshev
2025-12-02 19:30   ` Stefan Hajnoczi
2025-12-03  9:39     ` Andrey Drobyshev [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3eda31c5-84e5-4ad1-a4b0-74b112e33256@virtuozzo.com \
    --to=andrey.drobyshev@virtuozzo.com \
    --cc=den@virtuozzo.com \
    --cc=kwolf@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=vsementsov@yandex-team.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).