qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: qemu-devel@nongnu.org, kwolf@redhat.com, peterx@redhat.com,
	vsementsov@yandex-team.ru, den@virtuozzo.com
Subject: Re: [PATCH 4/4] scripts/qemugdb: coroutine: Add option for obtaining detailed trace in coredump
Date: Thu, 27 Nov 2025 18:13:49 +0200	[thread overview]
Message-ID: <1ec1b39c-96a9-498a-bbcd-f16aa33e58c3@virtuozzo.com> (raw)
In-Reply-To: <20251127144806.GB609942@fedora>

On 11/27/25 4:48 PM, Stefan Hajnoczi wrote:
> On Thu, Nov 27, 2025 at 03:14:43PM +0200, Andrey Drobyshev wrote:
>> On 11/26/25 10:58 PM, Stefan Hajnoczi wrote:
>>> On Tue, Nov 25, 2025 at 04:21:05PM +0200, andrey.drobyshev@virtuozzo.com wrote:
>>>> From: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
>>>>
>>>> Commit 772f86839f ("scripts/qemu-gdb: Support coroutine dumps in
>>>> coredumps") introduced coroutine traces in coredumps using raw stack
>>>> unwinding.  While this works, this approach does not allow to view the
>>>> function arguments in the corresponding stack frames.
>>>>
>>>> As an alternative, we can obtain saved registers from the coroutine's
>>>> jmpbuf, copy the original coredump file into a temporary file, patch the
>>>> saved registers into the tmp coredump's struct elf_prstatus and execute
>>>> another gdb subprocess to get backtrace from the patched temporary coredump.
>>>>
>>>> While providing more detailed info, this alternative approach, however, is
>>>> quite heavyweight as it takes significantly more time and disk space.
>>>> So, instead of making it a new default, let's keep raw unwind the default
>>>> behaviour, but add the '--detailed' option for 'qemu bt' and 'qemu coroutine'
>>>> command which would enforce the new behaviour.
>>>
>>> Wow, that's a big hack around GDB limitations but I don't see any harm
>>> in offering this as an option.
>>>
>>>>
>>>> That's how this looks:
>>>>
>>>>   (gdb) qemu coroutine 0x7fda9335a508
>>>>   #0  0x5602bdb41c26 in qemu_coroutine_switch<+214> () at ../util/coroutine-ucontext.c:321
>>>>   #1  0x5602bdb3e8fe in qemu_aio_coroutine_enter<+493> () at ../util/qemu-coroutine.c:293
>>>>   #2  0x5602bdb3c4eb in co_schedule_bh_cb<+538> () at ../util/async.c:547
>>>>   #3  0x5602bdb3b518 in aio_bh_call<+119> () at ../util/async.c:172
>>>>   #4  0x5602bdb3b79a in aio_bh_poll<+457> () at ../util/async.c:219
>>>>   #5  0x5602bdb10f22 in aio_poll<+1201> () at ../util/aio-posix.c:719
>>>>   #6  0x5602bd8fb1ac in iothread_run<+123> () at ../iothread.c:63
>>>>   #7  0x5602bdb18a24 in qemu_thread_start<+355> () at ../util/qemu-thread-posix.c:393
>>>>
>>>>   (gdb) qemu coroutine 0x7fda9335a508 --detailed
>>>>   patching core file /tmp/tmpq4hmk2qc
>>>>   found "CORE" at 0x10c48
>>>>   assume pt_regs at 0x10cbc
>>>>   write r15 at 0x10cbc
>>>>   write r14 at 0x10cc4
>>>>   write r13 at 0x10ccc
>>>>   write r12 at 0x10cd4
>>>>   write rbp at 0x10cdc
>>>>   write rbx at 0x10ce4
>>>>   write rip at 0x10d3c
>>>>   write rsp at 0x10d54
>>>>
>>>>   #0  0x00005602bdb41c26 in qemu_coroutine_switch (from_=0x7fda9335a508, to_=0x7fda8400c280, action=COROUTINE_ENTER) at ../util/coroutine-ucontext.c:321
>>>>   #1  0x00005602bdb3e8fe in qemu_aio_coroutine_enter (ctx=0x5602bf7147c0, co=0x7fda8400c280) at ../util/qemu-coroutine.c:293
>>>>   #2  0x00005602bdb3c4eb in co_schedule_bh_cb (opaque=0x5602bf7147c0) at ../util/async.c:547
>>>>   #3  0x00005602bdb3b518 in aio_bh_call (bh=0x5602bf714a40) at ../util/async.c:172
>>>>   #4  0x00005602bdb3b79a in aio_bh_poll (ctx=0x5602bf7147c0) at ../util/async.c:219
>>>>   #5  0x00005602bdb10f22 in aio_poll (ctx=0x5602bf7147c0, blocking=true) at ../util/aio-posix.c:719
>>>>   #6  0x00005602bd8fb1ac in iothread_run (opaque=0x5602bf42b100) at ../iothread.c:63
>>>>   #7  0x00005602bdb18a24 in qemu_thread_start (args=0x5602bf7164a0) at ../util/qemu-thread-posix.c:393
>>>>   #8  0x00007fda9e89f7f2 in start_thread (arg=<optimized out>) at pthread_create.c:443
>>>>   #9  0x00007fda9e83f450 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
>>>>
>>>> CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>>>> CC: Peter Xu <peterx@redhat.com>
>>>> Originally-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>> Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
>>>> ---
>>>>  scripts/qemugdb/coroutine.py | 126 ++++++++++++++++++++++++++++++++---
>>>>  1 file changed, 115 insertions(+), 11 deletions(-)
>>>>
>>>> diff --git a/scripts/qemugdb/coroutine.py b/scripts/qemugdb/coroutine.py
>>>> index e98fc48a4b..b1c7f96af5 100644
>>>> --- a/scripts/qemugdb/coroutine.py
>>>> +++ b/scripts/qemugdb/coroutine.py
>>>> @@ -10,6 +10,13 @@
>>>>  # or later.  See the COPYING file in the top-level directory.
>>>>  
>>>>  import gdb
>>>> +import os
>>>> +import re
>>>> +import struct
>>>> +import shutil
>>>> +import subprocess
>>>> +import tempfile
>>>> +import textwrap
>>>>  
>>>>  VOID_PTR = gdb.lookup_type('void').pointer()
>>>>  
>>>> @@ -77,6 +84,65 @@ def symbol_lookup(addr):
>>>>  
>>>>      return f"{func_str} at {path}:{line}"
>>>>  
>>>> +def write_regs_to_coredump(corefile, set_regs):
>>>> +    # asm/ptrace.h
>>>> +    pt_regs = ['r15', 'r14', 'r13', 'r12', 'rbp', 'rbx', 'r11', 'r10',
>>>> +               'r9', 'r8', 'rax', 'rcx', 'rdx', 'rsi', 'rdi', 'orig_rax',
>>>> +               'rip', 'cs', 'eflags', 'rsp', 'ss']
>>>> +
>>>> +    with open(corefile, 'r+b') as f:
>>>> +        gdb.write(f'patching core file {corefile}\n')
>>>> +
>>>> +        while f.read(4) != b'CORE':
>>>> +            pass
>>>> +        gdb.write(f'found "CORE" at 0x{f.tell():x}\n')
>>>> +
>>>> +        # Looking for struct elf_prstatus and pr_reg field in it (an array
>>>> +        # of general purpose registers).  See sys/procfs.h
>>>> +
>>>> +        # lseek(f.fileno(), 4, SEEK_CUR): go to elf_prstatus
>>>> +        f.seek(4, 1)
>>>> +        # lseek(f.fileno(), 112, SEEK_CUR): offsetof(struct elf_prstatus, pr_reg)
>>>> +        f.seek(112, 1)
>>>> +
>>>> +        gdb.write(f'assume pt_regs at 0x{f.tell():x}\n')
>>>> +        for reg in pt_regs:
>>>> +            if reg in set_regs:
>>>> +                gdb.write(f'write {reg} at 0x{f.tell():x}\n')
>>>> +                f.write(struct.pack('q', set_regs[reg]))
>>>> +            else:
>>>> +                # lseek(f.fileno(), 8, SEEK_CUR): go to the next reg
>>>> +                f.seek(8, 1)
>>>> +
>>>> +def clone_coredump(source, target, set_regs):
>>>> +    shutil.copyfile(source, target)
>>>> +    write_regs_to_coredump(target, set_regs)
>>>> +
>>>> +def dump_backtrace_patched(regs):
>>>> +    files = gdb.execute('info files', False, True).split('\n')
>>>> +    executable = re.match('^Symbols from "(.*)".$', files[0]).group(1)
>>>> +    dump = re.search("`(.*)'", files[2]).group(1)
>>>> +
>>>> +    with tempfile.NamedTemporaryFile(dir='/tmp', delete=False) as f:
>>>> +        tmpcore = f.name
>>>> +
>>>> +    clone_coredump(dump, tmpcore, regs)
>>>> +
>>>> +    cmd = ['script', '-qec',
>>>> +           'gdb -batch ' +
>>>> +           '-ex "set complaints 0" ' +
>>>> +           '-ex "set verbose off" ' +
>>>> +           '-ex "set style enabled on" ' +
>>>> +           '-ex "python print(\'----split----\')" ' +
>>>> +           f'-ex bt {executable} {tmpcore}',
>>>> +           '/dev/null']
>>>> +    out = subprocess.check_output(cmd, stderr=subprocess.DEVNULL)
>>>
>>> Is script(1) necessary or just something you used for debugging?
>>>
>>> On Fedora 43 the script(1) utility isn't installed by default. Due to
>>> its generic name it's also a little hard to find the package name
>>> online. It would be nice to print a help message pointing to the
>>> packages. From what I can tell, script(1) is available in
>>> util-linux-script on Red Hat-based distros, bsdutils on Debian-based
>>> distros, and util-linux on Arch.
>>>
>>> [...]
>> My sole purpose for using script(1) was to make GDB subprocess produce
>> colored stack trace output, just like what we get when calling 'bt' in a
>> regular GDB session.  I just find it easier to read.  So, unless there's
>> an easier way to achieve that same result, I'd prefer to keep using
>> script(1).
> 
> Have you tried the pty Python standard library module?
> https://docs.python.org/3/library/pty.html
> 

I haven't until now.  Although using it creates a necessity to manually
fork() and manage master/slave fds, it does serves its purpose and
eliminates the dependency for external programs.  So thank you, I'll add
it in v2.

>>
>> But your point is of course valid -- I didn't think of the case when
>> script(1) program might not be installed.  Since we're just decorating
>> the output here, instead of failing with a help message I'd suggest
>> simply checking whether script(1) binary is present in the system with
>> smth like shutil.which(), and only using it if it is.  I'll update the
>> patch accordingly, if there're no objections.
>>
>> Andrey
>>



  reply	other threads:[~2025-11-27 16:16 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-25 14:21 [PATCH 0/4] Fixes and improvements for scripts/qemugdb commands andrey.drobyshev
2025-11-25 14:21 ` [PATCH 1/4] scripts/qemugdb: mtree: Fix OverflowError in mtree with 128-bit addresses andrey.drobyshev
2025-11-25 14:21 ` [PATCH 2/4] scripts/qemugdb: timers: Fix KeyError in 'qemu timers' command andrey.drobyshev
2025-11-25 14:21 ` [PATCH 3/4] scripts/qemugdb: timers: Improve 'qemu timers' command readability andrey.drobyshev
2025-11-25 14:21 ` [PATCH 4/4] scripts/qemugdb: coroutine: Add option for obtaining detailed trace in coredump andrey.drobyshev
2025-11-26 20:58   ` Stefan Hajnoczi
2025-11-27 13:14     ` Andrey Drobyshev
2025-11-27 14:48       ` Stefan Hajnoczi
2025-11-27 16:13         ` Andrey Drobyshev [this message]
2025-11-27  9:56   ` Kevin Wolf
2025-11-27 10:02     ` Daniel P. Berrangé
2025-11-27 14:31       ` Andrey Drobyshev
2025-11-27 14:55         ` Daniel P. Berrangé
2025-11-27 16:39         ` Kevin Wolf
2025-11-28 12:24           ` Andrey Drobyshev
2025-11-28 13:18             ` Kevin Wolf
2025-12-02 16:36               ` Andrey Drobyshev
2025-11-26 20:59 ` [PATCH 0/4] Fixes and improvements for scripts/qemugdb commands Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1ec1b39c-96a9-498a-bbcd-f16aa33e58c3@virtuozzo.com \
    --to=andrey.drobyshev@virtuozzo.com \
    --cc=den@virtuozzo.com \
    --cc=kwolf@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=vsementsov@yandex-team.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).