* [PATCH v2 0/4] Fixes and improvements for scripts/qemugdb commands
@ 2025-12-02 16:31 Andrey Drobyshev
2025-12-02 16:31 ` [PATCH v2 1/4] scripts/qemugdb: mtree: Fix OverflowError in mtree with 128-bit addresses Andrey Drobyshev
` (3 more replies)
0 siblings, 4 replies; 7+ messages in thread
From: Andrey Drobyshev @ 2025-12-02 16:31 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, peterx, stefanha, vsementsov, den, andrey.drobyshev
v1 -> v2:
* Use pty module instead of script(1) for producing colored output;
* Patch coredump file in place instead of full copy;
* Save and restore original pt_regs values in a separate file;
* Wrap this logic in a separate class.
v1: https://lore.kernel.org/qemu-devel/20251125142105.448289-1-andrey.drobyshev@virtuozzo.com/
Andrey Drobyshev (4):
scripts/qemugdb: mtree: Fix OverflowError in mtree with 128-bit
addresses
scripts/qemugdb: timers: Fix KeyError in 'qemu timers' command
scripts/qemugdb: timers: Improve 'qemu timers' command readability
scripts/qemugdb: coroutine: Add option for obtaining detailed trace in
coredump
scripts/qemugdb/coroutine.py | 243 +++++++++++++++++++++++++++++++++--
scripts/qemugdb/mtree.py | 2 +-
scripts/qemugdb/timers.py | 54 ++++++--
3 files changed, 280 insertions(+), 19 deletions(-)
--
2.43.5
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 1/4] scripts/qemugdb: mtree: Fix OverflowError in mtree with 128-bit addresses
2025-12-02 16:31 [PATCH v2 0/4] Fixes and improvements for scripts/qemugdb commands Andrey Drobyshev
@ 2025-12-02 16:31 ` Andrey Drobyshev
2025-12-02 16:31 ` [PATCH v2 2/4] scripts/qemugdb: timers: Fix KeyError in 'qemu timers' command Andrey Drobyshev
` (2 subsequent siblings)
3 siblings, 0 replies; 7+ messages in thread
From: Andrey Drobyshev @ 2025-12-02 16:31 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, peterx, stefanha, vsementsov, den, andrey.drobyshev
The 'qemu mtree' command fails with "OverflowError: int too big to
convert" when memory regions have 128-bit addresses.
Fix by changing conversion base from 16 to 0 (automatic detection based
on string prefix). This works more reliably in GDB's embedded
Python.
Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
scripts/qemugdb/mtree.py | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/scripts/qemugdb/mtree.py b/scripts/qemugdb/mtree.py
index 8fe42c3c12..77603c04b1 100644
--- a/scripts/qemugdb/mtree.py
+++ b/scripts/qemugdb/mtree.py
@@ -25,7 +25,7 @@ def int128(p):
if p.type.code == gdb.TYPE_CODE_STRUCT:
return int(p['lo']) + (int(p['hi']) << 64)
else:
- return int(("%s" % p), 16)
+ return int(("%s" % p), 0)
class MtreeCommand(gdb.Command):
'''Display the memory tree hierarchy'''
--
2.43.5
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v2 2/4] scripts/qemugdb: timers: Fix KeyError in 'qemu timers' command
2025-12-02 16:31 [PATCH v2 0/4] Fixes and improvements for scripts/qemugdb commands Andrey Drobyshev
2025-12-02 16:31 ` [PATCH v2 1/4] scripts/qemugdb: mtree: Fix OverflowError in mtree with 128-bit addresses Andrey Drobyshev
@ 2025-12-02 16:31 ` Andrey Drobyshev
2025-12-02 16:31 ` [PATCH v2 3/4] scripts/qemugdb: timers: Improve 'qemu timers' command readability Andrey Drobyshev
2025-12-02 16:31 ` [PATCH v2 4/4] scripts/qemugdb: coroutine: Add option for obtaining detailed trace in coredump Andrey Drobyshev
3 siblings, 0 replies; 7+ messages in thread
From: Andrey Drobyshev @ 2025-12-02 16:31 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, peterx, stefanha, vsementsov, den, andrey.drobyshev
Currently invoking 'qemu timers' command results into: "gdb.error: There
is no member named last". Let's remove the legacy 'last' field from
QEMUClock, as it was removed in v4.2.0 by the commit 3c2d4c8aa6a
("timer: last, remove last bits of last").
Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
scripts/qemugdb/timers.py | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/scripts/qemugdb/timers.py b/scripts/qemugdb/timers.py
index 5714f92cc2..1219a96b32 100644
--- a/scripts/qemugdb/timers.py
+++ b/scripts/qemugdb/timers.py
@@ -36,10 +36,9 @@ def dump_timers(self, timer):
def process_timerlist(self, tlist, ttype):
gdb.write("Processing %s timers\n" % (ttype))
- gdb.write(" clock %s is enabled:%s, last:%s\n" % (
+ gdb.write(" clock %s is enabled:%s\n" % (
tlist['clock']['type'],
- tlist['clock']['enabled'],
- tlist['clock']['last']))
+ tlist['clock']['enabled']))
if int(tlist['active_timers']) > 0:
self.dump_timers(tlist['active_timers'])
--
2.43.5
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v2 3/4] scripts/qemugdb: timers: Improve 'qemu timers' command readability
2025-12-02 16:31 [PATCH v2 0/4] Fixes and improvements for scripts/qemugdb commands Andrey Drobyshev
2025-12-02 16:31 ` [PATCH v2 1/4] scripts/qemugdb: mtree: Fix OverflowError in mtree with 128-bit addresses Andrey Drobyshev
2025-12-02 16:31 ` [PATCH v2 2/4] scripts/qemugdb: timers: Fix KeyError in 'qemu timers' command Andrey Drobyshev
@ 2025-12-02 16:31 ` Andrey Drobyshev
2025-12-02 16:31 ` [PATCH v2 4/4] scripts/qemugdb: coroutine: Add option for obtaining detailed trace in coredump Andrey Drobyshev
3 siblings, 0 replies; 7+ messages in thread
From: Andrey Drobyshev @ 2025-12-02 16:31 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, peterx, stefanha, vsementsov, den, andrey.drobyshev
* Add the 'attributes' field from QEMUTimer;
* Stringify the field's value in accordance with macros from
include/qemu/timer.h;
* Make timer expiration times human-readable by converting from nanoseconds
to appropriate units (ms/s/min/hrs/days) and showing the scale factor
(ns/us/ms/s).
Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
scripts/qemugdb/timers.py | 49 +++++++++++++++++++++++++++++++++++----
1 file changed, 44 insertions(+), 5 deletions(-)
diff --git a/scripts/qemugdb/timers.py b/scripts/qemugdb/timers.py
index 1219a96b32..916c71b74a 100644
--- a/scripts/qemugdb/timers.py
+++ b/scripts/qemugdb/timers.py
@@ -21,14 +21,53 @@ def __init__(self):
gdb.Command.__init__(self, 'qemu timers', gdb.COMMAND_DATA,
gdb.COMPLETE_NONE)
+ def _format_expire_time(self, expire_time, scale):
+ "Return human-readable expiry time (ns) with scale info."
+ secs = expire_time / 1e9
+
+ # Select unit and compute value
+ if secs < 1:
+ val, unit = secs * 1000, "ms"
+ elif secs < 60:
+ val, unit = secs, "s"
+ elif secs < 3600:
+ val, unit = secs / 60, "min"
+ elif secs < 86400:
+ val, unit = secs / 3600, "hrs"
+ else:
+ val, unit = secs / 86400, "days"
+
+ scale_map = {1: "ns", 1000: "us", 1000000: "ms",
+ 1000000000: "s"}
+ scale_str = scale_map.get(scale, f"scale={scale}")
+ return f"{val:.2f} {unit} [{scale_str}]"
+
+ def _format_attribute(self, attr):
+ "Given QEMUTimer attributes value, return a human-readable string"
+
+ # From include/qemu/timer.h
+ if attr == 0:
+ value = 'NONE'
+ elif attr == 1 << 0:
+ value = 'ATTR_EXTERNAL'
+ elif attr == int(0xffffffff):
+ value = 'ATTR_ALL'
+ else:
+ value = 'UNKNOWN'
+ return f'{attr} <{value}>'
+
def dump_timers(self, timer):
"Follow a timer and recursively dump each one in the list."
# timer should be of type QemuTimer
- gdb.write(" timer %s/%s (cb:%s,opq:%s)\n" % (
- timer['expire_time'],
- timer['scale'],
- timer['cb'],
- timer['opaque']))
+ scale = int(timer['scale'])
+ expire_time = int(timer['expire_time'])
+ attributes = int(timer['attributes'])
+
+ time_str = self._format_expire_time(expire_time, scale)
+ attr_str = self._format_attribute(attributes)
+
+ gdb.write(f" timer at {time_str} (attr:{attr_str}, "
+ f"cb:{timer['cb']}, opq:{timer['opaque']})\n")
if int(timer['next']) > 0:
self.dump_timers(timer['next'])
--
2.43.5
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v2 4/4] scripts/qemugdb: coroutine: Add option for obtaining detailed trace in coredump
2025-12-02 16:31 [PATCH v2 0/4] Fixes and improvements for scripts/qemugdb commands Andrey Drobyshev
` (2 preceding siblings ...)
2025-12-02 16:31 ` [PATCH v2 3/4] scripts/qemugdb: timers: Improve 'qemu timers' command readability Andrey Drobyshev
@ 2025-12-02 16:31 ` Andrey Drobyshev
2025-12-02 19:30 ` Stefan Hajnoczi
3 siblings, 1 reply; 7+ messages in thread
From: Andrey Drobyshev @ 2025-12-02 16:31 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, peterx, stefanha, vsementsov, den, andrey.drobyshev
Commit 772f86839f ("scripts/qemu-gdb: Support coroutine dumps in
coredumps") introduced coroutine traces in coredumps using raw stack
unwinding. While this works, this approach does not allow to view the
function arguments in the corresponding stack frames.
As an alternative, we can obtain saved registers from the coroutine's
jmpbuf, patch them into the coredump's struct elf_prstatus in place, and
execute another gdb subprocess to get backtrace from the patched temporary
coredump.
While providing more detailed info, this alternative approach, however, is
more invasive as it might potentially corrupt the coredump file. We do take
precautions by saving the original registers values into a separate binary
blob /path/to/coredump.ptregs, so that it can be restores in the next
GDB session. Still, instead of making it a new deault, let's keep raw unwind
the default behaviour, but add the '--detailed' option for 'qemu bt' and
'qemu coroutine' command which would enforce the new behaviour.
That's how this looks:
(gdb) qemu coroutine 0x7fda9335a508
#0 0x5602bdb41c26 in qemu_coroutine_switch<+214> () at ../util/coroutine-ucontext.c:321
#1 0x5602bdb3e8fe in qemu_aio_coroutine_enter<+493> () at ../util/qemu-coroutine.c:293
#2 0x5602bdb3c4eb in co_schedule_bh_cb<+538> () at ../util/async.c:547
#3 0x5602bdb3b518 in aio_bh_call<+119> () at ../util/async.c:172
#4 0x5602bdb3b79a in aio_bh_poll<+457> () at ../util/async.c:219
#5 0x5602bdb10f22 in aio_poll<+1201> () at ../util/aio-posix.c:719
#6 0x5602bd8fb1ac in iothread_run<+123> () at ../iothread.c:63
#7 0x5602bdb18a24 in qemu_thread_start<+355> () at ../util/qemu-thread-posix.c:393
(gdb) qemu coroutine 0x7fda9335a508 --detailed
patching core file /tmp/tmpq4hmk2qc
found "CORE" at 0x10c48
assume pt_regs at 0x10cbc
write r15 at 0x10cbc
write r14 at 0x10cc4
write r13 at 0x10ccc
write r12 at 0x10cd4
write rbp at 0x10cdc
write rbx at 0x10ce4
write rip at 0x10d3c
write rsp at 0x10d54
#0 0x00005602bdb41c26 in qemu_coroutine_switch (from_=0x7fda9335a508, to_=0x7fda8400c280, action=COROUTINE_ENTER) at ../util/coroutine-ucontext.c:321
#1 0x00005602bdb3e8fe in qemu_aio_coroutine_enter (ctx=0x5602bf7147c0, co=0x7fda8400c280) at ../util/qemu-coroutine.c:293
#2 0x00005602bdb3c4eb in co_schedule_bh_cb (opaque=0x5602bf7147c0) at ../util/async.c:547
#3 0x00005602bdb3b518 in aio_bh_call (bh=0x5602bf714a40) at ../util/async.c:172
#4 0x00005602bdb3b79a in aio_bh_poll (ctx=0x5602bf7147c0) at ../util/async.c:219
#5 0x00005602bdb10f22 in aio_poll (ctx=0x5602bf7147c0, blocking=true) at ../util/aio-posix.c:719
#6 0x00005602bd8fb1ac in iothread_run (opaque=0x5602bf42b100) at ../iothread.c:63
#7 0x00005602bdb18a24 in qemu_thread_start (args=0x5602bf7164a0) at ../util/qemu-thread-posix.c:393
#8 0x00007fda9e89f7f2 in start_thread (arg=<optimized out>) at pthread_create.c:443
#9 0x00007fda9e83f450 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
CC: Peter Xu <peterx@redhat.com>
Originally-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
---
scripts/qemugdb/coroutine.py | 243 +++++++++++++++++++++++++++++++++--
1 file changed, 233 insertions(+), 10 deletions(-)
diff --git a/scripts/qemugdb/coroutine.py b/scripts/qemugdb/coroutine.py
index e98fc48a4b..280c02c12d 100644
--- a/scripts/qemugdb/coroutine.py
+++ b/scripts/qemugdb/coroutine.py
@@ -10,9 +10,116 @@
# or later. See the COPYING file in the top-level directory.
import gdb
+import os
+import pty
+import re
+import struct
+import textwrap
+
+from collections import OrderedDict
+from copy import deepcopy
VOID_PTR = gdb.lookup_type('void').pointer()
+# Registers in the same order they're present in ELF coredump file.
+# See asm/ptrace.h
+PT_REGS = ['r15', 'r14', 'r13', 'r12', 'rbp', 'rbx', 'r11', 'r10', 'r9',
+ 'r8', 'rax', 'rcx', 'rdx', 'rsi', 'rdi', 'orig_rax', 'rip', 'cs',
+ 'eflags', 'rsp', 'ss']
+
+coredump = None
+
+
+class Coredump:
+ _ptregs_suff = '.ptregs'
+
+ def __init__(self, coredump, executable):
+ gdb.events.exited.connect(self._cleanup)
+
+ self.coredump = coredump
+ self.executable = executable
+ self._ptregs_blob = coredump + self._ptregs_suff
+ self._dirty = False
+
+ with open(coredump, 'rb') as f:
+ while f.read(4) != b'CORE':
+ pass
+ gdb.write(f'core file {coredump}: found "CORE" at 0x{f.tell():x}\n')
+
+ # Looking for struct elf_prstatus and pr_reg field in it (an array
+ # of general purpose registers). See sys/procfs.h.
+
+ # lseek(f.fileno(), 4, SEEK_CUR): go to elf_prstatus
+ f.seek(4, 1)
+
+ # lseek(f.fileno(), 112, SEEK_CUR):
+ # offsetof(struct elf_prstatus, pr_reg)
+ f.seek(112, 1)
+
+ self._ptregs_offset = f.tell()
+
+ # If binary blob with the name /path/to/coredump + '.ptregs'
+ # exists, that means proper cleanup didn't happen during previous
+ # GDB session with the same coredump, and registers in the dump
+ # itself might've remained patched. Thus we restore original
+ # registers values from this blob
+ if os.path.exists(self._ptregs_blob):
+ with open(self._ptregs_blob, 'rb') as b:
+ orig_ptregs_bytes = b.read()
+ self._dirty = True
+ else:
+ orig_ptregs_bytes = f.read(len(PT_REGS) * 8)
+
+ values = struct.unpack(f"={len(PT_REGS)}q", orig_ptregs_bytes)
+ self._orig_ptregs = OrderedDict(zip(PT_REGS, values))
+
+ if not os.path.exists(self._ptregs_blob):
+ gdb.write(f'saving original pt_regs in {self._ptregs_blob}\n')
+ with open(self._ptregs_blob, 'wb') as b:
+ b.write(orig_ptregs_bytes)
+
+ gdb.write('\n')
+
+ def patch_regs(self, regs):
+ gdb.write(f'patching core file {self.coredump}\n')
+ patched_ptregs = deepcopy(self._orig_ptregs)
+ int_regs = {k: int(v) for k, v in regs.items()}
+ patched_ptregs.update(int_regs)
+
+ with open(self.coredump, 'ab') as f:
+ gdb.write(f'assume pt_regs at 0x{self._ptregs_offset:x}\n')
+ f.seek(self._ptregs_offset, 0)
+ gdb.write('writing regs:\n')
+ for reg in self._orig_ptregs.keys():
+ if reg in int_regs:
+ gdb.write(f" {reg}: {int_regs[reg]:#16x}\n")
+ f.write(struct.pack(f"={len(PT_REGS)}q", *patched_ptregs.values()))
+
+ self._dirty = True
+ gdb.write('\n')
+
+ def restore_regs(self):
+ if not self._dirty:
+ return
+
+ gdb.write(f'\nrestoring original regs in core file {self.coredump}\n')
+ with open(self.coredump, 'ab') as f:
+ gdb.write(f'assume pt_regs at 0x{self._ptregs_offset:x}\n')
+ f.seek(self._ptregs_offset, 0)
+ f.write(struct.pack(f"={len(PT_REGS)}q",
+ *self._orig_ptregs.values()))
+
+ self._dirty = False
+ gdb.write('\n')
+
+ def _cleanup(self, event):
+ # If we've come to the proper cleanup upon the end of GDB session,
+ # that means original regs are already restored
+ if os.path.exists(self._ptregs_blob):
+ gdb.write(f'\nremoving saved pt_regs file {self._ptregs_blob}\n')
+ os.unlink(self._ptregs_blob)
+
+
def pthread_self():
'''Fetch the base address of TLS.'''
return gdb.parse_and_eval("$fs_base")
@@ -77,6 +184,55 @@ def symbol_lookup(addr):
return f"{func_str} at {path}:{line}"
+def run_with_pty(cmd):
+ # Create a PTY pair
+ master_fd, slave_fd = pty.openpty()
+
+ pid = os.fork()
+ if pid == 0: # Child
+ os.close(master_fd)
+ # Attach stdin/stdout/stderr to the PTY slave side
+ os.dup2(slave_fd, 0)
+ os.dup2(slave_fd, 1)
+ os.dup2(slave_fd, 2)
+ os.close(slave_fd)
+ os.execvp("gdb", cmd) # Runs gdb and doesn't return
+
+ # Parent
+ os.close(slave_fd)
+
+ output = bytearray()
+ try:
+ while True:
+ data = os.read(master_fd, 65536)
+ if not data:
+ break
+ output.extend(data)
+ except OSError: # in case subprocess exits and we get EBADF on read()
+ pass
+ finally:
+ try:
+ os.close(master_fd)
+ except OSError: # in case we get EBADF on close()
+ pass
+
+ # Wait for child to finish (reap zombie)
+ os.waitpid(pid, 0)
+
+ return output.decode('utf-8')
+
+def dump_backtrace_patched(regs):
+ cmd = ['gdb', '-batch',
+ '-ex', 'set debuginfod enabled off',
+ '-ex', 'set complaints 0',
+ '-ex', 'set style enabled on',
+ '-ex', 'python print("----split----")',
+ '-ex', 'bt', coredump.executable, coredump.coredump]
+
+ coredump.patch_regs(regs)
+ out = run_with_pty(cmd).split('----split----')[1]
+ gdb.write(out)
+
def dump_backtrace(regs):
'''
Backtrace dump with raw registers, mimic GDB command 'bt'.
@@ -120,7 +276,7 @@ def dump_backtrace_live(regs):
selected_frame.select()
-def bt_jmpbuf(jmpbuf):
+def bt_jmpbuf(jmpbuf, detailed=False):
'''Backtrace a jmpbuf'''
regs = get_jmpbuf_regs(jmpbuf)
try:
@@ -128,8 +284,12 @@ def bt_jmpbuf(jmpbuf):
# but only works with live sessions.
dump_backtrace_live(regs)
except:
- # If above doesn't work, fallback to poor man's unwind
- dump_backtrace(regs)
+ if detailed:
+ # Obtain detailed trace by patching regs in copied coredump
+ dump_backtrace_patched(regs)
+ else:
+ # If above doesn't work, fallback to poor man's unwind
+ dump_backtrace(regs)
def co_cast(co):
return co.cast(gdb.lookup_type('CoroutineUContext').pointer())
@@ -138,28 +298,89 @@ def coroutine_to_jmpbuf(co):
coroutine_pointer = co_cast(co)
return coroutine_pointer['env']['__jmpbuf']
+def init_coredump():
+ global coredump
+
+ files = gdb.execute('info files', False, True).split('\n')
+
+ if not 'core dump' in files[1]:
+ return False
+
+ core_path = re.search("`(.*)'", files[2]).group(1)
+ exec_path = re.match('^Symbols from "(.*)".$', files[0]).group(1)
+
+ if coredump is None:
+ coredump = Coredump(core_path, exec_path)
+
+ return True
class CoroutineCommand(gdb.Command):
- '''Display coroutine backtrace'''
+ __doc__ = textwrap.dedent("""\
+ Display coroutine backtrace
+
+ Usage: qemu coroutine COROPTR [--detailed]
+ Show backtrace for a coroutine specified by COROPTR
+
+ --detailed obtain detailed trace by copying coredump, patching
+ regs in it, and runing gdb subprocess to get
+ backtrace from the patched coredump
+ """)
+
def __init__(self):
gdb.Command.__init__(self, 'qemu coroutine', gdb.COMMAND_DATA,
gdb.COMPLETE_NONE)
+ def _usage(self):
+ gdb.write('usage: qemu coroutine <coroutine-pointer> [--detailed]\n')
+ return
+
def invoke(self, arg, from_tty):
argv = gdb.string_to_argv(arg)
- if len(argv) != 1:
- gdb.write('usage: qemu coroutine <coroutine-pointer>\n')
+ argc = len(argv)
+ if argc == 0 or argc > 2 or (argc == 2 and argv[1] != '--detailed'):
+ return self._usage()
+ detailed = True if argc == 2 else False
+
+ is_coredump = init_coredump()
+ if detailed and not is_coredump:
+ gdb.write('--detailed is only valid when debugging core dumps\n')
return
- bt_jmpbuf(coroutine_to_jmpbuf(gdb.parse_and_eval(argv[0])))
+ bt_jmpbuf(coroutine_to_jmpbuf(gdb.parse_and_eval(argv[0])),
+ detailed=detailed)
+
+ coredump.restore_regs()
class CoroutineBt(gdb.Command):
- '''Display backtrace including coroutine switches'''
+ __doc__ = textwrap.dedent("""\
+ Display backtrace including coroutine switches
+
+ Usage: qemu bt [--detailed]
+
+ --detailed obtain detailed trace by copying coredump, patching
+ regs in it, and runing gdb subprocess to get
+ backtrace from the patched coredump
+ """)
+
def __init__(self):
gdb.Command.__init__(self, 'qemu bt', gdb.COMMAND_STACK,
gdb.COMPLETE_NONE)
+ def _usage(self):
+ gdb.write('usage: qemu bt [--detailed]\n')
+ return
+
def invoke(self, arg, from_tty):
+ argv = gdb.string_to_argv(arg)
+ argc = len(argv)
+ if argc > 1 or (argc == 1 and argv[0] != '--detailed'):
+ return self._usage()
+ detailed = True if argc == 1 else False
+
+ is_coredump = init_coredump()
+ if detailed and not is_coredump:
+ gdb.write('--detailed is only valid when debugging core dumps\n')
+ return
gdb.execute("bt")
@@ -178,8 +399,10 @@ def invoke(self, arg, from_tty):
co_ptr = co["base"]["caller"]
if co_ptr == 0:
break
- gdb.write("Coroutine at " + str(co_ptr) + ":\n")
- bt_jmpbuf(coroutine_to_jmpbuf(co_ptr))
+ gdb.write("\nCoroutine at " + str(co_ptr) + ":\n")
+ bt_jmpbuf(coroutine_to_jmpbuf(co_ptr), detailed=detailed)
+
+ coredump.restore_regs()
class CoroutineSPFunction(gdb.Function):
def __init__(self):
--
2.43.5
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v2 4/4] scripts/qemugdb: coroutine: Add option for obtaining detailed trace in coredump
2025-12-02 16:31 ` [PATCH v2 4/4] scripts/qemugdb: coroutine: Add option for obtaining detailed trace in coredump Andrey Drobyshev
@ 2025-12-02 19:30 ` Stefan Hajnoczi
2025-12-03 9:39 ` Andrey Drobyshev
0 siblings, 1 reply; 7+ messages in thread
From: Stefan Hajnoczi @ 2025-12-02 19:30 UTC (permalink / raw)
To: Andrey Drobyshev; +Cc: qemu-devel, kwolf, peterx, vsementsov, den
[-- Attachment #1: Type: text/plain, Size: 5291 bytes --]
On Tue, Dec 02, 2025 at 06:31:19PM +0200, Andrey Drobyshev wrote:
> Commit 772f86839f ("scripts/qemu-gdb: Support coroutine dumps in
> coredumps") introduced coroutine traces in coredumps using raw stack
> unwinding. While this works, this approach does not allow to view the
> function arguments in the corresponding stack frames.
>
> As an alternative, we can obtain saved registers from the coroutine's
> jmpbuf, patch them into the coredump's struct elf_prstatus in place, and
> execute another gdb subprocess to get backtrace from the patched temporary
> coredump.
>
> While providing more detailed info, this alternative approach, however, is
> more invasive as it might potentially corrupt the coredump file. We do take
> precautions by saving the original registers values into a separate binary
> blob /path/to/coredump.ptregs, so that it can be restores in the next
> GDB session. Still, instead of making it a new deault, let's keep raw unwind
> the default behaviour, but add the '--detailed' option for 'qemu bt' and
> 'qemu coroutine' command which would enforce the new behaviour.
>
> That's how this looks:
>
> (gdb) qemu coroutine 0x7fda9335a508
> #0 0x5602bdb41c26 in qemu_coroutine_switch<+214> () at ../util/coroutine-ucontext.c:321
> #1 0x5602bdb3e8fe in qemu_aio_coroutine_enter<+493> () at ../util/qemu-coroutine.c:293
> #2 0x5602bdb3c4eb in co_schedule_bh_cb<+538> () at ../util/async.c:547
> #3 0x5602bdb3b518 in aio_bh_call<+119> () at ../util/async.c:172
> #4 0x5602bdb3b79a in aio_bh_poll<+457> () at ../util/async.c:219
> #5 0x5602bdb10f22 in aio_poll<+1201> () at ../util/aio-posix.c:719
> #6 0x5602bd8fb1ac in iothread_run<+123> () at ../iothread.c:63
> #7 0x5602bdb18a24 in qemu_thread_start<+355> () at ../util/qemu-thread-posix.c:393
>
> (gdb) qemu coroutine 0x7fda9335a508 --detailed
> patching core file /tmp/tmpq4hmk2qc
> found "CORE" at 0x10c48
> assume pt_regs at 0x10cbc
> write r15 at 0x10cbc
> write r14 at 0x10cc4
> write r13 at 0x10ccc
> write r12 at 0x10cd4
> write rbp at 0x10cdc
> write rbx at 0x10ce4
> write rip at 0x10d3c
> write rsp at 0x10d54
>
> #0 0x00005602bdb41c26 in qemu_coroutine_switch (from_=0x7fda9335a508, to_=0x7fda8400c280, action=COROUTINE_ENTER) at ../util/coroutine-ucontext.c:321
> #1 0x00005602bdb3e8fe in qemu_aio_coroutine_enter (ctx=0x5602bf7147c0, co=0x7fda8400c280) at ../util/qemu-coroutine.c:293
> #2 0x00005602bdb3c4eb in co_schedule_bh_cb (opaque=0x5602bf7147c0) at ../util/async.c:547
> #3 0x00005602bdb3b518 in aio_bh_call (bh=0x5602bf714a40) at ../util/async.c:172
> #4 0x00005602bdb3b79a in aio_bh_poll (ctx=0x5602bf7147c0) at ../util/async.c:219
> #5 0x00005602bdb10f22 in aio_poll (ctx=0x5602bf7147c0, blocking=true) at ../util/aio-posix.c:719
> #6 0x00005602bd8fb1ac in iothread_run (opaque=0x5602bf42b100) at ../iothread.c:63
> #7 0x00005602bdb18a24 in qemu_thread_start (args=0x5602bf7164a0) at ../util/qemu-thread-posix.c:393
> #8 0x00007fda9e89f7f2 in start_thread (arg=<optimized out>) at pthread_create.c:443
> #9 0x00007fda9e83f450 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
>
> CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> CC: Peter Xu <peterx@redhat.com>
> Originally-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
> ---
> scripts/qemugdb/coroutine.py | 243 +++++++++++++++++++++++++++++++++--
> 1 file changed, 233 insertions(+), 10 deletions(-)
>
> diff --git a/scripts/qemugdb/coroutine.py b/scripts/qemugdb/coroutine.py
> index e98fc48a4b..280c02c12d 100644
> --- a/scripts/qemugdb/coroutine.py
> +++ b/scripts/qemugdb/coroutine.py
> @@ -10,9 +10,116 @@
> # or later. See the COPYING file in the top-level directory.
>
> import gdb
> +import os
> +import pty
> +import re
> +import struct
> +import textwrap
> +
> +from collections import OrderedDict
> +from copy import deepcopy
>
> VOID_PTR = gdb.lookup_type('void').pointer()
>
> +# Registers in the same order they're present in ELF coredump file.
> +# See asm/ptrace.h
> +PT_REGS = ['r15', 'r14', 'r13', 'r12', 'rbp', 'rbx', 'r11', 'r10', 'r9',
> + 'r8', 'rax', 'rcx', 'rdx', 'rsi', 'rdi', 'orig_rax', 'rip', 'cs',
> + 'eflags', 'rsp', 'ss']
> +
> +coredump = None
> +
> +
> +class Coredump:
> + _ptregs_suff = '.ptregs'
> +
> + def __init__(self, coredump, executable):
> + gdb.events.exited.connect(self._cleanup)
It's not clear to me that this cleanup mechanism is reliable:
- The restore_regs() method is called from invoke(), but not in a
`finally` block that would guarantee it runs even when an exception is
thrown. Maybe _cleanup() can be called without a prior restore_regs()
call. It would be inconvenient to lose the original register values.
- I'm not sure if gdb.events.exited (when GDB's inferior terminates) is
the correct event to ensure cleanup. The worst case is that the
temporary file is leaked, which is not a serious problem.
But then this is a debugging script and it's probably fine:
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 4/4] scripts/qemugdb: coroutine: Add option for obtaining detailed trace in coredump
2025-12-02 19:30 ` Stefan Hajnoczi
@ 2025-12-03 9:39 ` Andrey Drobyshev
0 siblings, 0 replies; 7+ messages in thread
From: Andrey Drobyshev @ 2025-12-03 9:39 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: qemu-devel, kwolf, peterx, vsementsov, den
On 12/2/25 9:30 PM, Stefan Hajnoczi wrote:
> On Tue, Dec 02, 2025 at 06:31:19PM +0200, Andrey Drobyshev wrote:
>> Commit 772f86839f ("scripts/qemu-gdb: Support coroutine dumps in
>> coredumps") introduced coroutine traces in coredumps using raw stack
>> unwinding. While this works, this approach does not allow to view the
>> function arguments in the corresponding stack frames.
>>
>> As an alternative, we can obtain saved registers from the coroutine's
>> jmpbuf, patch them into the coredump's struct elf_prstatus in place, and
>> execute another gdb subprocess to get backtrace from the patched temporary
>> coredump.
>>
>> While providing more detailed info, this alternative approach, however, is
>> more invasive as it might potentially corrupt the coredump file. We do take
>> precautions by saving the original registers values into a separate binary
>> blob /path/to/coredump.ptregs, so that it can be restores in the next
>> GDB session. Still, instead of making it a new deault, let's keep raw unwind
>> the default behaviour, but add the '--detailed' option for 'qemu bt' and
>> 'qemu coroutine' command which would enforce the new behaviour.
>>
>> That's how this looks:
>>
>> (gdb) qemu coroutine 0x7fda9335a508
>> #0 0x5602bdb41c26 in qemu_coroutine_switch<+214> () at ../util/coroutine-ucontext.c:321
>> #1 0x5602bdb3e8fe in qemu_aio_coroutine_enter<+493> () at ../util/qemu-coroutine.c:293
>> #2 0x5602bdb3c4eb in co_schedule_bh_cb<+538> () at ../util/async.c:547
>> #3 0x5602bdb3b518 in aio_bh_call<+119> () at ../util/async.c:172
>> #4 0x5602bdb3b79a in aio_bh_poll<+457> () at ../util/async.c:219
>> #5 0x5602bdb10f22 in aio_poll<+1201> () at ../util/aio-posix.c:719
>> #6 0x5602bd8fb1ac in iothread_run<+123> () at ../iothread.c:63
>> #7 0x5602bdb18a24 in qemu_thread_start<+355> () at ../util/qemu-thread-posix.c:393
>>
>> (gdb) qemu coroutine 0x7fda9335a508 --detailed
>> patching core file /tmp/tmpq4hmk2qc
>> found "CORE" at 0x10c48
>> assume pt_regs at 0x10cbc
>> write r15 at 0x10cbc
>> write r14 at 0x10cc4
>> write r13 at 0x10ccc
>> write r12 at 0x10cd4
>> write rbp at 0x10cdc
>> write rbx at 0x10ce4
>> write rip at 0x10d3c
>> write rsp at 0x10d54
>>
>> #0 0x00005602bdb41c26 in qemu_coroutine_switch (from_=0x7fda9335a508, to_=0x7fda8400c280, action=COROUTINE_ENTER) at ../util/coroutine-ucontext.c:321
>> #1 0x00005602bdb3e8fe in qemu_aio_coroutine_enter (ctx=0x5602bf7147c0, co=0x7fda8400c280) at ../util/qemu-coroutine.c:293
>> #2 0x00005602bdb3c4eb in co_schedule_bh_cb (opaque=0x5602bf7147c0) at ../util/async.c:547
>> #3 0x00005602bdb3b518 in aio_bh_call (bh=0x5602bf714a40) at ../util/async.c:172
>> #4 0x00005602bdb3b79a in aio_bh_poll (ctx=0x5602bf7147c0) at ../util/async.c:219
>> #5 0x00005602bdb10f22 in aio_poll (ctx=0x5602bf7147c0, blocking=true) at ../util/aio-posix.c:719
>> #6 0x00005602bd8fb1ac in iothread_run (opaque=0x5602bf42b100) at ../iothread.c:63
>> #7 0x00005602bdb18a24 in qemu_thread_start (args=0x5602bf7164a0) at ../util/qemu-thread-posix.c:393
>> #8 0x00007fda9e89f7f2 in start_thread (arg=<optimized out>) at pthread_create.c:443
>> #9 0x00007fda9e83f450 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
>>
>> CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>> CC: Peter Xu <peterx@redhat.com>
>> Originally-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
>> ---
>> scripts/qemugdb/coroutine.py | 243 +++++++++++++++++++++++++++++++++--
>> 1 file changed, 233 insertions(+), 10 deletions(-)
>>
>> diff --git a/scripts/qemugdb/coroutine.py b/scripts/qemugdb/coroutine.py
>> index e98fc48a4b..280c02c12d 100644
>> --- a/scripts/qemugdb/coroutine.py
>> +++ b/scripts/qemugdb/coroutine.py
>> @@ -10,9 +10,116 @@
>> # or later. See the COPYING file in the top-level directory.
>>
>> import gdb
>> +import os
>> +import pty
>> +import re
>> +import struct
>> +import textwrap
>> +
>> +from collections import OrderedDict
>> +from copy import deepcopy
>>
>> VOID_PTR = gdb.lookup_type('void').pointer()
>>
>> +# Registers in the same order they're present in ELF coredump file.
>> +# See asm/ptrace.h
>> +PT_REGS = ['r15', 'r14', 'r13', 'r12', 'rbp', 'rbx', 'r11', 'r10', 'r9',
>> + 'r8', 'rax', 'rcx', 'rdx', 'rsi', 'rdi', 'orig_rax', 'rip', 'cs',
>> + 'eflags', 'rsp', 'ss']
>> +
>> +coredump = None
>> +
>> +
>> +class Coredump:
>> + _ptregs_suff = '.ptregs'
>> +
>> + def __init__(self, coredump, executable):
>> + gdb.events.exited.connect(self._cleanup)
>
> It's not clear to me that this cleanup mechanism is reliable:
>
> - The restore_regs() method is called from invoke(), but not in a
> `finally` block that would guarantee it runs even when an exception is
> thrown. Maybe _cleanup() can be called without a prior restore_regs()
> call. It would be inconvenient to lose the original register values.
>
Agreed. We might as well put restore_regs() call into a `finally` block
to make sure it's called in any case, like that:
> try:
> while True:
> co = co_cast(co_ptr)
> co_ptr = co["base"]["caller"]
> if co_ptr == 0:
> break
> gdb.write("\nCoroutine at " + str(co_ptr) + ":\n")
> bt_jmpbuf(coroutine_to_jmpbuf(co_ptr), detailed=detailed)
>
> finally:
> coredump.restore_regs()
And also we should probably call restore_regs() during the cleanup if
the dirty flag is set.
> - I'm not sure if gdb.events.exited (when GDB's inferior terminates) is
> the correct event to ensure cleanup. The worst case is that the
> temporary file is leaked, which is not a serious problem.
>
Hmm indeed, this callback isn't called upon signals. I guess we can
just call atexit.register(self._cleanup). This seems to handle both
normal and abnormal exit (except SIGKILL of course).
> But then this is a debugging script and it's probably fine:
>
> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-12-03 9:41 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-02 16:31 [PATCH v2 0/4] Fixes and improvements for scripts/qemugdb commands Andrey Drobyshev
2025-12-02 16:31 ` [PATCH v2 1/4] scripts/qemugdb: mtree: Fix OverflowError in mtree with 128-bit addresses Andrey Drobyshev
2025-12-02 16:31 ` [PATCH v2 2/4] scripts/qemugdb: timers: Fix KeyError in 'qemu timers' command Andrey Drobyshev
2025-12-02 16:31 ` [PATCH v2 3/4] scripts/qemugdb: timers: Improve 'qemu timers' command readability Andrey Drobyshev
2025-12-02 16:31 ` [PATCH v2 4/4] scripts/qemugdb: coroutine: Add option for obtaining detailed trace in coredump Andrey Drobyshev
2025-12-02 19:30 ` Stefan Hajnoczi
2025-12-03 9:39 ` Andrey Drobyshev
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).