qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
To: "Daniel P. Berrangé" <berrange@redhat.com>,
	"Kevin Wolf" <kwolf@redhat.com>
Cc: qemu-devel@nongnu.org, peterx@redhat.com, stefanha@redhat.com,
	vsementsov@yandex-team.ru, den@virtuozzo.com
Subject: Re: [PATCH 4/4] scripts/qemugdb: coroutine: Add option for obtaining detailed trace in coredump
Date: Thu, 27 Nov 2025 16:31:29 +0200	[thread overview]
Message-ID: <ef51cf63-16b1-48c4-8070-0acaf618ef3c@virtuozzo.com> (raw)
In-Reply-To: <aSghvhrBXL0xxL1a@redhat.com>

On 11/27/25 12:02 PM, Daniel P. Berrangé wrote:
> On Thu, Nov 27, 2025 at 10:56:12AM +0100, Kevin Wolf wrote:
>> Am 25.11.2025 um 15:21 hat andrey.drobyshev@virtuozzo.com geschrieben:
>>> From: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
>>>
>>> Commit 772f86839f ("scripts/qemu-gdb: Support coroutine dumps in
>>> coredumps") introduced coroutine traces in coredumps using raw stack
>>> unwinding.  While this works, this approach does not allow to view the
>>> function arguments in the corresponding stack frames.
>>>
>>> As an alternative, we can obtain saved registers from the coroutine's
>>> jmpbuf, copy the original coredump file into a temporary file, patch the
>>> saved registers into the tmp coredump's struct elf_prstatus and execute
>>> another gdb subprocess to get backtrace from the patched temporary coredump.
>>>
>>> While providing more detailed info, this alternative approach, however, is
>>> quite heavyweight as it takes significantly more time and disk space.
>>> So, instead of making it a new default, let's keep raw unwind the default
>>> behaviour, but add the '--detailed' option for 'qemu bt' and 'qemu coroutine'
>>> command which would enforce the new behaviour.
>>> [...]
>>
>>> +def clone_coredump(source, target, set_regs):
>>> +    shutil.copyfile(source, target)
>>> +    write_regs_to_coredump(target, set_regs)
>>> +
>>> +def dump_backtrace_patched(regs):
>>> +    files = gdb.execute('info files', False, True).split('\n')
>>> +    executable = re.match('^Symbols from "(.*)".$', files[0]).group(1)
>>> +    dump = re.search("`(.*)'", files[2]).group(1)
>>> +
>>> +    with tempfile.NamedTemporaryFile(dir='/tmp', delete=False) as f:
>>> +        tmpcore = f.name
>>> +
>>> +    clone_coredump(dump, tmpcore, regs)
>>
>> I think this is what makes it so heavy, right? Coredumps can be quite
>> large and /tmp is probably a different filesystem, so you end up really
>> copying the full size of the coredump around.
> 
> On my system /tmp is  tmpfs, so this is actually bringing the whole
> coredump into RAM which is not a sensible approach.
> 
>> Wouldn't it be better in the general case if we could just do a reflink
>> copy of the coredump and then do only very few writes for updating the
>> register values? Then the overhead should actually be quite negligible
>> both in terms of time and disk space.
> 

That's correct, copying the file to /tmp takes most of the time with
this approach.

As for reflink copy, this might've been a great solution.  However, it
would largely depend on the FS used.  E.g. in my system coredumpctl
places uncompressed coredump at /var/tmp, which is mounted as ext4.  And
in this case:

# cp --reflink /var/tmp/coredump-MQCZQc /root
cp: failed to clone '/root/coredump-MQCZQc' from
'/var/tmp/coredump-MQCZQc': Invalid cross-device link

# cp --reflink /var/tmp/coredump-MQCZQc /var/tmp/coredump.ref
cp: failed to clone '/var/tmp/coredump.ref' from
'/var/tmp/coredump-MQCZQc': Operation not supported

Apparently, ext4 doesn't support reflink copy. xfs and btrfs do.  But I
guess our implementation better be FS-agnostic.
> Personally I'd be fine with just modifying the core dump in place
> most of the time. I don't need to keep the current file untouched,
> as it is is just a temporary download acquired from systemd's
> coredumpctl, or from a bug tracker. 
> 
>

Hmm, that's an interesting proposal.  But I still see some potential
pitfalls with it:

1. When dealing with the core dump stored by coredumpctl, original file
is indeed stored compressed and not being modified.  We don't really
care about the uncompressed temporary dump placed in /var/tmp.  What we
do care about is that current GDB session keeps working smoothly.  I
tried patching the dump in place without copying, and it doesn't seem to
break subsequent commands.  However GDB keeps the temporary dump open
throughout the whole session, which means it can occasionally read
modified data from it.  I'm not sure that we have a solid guarantee that
things will keep working with the patched dump.

2. If we're dealing with an external core dump downloaded from a bug
report, we surely want to be able to create new GDB sessions with it.
That means we'll want its unmodified version.  Having to re-download it
again is even slower than plain copying.

The solution to both problems would be saving original registers and
patching them back into the core dump once we've obtained our coroutine
trace.  It's still potentially fragile in 2nd case if GDB process
abruptly gets killed/dies leaving registers un-restored.  But I guess we
can live with it?

What do you think?


> With regards,
> Daniel



  reply	other threads:[~2025-11-27 14:34 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-25 14:21 [PATCH 0/4] Fixes and improvements for scripts/qemugdb commands andrey.drobyshev
2025-11-25 14:21 ` [PATCH 1/4] scripts/qemugdb: mtree: Fix OverflowError in mtree with 128-bit addresses andrey.drobyshev
2025-11-25 14:21 ` [PATCH 2/4] scripts/qemugdb: timers: Fix KeyError in 'qemu timers' command andrey.drobyshev
2025-11-25 14:21 ` [PATCH 3/4] scripts/qemugdb: timers: Improve 'qemu timers' command readability andrey.drobyshev
2025-11-25 14:21 ` [PATCH 4/4] scripts/qemugdb: coroutine: Add option for obtaining detailed trace in coredump andrey.drobyshev
2025-11-26 20:58   ` Stefan Hajnoczi
2025-11-27 13:14     ` Andrey Drobyshev
2025-11-27 14:48       ` Stefan Hajnoczi
2025-11-27 16:13         ` Andrey Drobyshev
2025-11-27  9:56   ` Kevin Wolf
2025-11-27 10:02     ` Daniel P. Berrangé
2025-11-27 14:31       ` Andrey Drobyshev [this message]
2025-11-27 14:55         ` Daniel P. Berrangé
2025-11-27 16:39         ` Kevin Wolf
2025-11-28 12:24           ` Andrey Drobyshev
2025-11-28 13:18             ` Kevin Wolf
2025-12-02 16:36               ` Andrey Drobyshev
2025-11-26 20:59 ` [PATCH 0/4] Fixes and improvements for scripts/qemugdb commands Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ef51cf63-16b1-48c4-8070-0acaf618ef3c@virtuozzo.com \
    --to=andrey.drobyshev@virtuozzo.com \
    --cc=berrange@redhat.com \
    --cc=den@virtuozzo.com \
    --cc=kwolf@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=vsementsov@yandex-team.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).