From: Steven Sistare <steven.sistare@oracle.com>
To: Markus Armbruster <armbru@redhat.com>
Cc: "Daniel P. Berrangé" <berrange@redhat.com>,
qemu-devel@nongnu.org, "Michael Roth" <michael.roth@amd.com>,
"Dr. David Alan Gilbert" <dave@treblig.org>
Subject: Re: [PATCH] monitor: flush messages on abort
Date: Thu, 16 Nov 2023 12:03:28 -0500 [thread overview]
Message-ID: <45bfd9b2-09dd-471f-9c54-b7f746bf0baf@oracle.com> (raw)
In-Reply-To: <8734x5ennk.fsf@pond.sub.org>
On 11/16/2023 10:14 AM, Markus Armbruster wrote:
> Steven Sistare <steven.sistare@oracle.com> writes:
>> On 11/15/2023 11:15 AM, Markus Armbruster wrote:
>>> Steven Sistare <steven.sistare@oracle.com> writes:
>>>> On 11/15/2023 3:41 AM, Markus Armbruster wrote:
>>>>> Daniel P. Berrangé <berrange@redhat.com> writes:
>>>>>
>>>>>> On Fri, Nov 03, 2023 at 03:51:00PM -0400, Steven Sistare wrote:
>>>>>>> On 11/3/2023 1:33 PM, Daniel P. Berrangé wrote:
>>>>>>>> On Fri, Nov 03, 2023 at 09:01:29AM -0700, Steve Sistare wrote:
>>>>>>>>> Buffered monitor output is lost when abort() is called. The pattern
>>>>>>>>> error_report() followed by abort() occurs about 60 times, so valuable
>>>>>>>>> information is being lost when the abort is called in the context of a
>>>>>>>>> monitor command.
>>>>>>>>
>>>>>>>> I'm curious, was there a particular abort() scenario that you hit ?
>>>>>>>
>>>>>>> Yes, while tweaking the suspended state, and forgetting to add transitions:
>>>>>>>
>>>>>>> error_report("invalid runstate transition: '%s' -> '%s'",
>>>>>>> abort();
>>>>>>>
>>>>>>> But I have previously hit this for other errors.
>>>>>
>>>>> Can you provide a reproducer?
>>>>
>>>> I sometimes hit this when developing new code. I do not have a reproducer for upstream
>>>> branches. The patch is aimed at helping developers, not users.
>>>
>>> I'm asking because I can't see how the error message could be lost. A
>>> reproducer would let me find out. "Apply this set of broken patches,
>>> then do that" would serve.
>>
>> $ patch -p1 << EOF
>> diff --git a/monitor/qmp-cmds.c b/monitor/qmp-cmds.c
>> index b0f948d..c9a3aee 100644
>> --- a/monitor/qmp-cmds.c
>> +++ b/monitor/qmp-cmds.c
>> @@ -47,8 +47,12 @@ void qmp_quit(Error **errp)
>> qemu_system_shutdown_request(SHUTDOWN_CAUSE_HOST_QMP_QUIT);
>> }
>>
>> +#include "qemu/error-report.h"
>> +
>> void qmp_stop(Error **errp)
>> {
>> + error_report("injected failure");
>> + abort();
>> /* if there is a dump in background, we should wait until the dump
>> * finished */
>> if (qemu_system_dump_in_progress()) {
>> EOF
>>
>> # This example loses the error message:
>>
>> $ args='-display none -chardev socket,id=mon1,server=on,path=mon1.sock,wait=off -mon mon1,mode=control'
>> $ qemu-system-x86_64 $args < /dev/null &
>> [1] 18048
>> $ echo '{"execute":"qmp_capabilities"} {"execute":"human-monitor-command","arguments":{"command-line":"stop"}}' | ncat -U mon1.sock
>> {"QMP": {"version": {"qemu": {"micro": 50, "minor": 1, "major": 8}, "package": "v8.1.0-2976-g4025fde-dirty"}, "capabilities": ["oob"]}}
>> {"return": {}}
>> Ncat: Connection reset by peer.
>> $
>> [1]+ Aborted qemu-system-x86_64 $args < /dev/null
>>
>>
>> # This example preserves the error message. I include it to show the ncat-based test is valid.
>>
>> $ qemu-system-x86_64 $args < /dev/null &
>> [1] 18060
>> $ echo '{"execute":"qmp_capabilities"} {"execute":"stop"}' | ncat -U mon1.sock
>> {"QMP": {"version": {"qemu": {"micro": 50, "minor": 1, "major": 8}, "package": "v8.1.0-2976-g4025fde-dirty"}, "capabilities": ["oob"]}}
>> {"return": {}}
>> injected failure <============= qemu stderr
>> Ncat: Connection reset by peer.
>> $
>> [1]+ Aborted qemu-system-x86_64 $args < /dev/null
>>
>> - Steve
>
> Reproduced, thanks!
>
> Confirms my reading of the code. Two cases:
>
> 1. Normal monitor output
>
> Flushed on newline. A partial last line can be lost on crash or
> exit().
>
> Works as intended.
>
> 2. Output being captured for QMP command human-monitor-command
>
> Never flushed. Instead, the entire buffer is included in the
> command's success response. Naturally there is no response on crash.
>
> Works as intended.
>
> Here's how to fish unflushed output out of a core dump:
>
> (gdb) bt
> #0 0x00007efeba52fecc in __pthread_kill_implementation () at /lib64/libc.so.6
> #1 0x00007efeba4dfab6 in raise () at /lib64/libc.so.6
> #2 0x00007efeba4c97fc in abort () at /lib64/libc.so.6
> #3 0x000055588dcc848c in qmp_stop (errp=0x0) at ../monitor/qmp-cmds.c:53
> #4 0x000055588dcc36ab in hmp_stop (mon=0x7ffc01d47cc0, qdict=0x555891d19200)
> at ../monitor/hmp-cmds.c:119
> #5 0x000055588dcc714e in handle_hmp_command_exec
> (mon=0x7ffc01d47cc0, cmd=0x55588f0689d0 <hmp_cmds+7760>, qdict=0x555891d19200) at ../monitor/hmp.c:1106
> #6 0x000055588dcc737b in handle_hmp_command
> (mon=0x7ffc01d47cc0, cmdline=0x7efea80036f4 "") at ../monitor/hmp.c:1158
> #7 0x000055588dcc8922 in qmp_human_monitor_command
> (command_line=0x7efea80036f0 "stop", has_cpu_index=false, cpu_index=0, errp=0x7ffc01d47dd0) at ../monitor/qmp-cmds.c:184
>
> [...]
>
> (gdb) up 4
> #4 0x000055588dcc36ab in hmp_stop (mon=0x7ffc01d47cc0, qdict=0x555891d19200)
> at ../monitor/hmp-cmds.c:119
> 119 qmp_stop(NULL);
> (gdb) p mon->outbuf
> $1 = (GString *) 0x555890a65260
> (gdb) p *$
> $2 = {str = 0x7efea8002c00 "injected failure\r\n", len = 18,
> allocated_len = 128}
Yes. thanks. With the patch, the computer does the work for us, and sometimes the message
is definitive enough to fix without further investigation, eg
invalid runstate transition: 'suspended -> 'save-vm'
But, it's no big deal, and I withdraw the patch request. Thanks for looking at it.
- Steve
next prev parent reply other threads:[~2023-11-16 17:04 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-03 16:01 [PATCH] monitor: flush messages on abort Steve Sistare
2023-11-03 17:33 ` Daniel P. Berrangé
2023-11-03 19:51 ` Steven Sistare
2023-11-03 20:28 ` Steven Sistare
2023-11-06 10:10 ` Daniel P. Berrangé
2023-11-15 8:41 ` Markus Armbruster
2023-11-15 15:52 ` Steven Sistare
2023-11-15 16:15 ` Markus Armbruster
2023-11-15 17:41 ` Steven Sistare
2023-11-16 15:14 ` Markus Armbruster
2023-11-16 17:03 ` Steven Sistare [this message]
2023-11-15 15:30 ` Steven Sistare
2023-11-15 16:05 ` Markus Armbruster
2023-11-16 11:21 ` Daniel P. Berrangé
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=45bfd9b2-09dd-471f-9c54-b7f746bf0baf@oracle.com \
--to=steven.sistare@oracle.com \
--cc=armbru@redhat.com \
--cc=berrange@redhat.com \
--cc=dave@treblig.org \
--cc=michael.roth@amd.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).