qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Hanna Reitz <hreitz@redhat.com>
To: John Snow <jsnow@redhat.com>, qemu-devel <qemu-devel@nongnu.org>
Cc: Kevin Wolf <kwolf@redhat.com>,
	Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	Eduardo Habkost <ehabkost@redhat.com>,
	qemu-block@nongnu.org, Cleber Rosa <crosa@redhat.com>,
	Eric Blake <eblake@redhat.com>
Subject: Re: [PATCH v2 12/17] python/machine: Handle QMP errors on close more meticulously
Date: Tue, 12 Oct 2021 17:56:59 +0200	[thread overview]
Message-ID: <f1b4ddcf-7b0c-eada-1132-da1b0cbe99b0@redhat.com> (raw)
In-Reply-To: <CAFn=p-YwL+v7_rvC40z_T3DiKhH0cdgu53KTghwTjmixakHC+Q@mail.gmail.com>

On 07.10.21 18:52, John Snow wrote:
>
>
> On Wed, Sep 22, 2021 at 8:50 PM John Snow <jsnow@redhat.com> wrote:
>
>     To use the AQMP backend, Machine just needs to be a little more
>     diligent
>     about what happens when closing a QMP connection. The operation is no
>     longer a freebie in the async world; it may return errors
>     encountered in
>     the async bottom half on incoming message receipt, etc.
>
>     (AQMP's disconnect, ultimately, serves as the quiescence point
>     where all
>     async contexts are gathered together, and any final errors reported at
>     that point.)
>
>     Because async QMP continues to check for messages asynchronously, it's
>     almost certainly likely that the loop will have exited due to EOF
>     after
>     issuing the last 'quit' command. That error will ultimately be bubbled
>     up when attempting to close the QMP connection. The manager class here
>     then is free to discard it -- if it was expected.
>
>     Signed-off-by: John Snow <jsnow@redhat.com>
>
>     ---
>
>     Yes, I regret that this class has become quite a dumping ground for
>     complexity around the exit path. It's in need of a refactor to help
>     separate out the exception handling and cleanup mechanisms from the
>     VM-related stuff, but it's not a priority to do that just yet -- but
>     it's on the list.
>
>     Signed-off-by: John Snow <jsnow@redhat.com>
>     ---
>      python/qemu/machine/machine.py | 48
>     +++++++++++++++++++++++++++++-----
>      1 file changed, 42 insertions(+), 6 deletions(-)
>
>     diff --git a/python/qemu/machine/machine.py
>     b/python/qemu/machine/machine.py
>     index 0bd40bc2f76..c33a78a2d9f 100644
>     --- a/python/qemu/machine/machine.py
>     +++ b/python/qemu/machine/machine.py
>     @@ -342,9 +342,15 @@ def _post_shutdown(self) -> None:
>              # Comprehensive reset for the failed launch case:
>              self._early_cleanup()
>
>     -        if self._qmp_connection:
>     -            self._qmp.close()
>     -            self._qmp_connection = None
>     +        try:
>     +            self._close_qmp_connection()
>     +        except Exception as err:  # pylint: disable=broad-except
>     +            LOG.warning(
>     +                "Exception closing QMP connection: %s",
>     +                str(err) if str(err) else type(err).__name__
>     +            )
>     +        finally:
>     +            assert self._qmp_connection is None
>
>              self._close_qemu_log_file()
>
>     @@ -420,6 +426,31 @@ def _launch(self) -> None:
>                                             close_fds=False)
>              self._post_launch()
>
>     +    def _close_qmp_connection(self) -> None:
>     +        """
>     +        Close the underlying QMP connection, if any.
>     +
>     +        Dutifully report errors that occurred while closing, but
>     assume
>     +        that any error encountered indicates an abnormal termination
>     +        process and not a failure to close.
>     +        """
>     +        if self._qmp_connection is None:
>     +            return
>     +
>     +        try:
>     +            self._qmp.close()
>     +        except EOFError:
>     +            # EOF can occur as an Exception here when using the Async
>     +            # QMP backend. It indicates that the server closed the
>     +            # stream. If we successfully issued 'quit' at any point,
>     +            # then this was expected. If the remote went away without
>     +            # our permission, it's worth reporting that as an
>     abnormal
>     +            # shutdown case.
>     +            if not self._quit_issued:
>
>
> I strongly suspect that this line needs to actually be "if not 
> (self._user_killed or self._quit_issued):" to also handle the 
> force-kill cases.

Sounds right.

(Other than that, the patch looks good to me.)

> I wrote a different sync compatibility layer in a yet-to-be-published 
> branch and noticed this code still allows EOFError through. Why it did 
> not seem to come up in *this* series I still don't know -- I think the 
> timing of when exactly the coroutines get scheduled can be finnicky 
> depending on what else is running. So, sometimes, we manage to close 
> the loop before we get EOF and sometimes we don't.
>
> I am mulling over adding a "tolerate_eof: bool = False" argument to 
> disconnect() to allow the caller to handle this stuff with a little 
> less boilerplate. Anyone have strong feelings on that?

FWIW I don’t.

> (Ultimately, because there's no canonical "goodbye" in-band message, 
> the QMP layer itself can never really know if EOF was expected or not 
> -- that's really up to whomever is managing the connection, but it 
> does add a layer of complexity around the exit path that I am starting 
> to find is a nuisance. Not to mention that there are likely many 
> possible cases in which we might expect the remote to disappear on us 
> that don't involve using QMP at all -- kill is one, using the guest 
> agent to coerce the guest into shutdown is another. Networking and 
> migration are other theoretical(?) avenues for intended disconnects.)

For the iotests we don’t have a guest, so I suppose this doesn’t concern 
me, does it? :)

Hanna



  reply	other threads:[~2021-10-12 15:58 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-23  0:49 [PATCH v2 00/17] Switch iotests to using Async QMP John Snow
2021-09-23  0:49 ` [PATCH v2 01/17] python/aqmp: add greeting property to QMPClient John Snow
2021-09-23  0:49 ` [PATCH v2 02/17] python/aqmp: add .empty() method to EventListener John Snow
2021-09-23  0:49 ` [PATCH v2 03/17] python/aqmp: Return cleared events from EventListener.clear() John Snow
2021-09-23  0:49 ` [PATCH v2 04/17] python/aqmp: add send_fd_scm John Snow
2021-10-07 14:52   ` Eric Blake
2021-10-07 16:27     ` John Snow
2021-10-07 21:46       ` Eric Blake
2021-09-23  0:49 ` [PATCH v2 05/17] python/aqmp: Add dict conversion method to Greeting object John Snow
2021-10-07 14:53   ` Eric Blake
2021-09-23  0:49 ` [PATCH v2 06/17] python/aqmp: Reduce severity of EOFError-caused loop terminations John Snow
2021-10-07 14:55   ` Eric Blake
2021-09-23  0:49 ` [PATCH v2 07/17] python/aqmp: Disable logging messages by default John Snow
2021-10-07 15:02   ` Eric Blake
2021-09-23  0:49 ` [PATCH v2 08/17] python/qmp: clear events on get_events() call John Snow
2021-09-23  0:49 ` [PATCH v2 09/17] python/qmp: add send_fd_scm directly to QEMUMonitorProtocol John Snow
2021-09-23  0:49 ` [PATCH v2 10/17] python, iotests: remove socket_scm_helper John Snow
2021-09-23  0:49 ` [PATCH v2 11/17] python/machine: remove has_quit argument John Snow
2021-10-12 15:30   ` Hanna Reitz
2021-09-23  0:49 ` [PATCH v2 12/17] python/machine: Handle QMP errors on close more meticulously John Snow
2021-10-07 15:07   ` Eric Blake
2021-10-07 16:34     ` John Snow
2021-10-07 16:52   ` John Snow
2021-10-12 15:56     ` Hanna Reitz [this message]
2021-09-23  0:49 ` [PATCH v2 13/17] iotests: Accommodate async QMP Exception classes John Snow
2021-10-12 16:06   ` Hanna Reitz
2021-10-12 16:38     ` John Snow
2021-09-23  0:49 ` [PATCH v2 14/17] iotests: Conditionally silence certain AQMP errors John Snow
2021-09-23  0:49 ` [PATCH v2 15/17] python/aqmp: Create sync QMP wrapper for iotests John Snow
2021-10-06 10:13   ` Paolo Bonzini
2021-10-06 14:24     ` John Snow
2021-10-06 14:32       ` Paolo Bonzini
2021-10-06 15:12         ` John Snow
2021-09-23  0:49 ` [PATCH v2 16/17] python/aqmp: Remove scary message John Snow
2021-09-23  0:49 ` [PATCH v2 17/17] python, iotests: replace qmp with aqmp John Snow
2021-10-06 10:14 ` [PATCH v2 00/17] Switch iotests to using Async QMP Paolo Bonzini
2021-10-06 15:01   ` John Snow

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f1b4ddcf-7b0c-eada-1132-da1b0cbe99b0@redhat.com \
    --to=hreitz@redhat.com \
    --cc=crosa@redhat.com \
    --cc=eblake@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).