* [PATCH 0/2] python: misc machine.py improvements @ 2022-10-24 19:53 John Snow 2022-10-24 19:53 ` [PATCH 1/2] python/machine: Add debug logging to key state changes John Snow 2022-10-24 19:53 ` [PATCH 2/2] python/machine: Handle termination cases without QMP John Snow 0 siblings, 2 replies; 5+ messages in thread From: John Snow @ 2022-10-24 19:53 UTC (permalink / raw) To: qemu-devel; +Cc: John Snow, Beraldo Leal, Cleber Rosa Improve machine.py logging and fix a shutdown bug that we *probably* weren't actually hitting anywhere. Changes for console input logging are on the way, but separate because it will touch avocado. More comprehensive fixes for multiple socket polling are also on the way, but decidedly separate. This is the simple stuff that might make it easier to diagnose failures in the meantime. John Snow (2): python/machine: Add debug logging to key state changes python/machine: Handle termination cases without QMP python/qemu/machine/machine.py | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) -- 2.37.3 ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 1/2] python/machine: Add debug logging to key state changes 2022-10-24 19:53 [PATCH 0/2] python: misc machine.py improvements John Snow @ 2022-10-24 19:53 ` John Snow 2022-10-25 8:24 ` Daniel P. Berrangé 2022-10-24 19:53 ` [PATCH 2/2] python/machine: Handle termination cases without QMP John Snow 1 sibling, 1 reply; 5+ messages in thread From: John Snow @ 2022-10-24 19:53 UTC (permalink / raw) To: qemu-devel; +Cc: John Snow, Beraldo Leal, Cleber Rosa When key decisions are made about the lifetime of the VM process being managed, there's no log entry. Juxtaposed with the very verbose runstate change logging of the QMP module, machine seems a bit too introverted now. Season the machine.py module with logging statements to taste to help make a tastier soup. Signed-off-by: John Snow <jsnow@redhat.com> --- python/qemu/machine/machine.py | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py index 37191f433b2..c467f951d5d 100644 --- a/python/qemu/machine/machine.py +++ b/python/qemu/machine/machine.py @@ -373,6 +373,7 @@ def _post_shutdown(self) -> None: Called to cleanup the VM instance after the process has exited. May also be called after a failed launch. """ + LOG.debug("Cleaning up after VM process") try: self._close_qmp_connection() except Exception as err: # pylint: disable=broad-except @@ -497,6 +498,7 @@ def _early_cleanup(self) -> None: # for QEMU to exit, while QEMU is waiting for the socket to # become writable. if self._console_socket is not None: + LOG.debug("Closing console socket") self._console_socket.close() self._console_socket = None @@ -507,6 +509,7 @@ def _hard_shutdown(self) -> None: :raise subprocess.Timeout: When timeout is exceeds 60 seconds waiting for the QEMU process to terminate. """ + LOG.debug("Performing hard shutdown") self._early_cleanup() self._subp.kill() self._subp.wait(timeout=60) @@ -523,6 +526,13 @@ def _soft_shutdown(self, timeout: Optional[int]) -> None: :raise subprocess.TimeoutExpired: When timeout is exceeded waiting for the QEMU process to terminate. """ + LOG.debug("Attempting graceful termination") + if self._quit_issued: + LOG.debug( + "Anticipating QEMU termination due to prior 'quit' command, " + "or explicit call to wait()" + ) + self._early_cleanup() if self._qmp_connection: @@ -553,6 +563,10 @@ def _do_shutdown(self, timeout: Optional[int]) -> None: try: self._soft_shutdown(timeout) except Exception as exc: + if isinstance(exc, subprocess.TimeoutExpired): + LOG.debug("Timed out waiting for QEMU process to exit") + LOG.debug("Graceful shutdown failed, " + "falling back to hard shutdown") self._hard_shutdown() raise AbnormalShutdown("Could not perform graceful shutdown") \ from exc @@ -575,6 +589,10 @@ def shutdown(self, if not self._launched: return + LOG.debug("Shutting down VM appliance; timeout=%s", timeout) + if hard: + LOG.debug("Caller requests immediate termination of QEMU process.") + try: if hard: self._user_killed = True -- 2.37.3 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] python/machine: Add debug logging to key state changes 2022-10-24 19:53 ` [PATCH 1/2] python/machine: Add debug logging to key state changes John Snow @ 2022-10-25 8:24 ` Daniel P. Berrangé 2022-10-25 21:48 ` John Snow 0 siblings, 1 reply; 5+ messages in thread From: Daniel P. Berrangé @ 2022-10-25 8:24 UTC (permalink / raw) To: John Snow; +Cc: qemu-devel, Beraldo Leal, Cleber Rosa On Mon, Oct 24, 2022 at 03:53:54PM -0400, John Snow wrote: > When key decisions are made about the lifetime of the VM process being > managed, there's no log entry. Juxtaposed with the very verbose runstate > change logging of the QMP module, machine seems a bit too introverted > now. > > Season the machine.py module with logging statements to taste to help > make a tastier soup. > > Signed-off-by: John Snow <jsnow@redhat.com> > --- > python/qemu/machine/machine.py | 18 ++++++++++++++++++ > 1 file changed, 18 insertions(+) > > diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py > index 37191f433b2..c467f951d5d 100644 > --- a/python/qemu/machine/machine.py > +++ b/python/qemu/machine/machine.py > @@ -373,6 +373,7 @@ def _post_shutdown(self) -> None: > Called to cleanup the VM instance after the process has exited. > May also be called after a failed launch. > """ > + LOG.debug("Cleaning up after VM process") > try: > self._close_qmp_connection() > except Exception as err: # pylint: disable=broad-except > @@ -497,6 +498,7 @@ def _early_cleanup(self) -> None: > # for QEMU to exit, while QEMU is waiting for the socket to > # become writable. > if self._console_socket is not None: > + LOG.debug("Closing console socket") > self._console_socket.close() > self._console_socket = None > > @@ -507,6 +509,7 @@ def _hard_shutdown(self) -> None: > :raise subprocess.Timeout: When timeout is exceeds 60 seconds > waiting for the QEMU process to terminate. > """ > + LOG.debug("Performing hard shutdown") > self._early_cleanup() > self._subp.kill() > self._subp.wait(timeout=60) > @@ -523,6 +526,13 @@ def _soft_shutdown(self, timeout: Optional[int]) -> None: > :raise subprocess.TimeoutExpired: When timeout is exceeded waiting for > the QEMU process to terminate. > """ > + LOG.debug("Attempting graceful termination") > + if self._quit_issued: > + LOG.debug( > + "Anticipating QEMU termination due to prior 'quit' command, " > + "or explicit call to wait()" > + ) > + > self._early_cleanup() > > if self._qmp_connection: > @@ -553,6 +563,10 @@ def _do_shutdown(self, timeout: Optional[int]) -> None: > try: > self._soft_shutdown(timeout) > except Exception as exc: > + if isinstance(exc, subprocess.TimeoutExpired): > + LOG.debug("Timed out waiting for QEMU process to exit") > + LOG.debug("Graceful shutdown failed, " > + "falling back to hard shutdown") If you add 'exc_info=True' when logging from inside an 'except' clause, it captures the stack trace which is often critical for debugging unexpected exceptions, especially when you're catchintg the top level 'Exception' class instead of a very specialized class. > self._hard_shutdown() > raise AbnormalShutdown("Could not perform graceful shutdown") \ > from exc > @@ -575,6 +589,10 @@ def shutdown(self, > if not self._launched: > return > > + LOG.debug("Shutting down VM appliance; timeout=%s", timeout) > + if hard: > + LOG.debug("Caller requests immediate termination of QEMU process.") > + > try: > if hard: > self._user_killed = True > -- > 2.37.3 > > With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] python/machine: Add debug logging to key state changes 2022-10-25 8:24 ` Daniel P. Berrangé @ 2022-10-25 21:48 ` John Snow 0 siblings, 0 replies; 5+ messages in thread From: John Snow @ 2022-10-25 21:48 UTC (permalink / raw) To: Daniel P. Berrangé; +Cc: qemu-devel, Beraldo Leal, Cleber Rosa On Tue, Oct 25, 2022 at 4:24 AM Daniel P. Berrangé <berrange@redhat.com> wrote: > > On Mon, Oct 24, 2022 at 03:53:54PM -0400, John Snow wrote: > > When key decisions are made about the lifetime of the VM process being > > managed, there's no log entry. Juxtaposed with the very verbose runstate > > change logging of the QMP module, machine seems a bit too introverted > > now. > > > > Season the machine.py module with logging statements to taste to help > > make a tastier soup. > > > > Signed-off-by: John Snow <jsnow@redhat.com> > > --- > > python/qemu/machine/machine.py | 18 ++++++++++++++++++ > > 1 file changed, 18 insertions(+) > > > > diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py > > index 37191f433b2..c467f951d5d 100644 > > --- a/python/qemu/machine/machine.py > > +++ b/python/qemu/machine/machine.py > > @@ -373,6 +373,7 @@ def _post_shutdown(self) -> None: > > Called to cleanup the VM instance after the process has exited. > > May also be called after a failed launch. > > """ > > + LOG.debug("Cleaning up after VM process") > > try: > > self._close_qmp_connection() > > except Exception as err: # pylint: disable=broad-except > > @@ -497,6 +498,7 @@ def _early_cleanup(self) -> None: > > # for QEMU to exit, while QEMU is waiting for the socket to > > # become writable. > > if self._console_socket is not None: > > + LOG.debug("Closing console socket") > > self._console_socket.close() > > self._console_socket = None > > > > @@ -507,6 +509,7 @@ def _hard_shutdown(self) -> None: > > :raise subprocess.Timeout: When timeout is exceeds 60 seconds > > waiting for the QEMU process to terminate. > > """ > > + LOG.debug("Performing hard shutdown") > > self._early_cleanup() > > self._subp.kill() > > self._subp.wait(timeout=60) > > @@ -523,6 +526,13 @@ def _soft_shutdown(self, timeout: Optional[int]) -> None: > > :raise subprocess.TimeoutExpired: When timeout is exceeded waiting for > > the QEMU process to terminate. > > """ > > + LOG.debug("Attempting graceful termination") > > + if self._quit_issued: > > + LOG.debug( > > + "Anticipating QEMU termination due to prior 'quit' command, " > > + "or explicit call to wait()" > > + ) > > + > > self._early_cleanup() > > > > if self._qmp_connection: > > @@ -553,6 +563,10 @@ def _do_shutdown(self, timeout: Optional[int]) -> None: > > try: > > self._soft_shutdown(timeout) > > except Exception as exc: > > + if isinstance(exc, subprocess.TimeoutExpired): > > + LOG.debug("Timed out waiting for QEMU process to exit") > > + LOG.debug("Graceful shutdown failed, " > > + "falling back to hard shutdown") > > If you add 'exc_info=True' when logging from inside an 'except' > clause, it captures the stack trace which is often critical for > debugging unexpected exceptions, especially when you're catchintg > the top level 'Exception' class instead of a very specialized > class. > Sure. If the exception goes unhandled, ultimately, we'll see that stack trace twice. On the other hand, if someone handles this trace and you still want to see the exception in the debug log somewhere, it's probably a fair trade-off. I'll touch it up while we're here. --js ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 2/2] python/machine: Handle termination cases without QMP 2022-10-24 19:53 [PATCH 0/2] python: misc machine.py improvements John Snow 2022-10-24 19:53 ` [PATCH 1/2] python/machine: Add debug logging to key state changes John Snow @ 2022-10-24 19:53 ` John Snow 1 sibling, 0 replies; 5+ messages in thread From: John Snow @ 2022-10-24 19:53 UTC (permalink / raw) To: qemu-devel; +Cc: John Snow, Beraldo Leal, Cleber Rosa If we request a shutdown of a VM without a QMP console, we'll just hang waiting. Not ideal. Add in code that attempts graceful termination in these cases. Tested lightly; it appears to work and I doubt we rely on this case anywhere, but it's a corner you're allowed to wedge yourself in, so it should be handled. Signed-off-by: John Snow <jsnow@redhat.com> --- python/qemu/machine/machine.py | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py index c467f951d5d..e1ba5d01b86 100644 --- a/python/qemu/machine/machine.py +++ b/python/qemu/machine/machine.py @@ -544,6 +544,12 @@ def _soft_shutdown(self, timeout: Optional[int]) -> None: finally: # Regardless, we want to quiesce the connection. self._close_qmp_connection() + elif not self._quit_issued: + LOG.debug( + "Not anticipating QEMU quit and no QMP connection present, " + "issuing SIGTERM" + ) + self._subp.terminate() # May raise subprocess.TimeoutExpired self._subp.wait(timeout=timeout) -- 2.37.3 ^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-10-25 21:50 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-10-24 19:53 [PATCH 0/2] python: misc machine.py improvements John Snow 2022-10-24 19:53 ` [PATCH 1/2] python/machine: Add debug logging to key state changes John Snow 2022-10-25 8:24 ` Daniel P. Berrangé 2022-10-25 21:48 ` John Snow 2022-10-24 19:53 ` [PATCH 2/2] python/machine: Handle termination cases without QMP John Snow
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).