qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] python: misc machine.py improvements
@ 2022-10-24 19:53 John Snow
  2022-10-24 19:53 ` [PATCH 1/2] python/machine: Add debug logging to key state changes John Snow
  2022-10-24 19:53 ` [PATCH 2/2] python/machine: Handle termination cases without QMP John Snow
  0 siblings, 2 replies; 5+ messages in thread
From: John Snow @ 2022-10-24 19:53 UTC (permalink / raw)
  To: qemu-devel; +Cc: John Snow, Beraldo Leal, Cleber Rosa

Improve machine.py logging and fix a shutdown bug that we *probably*
weren't actually hitting anywhere.

Changes for console input logging are on the way, but separate because
it will touch avocado.

More comprehensive fixes for multiple socket polling are also on the
way, but decidedly separate. This is the simple stuff that might make it
easier to diagnose failures in the meantime.

John Snow (2):
  python/machine: Add debug logging to key state changes
  python/machine: Handle termination cases without QMP

 python/qemu/machine/machine.py | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

-- 
2.37.3




^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/2] python/machine: Add debug logging to key state changes
  2022-10-24 19:53 [PATCH 0/2] python: misc machine.py improvements John Snow
@ 2022-10-24 19:53 ` John Snow
  2022-10-25  8:24   ` Daniel P. Berrangé
  2022-10-24 19:53 ` [PATCH 2/2] python/machine: Handle termination cases without QMP John Snow
  1 sibling, 1 reply; 5+ messages in thread
From: John Snow @ 2022-10-24 19:53 UTC (permalink / raw)
  To: qemu-devel; +Cc: John Snow, Beraldo Leal, Cleber Rosa

When key decisions are made about the lifetime of the VM process being
managed, there's no log entry. Juxtaposed with the very verbose runstate
change logging of the QMP module, machine seems a bit too introverted
now.

Season the machine.py module with logging statements to taste to help
make a tastier soup.

Signed-off-by: John Snow <jsnow@redhat.com>
---
 python/qemu/machine/machine.py | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
index 37191f433b2..c467f951d5d 100644
--- a/python/qemu/machine/machine.py
+++ b/python/qemu/machine/machine.py
@@ -373,6 +373,7 @@ def _post_shutdown(self) -> None:
         Called to cleanup the VM instance after the process has exited.
         May also be called after a failed launch.
         """
+        LOG.debug("Cleaning up after VM process")
         try:
             self._close_qmp_connection()
         except Exception as err:  # pylint: disable=broad-except
@@ -497,6 +498,7 @@ def _early_cleanup(self) -> None:
         # for QEMU to exit, while QEMU is waiting for the socket to
         # become writable.
         if self._console_socket is not None:
+            LOG.debug("Closing console socket")
             self._console_socket.close()
             self._console_socket = None
 
@@ -507,6 +509,7 @@ def _hard_shutdown(self) -> None:
         :raise subprocess.Timeout: When timeout is exceeds 60 seconds
             waiting for the QEMU process to terminate.
         """
+        LOG.debug("Performing hard shutdown")
         self._early_cleanup()
         self._subp.kill()
         self._subp.wait(timeout=60)
@@ -523,6 +526,13 @@ def _soft_shutdown(self, timeout: Optional[int]) -> None:
         :raise subprocess.TimeoutExpired: When timeout is exceeded waiting for
             the QEMU process to terminate.
         """
+        LOG.debug("Attempting graceful termination")
+        if self._quit_issued:
+            LOG.debug(
+                "Anticipating QEMU termination due to prior 'quit' command, "
+                "or explicit call to wait()"
+            )
+
         self._early_cleanup()
 
         if self._qmp_connection:
@@ -553,6 +563,10 @@ def _do_shutdown(self, timeout: Optional[int]) -> None:
         try:
             self._soft_shutdown(timeout)
         except Exception as exc:
+            if isinstance(exc, subprocess.TimeoutExpired):
+                LOG.debug("Timed out waiting for QEMU process to exit")
+            LOG.debug("Graceful shutdown failed, "
+                      "falling back to hard shutdown")
             self._hard_shutdown()
             raise AbnormalShutdown("Could not perform graceful shutdown") \
                 from exc
@@ -575,6 +589,10 @@ def shutdown(self,
         if not self._launched:
             return
 
+        LOG.debug("Shutting down VM appliance; timeout=%s", timeout)
+        if hard:
+            LOG.debug("Caller requests immediate termination of QEMU process.")
+
         try:
             if hard:
                 self._user_killed = True
-- 
2.37.3



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] python/machine: Handle termination cases without QMP
  2022-10-24 19:53 [PATCH 0/2] python: misc machine.py improvements John Snow
  2022-10-24 19:53 ` [PATCH 1/2] python/machine: Add debug logging to key state changes John Snow
@ 2022-10-24 19:53 ` John Snow
  1 sibling, 0 replies; 5+ messages in thread
From: John Snow @ 2022-10-24 19:53 UTC (permalink / raw)
  To: qemu-devel; +Cc: John Snow, Beraldo Leal, Cleber Rosa

If we request a shutdown of a VM without a QMP console, we'll just hang
waiting. Not ideal.

Add in code that attempts graceful termination in these cases.  Tested
lightly; it appears to work and I doubt we rely on this case anywhere,
but it's a corner you're allowed to wedge yourself in, so it should be
handled.

Signed-off-by: John Snow <jsnow@redhat.com>
---
 python/qemu/machine/machine.py | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
index c467f951d5d..e1ba5d01b86 100644
--- a/python/qemu/machine/machine.py
+++ b/python/qemu/machine/machine.py
@@ -544,6 +544,12 @@ def _soft_shutdown(self, timeout: Optional[int]) -> None:
             finally:
                 # Regardless, we want to quiesce the connection.
                 self._close_qmp_connection()
+        elif not self._quit_issued:
+            LOG.debug(
+                "Not anticipating QEMU quit and no QMP connection present, "
+                "issuing SIGTERM"
+            )
+            self._subp.terminate()
 
         # May raise subprocess.TimeoutExpired
         self._subp.wait(timeout=timeout)
-- 
2.37.3



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] python/machine: Add debug logging to key state changes
  2022-10-24 19:53 ` [PATCH 1/2] python/machine: Add debug logging to key state changes John Snow
@ 2022-10-25  8:24   ` Daniel P. Berrangé
  2022-10-25 21:48     ` John Snow
  0 siblings, 1 reply; 5+ messages in thread
From: Daniel P. Berrangé @ 2022-10-25  8:24 UTC (permalink / raw)
  To: John Snow; +Cc: qemu-devel, Beraldo Leal, Cleber Rosa

On Mon, Oct 24, 2022 at 03:53:54PM -0400, John Snow wrote:
> When key decisions are made about the lifetime of the VM process being
> managed, there's no log entry. Juxtaposed with the very verbose runstate
> change logging of the QMP module, machine seems a bit too introverted
> now.
> 
> Season the machine.py module with logging statements to taste to help
> make a tastier soup.
> 
> Signed-off-by: John Snow <jsnow@redhat.com>
> ---
>  python/qemu/machine/machine.py | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
> 
> diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
> index 37191f433b2..c467f951d5d 100644
> --- a/python/qemu/machine/machine.py
> +++ b/python/qemu/machine/machine.py
> @@ -373,6 +373,7 @@ def _post_shutdown(self) -> None:
>          Called to cleanup the VM instance after the process has exited.
>          May also be called after a failed launch.
>          """
> +        LOG.debug("Cleaning up after VM process")
>          try:
>              self._close_qmp_connection()
>          except Exception as err:  # pylint: disable=broad-except
> @@ -497,6 +498,7 @@ def _early_cleanup(self) -> None:
>          # for QEMU to exit, while QEMU is waiting for the socket to
>          # become writable.
>          if self._console_socket is not None:
> +            LOG.debug("Closing console socket")
>              self._console_socket.close()
>              self._console_socket = None
>  
> @@ -507,6 +509,7 @@ def _hard_shutdown(self) -> None:
>          :raise subprocess.Timeout: When timeout is exceeds 60 seconds
>              waiting for the QEMU process to terminate.
>          """
> +        LOG.debug("Performing hard shutdown")
>          self._early_cleanup()
>          self._subp.kill()
>          self._subp.wait(timeout=60)
> @@ -523,6 +526,13 @@ def _soft_shutdown(self, timeout: Optional[int]) -> None:
>          :raise subprocess.TimeoutExpired: When timeout is exceeded waiting for
>              the QEMU process to terminate.
>          """
> +        LOG.debug("Attempting graceful termination")
> +        if self._quit_issued:
> +            LOG.debug(
> +                "Anticipating QEMU termination due to prior 'quit' command, "
> +                "or explicit call to wait()"
> +            )
> +
>          self._early_cleanup()
>  
>          if self._qmp_connection:
> @@ -553,6 +563,10 @@ def _do_shutdown(self, timeout: Optional[int]) -> None:
>          try:
>              self._soft_shutdown(timeout)
>          except Exception as exc:
> +            if isinstance(exc, subprocess.TimeoutExpired):
> +                LOG.debug("Timed out waiting for QEMU process to exit")
> +            LOG.debug("Graceful shutdown failed, "
> +                      "falling back to hard shutdown")

If you add 'exc_info=True' when logging from inside an 'except'
clause, it captures the stack trace which is often critical for
debugging unexpected exceptions, especially when you're catchintg
the top level 'Exception' class instead of a very specialized
class.

>              self._hard_shutdown()
>              raise AbnormalShutdown("Could not perform graceful shutdown") \
>                  from exc
> @@ -575,6 +589,10 @@ def shutdown(self,
>          if not self._launched:
>              return
>  
> +        LOG.debug("Shutting down VM appliance; timeout=%s", timeout)
> +        if hard:
> +            LOG.debug("Caller requests immediate termination of QEMU process.")
> +
>          try:
>              if hard:
>                  self._user_killed = True
> -- 
> 2.37.3
> 
> 

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] python/machine: Add debug logging to key state changes
  2022-10-25  8:24   ` Daniel P. Berrangé
@ 2022-10-25 21:48     ` John Snow
  0 siblings, 0 replies; 5+ messages in thread
From: John Snow @ 2022-10-25 21:48 UTC (permalink / raw)
  To: Daniel P. Berrangé; +Cc: qemu-devel, Beraldo Leal, Cleber Rosa

On Tue, Oct 25, 2022 at 4:24 AM Daniel P. Berrangé <berrange@redhat.com> wrote:
>
> On Mon, Oct 24, 2022 at 03:53:54PM -0400, John Snow wrote:
> > When key decisions are made about the lifetime of the VM process being
> > managed, there's no log entry. Juxtaposed with the very verbose runstate
> > change logging of the QMP module, machine seems a bit too introverted
> > now.
> >
> > Season the machine.py module with logging statements to taste to help
> > make a tastier soup.
> >
> > Signed-off-by: John Snow <jsnow@redhat.com>
> > ---
> >  python/qemu/machine/machine.py | 18 ++++++++++++++++++
> >  1 file changed, 18 insertions(+)
> >
> > diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
> > index 37191f433b2..c467f951d5d 100644
> > --- a/python/qemu/machine/machine.py
> > +++ b/python/qemu/machine/machine.py
> > @@ -373,6 +373,7 @@ def _post_shutdown(self) -> None:
> >          Called to cleanup the VM instance after the process has exited.
> >          May also be called after a failed launch.
> >          """
> > +        LOG.debug("Cleaning up after VM process")
> >          try:
> >              self._close_qmp_connection()
> >          except Exception as err:  # pylint: disable=broad-except
> > @@ -497,6 +498,7 @@ def _early_cleanup(self) -> None:
> >          # for QEMU to exit, while QEMU is waiting for the socket to
> >          # become writable.
> >          if self._console_socket is not None:
> > +            LOG.debug("Closing console socket")
> >              self._console_socket.close()
> >              self._console_socket = None
> >
> > @@ -507,6 +509,7 @@ def _hard_shutdown(self) -> None:
> >          :raise subprocess.Timeout: When timeout is exceeds 60 seconds
> >              waiting for the QEMU process to terminate.
> >          """
> > +        LOG.debug("Performing hard shutdown")
> >          self._early_cleanup()
> >          self._subp.kill()
> >          self._subp.wait(timeout=60)
> > @@ -523,6 +526,13 @@ def _soft_shutdown(self, timeout: Optional[int]) -> None:
> >          :raise subprocess.TimeoutExpired: When timeout is exceeded waiting for
> >              the QEMU process to terminate.
> >          """
> > +        LOG.debug("Attempting graceful termination")
> > +        if self._quit_issued:
> > +            LOG.debug(
> > +                "Anticipating QEMU termination due to prior 'quit' command, "
> > +                "or explicit call to wait()"
> > +            )
> > +
> >          self._early_cleanup()
> >
> >          if self._qmp_connection:
> > @@ -553,6 +563,10 @@ def _do_shutdown(self, timeout: Optional[int]) -> None:
> >          try:
> >              self._soft_shutdown(timeout)
> >          except Exception as exc:
> > +            if isinstance(exc, subprocess.TimeoutExpired):
> > +                LOG.debug("Timed out waiting for QEMU process to exit")
> > +            LOG.debug("Graceful shutdown failed, "
> > +                      "falling back to hard shutdown")
>
> If you add 'exc_info=True' when logging from inside an 'except'
> clause, it captures the stack trace which is often critical for
> debugging unexpected exceptions, especially when you're catchintg
> the top level 'Exception' class instead of a very specialized
> class.
>

Sure. If the exception goes unhandled, ultimately, we'll see that
stack trace twice. On the other hand, if someone handles this trace
and you still want to see the exception in the debug log somewhere,
it's probably a fair trade-off.

I'll touch it up while we're here.

--js



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-10-25 21:50 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-10-24 19:53 [PATCH 0/2] python: misc machine.py improvements John Snow
2022-10-24 19:53 ` [PATCH 1/2] python/machine: Add debug logging to key state changes John Snow
2022-10-25  8:24   ` Daniel P. Berrangé
2022-10-25 21:48     ` John Snow
2022-10-24 19:53 ` [PATCH 2/2] python/machine: Handle termination cases without QMP John Snow

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).