All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Claudio Imbrenda <imbrenda@linux.ibm.com>
Cc: "Murilo Opsfelder Araújo" <muriloo@linux.ibm.com>,
	pbonzini@redhat.com, qemu-devel@nongnu.org, david@redhat.com,
	cohuck@redhat.com, thuth@redhat.com, borntraeger@de.ibm.com,
	frankja@linux.ibm.com, fiuczy@linux.ibm.com, pasic@linux.ibm.com,
	alex.bennee@linaro.org, armbru@redhat.com
Subject: Re: [PATCH v3 1/1] os-posix: asynchronous teardown for shutdown on Linux
Date: Tue, 23 Aug 2022 18:09:35 +0100	[thread overview]
Message-ID: <YwUJz0ldXjcPdmDF@redhat.com> (raw)
In-Reply-To: <20220812092623.19058f32@p-imbrenda>

On Fri, Aug 12, 2022 at 09:26:23AM +0200, Claudio Imbrenda wrote:
> On Thu, 11 Aug 2022 23:05:52 -0300
> Murilo Opsfelder Araújo <muriloo@linux.ibm.com> wrote:
> 
> > On 8/11/22 11:02, Daniel P. Berrangé wrote:
> > [...]
> > >>> Hmm, I was hoping you could just use SIGKILL to guarantee that this
> > >>> gets killed off.  Is SIGKILL delivered too soon to allow for the
> > >>> main QEMU process to have exited quickly ?  
> > >>
> > >> yes, I tried. qemu has not finished exiting when the signal is
> > >> delivered, the cleanup process dies before qemu, which defeats the
> > >> purpose  
> > >
> > > Ok, too bad.
> > >  
> > >>> If so I wonder what happens when systemd just delivers SIGKILL to
> > >>> all processes in the cgroup - I'm not sure there's a guarantee it
> > >>> will SIGKILL the main qemu before it SIGKILLs this helper  
> > >>
> > >> I'm afraid in that case there is no guarantee.
> > >>
> > >> for what it's worth, both virsh shutdown and destroy seem to do things
> > >> properly.  
> > >
> > > Hmm, probably because libvirt tells QEMU to exit before systemd comes
> > > along and tells everything in the cgroup to die with SIGKILL.  
> > 
> > It seems Libvirt sends SIGKILL if qemu process doesn't terminate within 10
> > seconds after Libvirt sent SIGTERM:
> > 
> > https://gitlab.com/libvirt/libvirt/-/blob/0615df084ec9996b5df88d6a1b59c557e22f3a12/src/util/virprocess.c#L375
> 
> but this is fine.
> 
> with asynchronous teardown, qemu will exit almost immediately when
> receiving SIGTERM, and the cleanup process will start cleaning up.

Note, when you have PCI host devices attahced it can take a very
long time for QEMU to exit. For this reason, the 10 second wait
before switching to SIGKILL is extended by 2 seconds for each
attachec PCI hostdev.

I think the main time we will have problems is where there are
storage failures that cause QEMU to get stuck in an uninterruptible
sleep in kernel space.  The classic example of this is an NFS server
that goes away, QEMU will get stuck waiting for the NFS server to
come back to life and be unkillable in this time even with SIGKILL.

That said, this call to virProcessKillPainfully shouldn't impact
the cleanmup process, becaused the SIGTERM/KILL are both directed
to the QEMU PID alone, not the process group.

The cleanup process should only get any signal later once libvirt
has finished sending SIGTERM/KILL, it then asks systemd to cleanup
the cgroups and at that time systemd can send SIGKILL to the
cleanup process. So in fact I think we should be fine in all
respects, except for the unkillable sleeps in kernel space.


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



  parent reply	other threads:[~2022-08-23 17:12 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-09  6:40 [PATCH v3 1/1] os-posix: asynchronous teardown for shutdown on Linux Claudio Imbrenda
2022-08-10 20:30 ` Murilo Opsfelder Araújo
2022-08-11 12:03   ` Claudio Imbrenda
2022-08-11 12:27 ` Daniel P. Berrangé
2022-08-11 13:54   ` Christian Borntraeger
2022-08-11 13:56   ` Claudio Imbrenda
2022-08-11 14:02     ` Daniel P. Berrangé
2022-08-12  2:05       ` Murilo Opsfelder Araújo
2022-08-12  7:26         ` Claudio Imbrenda
2022-08-12 11:38           ` Murilo Opsfelder Araújo
2022-08-12 11:45             ` Claudio Imbrenda
2022-08-23 17:09           ` Daniel P. Berrangé [this message]
2022-08-30  6:32   ` Markus Armbruster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YwUJz0ldXjcPdmDF@redhat.com \
    --to=berrange@redhat.com \
    --cc=alex.bennee@linaro.org \
    --cc=armbru@redhat.com \
    --cc=borntraeger@de.ibm.com \
    --cc=cohuck@redhat.com \
    --cc=david@redhat.com \
    --cc=fiuczy@linux.ibm.com \
    --cc=frankja@linux.ibm.com \
    --cc=imbrenda@linux.ibm.com \
    --cc=muriloo@linux.ibm.com \
    --cc=pasic@linux.ibm.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.