From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: "Thomas Huth" <thuth@redhat.com>,
qemu-devel@nongnu.org, "Juan Quintela" <quintela@redhat.com>,
"Lukáš Doktor" <ldoktor@redhat.com>,
"Cornelia Huck" <cohuck@redhat.com>
Subject: Re: [Qemu-devel] [PATCH] io/channel-command: Delay the killing of the child after closing the pipe
Date: Tue, 13 Feb 2018 15:49:42 +0000 [thread overview]
Message-ID: <20180213154942.GL2378@work-vm> (raw)
In-Reply-To: <20180213154512.GT573@redhat.com>
* Daniel P. Berrangé (berrange@redhat.com) wrote:
> On Tue, Feb 13, 2018 at 03:41:45PM +0000, Dr. David Alan Gilbert wrote:
> > * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > > On Tue, Feb 13, 2018 at 03:25:30PM +0000, Dr. David Alan Gilbert wrote:
> > > > * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > > > > On Tue, Feb 13, 2018 at 03:09:12PM +0000, Dr. David Alan Gilbert wrote:
> > > > > > * Thomas Huth (thuth@redhat.com) wrote:
> > > > > > > We are currently facing some migration failure on s390x when running
> > > > > > > certain avocado tests, e.g. when running the test
> > > > > > > type_specific.io-github-autotest-qemu.migrate.with_reboot.exec.gzip_exec.
> > > > > > > This test is using 'migrate -d "exec:nc localhost 5200"' for the migration.
> > > > > > > The problem is detected at the receiving side, where the migration stream
> > > > > > > apparently ends too early. However, the cause for the problem is the
> > > > > > > sending side: After writing the migration stream into the pipe to netcat,
> > > > > > > the source QEMU calls qio_channel_command_close() which closes the pipe
> > > > > > > and immediately (!) kills the child process afterwards. So if the
> > > > > > > sending netcat did not read the final bytes from the pipe yet, or
> > > > > > > if it did not manage to send out all its buffers yet, it is killed
> > > > > > > before the whole migration stream is passed to the destination side.
> > > > > >
> > > > > > Thanks for tracking that down!
> > > > > >
> > > > > > > To ease the situation at least a little bit, we should give the child
> > > > > > > process at least some few more time slices before we kill it with
> > > > > > > SIGTERM and then with SIGKILL. With this change, the avocado test now
> > > > > > > succeeds here in 10 out of 10 runs.
> > > > > > >
> > > > > > > Signed-off-by: Thomas Huth <thuth@redhat.com>
> > > > > > > ---
> > > > > > > io/channel-command.c | 6 +++---
> > > > > > > 1 file changed, 3 insertions(+), 3 deletions(-)
> > > > > > >
> > > > > > > diff --git a/io/channel-command.c b/io/channel-command.c
> > > > > > > index 319c5ed..f64db3e 100644
> > > > > > > --- a/io/channel-command.c
> > > > > > > +++ b/io/channel-command.c
> > > > > > > @@ -177,11 +177,11 @@ static int qio_channel_command_abort(QIOChannelCommand *ioc,
> > > > > > > return -1;
> > > > > > > }
> > > > > > > } else if (ret == 0) {
> > > > > > > - if (step == 0) {
> > > > > > > + if (step == 4) {
> > > > > > > kill(ioc->pid, SIGTERM);
> > > > > > > - } else if (step == 1) {
> > > > > > > + } else if (step == 8) {
> > > > > > > kill(ioc->pid, SIGKILL);
> > > > > > > - } else {
> > > > > > > + } else if (step >= 9) {
> > > > > >
> > > > > > Hmm. This seems pretty arbitrary; if I understand correctly you're
> > > > > > saying it'll get a SIGTERM after 4 (arbitrary) * 10ms (arbitrary).
> > > > > >
> > > > > > Who is to say that's enough for a scp or gzip or the like?
> > > > >
> > > > > We could conceivably implement the qio_channel_shutdown() operation
> > > > > for the QIOChannelCommand class. It would merely close the FD to the
> > > > > child process, but leave it running. That would give it time to read
> > > > > any data still in the pipe from QEMU IIUC.
> > > >
> > > > Yeh that's better; although when would we call shutdown or close on it?
> > >
> > > Doesn't QEMU alredy use shutdown() during the right part of migration,
> > > or is that only wrt post-copy ?
> >
> > We only use it for cancel and errors, not during the normal behaviour.
>
> So we could do with shutdown() for sake of post-copy anyway, but for
> normal behaviour maybe the right answer is for close() to just wait a
> real long time for the child app to exit ? If we close the pipes, and
> then wait 5 seconds or more before giving up ?
Yes, I'm happier with a much longer arbitrary value than a short
arbitrary value; but I do wonder if there's any real need to kill it.
Dave
> Regards,
> Daniel
> --
> |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o- https://fstop138.berrange.com :|
> |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
next prev parent reply other threads:[~2018-02-13 15:49 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-02-13 14:14 [Qemu-devel] [PATCH] io/channel-command: Delay the killing of the child after closing the pipe Thomas Huth
2018-02-13 14:23 ` Daniel P. Berrangé
2018-02-13 15:09 ` Dr. David Alan Gilbert
2018-02-13 15:11 ` Daniel P. Berrangé
2018-02-13 15:25 ` Dr. David Alan Gilbert
2018-02-13 15:27 ` Daniel P. Berrangé
2018-02-13 15:41 ` Dr. David Alan Gilbert
2018-02-13 15:45 ` Daniel P. Berrangé
2018-02-13 15:49 ` Dr. David Alan Gilbert [this message]
2018-02-13 15:58 ` Daniel P. Berrangé
2018-02-13 16:03 ` Dr. David Alan Gilbert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180213154942.GL2378@work-vm \
--to=dgilbert@redhat.com \
--cc=berrange@redhat.com \
--cc=cohuck@redhat.com \
--cc=ldoktor@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=thuth@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).