From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Laszlo Ersek <lersek@redhat.com>
Cc: Shaun Reitan <shaun.reitan@ndchost.com>,
pbonzini@redhat.com, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] QEMU leaves pidfile behind on exit
Date: Wed, 14 Feb 2018 08:46:28 +0000 [thread overview]
Message-ID: <20180214084628.GC13644@redhat.com> (raw)
In-Reply-To: <7a31ffe6-03a2-8add-3d24-399651cd856f@redhat.com>
On Tue, Feb 13, 2018 at 08:35:23PM +0100, Laszlo Ersek wrote:
> On 02/13/18 17:28, Daniel P. Berrangé wrote:
> > On Fri, Feb 09, 2018 at 07:12:59PM +0000, Shaun Reitan wrote:
> >> QEMU leaves the pidfile behind on a clean exit when using the option
> >> -pidfile /var/run/qemu.pid.
> >>
> >> Should QEMU leave it behind or should it clean up after itself?
> >>
> >> I'm willing to take a crack at a patch to fix the issue, but before I do, I
> >> want to make sure that leaving the pidfile behind was not intentional?
> >
> > If QEMU deletes the pidfile on exit then, with the current pidfile
> > acquisition logic, there's a race condition possible:
> >
> > To acquire we do
> >
> > 1. fd = open()
> > 2. lockf(fd)
> >
> > If the first QEMU that currently owns the pidfile unlinks in, while
> > a second qemu is in betweeen steps 1 & 2, the second QEMU will
> > acquire the pidfile successfully (which is fine) but the pidfile
> > is now unlinked. This is not fine, because a 3rd qemu can now come
> > and try to acquire the pidfile (by creating a new one) and succeed,
> > despite the second qemu still owning the (now unlinked) pidfile.
> >
> > It is possible to deal with this race by making qemu_create_pidfile
> > more intelligent [1]. It would have todo
> >
> > 1. fd = open(filename)
> > 2. fstat(fd)
> > 3. lockf(fd)
> > 4. stat(filename)
> >
> > It must then compare the results of 2 + 4 to ensure the pidfile it
> > acquired is the same as the one on disk. With this change, it would
> > be safe for QEMU to delete the pidfile on exit.
>
> Why don't we just open the pidfile with (O_CREAT | O_EXCL)? O_EXCL is
> supposed to be atomic.
O_EXCL isn't a good idea because if QEMU crashes without cleaning up
you have a stale pidfile and O_EXCL will turn that into a failure to
acquire pidfile. The key point of using lockf() is to ensure we can
cope reliably with stale pidfiles
>
> ... The open(2) manual on Linux says,
>
> On NFS, O_EXCL is supported only when using NFSv3 or
> later on kernel 2.6 or later. In NFS environments where
> O_EXCL support is not provided, programs that rely on it
> for performing locking tasks will contain a race condi-
> tion. [...]
>
> Sigh.
>
> > [1] See the equiv libvirt logic for pidfile acquisition in
> > https://libvirt.org/git/?p=libvirt.git;a=blob;f=src/util/virpidfile.c;h=58ab29f77f2cfb8583447112dae77a07446bc627;hb=HEAD#l384
> >
>
> To my knowledge, "same file" should be checked with:
>
> a.st_dev == b.st_dev && a.st_ino == b.st_ino
>
> Example:
> - "filename" is "/var/run/qemu.pid"
> - "/var/run" is originally a symbolic link to "/mnt/fs1/"
> - between steps #1 and #4, "/var/run" is re-created as a symbolic link
> to "/mnt/fs2/" -- a different filesystem from fs1
> - "/mnt/fs2/qemu.pid" happens to have the same inode number as
> "/mnt/fs1/qemu.pid"
I don't really think we need to worry about the admin changing symlinks
like this while QEMU is in middle of acquiring the PID.
Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
prev parent reply other threads:[~2018-02-14 8:46 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-02-09 19:12 [Qemu-devel] QEMU leaves pidfile behind on exit Shaun Reitan
2018-02-13 16:28 ` Daniel P. Berrangé
2018-02-13 19:35 ` Laszlo Ersek
2018-02-14 8:46 ` Daniel P. Berrangé [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180214084628.GC13644@redhat.com \
--to=berrange@redhat.com \
--cc=lersek@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=shaun.reitan@ndchost.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.