From: Kevin Wolf <kwolf@redhat.com>
To: Corey Bryant <coreyb@linux.vnet.ibm.com>
Cc: aliguori@us.ibm.com, stefanha@linux.vnet.ibm.com,
libvir-list@redhat.com, qemu-devel@nongnu.org,
lcapitulino@redhat.com, eblake@redhat.com
Subject: Re: [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets
Date: Tue, 24 Jul 2012 14:07:57 +0200 [thread overview]
Message-ID: <500E901D.3080801@redhat.com> (raw)
In-Reply-To: <1343048885-1701-7-git-send-email-coreyb@linux.vnet.ibm.com>
Am 23.07.2012 15:08, schrieb Corey Bryant:
> When qemu_open is passed a filename of the "/dev/fdset/nnn"
> format (where nnn is the fdset ID), an fd with matching access
> mode flags will be searched for within the specified monitor
> fd set. If the fd is found, a dup of the fd will be returned
> from qemu_open.
>
> Each fd set has a reference count. The purpose of the reference
> count is to determine if an fd set contains file descriptors that
> have open dup() references that have not yet been closed. It is
> incremented on qemu_open and decremented on qemu_close. It is
> not until the refcount is zero that file desriptors in an fd set
> can be closed. If an fd set has dup() references open, then we
> must keep the other fds in the fd set open in case a reopen
> of the file occurs that requires an fd with a different access
> mode.
>
> Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
>
> v2:
> -Get rid of file_open and move dup code to qemu_open
> (kwolf@redhat.com)
> -Use strtol wrapper instead of atoi (kwolf@redhat.com)
>
> v3:
> -Add note about fd leakage (eblake@redhat.com)
>
> v4
> -Moved patch to be later in series (lcapitulino@redhat.com)
> -Update qemu_open to check access mode flags and set flags that
> can be set (eblake@redhat.com, kwolf@redhat.com)
>
> v5:
> -This patch was overhauled quite a bit in this version, with
> the addition of fd set and refcount support.
> -Use qemu_set_cloexec() on dup'd fd (eblake@redhat.com)
> -Modify flags set by fcntl on dup'd fd (eblake@redhat.com)
> -Reduce syscalls when setting flags for dup'd fd (eblake@redhat.com)
> -Fix O_RDWR, O_RDONLY, O_WRONLY checks (eblake@redhat.com)
> ---
> block/raw-posix.c | 24 +++++-----
> block/raw-win32.c | 2 +-
> block/vmdk.c | 4 +-
> block/vpc.c | 2 +-
> block/vvfat.c | 12 ++---
> cutils.c | 5 ++
> monitor.c | 85 +++++++++++++++++++++++++++++++++
> monitor.h | 4 ++
> osdep.c | 138 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
> qemu-common.h | 3 +-
> qemu-tool.c | 12 +++++
> 11 files changed, 267 insertions(+), 24 deletions(-)
>
> diff --git a/block/raw-posix.c b/block/raw-posix.c
> index a172de3..5d0a801 100644
> --- a/block/raw-posix.c
> +++ b/block/raw-posix.c
> @@ -271,7 +271,7 @@ static int raw_open_common(BlockDriverState *bs, const char *filename,
> out_free_buf:
> qemu_vfree(s->aligned_buf);
> out_close:
> - qemu_close(fd);
> + qemu_close(fd, filename);
> return -errno;
> }
Hm, not a nice interface where qemu_close() needs the filename and
(worse) could be given a wrong filename. Maybe it would be better to
maintain a list of fd -> fdset mappings in qemu_open/close?
But if we decided to keep it like this, please use the right interface
from the beginning in patch 5 instead of updating it here.
> @@ -2551,6 +2551,91 @@ static void monitor_fdsets_set_in_use(Monitor *mon, bool in_use)
> }
> }
>
> +void monitor_fdset_increment_refcount(Monitor *mon, int64_t fdset_id)
> +{
> + mon_fdset_t *mon_fdset;
> +
> + if (!mon) {
> + return;
> + }
> +
> + QLIST_FOREACH(mon_fdset, &mon->fdsets, next) {
> + if (mon_fdset->id == fdset_id) {
> + mon_fdset->refcount++;
> + break;
> + }
> + }
> +}
> +
> +void monitor_fdset_decrement_refcount(Monitor *mon, int64_t fdset_id)
> +{
> + mon_fdset_t *mon_fdset;
> +
> + if (!mon) {
> + return;
> + }
> +
> + QLIST_FOREACH(mon_fdset, &mon->fdsets, next) {
> + if (mon_fdset->id == fdset_id) {
> + mon_fdset->refcount--;
> + if (mon_fdset->refcount == 0) {
> + monitor_fdset_cleanup(mon_fdset);
> + }
> + break;
> + }
> + }
> +}
These two functions are almost the same. Would a
monitor_fdset_update_refcount(mon, fdset_id, value) make sense? These
functions could then be kept as thin wrappers around it, or they could
even be dropped completely.
> +
> +int monitor_fdset_get_fd(Monitor *mon, int64_t fdset_id, int flags)
> +{
> + mon_fdset_t *mon_fdset;
> + mon_fdset_fd_t *mon_fdset_fd;
> + int mon_fd_flags;
> +
> + if (!mon) {
> + errno = ENOENT;
> + return -1;
> + }
> +
> + QLIST_FOREACH(mon_fdset, &mon->fdsets, next) {
> + if (mon_fdset->id != fdset_id) {
> + continue;
> + }
> + QLIST_FOREACH(mon_fdset_fd, &mon_fdset->fds, next) {
> + if (mon_fdset_fd->removed) {
> + continue;
> + }
> +
> + mon_fd_flags = fcntl(mon_fdset_fd->fd, F_GETFL);
> + if (mon_fd_flags == -1) {
> + return -1;
> + }
> +
> + switch (flags & O_ACCMODE) {
> + case O_RDWR:
> + if ((mon_fd_flags & O_ACCMODE) == O_RDWR) {
> + return mon_fdset_fd->fd;
> + }
> + break;
> + case O_RDONLY:
> + if ((mon_fd_flags & O_ACCMODE) == O_RDONLY) {
> + return mon_fdset_fd->fd;
> + }
> + break;
> + case O_WRONLY:
> + if ((mon_fd_flags & O_ACCMODE) == O_WRONLY) {
> + return mon_fdset_fd->fd;
> + }
> + break;
> + }
I think you mean:
if ((flags & O_ACCMODE) == (mon_fd_flags & O_ACCMODE)) {
return mon_fdset_fd->fd;
}
What about other flags that cannot be set with fcntl(), like O_SYNC on
older kernels or possibly non-Linux? (The block layer doesn't use it any
more, but I think we want to keep the function generally useful)
> + }
> + errno = EACCES;
> + return -1;
> + }
> + errno = ENOENT;
> + return -1;
> +}
> +
> /* mon_cmds and info_cmds would be sorted at runtime */
> static mon_cmd_t mon_cmds[] = {
> #include "hmp-commands.h"
> @@ -75,6 +76,79 @@ int qemu_madvise(void *addr, size_t len, int advice)
> #endif
> }
>
> +/*
> + * Dups an fd and sets the flags
> + */
> +static int qemu_dup(int fd, int flags)
> +{
> + int i;
> + int ret;
> + int serrno;
> + int dup_flags;
> + int setfl_flags[] = { O_APPEND, O_ASYNC, O_DIRECT, O_NOATIME,
> + O_NONBLOCK, 0 };
> +
> + if (flags & O_CLOEXEC) {
> + ret = fcntl(fd, F_DUPFD_CLOEXEC, 0);
> + if (ret == -1 && errno == EINVAL) {
> + ret = dup(fd);
> + if (ret != -1 && fcntl_setfl(ret, O_CLOEXEC) == -1) {
> + goto fail;
> + }
> + }
> + } else {
> + ret = dup(fd);
> + }
> +
> + if (ret == -1) {
> + goto fail;
> + }
> +
> + dup_flags = fcntl(ret, F_GETFL);
> + if (dup_flags == -1) {
> + goto fail;
> + }
> +
> + if ((flags & O_SYNC) != (dup_flags & O_SYNC)) {
> + errno = EINVAL;
> + goto fail;
> + }
It's worth trying to set it before failing, newer kernels can do it. But
as I said above, if you can fail here, it makes sense to consider O_SYNC
when selecting the right file descriptor from the fdset.
> + /* Set/unset flags that we can with fcntl */
> + i = 0;
> + while (setfl_flags[i] != 0) {
> + if (flags & setfl_flags[i]) {
> + dup_flags = (dup_flags | setfl_flags[i]);
> + } else {
> + dup_flags = (dup_flags & ~setfl_flags[i]);
> + }
> + i++;
> + }
What about this instead of the loop:
int setfl_flags = O_APPEND | O_ASYNC | ... ;
dup_flags &= ~setfl_flags;
dup_flags |= (flags & setfl_flags);
> +
> + if (fcntl(ret, F_SETFL, dup_flags) == -1) {
> + goto fail;
> + }
> +
> + /* Truncate the file in the cases that open() would truncate it */
> + if (flags & O_TRUNC ||
> + ((flags & (O_CREAT | O_EXCL)) == (O_CREAT | O_EXCL))) {
> + if (ftruncate(ret, 0) == -1) {
> + goto fail;
> + }
> + }
O_CREAT | O_EXCL kind of loses its meaning here, but okay, it's hard to
do better with file descriptors.
> +
> + qemu_set_cloexec(ret);
Wait... If O_CLOEXEC is set, you set the flag immediately and if it
isn't you set it at the end of the function? What's the intended use
case for not using O_CLOEXEC then?
Kevin
next prev parent reply other threads:[~2012-07-24 12:08 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-07-23 13:07 [Qemu-devel] [PATCH v5 0/6] file descriptor passing using fd sets Corey Bryant
2012-07-23 13:08 ` [Qemu-devel] [PATCH v5 1/6] qemu-char: Add MSG_CMSG_CLOEXEC flag to recvmsg Corey Bryant
2012-07-23 22:50 ` Eric Blake
2012-07-24 2:19 ` Corey Bryant
2012-07-23 13:08 ` [Qemu-devel] [PATCH v5 2/6] qapi: Introduce add-fd, remove-fd, query-fdsets Corey Bryant
2012-07-25 18:16 ` Eric Blake
2012-07-26 2:55 ` Corey Bryant
2012-07-23 13:08 ` [Qemu-devel] [PATCH v5 3/6] monitor: Clean up fd sets on monitor disconnect Corey Bryant
2012-07-23 13:08 ` [Qemu-devel] [PATCH v5 4/6] block: Convert open calls to qemu_open Corey Bryant
2012-07-25 19:22 ` Eric Blake
2012-07-26 3:11 ` Corey Bryant
2012-07-23 13:08 ` [Qemu-devel] [PATCH v5 5/6] block: Convert close calls to qemu_close Corey Bryant
2012-07-23 13:08 ` [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets Corey Bryant
2012-07-23 13:14 ` Corey Bryant
2012-08-02 22:21 ` Corey Bryant
2012-08-06 9:15 ` Kevin Wolf
2012-08-06 13:32 ` Corey Bryant
2012-08-06 13:51 ` Kevin Wolf
2012-08-06 14:15 ` Corey Bryant
2012-08-07 16:43 ` Corey Bryant
2012-07-24 12:07 ` Kevin Wolf [this message]
2012-07-25 3:41 ` Corey Bryant
2012-07-25 8:22 ` Kevin Wolf
2012-07-25 19:25 ` Eric Blake
2012-07-26 3:21 ` Corey Bryant
2012-07-26 13:13 ` Eric Blake
2012-07-26 13:16 ` Kevin Wolf
2012-07-27 4:07 ` Corey Bryant
2012-07-25 19:43 ` Eric Blake
2012-07-26 3:57 ` Corey Bryant
2012-07-26 9:07 ` Kevin Wolf
2012-07-27 3:59 ` Corey Bryant
2012-07-27 4:03 ` Corey Bryant
2012-08-02 15:08 ` Corey Bryant
2012-07-24 12:09 ` [Qemu-devel] [PATCH v5 0/6] file descriptor passing using " Kevin Wolf
2012-07-25 3:42 ` Corey Bryant
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=500E901D.3080801@redhat.com \
--to=kwolf@redhat.com \
--cc=aliguori@us.ibm.com \
--cc=coreyb@linux.vnet.ibm.com \
--cc=eblake@redhat.com \
--cc=lcapitulino@redhat.com \
--cc=libvir-list@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).