From: Kevin Wolf <kwolf@redhat.com>
To: Corey Bryant <coreyb@linux.vnet.ibm.com>
Cc: aliguori@us.ibm.com, stefanha@linux.vnet.ibm.com,
libvir-list@redhat.com, qemu-devel@nongnu.org,
lcapitulino@redhat.com, eblake@redhat.com
Subject: Re: [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets
Date: Tue, 24 Jul 2012 14:07:57 +0200 [thread overview]
Message-ID: <500E901D.3080801@redhat.com> (raw)
In-Reply-To: <1343048885-1701-7-git-send-email-coreyb@linux.vnet.ibm.com>
Am 23.07.2012 15:08, schrieb Corey Bryant:
> When qemu_open is passed a filename of the "/dev/fdset/nnn"
> format (where nnn is the fdset ID), an fd with matching access
> mode flags will be searched for within the specified monitor
> fd set. If the fd is found, a dup of the fd will be returned
> from qemu_open.
>
> Each fd set has a reference count. The purpose of the reference
> count is to determine if an fd set contains file descriptors that
> have open dup() references that have not yet been closed. It is
> incremented on qemu_open and decremented on qemu_close. It is
> not until the refcount is zero that file desriptors in an fd set
> can be closed. If an fd set has dup() references open, then we
> must keep the other fds in the fd set open in case a reopen
> of the file occurs that requires an fd with a different access
> mode.
>
> Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
>
> v2:
> -Get rid of file_open and move dup code to qemu_open
> (kwolf@redhat.com)
> -Use strtol wrapper instead of atoi (kwolf@redhat.com)
>
> v3:
> -Add note about fd leakage (eblake@redhat.com)
>
> v4
> -Moved patch to be later in series (lcapitulino@redhat.com)
> -Update qemu_open to check access mode flags and set flags that
> can be set (eblake@redhat.com, kwolf@redhat.com)
>
> v5:
> -This patch was overhauled quite a bit in this version, with
> the addition of fd set and refcount support.
> -Use qemu_set_cloexec() on dup'd fd (eblake@redhat.com)
> -Modify flags set by fcntl on dup'd fd (eblake@redhat.com)
> -Reduce syscalls when setting flags for dup'd fd (eblake@redhat.com)
> -Fix O_RDWR, O_RDONLY, O_WRONLY checks (eblake@redhat.com)
> ---
> block/raw-posix.c | 24 +++++-----
> block/raw-win32.c | 2 +-
> block/vmdk.c | 4 +-
> block/vpc.c | 2 +-
> block/vvfat.c | 12 ++---
> cutils.c | 5 ++
> monitor.c | 85 +++++++++++++++++++++++++++++++++
> monitor.h | 4 ++
> osdep.c | 138 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
> qemu-common.h | 3 +-
> qemu-tool.c | 12 +++++
> 11 files changed, 267 insertions(+), 24 deletions(-)
>
> diff --git a/block/raw-posix.c b/block/raw-posix.c
> index a172de3..5d0a801 100644
> --- a/block/raw-posix.c
> +++ b/block/raw-posix.c
> @@ -271,7 +271,7 @@ static int raw_open_common(BlockDriverState *bs, const char *filename,
> out_free_buf:
> qemu_vfree(s->aligned_buf);
> out_close:
> - qemu_close(fd);
> + qemu_close(fd, filename);
> return -errno;
> }
Hm, not a nice interface where qemu_close() needs the filename and
(worse) could be given a wrong filename. Maybe it would be better to
maintain a list of fd -> fdset mappings in qemu_open/close?
But if we decided to keep it like this, please use the right interface
from the beginning in patch 5 instead of updating it here.
> @@ -2551,6 +2551,91 @@ static void monitor_fdsets_set_in_use(Monitor *mon, bool in_use)
> }
> }
>
> +void monitor_fdset_increment_refcount(Monitor *mon, int64_t fdset_id)
> +{
> + mon_fdset_t *mon_fdset;
> +
> + if (!mon) {
> + return;
> + }
> +
> + QLIST_FOREACH(mon_fdset, &mon->fdsets, next) {
> + if (mon_fdset->id == fdset_id) {
> + mon_fdset->refcount++;
> + break;
> + }
> + }
> +}
> +
> +void monitor_fdset_decrement_refcount(Monitor *mon, int64_t fdset_id)
> +{
> + mon_fdset_t *mon_fdset;
> +
> + if (!mon) {
> + return;
> + }
> +
> + QLIST_FOREACH(mon_fdset, &mon->fdsets, next) {
> + if (mon_fdset->id == fdset_id) {
> + mon_fdset->refcount--;
> + if (mon_fdset->refcount == 0) {
> + monitor_fdset_cleanup(mon_fdset);
> + }
> + break;
> + }
> + }
> +}
These two functions are almost the same. Would a
monitor_fdset_update_refcount(mon, fdset_id, value) make sense? These
functions could then be kept as thin wrappers around it, or they could
even be dropped completely.
> +
> +int monitor_fdset_get_fd(Monitor *mon, int64_t fdset_id, int flags)
> +{
> + mon_fdset_t *mon_fdset;
> + mon_fdset_fd_t *mon_fdset_fd;
> + int mon_fd_flags;
> +
> + if (!mon) {
> + errno = ENOENT;
> + return -1;
> + }
> +
> + QLIST_FOREACH(mon_fdset, &mon->fdsets, next) {
> + if (mon_fdset->id != fdset_id) {
> + continue;
> + }
> + QLIST_FOREACH(mon_fdset_fd, &mon_fdset->fds, next) {
> + if (mon_fdset_fd->removed) {
> + continue;
> + }
> +
> + mon_fd_flags = fcntl(mon_fdset_fd->fd, F_GETFL);
> + if (mon_fd_flags == -1) {
> + return -1;
> + }
> +
> + switch (flags & O_ACCMODE) {
> + case O_RDWR:
> + if ((mon_fd_flags & O_ACCMODE) == O_RDWR) {
> + return mon_fdset_fd->fd;
> + }
> + break;
> + case O_RDONLY:
> + if ((mon_fd_flags & O_ACCMODE) == O_RDONLY) {
> + return mon_fdset_fd->fd;
> + }
> + break;
> + case O_WRONLY:
> + if ((mon_fd_flags & O_ACCMODE) == O_WRONLY) {
> + return mon_fdset_fd->fd;
> + }
> + break;
> + }
I think you mean:
if ((flags & O_ACCMODE) == (mon_fd_flags & O_ACCMODE)) {
return mon_fdset_fd->fd;
}
What about other flags that cannot be set with fcntl(), like O_SYNC on
older kernels or possibly non-Linux? (The block layer doesn't use it any
more, but I think we want to keep the function generally useful)
> + }
> + errno = EACCES;
> + return -1;
> + }
> + errno = ENOENT;
> + return -1;
> +}
> +
> /* mon_cmds and info_cmds would be sorted at runtime */
> static mon_cmd_t mon_cmds[] = {
> #include "hmp-commands.h"
> @@ -75,6 +76,79 @@ int qemu_madvise(void *addr, size_t len, int advice)
> #endif
> }
>
> +/*
> + * Dups an fd and sets the flags
> + */
> +static int qemu_dup(int fd, int flags)
> +{
> + int i;
> + int ret;
> + int serrno;
> + int dup_flags;
> + int setfl_flags[] = { O_APPEND, O_ASYNC, O_DIRECT, O_NOATIME,
> + O_NONBLOCK, 0 };
> +
> + if (flags & O_CLOEXEC) {
> + ret = fcntl(fd, F_DUPFD_CLOEXEC, 0);
> + if (ret == -1 && errno == EINVAL) {
> + ret = dup(fd);
> + if (ret != -1 && fcntl_setfl(ret, O_CLOEXEC) == -1) {
> + goto fail;
> + }
> + }
> + } else {
> + ret = dup(fd);
> + }
> +
> + if (ret == -1) {
> + goto fail;
> + }
> +
> + dup_flags = fcntl(ret, F_GETFL);
> + if (dup_flags == -1) {
> + goto fail;
> + }
> +
> + if ((flags & O_SYNC) != (dup_flags & O_SYNC)) {
> + errno = EINVAL;
> + goto fail;
> + }
It's worth trying to set it before failing, newer kernels can do it. But
as I said above, if you can fail here, it makes sense to consider O_SYNC
when selecting the right file descriptor from the fdset.
> + /* Set/unset flags that we can with fcntl */
> + i = 0;
> + while (setfl_flags[i] != 0) {
> + if (flags & setfl_flags[i]) {
> + dup_flags = (dup_flags | setfl_flags[i]);
> + } else {
> + dup_flags = (dup_flags & ~setfl_flags[i]);
> + }
> + i++;
> + }
What about this instead of the loop:
int setfl_flags = O_APPEND | O_ASYNC | ... ;
dup_flags &= ~setfl_flags;
dup_flags |= (flags & setfl_flags);
> +
> + if (fcntl(ret, F_SETFL, dup_flags) == -1) {
> + goto fail;
> + }
> +
> + /* Truncate the file in the cases that open() would truncate it */
> + if (flags & O_TRUNC ||
> + ((flags & (O_CREAT | O_EXCL)) == (O_CREAT | O_EXCL))) {
> + if (ftruncate(ret, 0) == -1) {
> + goto fail;
> + }
> + }
O_CREAT | O_EXCL kind of loses its meaning here, but okay, it's hard to
do better with file descriptors.
> +
> + qemu_set_cloexec(ret);
Wait... If O_CLOEXEC is set, you set the flag immediately and if it
isn't you set it at the end of the function? What's the intended use
case for not using O_CLOEXEC then?
Kevin
next prev parent reply other threads:[~2012-07-24 12:08 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-07-23 13:07 [Qemu-devel] [PATCH v5 0/6] file descriptor passing using fd sets Corey Bryant
2012-07-23 13:08 ` [Qemu-devel] [PATCH v5 1/6] qemu-char: Add MSG_CMSG_CLOEXEC flag to recvmsg Corey Bryant
2012-07-23 22:50 ` Eric Blake
2012-07-24 2:19 ` Corey Bryant
2012-07-23 13:08 ` [Qemu-devel] [PATCH v5 2/6] qapi: Introduce add-fd, remove-fd, query-fdsets Corey Bryant
2012-07-25 18:16 ` Eric Blake
2012-07-26 2:55 ` Corey Bryant
2012-07-23 13:08 ` [Qemu-devel] [PATCH v5 3/6] monitor: Clean up fd sets on monitor disconnect Corey Bryant
2012-07-23 13:08 ` [Qemu-devel] [PATCH v5 4/6] block: Convert open calls to qemu_open Corey Bryant
2012-07-25 19:22 ` Eric Blake
2012-07-26 3:11 ` Corey Bryant
2012-07-23 13:08 ` [Qemu-devel] [PATCH v5 5/6] block: Convert close calls to qemu_close Corey Bryant
2012-07-23 13:08 ` [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets Corey Bryant
2012-07-23 13:14 ` Corey Bryant
2012-08-02 22:21 ` Corey Bryant
2012-08-06 9:15 ` Kevin Wolf
2012-08-06 13:32 ` Corey Bryant
2012-08-06 13:51 ` Kevin Wolf
2012-08-06 14:15 ` Corey Bryant
2012-08-07 16:43 ` Corey Bryant
2012-07-24 12:07 ` Kevin Wolf [this message]
2012-07-25 3:41 ` Corey Bryant
2012-07-25 8:22 ` Kevin Wolf
2012-07-25 19:25 ` Eric Blake
2012-07-26 3:21 ` Corey Bryant
2012-07-26 13:13 ` Eric Blake
2012-07-26 13:16 ` Kevin Wolf
2012-07-27 4:07 ` Corey Bryant
2012-07-25 19:43 ` Eric Blake
2012-07-26 3:57 ` Corey Bryant
2012-07-26 9:07 ` Kevin Wolf
2012-07-27 3:59 ` Corey Bryant
2012-07-27 4:03 ` Corey Bryant
2012-08-02 15:08 ` Corey Bryant
2012-07-24 12:09 ` [Qemu-devel] [PATCH v5 0/6] file descriptor passing using " Kevin Wolf
2012-07-25 3:42 ` Corey Bryant
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=500E901D.3080801@redhat.com \
--to=kwolf@redhat.com \
--cc=aliguori@us.ibm.com \
--cc=coreyb@linux.vnet.ibm.com \
--cc=eblake@redhat.com \
--cc=lcapitulino@redhat.com \
--cc=libvir-list@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.