Re: [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Corey Bryant <coreyb@linux.vnet.ibm.com>
To: Eric Blake <eblake@redhat.com>
Cc: kwolf@redhat.com, aliguori@us.ibm.com,
	stefanha@linux.vnet.ibm.com, libvir-list@redhat.com,
	qemu-devel@nongnu.org, lcapitulino@redhat.com
Subject: Re: [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets
Date: Wed, 25 Jul 2012 23:57:37 -0400	[thread overview]
Message-ID: <5010C031.9040602@linux.vnet.ibm.com> (raw)
In-Reply-To: <50104C44.1050206@redhat.com>



On 07/25/2012 03:43 PM, Eric Blake wrote:
> On 07/23/2012 07:08 AM, Corey Bryant wrote:
>> When qemu_open is passed a filename of the "/dev/fdset/nnn"
>> format (where nnn is the fdset ID), an fd with matching access
>> mode flags will be searched for within the specified monitor
>> fd set.  If the fd is found, a dup of the fd will be returned
>> from qemu_open.
>>
>> Each fd set has a reference count.  The purpose of the reference
>> count is to determine if an fd set contains file descriptors that
>> have open dup() references that have not yet been closed.  It is
>> incremented on qemu_open and decremented on qemu_close.  It is
>> not until the refcount is zero that file desriptors in an fd set
>> can be closed.  If an fd set has dup() references open, then we
>> must keep the other fds in the fd set open in case a reopen
>> of the file occurs that requires an fd with a different access
>> mode.
>>
>
>> +++ b/monitor.c
>> @@ -2551,6 +2551,91 @@ static void monitor_fdsets_set_in_use(Monitor *mon, bool in_use)
>>       }
>>   }
>>
>> +void monitor_fdset_increment_refcount(Monitor *mon, int64_t fdset_id)
>> +{
>> +    mon_fdset_t *mon_fdset;
>> +
>> +    if (!mon) {
>> +        return;
>> +    }
>
> Am I reading this code right by stating that 'if there is no monitor, we
> don't increment the refcount'?  How does a monitor reattach affect
> things?  Or am I missing something fundamental about the cases when
> 'mon==NULL' will exist?
>

Yes you're reading this correctly.

I'm pretty sure that mon will only be NULL if QEMU is started without a 
monitor.

If QEMU has a monitor, and libvirt disconnects it's connection to the 
qemu monitor, then I believe mon will remain non-NULL.

I'll plan on testing this out to verify though.  (I'm out most of this 
week and will be back full time starting next Tues.)

>> +int monitor_fdset_get_fd(Monitor *mon, int64_t fdset_id, int flags)
>> +{
>> +    mon_fdset_t *mon_fdset;
>> +    mon_fdset_fd_t *mon_fdset_fd;
>> +    int mon_fd_flags;
>> +
>> +    if (!mon) {
>> +        errno = ENOENT;
>> +        return -1;
>> +    }
>> +
>> +    QLIST_FOREACH(mon_fdset, &mon->fdsets, next) {
>> +        if (mon_fdset->id != fdset_id) {
>> +            continue;
>> +        }
>> +        QLIST_FOREACH(mon_fdset_fd, &mon_fdset->fds, next) {
>> +            if (mon_fdset_fd->removed) {
>> +                continue;
>> +            }
>> +
>> +            mon_fd_flags = fcntl(mon_fdset_fd->fd, F_GETFL);
>> +            if (mon_fd_flags == -1) {
>> +                return -1;
>
> This says we fail on the first fcntl() failure, instead of trying other
> fds in the set.  Granted, an fcntl() failure is probably the sign of a
> bigger bug (such as closing an fd at the wrong point in time), so I
> guess trying to go on doesn't make much sense once we already know we
> are hosed.
>

I think I'll stick with it the way it is.  If fcntl() fails we might 
have a tainted fd set so I think we should fail.

>> +            }
>> +
>> +            switch (flags & O_ACCMODE) {
>> +            case O_RDWR:
>> +                if ((mon_fd_flags & O_ACCMODE) == O_RDWR) {
>> +                    return mon_fdset_fd->fd;
>> +                }
>> +                break;
>> +            case O_RDONLY:
>> +                if ((mon_fd_flags & O_ACCMODE) == O_RDONLY) {
>> +                    return mon_fdset_fd->fd;
>> +                }
>> +                break;
>
> Do we want to allow the case where the caller asked for O_RDONLY, but
> the set only has O_RDWR?  After all, the caller is getting a compatible
> subset of what the set offers.
>

I don't see a problem with it.

>> +            case O_WRONLY:
>> +                if ((mon_fd_flags & O_ACCMODE) == O_WRONLY) {
>> +                    return mon_fdset_fd->fd;
>> +                }
>> +                break;
>
> Likewise, should we allow a caller asking for O_WRONLY when the set
> provides only O_RDWR?
>

I don't see a problem with it.

>>
>> +/*
>> + * Dups an fd and sets the flags
>> + */
>> +static int qemu_dup(int fd, int flags)
>> +{
>> +    int i;
>> +    int ret;
>> +    int serrno;
>> +    int dup_flags;
>> +    int setfl_flags[] = { O_APPEND, O_ASYNC, O_DIRECT, O_NOATIME,
>> +                          O_NONBLOCK, 0 };
>> +
>> +    if (flags & O_CLOEXEC) {
>> +        ret = fcntl(fd, F_DUPFD_CLOEXEC, 0);
>
> F_DUPFD_CLOEXEC is required by POSIX but not implemented on all modern
> OS yet; you probably need some #ifdef and/or configure guards.
>

Ok

>> +        if (ret == -1 && errno == EINVAL) {
>> +            ret = dup(fd);
>> +            if (ret != -1 && fcntl_setfl(ret, O_CLOEXEC) == -1) {
>
> You _can't_ call F_SETFL with O_CLOEXEC.  O_CLOEXEC only causes open()
> to set FD_CLOEXEC; thereafter, including in the case of this dup, what
> you want to do is instead set FD_CLOEXEC via F_SETFD (aka call
> qemu_set_cloexec, not fcntl_setfl).
>

I know, this is a mistake.  I'm planning to replace fcntl_setfl() here 
with qemu_set_cloexec().

>> +                goto fail;
>> +            }
>> +        }
>> +    } else {
>> +        ret = dup(fd);
>> +    }
>> +
>> +    if (ret == -1) {
>> +        goto fail;
>> +    }
>> +
>> +    dup_flags = fcntl(ret, F_GETFL);
>> +    if (dup_flags == -1) {
>> +        goto fail;
>> +    }
>> +
>> +    if ((flags & O_SYNC) != (dup_flags & O_SYNC)) {
>> +        errno = EINVAL;
>> +        goto fail;
>> +    }
>> +
>> +    /* Set/unset flags that we can with fcntl */
>> +    i = 0;
>> +    while (setfl_flags[i] != 0) {
>> +        if (flags & setfl_flags[i]) {
>> +            dup_flags = (dup_flags | setfl_flags[i]);
>> +        } else {
>> +            dup_flags = (dup_flags & ~setfl_flags[i]);
>> +        }
>> +        i++;
>> +    }
>
> Rather than looping one bit at a time, you should do this as a mask
> operation.
>

I agree.

>> +
>> +    if (fcntl(ret, F_SETFL, dup_flags) == -1) {
>> +        goto fail;
>> +    }
>> +
>> +    /* Truncate the file in the cases that open() would truncate it */
>> +    if (flags & O_TRUNC ||
>> +            ((flags & (O_CREAT | O_EXCL)) == (O_CREAT | O_EXCL))) {
>> +        if (ftruncate(ret, 0) == -1) {
>> +            goto fail;
>> +        }
>> +    }
>> +
>> +    qemu_set_cloexec(ret);
>
> If we're going to blindly set FD_CLOEXEC at the end of the day, rather
> than try to honor O_CLOEXEC, then why not simplify the beginning of this
> function:

This call to qemu_set_cloexec() was a mistake.  I'm planning on removing it.

>
>      ret = fcntl(fd, F_DUPFD_CLOEXEC, 0);
>      if (ret == -1 && errno == EINVAL) {
>          ret = dup(fd);
>          if (ret != -1) {
>              qemu_set_cloexec(ret);
>          }
>      }
>      if (ret == -1) {
>          goto fail;
>      }
>

I'll plan on sticking with the existing code in the beginning of this 
function with the modifications mentioned above.

-- 
Regards,
Corey

next prev parent reply	other threads:[~2012-07-26  3:58 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-23 13:07 [Qemu-devel] [PATCH v5 0/6] file descriptor passing using fd sets Corey Bryant
2012-07-23 13:08 ` [Qemu-devel] [PATCH v5 1/6] qemu-char: Add MSG_CMSG_CLOEXEC flag to recvmsg Corey Bryant
2012-07-23 22:50   ` Eric Blake
2012-07-24  2:19     ` Corey Bryant
2012-07-23 13:08 ` [Qemu-devel] [PATCH v5 2/6] qapi: Introduce add-fd, remove-fd, query-fdsets Corey Bryant
2012-07-25 18:16   ` Eric Blake
2012-07-26  2:55     ` Corey Bryant
2012-07-23 13:08 ` [Qemu-devel] [PATCH v5 3/6] monitor: Clean up fd sets on monitor disconnect Corey Bryant
2012-07-23 13:08 ` [Qemu-devel] [PATCH v5 4/6] block: Convert open calls to qemu_open Corey Bryant
2012-07-25 19:22   ` Eric Blake
2012-07-26  3:11     ` Corey Bryant
2012-07-23 13:08 ` [Qemu-devel] [PATCH v5 5/6] block: Convert close calls to qemu_close Corey Bryant
2012-07-23 13:08 ` [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets Corey Bryant
2012-07-23 13:14   ` Corey Bryant
2012-08-02 22:21     ` Corey Bryant
2012-08-06  9:15       ` Kevin Wolf
2012-08-06 13:32         ` Corey Bryant
2012-08-06 13:51           ` Kevin Wolf
2012-08-06 14:15             ` Corey Bryant
2012-08-07 16:43               ` Corey Bryant
2012-07-24 12:07   ` Kevin Wolf
2012-07-25  3:41     ` Corey Bryant
2012-07-25  8:22       ` Kevin Wolf
2012-07-25 19:25         ` Eric Blake
2012-07-26  3:21           ` Corey Bryant
2012-07-26 13:13             ` Eric Blake
2012-07-26 13:16               ` Kevin Wolf
2012-07-27  4:07                 ` Corey Bryant
2012-07-25 19:43   ` Eric Blake
2012-07-26  3:57     ` Corey Bryant [this message]
2012-07-26  9:07       ` Kevin Wolf
2012-07-27  3:59         ` Corey Bryant
2012-07-27  4:03         ` Corey Bryant
2012-08-02 15:08       ` Corey Bryant
2012-07-24 12:09 ` [Qemu-devel] [PATCH v5 0/6] file descriptor passing using " Kevin Wolf
2012-07-25  3:42   ` Corey Bryant

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5010C031.9040602@linux.vnet.ibm.com \
    --to=coreyb@linux.vnet.ibm.com \
    --cc=aliguori@us.ibm.com \
    --cc=eblake@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=lcapitulino@redhat.com \
    --cc=libvir-list@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).