All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Corey Bryant <coreyb@linux.vnet.ibm.com>
Cc: aliguori@us.ibm.com, stefanha@linux.vnet.ibm.com,
	libvir-list@redhat.com, qemu-devel@nongnu.org,
	lcapitulino@redhat.com, eblake@redhat.com
Subject: Re: [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets
Date: Wed, 25 Jul 2012 10:22:46 +0200	[thread overview]
Message-ID: <500FACD6.700@redhat.com> (raw)
In-Reply-To: <500F6B04.4020508@linux.vnet.ibm.com>

Am 25.07.2012 05:41, schrieb Corey Bryant:
>>> diff --git a/block/raw-posix.c b/block/raw-posix.c
>>> index a172de3..5d0a801 100644
>>> --- a/block/raw-posix.c
>>> +++ b/block/raw-posix.c
>>> @@ -271,7 +271,7 @@ static int raw_open_common(BlockDriverState *bs, const char *filename,
>>>   out_free_buf:
>>>       qemu_vfree(s->aligned_buf);
>>>   out_close:
>>> -    qemu_close(fd);
>>> +    qemu_close(fd, filename);
>>>       return -errno;
>>>   }
>>
>> Hm, not a nice interface where qemu_close() needs the filename and
>> (worse) could be given a wrong filename. Maybe it would be better to
>> maintain a list of fd -> fdset mappings in qemu_open/close?
>>
> 
> I agree, I don't really like it either.
> 
> We already have a list of fd -> fdset mappings (mon_fdset_fd_t -> 
> mon_fdset_t).  Would it be too costly to loop through all the fdsets/fds 
> at the beginning of every qemu_close()?

I don't think so. qemu_close() is not a fast path and happens almost
never, and the list is short enough that searching it isn't a problem
anyway.

>>> +            switch (flags & O_ACCMODE) {
>>> +            case O_RDWR:
>>> +                if ((mon_fd_flags & O_ACCMODE) == O_RDWR) {
>>> +                    return mon_fdset_fd->fd;
>>> +                }
>>> +                break;
>>> +            case O_RDONLY:
>>> +                if ((mon_fd_flags & O_ACCMODE) == O_RDONLY) {
>>> +                    return mon_fdset_fd->fd;
>>> +                }
>>> +                break;
>>> +            case O_WRONLY:
>>> +                if ((mon_fd_flags & O_ACCMODE) == O_WRONLY) {
>>> +                    return mon_fdset_fd->fd;
>>> +                }
>>> +                break;
>>> +            }
>>
>> I think you mean:
>>
>>    if ((flags & O_ACCMODE) == (mon_fd_flags & O_ACCMODE)) {
>>        return mon_fdset_fd->fd;
>>    }
> 
> Yes, that would be a bit simpler wouldn't it. :)
> 
>>
>> What about other flags that cannot be set with fcntl(), like O_SYNC on
>> older kernels or possibly non-Linux? (The block layer doesn't use it any
>> more, but I think we want to keep the function generally useful)
>>
> 
> I see what you're getting at here.  Basically you could have 2 fds in an 
> fdset with the same access mode flags, but one has O_SYNC on and the 
> other has O_SYNC off.  That seems like it would make sense to implement. 
>   As a work-around, I think a user could just create a separate fdset 
> for the same file with different O_SYNC value.  But from a client 
> perspective, it would be nicer to have this taken care of for you.

Now that the block layer doesn't use O_SYNC any more, it's more of a
theoretical point. I don't think there's any other place, where we'd
need to switch O_SYNC during runtime.

Taking it into consideration is complicated by the fact that some
kernels allow to fcntl() O_SYNC and others don't, so enforcing a match
here wouldn't feel completely right either.

Maybe just leave it as it is. :-)

> 
>>> +        }
>>> +        errno = EACCES;
>>> +        return -1;
>>> +    }
>>> +    errno = ENOENT;
>>> +    return -1;
>>> +}
>>> +
>>>   /* mon_cmds and info_cmds would be sorted at runtime */
>>>   static mon_cmd_t mon_cmds[] = {
>>>   #include "hmp-commands.h"
>>
>>> @@ -75,6 +76,79 @@ int qemu_madvise(void *addr, size_t len, int advice)
>>>   #endif
>>>   }
>>>
>>> +/*
>>> + * Dups an fd and sets the flags
>>> + */
>>> +static int qemu_dup(int fd, int flags)
>>> +{
>>> +    int i;
>>> +    int ret;
>>> +    int serrno;
>>> +    int dup_flags;
>>> +    int setfl_flags[] = { O_APPEND, O_ASYNC, O_DIRECT, O_NOATIME,
>>> +                          O_NONBLOCK, 0 };
>>> +
>>> +    if (flags & O_CLOEXEC) {
>>> +        ret = fcntl(fd, F_DUPFD_CLOEXEC, 0);
>>> +        if (ret == -1 && errno == EINVAL) {
>>> +            ret = dup(fd);
>>> +            if (ret != -1 && fcntl_setfl(ret, O_CLOEXEC) == -1) {
>>> +                goto fail;
>>> +            }
>>> +        }
>>> +    } else {
>>> +        ret = dup(fd);
>>> +    }
>>> +
>>> +    if (ret == -1) {
>>> +        goto fail;
>>> +    }
>>> +
>>> +    dup_flags = fcntl(ret, F_GETFL);
>>> +    if (dup_flags == -1) {
>>> +        goto fail;
>>> +    }
>>> +
>>> +    if ((flags & O_SYNC) != (dup_flags & O_SYNC)) {
>>> +        errno = EINVAL;
>>> +        goto fail;
>>> +    }
>>
>> It's worth trying to set it before failing, newer kernels can do it. But
>> as I said above, if you can fail here, it makes sense to consider O_SYNC
>> when selecting the right file descriptor from the fdset.
>>
> 
> I'm on a 3.4.4 Fedora kernel that doesn't appear to support 
> fcntl(O_SYNC), but perhaps I'm doing something wrong.  Here's my test 
> code (shortened for simplicty): [...]

Hm, true. So it seems that patch has never made it into the kernel, in
fact...

>>> +
>>> +    qemu_set_cloexec(ret);
>>
>> Wait... If O_CLOEXEC is set, you set the flag immediately and if it
>> isn't you set it at the end of the function? What's the intended use
>> case for not using O_CLOEXEC then?
>>
> 
> This is a mistake.  I think I just need to be using qemu_set_cloexec() 
> instead of fcntl_setfl() earlier in this function and get rid of this 
> latter call to qemu_set_cloexec().

Yes, probably. And in fact, I think this shouldn't even be conditional
on flags & O_CLOEXEC. The whole reason qemu_open() was introduced
originally was to always set O_CLOEXEC.

Kevin

  reply	other threads:[~2012-07-25  8:23 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-23 13:07 [Qemu-devel] [PATCH v5 0/6] file descriptor passing using fd sets Corey Bryant
2012-07-23 13:08 ` [Qemu-devel] [PATCH v5 1/6] qemu-char: Add MSG_CMSG_CLOEXEC flag to recvmsg Corey Bryant
2012-07-23 22:50   ` Eric Blake
2012-07-24  2:19     ` Corey Bryant
2012-07-23 13:08 ` [Qemu-devel] [PATCH v5 2/6] qapi: Introduce add-fd, remove-fd, query-fdsets Corey Bryant
2012-07-25 18:16   ` Eric Blake
2012-07-26  2:55     ` Corey Bryant
2012-07-23 13:08 ` [Qemu-devel] [PATCH v5 3/6] monitor: Clean up fd sets on monitor disconnect Corey Bryant
2012-07-23 13:08 ` [Qemu-devel] [PATCH v5 4/6] block: Convert open calls to qemu_open Corey Bryant
2012-07-25 19:22   ` Eric Blake
2012-07-26  3:11     ` Corey Bryant
2012-07-23 13:08 ` [Qemu-devel] [PATCH v5 5/6] block: Convert close calls to qemu_close Corey Bryant
2012-07-23 13:08 ` [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets Corey Bryant
2012-07-23 13:14   ` Corey Bryant
2012-08-02 22:21     ` Corey Bryant
2012-08-06  9:15       ` Kevin Wolf
2012-08-06 13:32         ` Corey Bryant
2012-08-06 13:51           ` Kevin Wolf
2012-08-06 14:15             ` Corey Bryant
2012-08-07 16:43               ` Corey Bryant
2012-07-24 12:07   ` Kevin Wolf
2012-07-25  3:41     ` Corey Bryant
2012-07-25  8:22       ` Kevin Wolf [this message]
2012-07-25 19:25         ` Eric Blake
2012-07-26  3:21           ` Corey Bryant
2012-07-26 13:13             ` Eric Blake
2012-07-26 13:16               ` Kevin Wolf
2012-07-27  4:07                 ` Corey Bryant
2012-07-25 19:43   ` Eric Blake
2012-07-26  3:57     ` Corey Bryant
2012-07-26  9:07       ` Kevin Wolf
2012-07-27  3:59         ` Corey Bryant
2012-07-27  4:03         ` Corey Bryant
2012-08-02 15:08       ` Corey Bryant
2012-07-24 12:09 ` [Qemu-devel] [PATCH v5 0/6] file descriptor passing using " Kevin Wolf
2012-07-25  3:42   ` Corey Bryant

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=500FACD6.700@redhat.com \
    --to=kwolf@redhat.com \
    --cc=aliguori@us.ibm.com \
    --cc=coreyb@linux.vnet.ibm.com \
    --cc=eblake@redhat.com \
    --cc=lcapitulino@redhat.com \
    --cc=libvir-list@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.