From: Corey Bryant <coreyb@linux.vnet.ibm.com>
To: Eric Blake <eblake@redhat.com>
Cc: kwolf@redhat.com, aliguori@us.ibm.com,
stefanha@linux.vnet.ibm.com, libvir-list@redhat.com,
qemu-devel@nongnu.org, lcapitulino@redhat.com
Subject: Re: [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets
Date: Wed, 25 Jul 2012 23:57:37 -0400 [thread overview]
Message-ID: <5010C031.9040602@linux.vnet.ibm.com> (raw)
In-Reply-To: <50104C44.1050206@redhat.com>
On 07/25/2012 03:43 PM, Eric Blake wrote:
> On 07/23/2012 07:08 AM, Corey Bryant wrote:
>> When qemu_open is passed a filename of the "/dev/fdset/nnn"
>> format (where nnn is the fdset ID), an fd with matching access
>> mode flags will be searched for within the specified monitor
>> fd set. If the fd is found, a dup of the fd will be returned
>> from qemu_open.
>>
>> Each fd set has a reference count. The purpose of the reference
>> count is to determine if an fd set contains file descriptors that
>> have open dup() references that have not yet been closed. It is
>> incremented on qemu_open and decremented on qemu_close. It is
>> not until the refcount is zero that file desriptors in an fd set
>> can be closed. If an fd set has dup() references open, then we
>> must keep the other fds in the fd set open in case a reopen
>> of the file occurs that requires an fd with a different access
>> mode.
>>
>
>> +++ b/monitor.c
>> @@ -2551,6 +2551,91 @@ static void monitor_fdsets_set_in_use(Monitor *mon, bool in_use)
>> }
>> }
>>
>> +void monitor_fdset_increment_refcount(Monitor *mon, int64_t fdset_id)
>> +{
>> + mon_fdset_t *mon_fdset;
>> +
>> + if (!mon) {
>> + return;
>> + }
>
> Am I reading this code right by stating that 'if there is no monitor, we
> don't increment the refcount'? How does a monitor reattach affect
> things? Or am I missing something fundamental about the cases when
> 'mon==NULL' will exist?
>
Yes you're reading this correctly.
I'm pretty sure that mon will only be NULL if QEMU is started without a
monitor.
If QEMU has a monitor, and libvirt disconnects it's connection to the
qemu monitor, then I believe mon will remain non-NULL.
I'll plan on testing this out to verify though. (I'm out most of this
week and will be back full time starting next Tues.)
>> +int monitor_fdset_get_fd(Monitor *mon, int64_t fdset_id, int flags)
>> +{
>> + mon_fdset_t *mon_fdset;
>> + mon_fdset_fd_t *mon_fdset_fd;
>> + int mon_fd_flags;
>> +
>> + if (!mon) {
>> + errno = ENOENT;
>> + return -1;
>> + }
>> +
>> + QLIST_FOREACH(mon_fdset, &mon->fdsets, next) {
>> + if (mon_fdset->id != fdset_id) {
>> + continue;
>> + }
>> + QLIST_FOREACH(mon_fdset_fd, &mon_fdset->fds, next) {
>> + if (mon_fdset_fd->removed) {
>> + continue;
>> + }
>> +
>> + mon_fd_flags = fcntl(mon_fdset_fd->fd, F_GETFL);
>> + if (mon_fd_flags == -1) {
>> + return -1;
>
> This says we fail on the first fcntl() failure, instead of trying other
> fds in the set. Granted, an fcntl() failure is probably the sign of a
> bigger bug (such as closing an fd at the wrong point in time), so I
> guess trying to go on doesn't make much sense once we already know we
> are hosed.
>
I think I'll stick with it the way it is. If fcntl() fails we might
have a tainted fd set so I think we should fail.
>> + }
>> +
>> + switch (flags & O_ACCMODE) {
>> + case O_RDWR:
>> + if ((mon_fd_flags & O_ACCMODE) == O_RDWR) {
>> + return mon_fdset_fd->fd;
>> + }
>> + break;
>> + case O_RDONLY:
>> + if ((mon_fd_flags & O_ACCMODE) == O_RDONLY) {
>> + return mon_fdset_fd->fd;
>> + }
>> + break;
>
> Do we want to allow the case where the caller asked for O_RDONLY, but
> the set only has O_RDWR? After all, the caller is getting a compatible
> subset of what the set offers.
>
I don't see a problem with it.
>> + case O_WRONLY:
>> + if ((mon_fd_flags & O_ACCMODE) == O_WRONLY) {
>> + return mon_fdset_fd->fd;
>> + }
>> + break;
>
> Likewise, should we allow a caller asking for O_WRONLY when the set
> provides only O_RDWR?
>
I don't see a problem with it.
>>
>> +/*
>> + * Dups an fd and sets the flags
>> + */
>> +static int qemu_dup(int fd, int flags)
>> +{
>> + int i;
>> + int ret;
>> + int serrno;
>> + int dup_flags;
>> + int setfl_flags[] = { O_APPEND, O_ASYNC, O_DIRECT, O_NOATIME,
>> + O_NONBLOCK, 0 };
>> +
>> + if (flags & O_CLOEXEC) {
>> + ret = fcntl(fd, F_DUPFD_CLOEXEC, 0);
>
> F_DUPFD_CLOEXEC is required by POSIX but not implemented on all modern
> OS yet; you probably need some #ifdef and/or configure guards.
>
Ok
>> + if (ret == -1 && errno == EINVAL) {
>> + ret = dup(fd);
>> + if (ret != -1 && fcntl_setfl(ret, O_CLOEXEC) == -1) {
>
> You _can't_ call F_SETFL with O_CLOEXEC. O_CLOEXEC only causes open()
> to set FD_CLOEXEC; thereafter, including in the case of this dup, what
> you want to do is instead set FD_CLOEXEC via F_SETFD (aka call
> qemu_set_cloexec, not fcntl_setfl).
>
I know, this is a mistake. I'm planning to replace fcntl_setfl() here
with qemu_set_cloexec().
>> + goto fail;
>> + }
>> + }
>> + } else {
>> + ret = dup(fd);
>> + }
>> +
>> + if (ret == -1) {
>> + goto fail;
>> + }
>> +
>> + dup_flags = fcntl(ret, F_GETFL);
>> + if (dup_flags == -1) {
>> + goto fail;
>> + }
>> +
>> + if ((flags & O_SYNC) != (dup_flags & O_SYNC)) {
>> + errno = EINVAL;
>> + goto fail;
>> + }
>> +
>> + /* Set/unset flags that we can with fcntl */
>> + i = 0;
>> + while (setfl_flags[i] != 0) {
>> + if (flags & setfl_flags[i]) {
>> + dup_flags = (dup_flags | setfl_flags[i]);
>> + } else {
>> + dup_flags = (dup_flags & ~setfl_flags[i]);
>> + }
>> + i++;
>> + }
>
> Rather than looping one bit at a time, you should do this as a mask
> operation.
>
I agree.
>> +
>> + if (fcntl(ret, F_SETFL, dup_flags) == -1) {
>> + goto fail;
>> + }
>> +
>> + /* Truncate the file in the cases that open() would truncate it */
>> + if (flags & O_TRUNC ||
>> + ((flags & (O_CREAT | O_EXCL)) == (O_CREAT | O_EXCL))) {
>> + if (ftruncate(ret, 0) == -1) {
>> + goto fail;
>> + }
>> + }
>> +
>> + qemu_set_cloexec(ret);
>
> If we're going to blindly set FD_CLOEXEC at the end of the day, rather
> than try to honor O_CLOEXEC, then why not simplify the beginning of this
> function:
This call to qemu_set_cloexec() was a mistake. I'm planning on removing it.
>
> ret = fcntl(fd, F_DUPFD_CLOEXEC, 0);
> if (ret == -1 && errno == EINVAL) {
> ret = dup(fd);
> if (ret != -1) {
> qemu_set_cloexec(ret);
> }
> }
> if (ret == -1) {
> goto fail;
> }
>
I'll plan on sticking with the existing code in the beginning of this
function with the modifications mentioned above.
--
Regards,
Corey
next prev parent reply other threads:[~2012-07-26 3:58 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-07-23 13:07 [Qemu-devel] [PATCH v5 0/6] file descriptor passing using fd sets Corey Bryant
2012-07-23 13:08 ` [Qemu-devel] [PATCH v5 1/6] qemu-char: Add MSG_CMSG_CLOEXEC flag to recvmsg Corey Bryant
2012-07-23 22:50 ` Eric Blake
2012-07-24 2:19 ` Corey Bryant
2012-07-23 13:08 ` [Qemu-devel] [PATCH v5 2/6] qapi: Introduce add-fd, remove-fd, query-fdsets Corey Bryant
2012-07-25 18:16 ` Eric Blake
2012-07-26 2:55 ` Corey Bryant
2012-07-23 13:08 ` [Qemu-devel] [PATCH v5 3/6] monitor: Clean up fd sets on monitor disconnect Corey Bryant
2012-07-23 13:08 ` [Qemu-devel] [PATCH v5 4/6] block: Convert open calls to qemu_open Corey Bryant
2012-07-25 19:22 ` Eric Blake
2012-07-26 3:11 ` Corey Bryant
2012-07-23 13:08 ` [Qemu-devel] [PATCH v5 5/6] block: Convert close calls to qemu_close Corey Bryant
2012-07-23 13:08 ` [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets Corey Bryant
2012-07-23 13:14 ` Corey Bryant
2012-08-02 22:21 ` Corey Bryant
2012-08-06 9:15 ` Kevin Wolf
2012-08-06 13:32 ` Corey Bryant
2012-08-06 13:51 ` Kevin Wolf
2012-08-06 14:15 ` Corey Bryant
2012-08-07 16:43 ` Corey Bryant
2012-07-24 12:07 ` Kevin Wolf
2012-07-25 3:41 ` Corey Bryant
2012-07-25 8:22 ` Kevin Wolf
2012-07-25 19:25 ` Eric Blake
2012-07-26 3:21 ` Corey Bryant
2012-07-26 13:13 ` Eric Blake
2012-07-26 13:16 ` Kevin Wolf
2012-07-27 4:07 ` Corey Bryant
2012-07-25 19:43 ` Eric Blake
2012-07-26 3:57 ` Corey Bryant [this message]
2012-07-26 9:07 ` Kevin Wolf
2012-07-27 3:59 ` Corey Bryant
2012-07-27 4:03 ` Corey Bryant
2012-08-02 15:08 ` Corey Bryant
2012-07-24 12:09 ` [Qemu-devel] [PATCH v5 0/6] file descriptor passing using " Kevin Wolf
2012-07-25 3:42 ` Corey Bryant
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5010C031.9040602@linux.vnet.ibm.com \
--to=coreyb@linux.vnet.ibm.com \
--cc=aliguori@us.ibm.com \
--cc=eblake@redhat.com \
--cc=kwolf@redhat.com \
--cc=lcapitulino@redhat.com \
--cc=libvir-list@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.