All of lore.kernel.org
 help / color / mirror / Atom feed
From: Corey Bryant <coreyb@linux.vnet.ibm.com>
To: Eric Blake <eblake@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
	aliguori@us.ibm.com, stefanha@linux.vnet.ibm.com,
	libvir-list@redhat.com, qemu-devel@nongnu.org,
	Luiz Capitulino <lcapitulino@redhat.com>,
	pbonzini@redhat.com
Subject: Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using	pass-fd
Date: Mon, 02 Jul 2012 18:02:15 -0400	[thread overview]
Message-ID: <4FF21A67.8010100@linux.vnet.ibm.com> (raw)
In-Reply-To: <4FEA3D9C.8080205@redhat.com>



On 06/26/2012 06:54 PM, Eric Blake wrote:
> On 06/26/2012 04:28 PM, Corey Bryant wrote:
>
>>>>> With this proposed series, we have usage akin to:
>>>>>
>>>>>     1. pass_fd FDSET={M} -> returns a string "/dev/fd/N" showing QEMU's
>>>>>        view of the FD
>>>>>     2. drive_add file=/dev/fd/N
>>>>>     3. if failure:
>>>>>          close_fd "/dev/fd/N"
>>>>
>>>> In fact, there are more steps:
>>>>
>>>> 4. use it successfully
>>>> 5. close_fd "/dev/fd/N"
>>>>
>>>> I think it would well be possible that qemu just closes the fd when it's
>>>> not used internally any more.
>>>
>>> pass-fd could have a flag indicating qemu to do that.
>>>
>>
>> It seems like libvirt would be in a better position to understand when a
>> file is no longer in use, and then it can call close_fd.  No?  Of course
>> the the only fd that needs to be closed is the originally passed fd. The
>> dup'd fd's are closed by QEMU.
>
> The more we discuss this, the more I'm convinced that commands like
> 'block-commit' that will reopen a file to change from O_RDONLY to O_RDWR
> will need to have an optional argument that says the name of the file to
> reopen.  That is, I've seen three proposals:
>
> Proposal One: enhance every command to take an fd:name protocol.
> PRO: Successful use of a name removes the fd from the 'getfd' list.
> CON: Lots of commands to change.
> CON: The command sequence is not atomic.
> CON: Block layer does not have a file name tied to the fd, so the reopen
> operation also needs to learn the fd:name protocol, but since we're
> already touching the command it's not much more work.
> USAGE:
> 1. getfd FDSET={M}, ties new fd to "name"
> 2. drive_add fd=name - looks up the fd by name from the list
> 3a. on success, fd is no longer in the list, drive_add consumed it
> 3b. on failure, libvirt must call 'closefd name' to avoid a leak
> 4. getfd FDSET={M2}, ties new fd to "newname"
> 5. block-commit fd=newname - looks up fd by name from the list
> 6a. on success, fd is no longer in the list, block-commit consumed it
> 6b. on failure, libvirt must call 'closefd newname' to avoid a leak
>
> Proposal Two: Create a permanent name via 'getfd' or 'pass-fd', then
> update that name to the new fd in advance of any operation that needs to
> do a reopen.
> PRO: All other commands work without impact by using qemu_open(), by
> getting a clone of the permanent name.
> CON: The permanent name must remain open as long as qemu might need to
> reopen it, and reopening needs the pass-fd force option.
> CON: The command sequence is not atomic.
> USAGE:
> 1. pass_fd FDSET={M} -> returns an integer N (or a string "/dev/fd/N")
> showing QEMU's permanent name of the FD
> 2. drive_add file=<permanent name> means that qemu_open() will clone the
> fd, and drive_add is now using yet another FD while N remains open;
> meanwhile, the block layer knows that the drive is tied to the permanent
> name
> 3. pass-fd force FDSET={M2} -> replaces fd N with the passed M2, and
> still returns /dev/fd/N
> 4. block-commit - looks up the block-device name (/dev/fd/N), which maps
> back to the permanent fd, and gets a copy of the newly passed M2
> 5. on completion (success or failure), libvirt must call 'closefd name'
> to avoid a leak
>
> Proposal Three: Use thread-local fds passed alongside any command,
> rather than passing via a separate command
> PRO: All commands can now atomically handle fd passing
> PRO: Commands that only need a one-shot open can now use fd
> CON: Commands that need to reopen still need modification to allow a
> name override during the reopen
> 1. drive_add nfds=1 file="/dev/fdlist/1" FDSET={M} -> on success, the fd
> is used as the block file, on failure it is atomically closed, so there
> is no leak either way. However, the block file has no permanent name.
> 2. block-commit nfds=1 file-override="/dev/fdlist/1" FDSET={M2} -> file
> override must be passed again (since we have no guarantee that the
> /dev/fdlist/1 of _this_ command matches the command-local naming used in
> the previous command), but the fd is again used atomically
>
> Under proposal 3, there is no need to dup fd's, merely only to check
> that qemu_open("/dev/fdlist/n", flags) has compatible flags with the fd
> received via FDSET.  And the fact that things are made atomic means that
> libvirt no longer has to worry about calling closefd, nor does it have
> to worry about being interrupted between two monitor commands and not
> knowing what state qemu is in.  But it does mean teaching every possible
> command that wants to do a reopen to learn how to use fd overrides
> rather than to reuse the permanent name that was originally in place on
> the first open.
>

Here's another option that Kevin and I discussed today on IRC.  I've 
modified a few minor details since the discussion. And Kevin please 
correct me if anything is wrong.

Proposal Four: Pass a set of fds via 'pass-fds'.  The group of fds 
should all refer to the same file, but may have different access flags 
(ie. O_RDWR, O_RDONLY).  qemu_open can then dup the fd that has the 
matching access mode flags.
pass-fds:
     { 'command': 'pass-fds',
       'data': {'fdsetname': 'str', '*close': 'bool'},
       'returns': 'string' }
close-fds:
     { 'command': 'close-fds',
       'data': {'fdsetname': 'str' }
where:
      @fdsetname - the file descriptor set name
      @close - *if true QEMU closes the monitor fds when they expire.
               if false, fds remain on the monitor until close-fds
               command.
PRO: Supports reopen
PRO: All other commands work without impact by using qemu_open()
PRO: No fd leakage if close==true specified
CON: Determining when to close fds when close==true may be tricky
USAGE:
1. pass-fd FDSET={M} -> returns a string "/dev/fdset/1") that refers to
the passed set of fds.
2. drive_add file=/dev/fdset/1 -- qemu_open() uses the first fd from the
list that has access flags matching the qemu_open() action flags.
3. block-commit -- qemu_open() reopens "/dev/fdset/1" by using the first
fd from the list that has access flags matching the qemu_open() action 
flags.
4. The monitor fds are closed:
    A) *If @close == true, qemu closes the set of fds when the timer
       expires
    B) If @close == false, libvirt must issue close-fds command to close
       the set of fds

*How to solve this has yet to be determined.  Kevin mentioned 
potentially using a bottom-half or a timer in qemu_close().

-- 
Regards,
Corey

  parent reply	other threads:[~2012-07-02 22:02 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-22 18:36 [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd Corey Bryant
2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 1/7] qemu-char: Add MSG_CMSG_CLOEXEC flag to recvmsg Corey Bryant
2012-06-22 19:31   ` Eric Blake
2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 2/7] qapi: Convert getfd and closefd Corey Bryant
2012-07-11 18:51   ` Luiz Capitulino
2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 3/7] qapi: Add pass-fd QMP command Corey Bryant
2012-06-22 20:24   ` Eric Blake
2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 4/7] qapi: Re-arrange monitor.c functions Corey Bryant
2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 5/7] block: Prevent /dev/fd/X filename from being detected as floppy Corey Bryant
2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 6/7] block: Convert open calls to qemu_open Corey Bryant
2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 7/7] osdep: Enable qemu_open to dup pre-opened fd Corey Bryant
2012-06-22 19:58   ` Eric Blake
     [not found] ` <20120626091004.GA14451@redhat.com>
     [not found]   ` <4FE9A0F0.2050809@redhat.com>
     [not found]     ` <20120626175045.2c7011b3@doriath.home>
     [not found]       ` <4FEA37A9.10707@linux.vnet.ibm.com>
     [not found]         ` <4FEA3D9C.8080205@redhat.com>
2012-07-02 22:02           ` Corey Bryant [this message]
2012-07-02 22:31             ` [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd Eric Blake
2012-07-03  9:07               ` Daniel P. Berrange
2012-07-03  9:40               ` Kevin Wolf
2012-07-03 13:42               ` Corey Bryant
2012-07-03 15:40             ` Corey Bryant
2012-07-03 15:59               ` Kevin Wolf
2012-07-03 16:25                 ` Corey Bryant
2012-07-03 17:03                   ` Eric Blake
2012-07-03 17:46                     ` Corey Bryant
2012-07-03 18:00                       ` Eric Blake
2012-07-03 18:21                         ` Corey Bryant
2012-07-04  8:09                           ` Kevin Wolf
2012-07-05 15:06                             ` Corey Bryant
2012-07-09 14:05                               ` Luiz Capitulino
2012-07-09 15:05                                 ` Corey Bryant
2012-07-09 15:46                                   ` Kevin Wolf
2012-07-09 16:18                                     ` Luiz Capitulino
2012-07-09 17:59                                       ` Corey Bryant
2012-07-09 17:35                                     ` Corey Bryant
2012-07-09 17:48                                       ` Luiz Capitulino
2012-07-09 18:02                                         ` Corey Bryant
2012-07-10  7:53                                       ` Kevin Wolf
2012-07-09 18:20                                   ` Corey Bryant
2012-07-04  8:00                     ` Kevin Wolf
2012-07-05 14:22                       ` Corey Bryant
2012-07-05 14:51                         ` Kevin Wolf
2012-07-05 16:35                           ` Corey Bryant
2012-07-05 16:37                             ` Corey Bryant
2012-07-06  9:06                               ` Kevin Wolf
2012-07-05 17:00                             ` Eric Blake
2012-07-05 17:36                               ` Corey Bryant
2012-07-06  9:11                               ` Kevin Wolf
2012-07-06 17:14                                 ` Corey Bryant
2012-07-06 17:15                                   ` Corey Bryant
2012-07-06 17:40                                 ` Corey Bryant
2012-07-06 18:19                                   ` [Qemu-devel] [libvirt] " Corey Bryant
2012-07-09 14:04                                   ` [Qemu-devel] " Kevin Wolf
2012-07-09 15:23                                     ` Corey Bryant
2012-07-09 15:30                                       ` Kevin Wolf
2012-07-09 18:40   ` Anthony Liguori
2012-07-09 19:00     ` Luiz Capitulino
2012-07-10  8:54       ` Daniel P. Berrange
2012-07-10  7:58     ` Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FF21A67.8010100@linux.vnet.ibm.com \
    --to=coreyb@linux.vnet.ibm.com \
    --cc=aliguori@us.ibm.com \
    --cc=eblake@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=lcapitulino@redhat.com \
    --cc=libvir-list@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.