From: Corey Bryant <coreyb@linux.vnet.ibm.com>
To: Eric Blake <eblake@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
aliguori@us.ibm.com, stefanha@linux.vnet.ibm.com,
libvir-list@redhat.com, qemu-devel@nongnu.org,
Luiz Capitulino <lcapitulino@redhat.com>,
pbonzini@redhat.com
Subject: Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
Date: Mon, 02 Jul 2012 18:02:15 -0400 [thread overview]
Message-ID: <4FF21A67.8010100@linux.vnet.ibm.com> (raw)
In-Reply-To: <4FEA3D9C.8080205@redhat.com>
On 06/26/2012 06:54 PM, Eric Blake wrote:
> On 06/26/2012 04:28 PM, Corey Bryant wrote:
>
>>>>> With this proposed series, we have usage akin to:
>>>>>
>>>>> 1. pass_fd FDSET={M} -> returns a string "/dev/fd/N" showing QEMU's
>>>>> view of the FD
>>>>> 2. drive_add file=/dev/fd/N
>>>>> 3. if failure:
>>>>> close_fd "/dev/fd/N"
>>>>
>>>> In fact, there are more steps:
>>>>
>>>> 4. use it successfully
>>>> 5. close_fd "/dev/fd/N"
>>>>
>>>> I think it would well be possible that qemu just closes the fd when it's
>>>> not used internally any more.
>>>
>>> pass-fd could have a flag indicating qemu to do that.
>>>
>>
>> It seems like libvirt would be in a better position to understand when a
>> file is no longer in use, and then it can call close_fd. No? Of course
>> the the only fd that needs to be closed is the originally passed fd. The
>> dup'd fd's are closed by QEMU.
>
> The more we discuss this, the more I'm convinced that commands like
> 'block-commit' that will reopen a file to change from O_RDONLY to O_RDWR
> will need to have an optional argument that says the name of the file to
> reopen. That is, I've seen three proposals:
>
> Proposal One: enhance every command to take an fd:name protocol.
> PRO: Successful use of a name removes the fd from the 'getfd' list.
> CON: Lots of commands to change.
> CON: The command sequence is not atomic.
> CON: Block layer does not have a file name tied to the fd, so the reopen
> operation also needs to learn the fd:name protocol, but since we're
> already touching the command it's not much more work.
> USAGE:
> 1. getfd FDSET={M}, ties new fd to "name"
> 2. drive_add fd=name - looks up the fd by name from the list
> 3a. on success, fd is no longer in the list, drive_add consumed it
> 3b. on failure, libvirt must call 'closefd name' to avoid a leak
> 4. getfd FDSET={M2}, ties new fd to "newname"
> 5. block-commit fd=newname - looks up fd by name from the list
> 6a. on success, fd is no longer in the list, block-commit consumed it
> 6b. on failure, libvirt must call 'closefd newname' to avoid a leak
>
> Proposal Two: Create a permanent name via 'getfd' or 'pass-fd', then
> update that name to the new fd in advance of any operation that needs to
> do a reopen.
> PRO: All other commands work without impact by using qemu_open(), by
> getting a clone of the permanent name.
> CON: The permanent name must remain open as long as qemu might need to
> reopen it, and reopening needs the pass-fd force option.
> CON: The command sequence is not atomic.
> USAGE:
> 1. pass_fd FDSET={M} -> returns an integer N (or a string "/dev/fd/N")
> showing QEMU's permanent name of the FD
> 2. drive_add file=<permanent name> means that qemu_open() will clone the
> fd, and drive_add is now using yet another FD while N remains open;
> meanwhile, the block layer knows that the drive is tied to the permanent
> name
> 3. pass-fd force FDSET={M2} -> replaces fd N with the passed M2, and
> still returns /dev/fd/N
> 4. block-commit - looks up the block-device name (/dev/fd/N), which maps
> back to the permanent fd, and gets a copy of the newly passed M2
> 5. on completion (success or failure), libvirt must call 'closefd name'
> to avoid a leak
>
> Proposal Three: Use thread-local fds passed alongside any command,
> rather than passing via a separate command
> PRO: All commands can now atomically handle fd passing
> PRO: Commands that only need a one-shot open can now use fd
> CON: Commands that need to reopen still need modification to allow a
> name override during the reopen
> 1. drive_add nfds=1 file="/dev/fdlist/1" FDSET={M} -> on success, the fd
> is used as the block file, on failure it is atomically closed, so there
> is no leak either way. However, the block file has no permanent name.
> 2. block-commit nfds=1 file-override="/dev/fdlist/1" FDSET={M2} -> file
> override must be passed again (since we have no guarantee that the
> /dev/fdlist/1 of _this_ command matches the command-local naming used in
> the previous command), but the fd is again used atomically
>
> Under proposal 3, there is no need to dup fd's, merely only to check
> that qemu_open("/dev/fdlist/n", flags) has compatible flags with the fd
> received via FDSET. And the fact that things are made atomic means that
> libvirt no longer has to worry about calling closefd, nor does it have
> to worry about being interrupted between two monitor commands and not
> knowing what state qemu is in. But it does mean teaching every possible
> command that wants to do a reopen to learn how to use fd overrides
> rather than to reuse the permanent name that was originally in place on
> the first open.
>
Here's another option that Kevin and I discussed today on IRC. I've
modified a few minor details since the discussion. And Kevin please
correct me if anything is wrong.
Proposal Four: Pass a set of fds via 'pass-fds'. The group of fds
should all refer to the same file, but may have different access flags
(ie. O_RDWR, O_RDONLY). qemu_open can then dup the fd that has the
matching access mode flags.
pass-fds:
{ 'command': 'pass-fds',
'data': {'fdsetname': 'str', '*close': 'bool'},
'returns': 'string' }
close-fds:
{ 'command': 'close-fds',
'data': {'fdsetname': 'str' }
where:
@fdsetname - the file descriptor set name
@close - *if true QEMU closes the monitor fds when they expire.
if false, fds remain on the monitor until close-fds
command.
PRO: Supports reopen
PRO: All other commands work without impact by using qemu_open()
PRO: No fd leakage if close==true specified
CON: Determining when to close fds when close==true may be tricky
USAGE:
1. pass-fd FDSET={M} -> returns a string "/dev/fdset/1") that refers to
the passed set of fds.
2. drive_add file=/dev/fdset/1 -- qemu_open() uses the first fd from the
list that has access flags matching the qemu_open() action flags.
3. block-commit -- qemu_open() reopens "/dev/fdset/1" by using the first
fd from the list that has access flags matching the qemu_open() action
flags.
4. The monitor fds are closed:
A) *If @close == true, qemu closes the set of fds when the timer
expires
B) If @close == false, libvirt must issue close-fds command to close
the set of fds
*How to solve this has yet to be determined. Kevin mentioned
potentially using a bottom-half or a timer in qemu_close().
--
Regards,
Corey
next prev parent reply other threads:[~2012-07-02 22:02 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-06-22 18:36 [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd Corey Bryant
2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 1/7] qemu-char: Add MSG_CMSG_CLOEXEC flag to recvmsg Corey Bryant
2012-06-22 19:31 ` Eric Blake
2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 2/7] qapi: Convert getfd and closefd Corey Bryant
2012-07-11 18:51 ` Luiz Capitulino
2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 3/7] qapi: Add pass-fd QMP command Corey Bryant
2012-06-22 20:24 ` Eric Blake
2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 4/7] qapi: Re-arrange monitor.c functions Corey Bryant
2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 5/7] block: Prevent /dev/fd/X filename from being detected as floppy Corey Bryant
2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 6/7] block: Convert open calls to qemu_open Corey Bryant
2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 7/7] osdep: Enable qemu_open to dup pre-opened fd Corey Bryant
2012-06-22 19:58 ` Eric Blake
[not found] ` <20120626091004.GA14451@redhat.com>
[not found] ` <4FE9A0F0.2050809@redhat.com>
[not found] ` <20120626175045.2c7011b3@doriath.home>
[not found] ` <4FEA37A9.10707@linux.vnet.ibm.com>
[not found] ` <4FEA3D9C.8080205@redhat.com>
2012-07-02 22:02 ` Corey Bryant [this message]
2012-07-02 22:31 ` [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd Eric Blake
2012-07-03 9:07 ` Daniel P. Berrange
2012-07-03 9:40 ` Kevin Wolf
2012-07-03 13:42 ` Corey Bryant
2012-07-03 15:40 ` Corey Bryant
2012-07-03 15:59 ` Kevin Wolf
2012-07-03 16:25 ` Corey Bryant
2012-07-03 17:03 ` Eric Blake
2012-07-03 17:46 ` Corey Bryant
2012-07-03 18:00 ` Eric Blake
2012-07-03 18:21 ` Corey Bryant
2012-07-04 8:09 ` Kevin Wolf
2012-07-05 15:06 ` Corey Bryant
2012-07-09 14:05 ` Luiz Capitulino
2012-07-09 15:05 ` Corey Bryant
2012-07-09 15:46 ` Kevin Wolf
2012-07-09 16:18 ` Luiz Capitulino
2012-07-09 17:59 ` Corey Bryant
2012-07-09 17:35 ` Corey Bryant
2012-07-09 17:48 ` Luiz Capitulino
2012-07-09 18:02 ` Corey Bryant
2012-07-10 7:53 ` Kevin Wolf
2012-07-09 18:20 ` Corey Bryant
2012-07-04 8:00 ` Kevin Wolf
2012-07-05 14:22 ` Corey Bryant
2012-07-05 14:51 ` Kevin Wolf
2012-07-05 16:35 ` Corey Bryant
2012-07-05 16:37 ` Corey Bryant
2012-07-06 9:06 ` Kevin Wolf
2012-07-05 17:00 ` Eric Blake
2012-07-05 17:36 ` Corey Bryant
2012-07-06 9:11 ` Kevin Wolf
2012-07-06 17:14 ` Corey Bryant
2012-07-06 17:15 ` Corey Bryant
2012-07-06 17:40 ` Corey Bryant
2012-07-06 18:19 ` [Qemu-devel] [libvirt] " Corey Bryant
2012-07-09 14:04 ` [Qemu-devel] " Kevin Wolf
2012-07-09 15:23 ` Corey Bryant
2012-07-09 15:30 ` Kevin Wolf
2012-07-09 18:40 ` Anthony Liguori
2012-07-09 19:00 ` Luiz Capitulino
2012-07-10 8:54 ` Daniel P. Berrange
2012-07-10 7:58 ` Kevin Wolf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FF21A67.8010100@linux.vnet.ibm.com \
--to=coreyb@linux.vnet.ibm.com \
--cc=aliguori@us.ibm.com \
--cc=eblake@redhat.com \
--cc=kwolf@redhat.com \
--cc=lcapitulino@redhat.com \
--cc=libvir-list@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).