From: Corey Bryant <coreyb@linux.vnet.ibm.com>
To: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
aliguori@us.ibm.com, stefanha@linux.vnet.ibm.com,
libvir-list@redhat.com, qemu-devel@nongnu.org,
pbonzini@redhat.com, Eric Blake <eblake@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
Date: Mon, 09 Jul 2012 11:05:24 -0400 [thread overview]
Message-ID: <4FFAF334.9000807@linux.vnet.ibm.com> (raw)
In-Reply-To: <20120709110510.12214347@doriath.home>
On 07/09/2012 10:05 AM, Luiz Capitulino wrote:
> On Thu, 05 Jul 2012 11:06:56 -0400
> Corey Bryant <coreyb@linux.vnet.ibm.com> wrote:
>
>>
>>
>> On 07/04/2012 04:09 AM, Kevin Wolf wrote:
>>> Am 03.07.2012 20:21, schrieb Corey Bryant:
>>>> On 07/03/2012 02:00 PM, Eric Blake wrote:
>>>>> On 07/03/2012 11:46 AM, Corey Bryant wrote:
>>>>>
>>>>>>
>>>>>> Yes, I think adding a +1 to the refcount for the monitor makes sense.
>>>>>>
>>>>>> I'm a bit unsure how to increment the refcount when a monitor reconnects
>>>>>> though. Maybe it is as simple as adding a +1 to each fd's refcount when
>>>>>> the next QMP monitor connects.
>>>>>
>>>>> Or maybe delay a +1 until after a 'query-fds' - it is not until the
>>>>> monitor has reconnected and learned what fds it should be aware of that
>>>>> incrementing the refcount again makes sense. But that would mean making
>>>>> 'query-fds' track whether this is the first call since the monitor
>>>>> reconnected, as it shouldn't normally increase refcounts.
>>>>
>>>> This doesn't sound ideal.
>>>
>>> Yes, it's less than ideal.
>>>
>>>>> The other alternative is that the monitor never re-increments a
>>>>> refcount. Once a monitor disconnects, that fd is lost to the monitor,
>>>>> and a reconnected monitor must pass in a new fd to be re-associated with
>>>>> the fdset. In other words, the monitor's use of an fd is a one-way
>>>>> operation, starting life in use but ending at the first disconnect or
>>>>> remove-fd.
>>>>
>>>> I would vote for this 2nd alternative. As long as we're not introducing
>>>> an fd leak. And I don't think we are if we decrement the refcount on
>>>> remove-fd or on QMP disconnect.
>>>
>>> In fact, I believe this one is even worse. I can already see hacks like
>>> adding a dummy FD with invalid flags and removing it again just to
>>> regain control over the fdset...
>>>
>>> You earlier suggestion made a lot of sense to me: Whenever a new QMP
>>> monitor is connected, increase the refcount. That is, as long as any
>>> monitor is there, don't drop any fdsets unless explicitly requested via QMP.
>>
>> Ok. So refcount would be incremented (for the fd or fdset, whatever we
>> decide on) when QMP reconnects. I'm assuming we wouldn't wait until
>> after a query-fds call.
>
> I'm not sure this is a good idea because we will leak fds if the client forgets
> about the fds when re-connecting (ie. it was restarted) or if a different
> client connects to QMP.
>
> If we really want to do that, I think that the right way of doing this is to
> add a command for clients to re-again ownership of the fds on reconnection.
>
> But to be honest, I don't fully understand why this is needed.
>
I'm not sure this is an issue with current design. I know things have
changed a bit as the email threads evolved, so I'll paste the current
design that I am working from. Please let me know if you still see any
issues.
FD passing:
-----------
New monitor commands enable adding/removing an fd to/from a set. New
monitor command query-fdsets enables querying of current monitor fdsets.
The set of fds should all refer to the same file, with each fd having
different access flags (ie. O_RDWR, O_RDONLY). qemu_open can then dup
the fd that has the matching access mode flags.
Design points:
--------------
1. add-fd
-> fd is passed via SCM rights and qemu adds fd to first unused fdset
(e.g. /dev/fdset/1)
-> add-fd monitor function initializes the monitor inuse flag for the
fdset to true
-> add-fd monitor function initializes the remove flag for the fd to false
-> add-fd returns fdset number and received fd number (e.g fd=3) to caller
2. drive_add file=/dev/fdset/1
-> qemu_open uses the first fd in fdset1 that has access flags matching
the qemu_open action flags and has remove flag set to false
-> qemu_open increments refcount for the fdset
-> Need to make sure that if a command like 'device-add' fails that
refcount is not incremented
3. add-fd fdset=1
-> fd is passed via SCM rights
-> add-fd monitor function adds the received fd to the specified fdset
(or fails if fdset doesn't exist)
-> add-fd monitor function initializes the remove flag for the fd to false
-> add-fd returns fdset number and received fd number (e.g fd=4) to caller
4. block-commit
-> qemu_open performs "reopen" by using the first fd from the fdset that
has access flags matching the qemu_open action flags and has remove flag
set to false
-> qemu_open increments refcount for the fdset
-> Need to make sure that if a command like 'block-commit' fails that
refcount is not incremented
5. remove-fd fdset=1 fd=4
-> remove-fd monitor function fails if fdset doesn't exist
-> remove-fd monitor function turns on remove flag for fd=4
6. qemu_close (need to replace all close calls in block layer with
qemu_close)
-> qemu_close decrements refcount for fdset
-> qemu_close closes all fds that have (refcount == 0 && (!inuse || remove))
-> qemu_close frees the fdset if no fds remain in it
7. disconnecting the QMP monitor
-> monitor disconnect visits all fdsets on monitor and turns off monitor
in-use flag for fdset
8. connecting the QMP monitor
-> monitor connect visits all fdsets on monitor and turns on monitor
in-use flag for fdset
9. query-fdsets
-> returns all fdsets and fds that don't have remove flag on
QMP command examples
--------------------
-> { "execute": "add-fd", "arguments": { "fdset": 1 } }
<- { "return": { "fdset": 1, "fd": 3 } }
-> { "execute": "remove-fd", "arguments": { "fdset": 1, "fd": 3 } }
<- { "return": {} }
-> { "execute":"query-fdsets" } =>
<- { "return" : { "fdsets": [
{ "name": "fdset1",
"fds": [ { "fd": 4, "removed": false } ],
"refcount": 1,
"monitor": true },
{ "name": "fdset2",
"fds": [ { "fd": 5, "removed": false },
{ "fd": 6, "removed": true } ],
"refcount": 1,
"monitor": true } }
--
Regards,
Corey
next prev parent reply other threads:[~2012-07-09 15:06 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-06-22 18:36 [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd Corey Bryant
2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 1/7] qemu-char: Add MSG_CMSG_CLOEXEC flag to recvmsg Corey Bryant
2012-06-22 19:31 ` Eric Blake
2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 2/7] qapi: Convert getfd and closefd Corey Bryant
2012-07-11 18:51 ` Luiz Capitulino
2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 3/7] qapi: Add pass-fd QMP command Corey Bryant
2012-06-22 20:24 ` Eric Blake
2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 4/7] qapi: Re-arrange monitor.c functions Corey Bryant
2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 5/7] block: Prevent /dev/fd/X filename from being detected as floppy Corey Bryant
2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 6/7] block: Convert open calls to qemu_open Corey Bryant
2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 7/7] osdep: Enable qemu_open to dup pre-opened fd Corey Bryant
2012-06-22 19:58 ` Eric Blake
[not found] ` <20120626091004.GA14451@redhat.com>
[not found] ` <4FE9A0F0.2050809@redhat.com>
[not found] ` <20120626175045.2c7011b3@doriath.home>
[not found] ` <4FEA37A9.10707@linux.vnet.ibm.com>
[not found] ` <4FEA3D9C.8080205@redhat.com>
2012-07-02 22:02 ` [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd Corey Bryant
2012-07-02 22:31 ` Eric Blake
2012-07-03 9:07 ` Daniel P. Berrange
2012-07-03 9:40 ` Kevin Wolf
2012-07-03 13:42 ` Corey Bryant
2012-07-03 15:40 ` Corey Bryant
2012-07-03 15:59 ` Kevin Wolf
2012-07-03 16:25 ` Corey Bryant
2012-07-03 17:03 ` Eric Blake
2012-07-03 17:46 ` Corey Bryant
2012-07-03 18:00 ` Eric Blake
2012-07-03 18:21 ` Corey Bryant
2012-07-04 8:09 ` Kevin Wolf
2012-07-05 15:06 ` Corey Bryant
2012-07-09 14:05 ` Luiz Capitulino
2012-07-09 15:05 ` Corey Bryant [this message]
2012-07-09 15:46 ` Kevin Wolf
2012-07-09 16:18 ` Luiz Capitulino
2012-07-09 17:59 ` Corey Bryant
2012-07-09 17:35 ` Corey Bryant
2012-07-09 17:48 ` Luiz Capitulino
2012-07-09 18:02 ` Corey Bryant
2012-07-10 7:53 ` Kevin Wolf
2012-07-09 18:20 ` Corey Bryant
2012-07-04 8:00 ` Kevin Wolf
2012-07-05 14:22 ` Corey Bryant
2012-07-05 14:51 ` Kevin Wolf
2012-07-05 16:35 ` Corey Bryant
2012-07-05 16:37 ` Corey Bryant
2012-07-06 9:06 ` Kevin Wolf
2012-07-05 17:00 ` Eric Blake
2012-07-05 17:36 ` Corey Bryant
2012-07-06 9:11 ` Kevin Wolf
2012-07-06 17:14 ` Corey Bryant
2012-07-06 17:15 ` Corey Bryant
2012-07-06 17:40 ` Corey Bryant
2012-07-06 18:19 ` [Qemu-devel] [libvirt] " Corey Bryant
2012-07-09 14:04 ` [Qemu-devel] " Kevin Wolf
2012-07-09 15:23 ` Corey Bryant
2012-07-09 15:30 ` Kevin Wolf
2012-07-09 18:40 ` Anthony Liguori
2012-07-09 19:00 ` Luiz Capitulino
2012-07-10 8:54 ` Daniel P. Berrange
2012-07-10 7:58 ` Kevin Wolf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FFAF334.9000807@linux.vnet.ibm.com \
--to=coreyb@linux.vnet.ibm.com \
--cc=aliguori@us.ibm.com \
--cc=eblake@redhat.com \
--cc=kwolf@redhat.com \
--cc=lcapitulino@redhat.com \
--cc=libvir-list@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).