From: Anthony Liguori <anthony@codemonkey.ws>
To: Avi Kivity <avi@redhat.com>
Cc: kvm@vger.kernel.org, qemu-devel@nongnu.org,
Blue Swirl <blauwirbel@gmail.com>,
ceph-devel@vger.kernel.org, Christian Brunner <chb@muc.de>,
MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Subject: Re: [Qemu-devel] [RFC PATCH 1/1] ceph/rbd block driver for qemu-kvm
Date: Mon, 24 May 2010 14:16:32 -0500 [thread overview]
Message-ID: <4BFAD090.3000203@codemonkey.ws> (raw)
In-Reply-To: <4BFA696D.2060606@redhat.com>
On 05/24/2010 06:56 AM, Avi Kivity wrote:
> On 05/24/2010 02:42 PM, MORITA Kazutaka wrote:
>>
>>> The server would be local and talk over a unix domain socket, perhaps
>>> anonymous.
>>>
>>> nbd has other issues though, such as requiring a copy and no support
>>> for
>>> metadata operations such as snapshot and file size extension.
>>>
>> Sorry, my explanation was unclear. I'm not sure how running servers
>> on localhost can solve the problem.
>
> The local server can convert from the local (nbd) protocol to the
> remote (sheepdog, ceph) protocol.
>
>> What I wanted to say was that we cannot specify the image of VM. With
>> nbd protocol, command line arguments are as follows:
>>
>> $ qemu nbd:hostname:port
>>
>> As this syntax shows, with nbd protocol the client cannot pass the VM
>> image name to the server.
>
> We would extend it to allow it to connect to a unix domain socket:
>
> qemu nbd:unix:/path/to/socket
nbd is a no-go because it only supports a single, synchronous I/O
operation at a time and has no mechanism for extensibility.
If we go this route, I think two options are worth considering. The
first would be a purely socket based approach where we just accepted the
extra copy.
The other potential approach would be shared memory based. We export
all guest ram as shared memory along with a small bounce buffer pool.
We would then use a ring queue (potentially even using virtio-blk) and
an eventfd for notification.
> The server at the other end would associate the socket with a filename
> and forward it to the server using the remote protocol.
>
> However, I don't think nbd would be a good protocol. My preference
> would be for a plugin API, or for a new local protocol that uses
> splice() to avoid copies.
I think a good shared memory implementation would be preferable to
plugins. I think it's worth attempting to do a plugin interface for the
block layer but I strongly suspect it would not be sufficient.
I would not want to see plugins that interacted with BlockDriverState
directly, for instance. We change it far too often. Our main loop
functions are also not terribly stable so I'm not sure how we would
handle that (unless we forced all block plugins to be in a separate thread).
Regards,
Anthony Liguori
next prev parent reply other threads:[~2010-05-24 19:16 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-05-19 19:22 [Qemu-devel] [RFC PATCH 1/1] ceph/rbd block driver for qemu-kvm Christian Brunner
2010-05-20 20:31 ` Blue Swirl
2010-05-20 21:18 ` Christian Brunner
2010-05-20 21:29 ` Anthony Liguori
2010-05-20 22:16 ` Christian Brunner
2010-05-21 5:28 ` Stefan Hajnoczi
2010-05-21 6:13 ` MORITA Kazutaka
2010-05-21 5:54 ` MORITA Kazutaka
2010-05-23 12:01 ` Avi Kivity
2010-05-24 7:12 ` MORITA Kazutaka
2010-05-24 11:05 ` Avi Kivity
2010-05-24 11:42 ` MORITA Kazutaka
2010-05-24 11:56 ` Avi Kivity
2010-05-24 12:07 ` Cláudio Martins
2010-05-24 14:01 ` MORITA Kazutaka
2010-05-24 19:07 ` Christian Brunner
2010-05-24 19:38 ` Anthony Liguori
2010-05-25 9:14 ` Avi Kivity
2010-05-25 13:17 ` Anthony Liguori
2010-05-25 13:25 ` Avi Kivity
2010-05-25 13:29 ` Anthony Liguori
2010-05-25 13:36 ` Avi Kivity
2010-05-25 13:54 ` Anthony Liguori
2010-05-25 13:57 ` Avi Kivity
2010-05-25 14:02 ` Anthony Liguori
2010-05-26 8:44 ` Avi Kivity
2010-05-25 14:01 ` Kevin Wolf
2010-05-25 16:21 ` Avi Kivity
2010-05-25 17:12 ` Sage Weil
2010-05-26 5:24 ` MORITA Kazutaka
2010-05-26 8:46 ` Avi Kivity
2010-05-24 19:16 ` Anthony Liguori [this message]
2010-05-25 9:19 ` Avi Kivity
2010-05-25 13:26 ` MORITA Kazutaka
2010-05-24 8:27 ` Stefan Hajnoczi
2010-05-24 11:03 ` Avi Kivity
2010-05-24 19:19 ` Anthony Liguori
2010-05-25 9:22 ` Avi Kivity
2010-05-25 11:02 ` Kevin Wolf
2010-05-25 11:25 ` Avi Kivity
2010-05-25 12:03 ` Christoph Hellwig
2010-05-25 12:13 ` Avi Kivity
2010-05-25 13:25 ` Anthony Liguori
2010-05-25 13:31 ` Avi Kivity
2010-05-25 13:35 ` Anthony Liguori
2010-05-25 13:38 ` Avi Kivity
2010-05-25 13:55 ` Anthony Liguori
2010-05-25 14:01 ` Avi Kivity
2010-05-25 14:05 ` Anthony Liguori
2010-05-25 15:00 ` Avi Kivity
2010-05-25 15:01 ` Anthony Liguori
2010-05-25 16:16 ` Avi Kivity
2010-05-25 16:21 ` Anthony Liguori
2010-05-25 16:27 ` Avi Kivity
2010-05-25 13:53 ` Kevin Wolf
2010-05-25 13:55 ` Avi Kivity
2010-05-25 14:03 ` Anthony Liguori
2010-05-25 15:02 ` Avi Kivity
2010-05-25 14:09 ` Kevin Wolf
2010-05-25 15:01 ` Avi Kivity
2010-05-20 23:02 ` Yehuda Sadeh Weinraub
2010-05-23 7:59 ` Blue Swirl
2010-05-24 2:17 ` Yehuda Sadeh Weinraub
2010-05-25 20:13 ` Blue Swirl
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4BFAD090.3000203@codemonkey.ws \
--to=anthony@codemonkey.ws \
--cc=avi@redhat.com \
--cc=blauwirbel@gmail.com \
--cc=ceph-devel@vger.kernel.org \
--cc=chb@muc.de \
--cc=kvm@vger.kernel.org \
--cc=morita.kazutaka@lab.ntt.co.jp \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).