From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=46138 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OGVUI-0002k3-1z for qemu-devel@nongnu.org; Mon, 24 May 2010 07:06:27 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OGVUG-0002Tr-69 for qemu-devel@nongnu.org; Mon, 24 May 2010 07:06:25 -0400 Received: from mx1.redhat.com ([209.132.183.28]:54579) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OGVUF-0002Tj-UZ for qemu-devel@nongnu.org; Mon, 24 May 2010 07:06:24 -0400 Message-ID: <4BFA5D96.3030603@redhat.com> Date: Mon, 24 May 2010 14:05:58 +0300 From: Avi Kivity MIME-Version: 1.0 Subject: Re: [Qemu-devel] [RFC PATCH 1/1] ceph/rbd block driver for qemu-kvm References: <20100519192222.GD61706@ncolin.muc.de> <4BF5A9D2.5080609@codemonkey.ws> <4BF91937.2070801@redhat.com> <87wrutg4dk.wl%morita.kazutaka@lab.ntt.co.jp> In-Reply-To: <87wrutg4dk.wl%morita.kazutaka@lab.ntt.co.jp> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: MORITA Kazutaka Cc: kvm@vger.kernel.org, qemu-devel@nongnu.org, Blue Swirl , ceph-devel@vger.kernel.org, Christian Brunner On 05/24/2010 10:12 AM, MORITA Kazutaka wrote: > At Sun, 23 May 2010 15:01:59 +0300, > Avi Kivity wrote: > >> On 05/21/2010 12:29 AM, Anthony Liguori wrote: >> >>> I'd be more interested in enabling people to build these types of >>> storage systems without touching qemu. >>> >>> Both sheepdog and ceph ultimately transmit I/O over a socket to a >>> central daemon, right? >>> >> That incurs an extra copy. >> >> >>> So could we not standardize a protocol for this that both sheepdog and >>> ceph could implement? >>> >> The protocol already exists, nbd. It doesn't support snapshotting etc. >> but we could extend it. >> >> > I have no objection to use another protocol for Sheepdog support, but > I think nbd protocol is unsuitable for the large storage pool with > many VM images. It is because nbd protocol doesn't support specifing > a file name to open. If we use nbd with such a storage system, the > server needs to listen ports as many as the number of VM images. As > far as I see the protocol, It looks difficult to extend it without > breaking backward compatibility. > The server would be local and talk over a unix domain socket, perhaps anonymous. nbd has other issues though, such as requiring a copy and no support for metadata operations such as snapshot and file size extension. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic.