From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=49065 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OGqLI-0005NJ-Hc for qemu-devel@nongnu.org; Tue, 25 May 2010 05:22:33 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OGqLD-0000cX-Rc for qemu-devel@nongnu.org; Tue, 25 May 2010 05:22:32 -0400 Received: from mx1.redhat.com ([209.132.183.28]:29875) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OGqLD-0000cO-EK for qemu-devel@nongnu.org; Tue, 25 May 2010 05:22:27 -0400 Message-ID: <4BFB96C6.8030006@redhat.com> Date: Tue, 25 May 2010 12:22:14 +0300 From: Avi Kivity MIME-Version: 1.0 Subject: Re: [Qemu-devel] [RFC PATCH 1/1] ceph/rbd block driver for qemu-kvm References: <20100519192222.GD61706@ncolin.muc.de> <4BF5A9D2.5080609@codemonkey.ws> <4BF91937.2070801@redhat.com> <4BFA5D07.8030309@redhat.com> <4BFAD146.9090708@codemonkey.ws> In-Reply-To: <4BFAD146.9090708@codemonkey.ws> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Anthony Liguori Cc: kvm@vger.kernel.org, Stefan Hajnoczi , qemu-devel@nongnu.org, Blue Swirl , ceph-devel@vger.kernel.org, Christian Brunner On 05/24/2010 10:19 PM, Anthony Liguori wrote: > On 05/24/2010 06:03 AM, Avi Kivity wrote: >> On 05/24/2010 11:27 AM, Stefan Hajnoczi wrote: >>> On Sun, May 23, 2010 at 1:01 PM, Avi Kivity wrote: >>>> On 05/21/2010 12:29 AM, Anthony Liguori wrote: >>>>> I'd be more interested in enabling people to build these types of >>>>> storage >>>>> systems without touching qemu. >>>>> >>>>> Both sheepdog and ceph ultimately transmit I/O over a socket to a >>>>> central >>>>> daemon, right? >>>> That incurs an extra copy. >>> Besides a shared memory approach, I wonder if the splice() family of >>> syscalls could be used to send/receive data through a storage daemon >>> without the daemon looking at or copying the data? >> >> Excellent idea. > > splice() eventually requires a copy. You cannot splice() to linux-aio > so you'd have to splice() to a temporary buffer and then call into > linux-aio. With shared memory, you can avoid ever bringing the data > into memory via O_DIRECT and linux-aio. If the final destination is a socket, then you end up queuing guest memory as an skbuff. In theory we could do an aio splice to block devices but I don't think that's realistic given our experience with aio changes. -- error compiling committee.c: too many arguments to function