From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1N1LmN-0005DF-Qv for qemu-devel@nongnu.org; Fri, 23 Oct 2009 11:10:11 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1N1LmI-0005Ce-IB for qemu-devel@nongnu.org; Fri, 23 Oct 2009 11:10:10 -0400 Received: from [199.232.76.173] (port=58488 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1N1LmI-0005Cb-Fe for qemu-devel@nongnu.org; Fri, 23 Oct 2009 11:10:06 -0400 Received: from mail-ew0-f211.google.com ([209.85.219.211]:54028) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1N1LmI-0007Gw-0a for qemu-devel@nongnu.org; Fri, 23 Oct 2009 11:10:06 -0400 Received: by ewy7 with SMTP id 7so8344994ewy.34 for ; Fri, 23 Oct 2009 08:10:04 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <20091023145815.GE18955@arachsys.com> References: <4ADE988B.2070303@lab.ntt.co.jp> <4AE07A7F.8000002@redhat.com> <8fd1d76d0910230341w7978ac09te203ef34b79a86c6@mail.gmail.com> <90eb1dc70910230714h65e918a4n255bcf97634b26b0@mail.gmail.com> <20091023145815.GE18955@arachsys.com> Date: Fri, 23 Oct 2009 10:10:03 -0500 Message-ID: <90eb1dc70910230810o5b86d14egec2b3514711c6bc4@mail.gmail.com> Subject: Re: [Qemu-devel] Re: [ANNOUNCE] Sheepdog: Distributed Storage System for KVM From: Javier Guerra Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Chris Webb Cc: linux-fsdevel@vger.kernel.org, qemu-devel@nongnu.org, Avi Kivity , MORITA Kazutaka , kvm@vger.kernel.org On Fri, Oct 23, 2009 at 9:58 AM, Chris Webb wrote: > If the chunks into which the virtual drives are split are quite small (sa= y > the 64MB used by Hadoop), LVM may be a less appropriate choice. It doesn'= t > support very large numbers of very small logical volumes very well. absolutely. the 'nicest' way to do it would be to use a single block device per sheep process, and do the splitting there. it's an extra layer of code, and once you add non-na=C3=AFve behavior for deleting and fragmentation, you quickly approach filesystem-like complexity..... unless you can do some very clever mapping that reuses the consistent hash algorithms to find not only which server(s) you want, but also which chunk to hit.... the kind of things i'd love to code, but never found the use for it. i'll definitely dig deeper in the code. --=20 Javier