qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Javier Guerra <javier@guerrag.com>
To: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Cc: linux-fsdevel@vger.kernel.org, Avi Kivity <avi@redhat.com>,
	kvm@vger.kernel.org, qemu-devel@nongnu.org
Subject: [Qemu-devel] Re: [ANNOUNCE] Sheepdog: Distributed Storage System for KVM
Date: Fri, 23 Oct 2009 09:14:29 -0500	[thread overview]
Message-ID: <90eb1dc70910230714h65e918a4n255bcf97634b26b0@mail.gmail.com> (raw)
In-Reply-To: <8fd1d76d0910230341w7978ac09te203ef34b79a86c6@mail.gmail.com>

On Fri, Oct 23, 2009 at 5:41 AM, MORITA Kazutaka
<morita.kazutaka@lab.ntt.co.jp> wrote:
> On Fri, Oct 23, 2009 at 12:30 AM, Avi Kivity <avi@redhat.com> wrote:
>> If so, is it reasonable to compare this to a cluster file system setup (like
>> GFS) with images as files on this filesystem?  The difference would be that
>> clustering is implemented in userspace in sheepdog, but in the kernel for a
>> clustering filesystem.
>
> I think that the major difference between sheepdog and cluster file
> systems such as Google File system, pNFS, etc is the interface between
> clients and a storage system.

note that GFS is "Global File System" (written by Sistina (the same
folks from LVM) and bought by RedHat).  Google Filesystem is a
different thing, and ironically the client/storage interface is a
little more like sheepdog and unlike a regular cluster filesystem.

>> How is load balancing implemented?  Can you move an image transparently
>> while a guest is running?  Will an image be moved closer to its guest?
>
> Sheepdog uses consistent hashing to decide where objects store; I/O
> load is balanced across the nodes. When a new node is added or the
> existing node is removed, the hash table changes and the data
> automatically and transparently are moved over nodes.
>
> We plan to implement a mechanism to distribute the data not randomly
> but intelligently; we could use machine load, the locations of VMs, etc.

i don't have much hands-on experience on consistent hashing; but it
sounds reasonable to make each node's ring segment proportional to its
storage capacity.  dynamic load balancing seems a tougher nut to
crack, especially while keeping all clients mapping consistent.

>> Do you support multiple guests accessing the same image?
>
> A VM image can be attached to any VMs but one VM at a time; multiple
> running VMs cannot access to the same VM image.

this is a must-have safety measure; but a 'manual override' is quite
useful for those that know how to manage a cluster-aware filesystem
inside a VM image, maybe like Xen's "w!" flag does.  justs be sure to
avoid distributed caching for a shared image!

in all, great project, and with such a clean patch into KVM/Qemu, high
hopes of making into regular use.

i'd just want to add my '+1 votes' on both getting rid of JVM
dependency and using block devices (usually LVM) instead of ext3/btrfs

-- 
Javier

  parent reply	other threads:[~2009-10-23 14:14 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-21  5:13 [Qemu-devel] [ANNOUNCE] Sheepdog: Distributed Storage System for KVM MORITA Kazutaka
2009-10-21  8:28 ` [Qemu-devel] " Nikolai K. Bochev
2009-10-21  8:45 ` Nikolai K. Bochev
2009-10-23  9:59   ` MORITA Kazutaka
2009-10-21  9:08 ` [Qemu-devel] " Dietmar Maurer
2009-10-23 10:06   ` [Qemu-devel] " MORITA Kazutaka
2009-10-23 10:17     ` Chris Webb
2009-10-23 10:26       ` Chris Webb
2009-10-23 11:10     ` [Qemu-devel] " Dietmar Maurer
2009-10-23 11:45       ` Dietmar Maurer
2009-10-22 15:30 ` [Qemu-devel] " Avi Kivity
2009-10-22 16:28   ` Anthony Liguori
2009-10-22 22:09     ` Alexander Graf
2009-10-23 10:41   ` MORITA Kazutaka
2009-10-23 11:10     ` Alexander Graf
2009-10-23 16:17       ` MORITA Kazutaka
2009-10-23 14:14     ` Javier Guerra [this message]
2009-10-23 14:58       ` Chris Webb
2009-10-23 15:10         ` Javier Guerra
2009-10-23 17:05         ` Tomasz Chmielewski
2009-10-25  8:44           ` Dietmar Maurer
2009-10-25 10:55             ` Tomasz Chmielewski
2009-10-23 15:40       ` FUJITA Tomonori
2009-10-25  5:36         ` Avi Kivity
2009-10-25  8:51       ` [Qemu-devel] " Dietmar Maurer
2009-10-26  6:53         ` [Qemu-devel] " MORITA Kazutaka
2009-10-22 18:46 ` Avishay Traeger
2009-10-23 11:22 ` [Qemu-devel] " Dietmar Maurer
2009-10-23 19:39 ` [Qemu-devel] " MORITA Kazutaka
2009-10-23 19:45   ` Javier Guerra
2009-10-24  2:49     ` MORITA Kazutaka
2009-10-28  3:53 ` [Qemu-devel] " MORITA Kazutaka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=90eb1dc70910230714h65e918a4n255bcf97634b26b0@mail.gmail.com \
    --to=javier@guerrag.com \
    --cc=avi@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=morita.kazutaka@lab.ntt.co.jp \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).