All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dan van der Ster <daniel.vanderster@cern.ch>
To: Jeff Darcy <jdarcy@redhat.com>, ceph-devel@vger.kernel.org
Cc: gluster-devel@gluster.org
Subject: Re: RADOS translator for GlusterFS
Date: Mon, 5 May 2014 17:37:22 +0200	[thread overview]
Message-ID: <5367B032.9030901@cern.ch> (raw)
In-Reply-To: <355696287.706122.1399303290204.JavaMail.zimbra@redhat.com>

Hi,

On 05/05/14 17:21, Jeff Darcy wrote:
> Now that we're all one big happy family, I've been mulling over
> different ways that the two technology stacks could work together.  One
> idea would be to use some of the GlusterFS upper layers for their
> interface and integration possibilities, but then falling down to RADOS
> instead of GlusterFS's own distribution and replication.  I must
> emphasize that I don't necessarily think this is The Right Way for
> anything real, but I think it's an important experiment just to see what
> the problems are and how well it performs.  So here's what I'm thinking.
>
> For the Ceph folks, I'll describe just a tiny bit of how GlusterFS
> works.  The core concept in GlusterFS is a "translator" which accepts
> file system requests and generates file system requests in exactly the
> same form.  This allows them to be stacked in arbitrary orders, moved
> back and forth across the server/client divide, etc.  There are several
> broad classes of translators:
>
> * Some, such as FUSE or GFAPI, inject new requests into the translator
>    stack.
>
> * Some, such as "posix", satisfy requests by calling a server-local FS.
>
> * The "client" and "server" translators together get requests from one
>    machine to another.
>
> * Some translators *route* requests (one in to one of several out).
>
> * Some translators *fan out* requests (one in to all of several out).
>
> * Most are one in, one out, to add e.g. locks or caching etc.
>
> Of particular interest here are the DHT (routing/distribution) and AFR
> (fan-out/replication) translators, which mirror functionality in RADOS.
> My idea is to cut out everything from these on below, in favor of a
> translator based on librados instead.  How this works is pretty obvious
> for file data - just read and write to RADOS objects instead of to
> files.  It's a bit less obvious for metadata, especially directory
> entries.  One really simple idea is to store metadata as data, in some
> format defined by the translator itself, and have it handle the
> read/modify/write for adding/deleting entries and such.  That would be
> enough to get some basic performance tests done.  A slightly more
> sophisticated idea might be to use OSD class methods to do the
> read/modify/write, but I don't know much about that mechanism so I'm not
> sure that's even feasible.
>
> This is not something I'm going to be working on as part of my main job,
> but I'd like to get the experiment started in some of my "spare" time.
> Is there anyone else interested in collaborating, or are there any other
> obvious ideas I'm missing?

Regarding obvious ideas, FWIW, I've been testing GlusterFS volumes which 
distribute over a few VMs with locally attached RBDs. That seems to be 
usable today, and shouldn't lose data but I guess would do something bad 
while individual VM/RBDs go down.
I'm very new to gluster, but I can't think of a way to make this HA 
without either replication at the gluster level (expensive) or making 
gluster speak to RADOS directly.

Cheers, Dan

> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


  reply	other threads:[~2014-05-05 15:37 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <980181538.654650.1399300829103.JavaMail.zimbra@redhat.com>
2014-05-05 15:21 ` RADOS translator for GlusterFS Jeff Darcy
2014-05-05 15:37   ` Dan van der Ster [this message]
2014-05-05 16:39   ` Yehuda Sadeh
2014-05-05 17:08     ` Jeff Darcy
2014-05-05 17:30       ` Samuel Just
2014-05-05 17:38         ` Jeff Darcy
     [not found]           ` <1666953774.790843.1399311496408.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-05-05 17:46             ` Samuel Just
2014-05-05 18:07               ` Jeff Darcy
2014-05-05 18:23                 ` Samuel Just
     [not found]                 ` <324933830.809209.1399313264579.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-05-05 20:25                   ` Sebastien Ponce
     [not found]   ` <355696287.706122.1399303290204.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-05-05 16:41     ` John Spray
2014-05-05 16:43   ` John Spray

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5367B032.9030901@cern.ch \
    --to=daniel.vanderster@cern.ch \
    --cc=ceph-devel@vger.kernel.org \
    --cc=gluster-devel@gluster.org \
    --cc=jdarcy@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.