All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff Garzik <jeff@garzik.org>
To: Pete Zaitcev <zaitcev@redhat.com>
Cc: Project Hail List <hail-devel@vger.kernel.org>
Subject: Re: Metadata replication in tabled
Date: Fri, 25 Jun 2010 03:26:05 -0400	[thread overview]
Message-ID: <4C245A0D.4020204@garzik.org> (raw)
In-Reply-To: <20100624183123.08248d80@lembas.zaitcev.lan>

On 06/24/2010 08:31 PM, Pete Zaitcev wrote:
> I worked on fixing the metadata replication in tabled. There were some
> difficulties in existing code, in particular the aliasing between the
> hostname used to identify nodes and the hostname used in bind() for
> listening was impossible to work around in repmgr. In the end I gave
> up on repmgr and switched tabled to the "Base" API. So, the replication
> works now... for some values of "works", which is still a progress.
>
> We essentially have a tabled that can really be considered as replicated.
> Before, it was only data replication, which was great and all but
> useless against disk failues in the tabled's database. I think it's
> a major treshold for tabled.

er, huh?  In addition to data replication, we already have metadata 
replication via db4 repmgr in tabled.git, which ensures metadata db 
integrity in the case of disk or tabled node failure.

The core problem with current tabled.git is that S3 clients expect all 
nodes to support PUT/DELETE as well as GET.  Our current use w/ db4 
slave mode does not fulfill this client requirement.

Your work here, moving to the base replication API, eliminates several 
obstacles on the path to making all tabled nodes support PUT/DELETE. 
But it is not true to say that metadata replication did not exist prior 
to this patch.

With either repmgr or base API, we still need to make failover more 
transparent to our S3 clients.


> Unfortunately, the code is rather ugly. I tried to create a kind
> of an optional replication layer, so that tdbadm could be built
> without it. Although I succeeded, the result is a hideous mess of
> methods and callbacks, functions with side effects, and a bunch
> of poorly laid out state machines. In places I cannot wrap my own
> head around what's going on without a help of pencil and paper.
>
> So, while working, it's not ready for going in. Still, I'm going
> to throw it here in case I get hit by a bus, or if anyone wants
> an example of using db4 replication early.

Based on a quick read, it seems straightforward, and looks like 
something I can try tomorrow...

Very excited to try this :)

	Jeff




  reply	other threads:[~2010-06-25  7:26 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-25  0:31 Metadata replication in tabled Pete Zaitcev
2010-06-25  7:26 ` Jeff Garzik [this message]
2010-06-25 14:39   ` Pete Zaitcev
2010-06-28 12:37 ` Jeff Darcy
2010-06-28 18:10   ` Pete Zaitcev
2010-06-28 23:21     ` Jeff Garzik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C245A0D.4020204@garzik.org \
    --to=jeff@garzik.org \
    --cc=hail-devel@vger.kernel.org \
    --cc=zaitcev@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.