git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Adam Kellas <Adam.Kellas@gmail.com>
To: git@vger.kernel.org
Subject: using git as a distributed content management tool
Date: Wed, 01 Dec 2010 20:47:34 -0500	[thread overview]
Message-ID: <id6tpm$sb4$1@dough.gmane.org> (raw)

I'm thinking of using git in a slightly unusual way and am looking for 
advice or pointers. I'm developing a (GPL) tool which can be thought of 
as a generic content management system. The only thing that matters here 
is that it generates a related set of potentially binary files. My 
current "persistence solution" is a typical Java server backed by a 
relational database but while reading about git in another context, and 
in particular about how it's really a general content-addressable 
filesystem which just happens to be tuned for SCM, I started thinking 
replacing that back end with something based on git. Essentially this is 
the same old DVCS-vs-centralized question except the tool is not exactly 
a VCS. The results would be shared via the usual push/pull git semantics.

 From what I see of git plumbing this looks eminently doable, and I 
think it will almost certainly work if each user is given their own repo 
as in normal git usage. What I'm trying to figure out is whether an 
optimization is possible: I'd like to support a model where a repo is 
shared between a number of instances of my tool running on the same 
machine or even on the same network as long as the repo is available via 
NFS, which raises issues like race conditions and locking that don't 
come up for a developer using his/her own repo.

Specifically here's what I'm thinking: each instance would set GIT_DIR 
to point to the common repo, and GIT_INDEX_FILE to a per-instance temp 
file. Then, as artifacts are generated they can be stored as blobs using 
hash-object and entered into a tree using create-tree. Note here that 
blobs would go directly into the shared repo while trees would be in the 
private index file. Then at the approriate time the tree can be 
"published" via commit-tree. It would be available immediately to other 
users of the shared repo and could be pushed or pulled asynchronously 
from there.

So, assuming permissions are set to allow it, is there a problem with 
this? I see a possible race in creating blobs iff they happen to be 
identical and another at commit-tree time. Does git do any locking of 
its own or would I have to implement my own around these?

Thanks,
AK

                 reply	other threads:[~2010-12-02  3:20 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='id6tpm$sb4$1@dough.gmane.org' \
    --to=adam.kellas@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).