From: "George Spelvin" <linux@horizon.com>
To: jon.seymour@gmail.com
Cc: git@vger.kernel.org, linux@horizon.com, spearce@spearce.org
Subject: Re: is hosting a read-mostly git repo on a distributed file system practical?
Date: 12 Apr 2011 23:47:15 -0400 [thread overview]
Message-ID: <20110413034715.15553.qmail@science.horizon.com> (raw)
> All clients, including the client that occasionally updates the
> read-mostly repo would be mounting the DFS as a local file system. My
> environment is one where DFS is easy, but establishing a shared server
> is more complicated (ie. bureaucratic).
> I guess I am prepared to put up with a slow initial clone (my developer
> pool will be relatively stable and pulling from a peer via git: or ssh:
> will usually be acceptable for this occasional need).
> What I am most interested in is the incremental performance. Can my
> integrator, who occasionally updates the shared repo, avoid automatically
> repacking it (and hence taking the whole of repo latency hit) and can
> my developers who are pulling the updates do so reliably without a whole
> of repo scan?
I think the answers are yes, but I have to make a vouple of things clear:
* You can *definitely* control repack behaviour. .keep files are the
simplest way to prevent repacking.
* Are you talking about hosting only a "bare" repository, or one with
the unpacked source tree as well? If you try to run git commands on
a large network-mounted source tree, things can get more than a bit
sluggish; git recursively stats the whole tree fairly frequently.
(There are ways to precent that, notably core.ignoreStat, but they
make it less friendly.)
* You can clone from a repository mounted on the file system just as
easily as you can from a network server. So there's no need to set
up a server if you find it onconvenient.
* Normally, the developers will clone from the integrator's repository
before doing anything, so the source tree, and any changes they make,
will be local.
* A local clone will try to hard link to the object directory. I think
it will copy them if it fails, or you can force that with "git clone
--no-hardlinks". For a more space-saving version, try "git clone
-s", which will make a sort of soft link to the upstream repository.
It's a git concept, so repacking upstream won't do any harm, but you
Must Not delete objects from the upstream repository or you'll create
dangling references in the downstream.
* If using the objects on the DFS mount turns out to be slow, you can
just do the initial clone with --no-hardlinks. Then the developers'
day-to-day work is all local.
Indeed, you could easily do everything via DFS. Give everyone a personal
"public" repo to push to, which is read-only to everyone else, and let
the integrator pull from those.
> I understand that avoiding repacking for an extended period brings its
> own problems, so I guess I could live with a local repack followed by
> an rsync transfer to re-initial the shared remote, if this was
> warranted.
Normally, you do a generational garbage collection thing. You repack the
current work frequently (which is fast to do, and to share, because
it's small), and the larger, slower, older packs less frequently.
Anyway, I hope this helps!
next reply other threads:[~2011-04-13 3:47 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-13 3:47 George Spelvin [this message]
2011-04-13 4:57 ` is hosting a read-mostly git repo on a distributed file system practical? Jon Seymour
-- strict thread matches above, loose matches on Subject: below --
2011-04-13 1:40 Jon Seymour
2011-04-13 2:06 ` Shawn Pearce
2011-04-13 2:29 ` Jon Seymour
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110413034715.15553.qmail@science.horizon.com \
--to=linux@horizon.com \
--cc=git@vger.kernel.org \
--cc=jon.seymour@gmail.com \
--cc=spearce@spearce.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).