git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Johan Herland <johan@herland.net>
Cc: Fredrik Gustafsson <iveqy@iveqy.com>,
	Olaf Hering <olaf@aepfle.de>,
	Git mailing list <git@vger.kernel.org>
Subject: Re: how to reduce disk usage for large .git dirs?
Date: Thu, 13 Nov 2014 15:15:42 -0500	[thread overview]
Message-ID: <20141113201542.GC3869@peff.net> (raw)
In-Reply-To: <CALKQrgeZYs9A-GZLuRczwzRWWapmfrjFvcvR8GN+YNKxajjDRw@mail.gmail.com>

On Thu, Nov 13, 2014 at 05:08:19PM +0100, Johan Herland wrote:

> Can you not do this much simpler with --reference? Like this:
> 
>   $ git clone --bare git://host/repo.git repo-master
>   $ git clone -b branchA --reference repo-master git://host/repo.git
> repo-branchA
>   $ git clone -b branchB --reference repo-master git://host/repo.git
> repo-branchB
> 
> All three repos now push/pull directly to/from git://host/repo.git,
> but repo-branchA and repo-branchB reference objects from within the
> bare repo-master. You have to make use to never delete objects from
> repo-master

I think the "never delete" part is why we usually warn people off of
using alternates. I think at the least you would have to "git config
gc.auto 0" in the bare repository (otherwise your nightly fetches risk
pruning). Of course you'd probably want to repack eventually for
performance reasons. So maybe setting gc.pruneExpire is a better option
(to something like "20.years.ago").

> If you want to prevent the repos growing in size, you must devise a
> way to add new objects into repo-master before repo-branchA|B. (e.g. a
> nightly cron-job in repo-master that fetches from origin), so that
> when repo-branchA|B pulls, they will find most objects are already
> present in repo-master.

You can also fetch from the children into repo-master periodically.
Like:

  cd repo-master &&
  for i in branchA branchB; do
    git fetch ../$i +refs/*:refs/remotes/$i/*
  done

after which it is actually safe to run "git gc" in the master (assuming
there isn't simultaneous activity in the children). This is how we
manage fork networks on GitHub (we take in objects to individual forks
via push, and then migrate them to the master repo via fetch).

-Peff

  reply	other threads:[~2014-11-13 20:15 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-13 11:14 how to reduce disk usage for large .git dirs? Olaf Hering
2014-11-13 11:49 ` Fredrik Gustafsson
2014-11-13 12:03   ` Olaf Hering
2014-11-14 12:32     ` Jakub Narębski
2014-11-13 12:02 ` Roger Gammans
2014-11-13 12:21   ` Olaf Hering
2014-11-13 12:09 ` Duy Nguyen
2014-11-13 15:44 ` Olaf Hering
2014-11-13 16:03   ` Fredrik Gustafsson
2014-11-13 16:08     ` Johan Herland
2014-11-13 20:15       ` Jeff King [this message]
2014-11-14 10:14     ` Olaf Hering
2014-11-14 10:24       ` Fredrik Gustafsson
2014-11-14 10:30         ` Olaf Hering
2014-11-14 10:54           ` Olaf Hering
2014-11-14 11:24     ` Olaf Hering
2014-11-14 15:06       ` Andreas Schwab
2014-11-25 14:32         ` Olaf Hering

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141113201542.GC3869@peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=iveqy@iveqy.com \
    --cc=johan@herland.net \
    --cc=olaf@aepfle.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).