git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nicolas Pitre <nico@fluxnic.net>
To: demerphq <demerphq@gmail.com>
Cc: Git <git@vger.kernel.org>
Subject: Re: Dealing with many many git repos in a /home directory
Date: Thu, 04 Feb 2010 12:35:11 -0500 (EST)	[thread overview]
Message-ID: <alpine.LFD.2.00.1002041207330.1681@xanadu.home> (raw)
In-Reply-To: <9b18b3111002040029x1c7de0afw4a5ef883588f7a18@mail.gmail.com>

On Thu, 4 Feb 2010, demerphq wrote:

> At $work we have a host where we have about 50-100 users each with
> their own private copies of the same repos. These are cloned froma
> remote via git/ssh and are not thus automatically hardlinking their
> object stores.
> 
> This is starting to take a lot of space.

You should keep a pristine copy of that common repository on that host 
and make it readable to everyone, and then ask your users to use the 
--reference argument with 'git clone' to borrow as much as possible from 
that common repository.

For those who already cloned the repository in full i.e. without the 
--reference switch, then it is possible to fix the situation simply by 
adding the full path to the common repository's .git/objects directory 
in their own .git/objects/info/alternates (create it if it doesn't 
exist) and then run 'git gc'.  That's what the --reference argument to 
the clone command does: setting up that .git/objects/info/alternates 
file.

> I was thinking it should be possible to hardlink all of the objects in
> the different repos to a canonical single copy.
> 
> Would i be correct in thinking that if i have to repos with an
> equivalent  .git/objects/../..... file in them that the files are
> necessarily identical and one can be replaced by a hardlink to the
> other?

Yes, you could do that.  However you'll save very little by doing that 
as the bulk of a repository content is normally stored into pack files, 
and those may differ from one repository to another depending on what 
exactly the pack contains.  The alternates mechanism is more powerful as 
it lets Git fetch objects from the canonical repository packed or not, 
and more importantly it avoids creating local copy of new objects if 
they already exists in that canonical copy meaning that you don't have 
to constantly search in every user's repository for potential new 
objects to hardlink.

> If this is correct then is there some tool known to the list that
> already does this?  I whipped this together:

The "tool" exists in Git already and is what I describe above.  The 
actual tool you might need is probably a script to populate that 
.git/objects/info/alternates file in all your users' repositoryes and 
maybe run ,git gc' on their behalf.


Nicolas

      parent reply	other threads:[~2010-02-04 17:35 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-04  8:29 Dealing with many many git repos in a /home directory demerphq
2010-02-04  9:57 ` Alex Riesen
2010-02-04 15:20   ` Sergio
2010-02-04 15:00 ` Martin Langhoff
2010-02-04 15:32 ` Andreas Schwab
2010-02-04 17:35 ` Nicolas Pitre [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.2.00.1002041207330.1681@xanadu.home \
    --to=nico@fluxnic.net \
    --cc=demerphq@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).