git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chad Dombrova <chadrik@gmail.com>
To: git@vger.kernel.org
Subject: read-only working copies using links
Date: Sat, 24 Jan 2009 01:17:19 -0800	[thread overview]
Message-ID: <3EE64C92-CB4C-47BD-9C48-E369AED4B82F@gmail.com> (raw)

hi all,

there's a major feature for working with large binaries that has not  
yet been addressed by git:  the ability to check out a file as a  
symbolic/hard link to a blob in the repository, instead of duplicating  
the file into the working copy.

imagine a scenario where one user is putting large binary files into a  
git repo on a networked server.  100 other users on the server need  
read-only access to this repo.  they clone the repo using --shared or  
--local, which saves disk space for the object files, but each of  
these 100 working copies also creates copies of all the binary files  
at the HEAD revision. it would be 100x as efficient in both disk space  
and checkout speeds if, in place of these files, symbolic or hard  
links were made to the blob files in .git/objects.

the crux of the issue is that the blob objects would have to be stored  
as exact copies of the original files.  it would seem there are two  
things that currently prevent this from happening.  1) blobs are  
stored with compression and 2) they include a small header.   
compression can be disabled by setting core.loosecompression to 0, so  
that seems like less of an issue.  as for the header, wouldn't it be  
possible to store it separately?  in other words, store two files per  
blob directory, a small stub file with the header info and the  
unaltered file data.

what are the caveats to a system like this?  has anyone looked into  
this before?

-chad

p.s.
i tried submitting a post through nabble a few days and it said that  
it was still pending, so i thought i'd try submitting directly to the  
mailing list.  sorry, if i end up double-posting

             reply	other threads:[~2009-01-24  9:18 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-24  9:17 Chad Dombrova [this message]
2009-01-24 11:02 ` read-only working copies using links Sverre Rabbelier
2009-01-24 18:39   ` Chad Dombrova
2009-01-24 18:43     ` Sverre Rabbelier
2009-01-24 19:35       ` Jeff King
2009-01-24 19:34     ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3EE64C92-CB4C-47BD-9C48-E369AED4B82F@gmail.com \
    --to=chadrik@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).