git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Avery Pennarun <apenwarr@gmail.com>
To: Asger Ottar Alstrup <asger@area9.dk>
Cc: git@vger.kernel.org, Alexander Gavrilov <angavrilov@gmail.com>
Subject: Re: git subtree as a solution to partial cloning?
Date: Mon, 25 May 2009 15:18:50 -0400	[thread overview]
Message-ID: <32541b130905251218w10e43b85v489f6018366058d4@mail.gmail.com> (raw)
In-Reply-To: <8873ae500905251128h1921895dp6ef227e0e0bbec49@mail.gmail.com>

On Mon, May 25, 2009 at 2:28 PM, Asger Ottar Alstrup <asger@area9.dk> wrote:
> On Mon, May 25, 2009 at 7:54 PM, Avery Pennarun <apenwarr@gmail.com> wrote:
>> On Mon, May 25, 2009 at 1:35 PM, Asger Ottar Alstrup <asger@area9.dk> wrote:
>>> So a poor mans system could work like this:
>>>
>>> - A reduced repository is defined by a list of paths in a file, I
>>> guess with a format similar to .gitignore
>>
>> Are you sure you want to define the list with exclusions instead of
>> inclusions?  I don't really know your use case.
>
> Since the .gitignore format supports !, I believe that should not make
> much of a difference.
>
>> Anyway, if you're using git filter-branch, it'll be up to you to fix
>> the index to contain the list of files you want. (See man
>> git-filter-branch)
>
> Yes, sure, and that is why I asked whether there is some tool in git
> that can give a list of concrete files surviving a .gitignore list of
> patterns.

Well, the problem here is with the definition of "concrete file."  If
you're using git filter-branch --index-filter (which is much faster
than --tree-filter), then your trees won't be checked out at all.  And
thus there is the open question of exactly what list of files you want
to use.  It's unlikely that any existing tool will do it exactly the
way you want (although I could be wrong).

In any case, what you'd probably do is something like git ls-files
--cached | perlscript, where your perlscript does whatever you want to
the file list.

> Thanks. OK, I see now that filter-branch will not destroy the original
> repository. That is not at all obvious from reading the man page, when
> the very first sentence says that it will rewrite history. But the
> main point of this exercise is to reduce the size of the reduced
> repository so that it can be transferred effectively. So after
> filter-branch, I guess I would run clone afterwards to make the new,
> smaller repository, and then the question becomes: Will clone reuse
> and prune packs intelligently?

filter-branch will destroy the history of the current branch.  But if
you make a new branch first, you'll be okay.

You seem to be giving the concept of "packs" a bit too much weight.
Packs are just groups of objects.  AFAIK, cloning and fetching
generally produces entirely new packs for each client.

clone is quite intelligent; in fact, if you clone the repository on
your local machine, it's so intelligent that it'll hardlink the packs
instead of copying them and it'll take virtually no space at all!

But you don't need to copy the whole repository unless you want to.
You can retrieve just the one, stripped-down branch from a client with
something like this:

   mkdir myproj
   cd myproj
   git init
   git fetch server:whatever.git my-stripped-down-branchname
   git checkout -b master FETCH_HEAD

Have fun,

Avery

  reply	other threads:[~2009-05-25 19:19 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <8873ae500905250021p20e7096dwf5bc71c36c4047b@mail.gmail.com>
2009-05-25  7:59 ` git subtree as a solution to partial cloning? Avery Pennarun
2009-05-25  9:33   ` Asger Ottar Alstrup
2009-05-25 15:50     ` Avery Pennarun
2009-05-25 17:35       ` Asger Ottar Alstrup
2009-05-25 17:54         ` Avery Pennarun
2009-05-25 18:28           ` Asger Ottar Alstrup
2009-05-25 19:18             ` Avery Pennarun [this message]
2009-05-25 23:26             ` Jakub Narebski
2009-05-25  7:35 Asger Ottar Alstrup

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=32541b130905251218w10e43b85v489f6018366058d4@mail.gmail.com \
    --to=apenwarr@gmail.com \
    --cc=angavrilov@gmail.com \
    --cc=asger@area9.dk \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).