From: Elijah Newren <newren@gmail.com>
To: Nguyen Thai Ngoc Duy <pclouds@gmail.com>
Cc: git <git@vger.kernel.org>
Subject: Re: [RFC PATCH 01/15] README-sparse-clone: Add a basic writeup of my ideas for sparse clones
Date: Sat, 4 Sep 2010 21:13:13 -0600 [thread overview]
Message-ID: <AANLkTikxCwSOBVREuRc7sShJahKR5FXWdaW79f_K36bU@mail.gmail.com> (raw)
In-Reply-To: <AANLkTikx89M+JcOcabU3TazGB=k8x39QLbVe7sH7Vvaa@mail.gmail.com>
On Sat, Sep 4, 2010 at 9:01 PM, Nguyen Thai Ngoc Duy <pclouds@gmail.com> wrote:
> On Sun, Sep 5, 2010 at 10:13 AM, Elijah Newren <newren@gmail.com> wrote:
>> +To ensure minimum necessary connectivity, we also download basic
>> +information from otherwise excluded commits
>> + * parents of these commits
>> + * trees matching the specified sparse path(s)
>> +but, for security and space reasons, do not download
>> + * author
>> + * author date
>> + * committer
>> + * committer date
>> + * log message
>> +Such commits are still considered "missing" (see item I4 for more
>> +details about how we handle "missing" commits).
>
> Just an observation. When I ran pack-objects with irrelevant commits
> removed (i.e. try_to_simplify_commit) on Documentation/, I got a 6MB
> pack. When I ran it without commit simplification, I got 16MB pack.
> That's 10MB larger.
Hmm. I get 22MB pack for a full clone of git.git and 13 MB for a
sparse clone with Documentation/; that's including all commits too. I
wonder why our numbers differ by 3 MB. Weird.
> Now I don't how much of that 10MB share is commit messages, authors,
> committers and trees but I suspect trees take a large part in it.
> Maybe you can just fake the trees in those fake commits as well, to
> avoid downloading more trees.
I originally planned to do that, but that makes working with tags and
branches difficult. For example, documenters could clone a repository
but be unable to make new commits on top of maint or master since we
probably wouldn't have trees for the tips of those branches.
So I really think trees are needed. Of course, for someone making a
sparse clone of "just" the Documentation directory will need the
toplevel trees, but they won't need trees under t/, for example. So
they do get some savings.
next prev parent reply other threads:[~2010-09-05 3:13 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-09-05 0:13 [RFC PATCH 00/15] Sparse clones Elijah Newren
2010-09-05 0:13 ` [RFC PATCH 01/15] README-sparse-clone: Add a basic writeup of my ideas for sparse clones Elijah Newren
2010-09-05 3:01 ` Nguyen Thai Ngoc Duy
2010-09-05 3:13 ` Elijah Newren [this message]
2010-09-06 3:14 ` Nguyen Thai Ngoc Duy
2010-09-05 0:13 ` [RFC PATCH 02/15] Add tests for client handling in a sparse repository Elijah Newren
2010-09-05 0:13 ` [RFC PATCH 03/15] Read sparse limiting args from $GIT_DIR/sparse-limit Elijah Newren
2010-09-05 0:13 ` [RFC PATCH 04/15] When unpacking in a sparse repository, avoid traversing missing trees/blobs Elijah Newren
2010-09-05 0:13 ` [RFC PATCH 05/15] read_tree_recursive: Avoid missing blobs and trees in a sparse repository Elijah Newren
2010-09-05 2:00 ` Nguyen Thai Ngoc Duy
2010-09-05 3:16 ` Elijah Newren
2010-09-05 4:31 ` Elijah Newren
2010-09-05 0:13 ` [RFC PATCH 06/15] Automatically reuse sparse limiting arguments in revision walking Elijah Newren
2010-09-05 1:58 ` Nguyen Thai Ngoc Duy
2010-09-05 4:50 ` Elijah Newren
2010-09-05 7:12 ` Nguyen Thai Ngoc Duy
2010-09-05 0:13 ` [RFC PATCH 07/15] cache_tree_update(): Capability to handle tree entries missing from index Elijah Newren
2010-09-05 7:54 ` Nguyen Thai Ngoc Duy
2010-09-05 21:09 ` Elijah Newren
2010-09-06 4:42 ` Elijah Newren
2010-09-06 5:02 ` Nguyen Thai Ngoc Duy
2010-09-06 4:47 ` [PATCH 0/4] en/object-list-with-pathspec update Nguyễn Thái Ngọc Duy
2010-09-06 4:47 ` [PATCH 1/4] Add testcases showing how pathspecs are ignored with rev-list --objects Nguyễn Thái Ngọc Duy
2010-09-06 4:47 ` [PATCH 2/4] tree-walk: copy tree_entry_interesting() as is from tree-diff.c Nguyễn Thái Ngọc Duy
2010-09-06 15:22 ` Elijah Newren
2010-09-06 22:09 ` Nguyen Thai Ngoc Duy
2010-09-06 4:47 ` [PATCH 3/4] tree-walk: actually move tree_entry_interesting() to tree-walk.c Nguyễn Thái Ngọc Duy
2010-09-06 15:31 ` Elijah Newren
2010-09-06 22:20 ` Nguyen Thai Ngoc Duy
2010-09-06 23:53 ` Junio C Hamano
2010-09-06 4:47 ` [PATCH 4/4] Make rev-list --objects work together with pathspecs Nguyễn Thái Ngọc Duy
2010-09-07 1:28 ` [RFC PATCH 07/15] cache_tree_update(): Capability to handle tree entries missing from index Nguyen Thai Ngoc Duy
2010-09-07 3:06 ` Elijah Newren
2010-09-05 0:14 ` [RFC PATCH 08/15] cache_tree_update(): Require relevant tree to be passed Elijah Newren
2010-09-05 0:14 ` [RFC PATCH 09/15] Add tests for communication dealing with sparse repositories Elijah Newren
2010-09-05 0:14 ` [RFC PATCH 10/15] sparse-repo: Provide a function to record sparse limiting arguments Elijah Newren
2010-09-05 0:14 ` [RFC PATCH 11/15] builtin-clone: Accept paths for sparse clone Elijah Newren
2010-09-05 0:14 ` [RFC PATCH 12/15] Pass extra (rev-list) args on, at least in some cases Elijah Newren
2010-09-05 0:14 ` [RFC PATCH 13/15] upload-pack: Handle extra rev-list arguments being passed Elijah Newren
2010-09-05 0:14 ` [RFC PATCH 14/15] EVIL COMMIT: Include all commits Elijah Newren
2010-09-05 0:14 ` [RFC PATCH 15/15] clone: Ensure sparse limiting arguments are used in subsequent operations Elijah Newren
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=AANLkTikxCwSOBVREuRc7sShJahKR5FXWdaW79f_K36bU@mail.gmail.com \
--to=newren@gmail.com \
--cc=git@vger.kernel.org \
--cc=pclouds@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).