From: Maaartin <grajcar1@seznam.cz>
To: git@vger.kernel.org
Subject: Re: Resumable clone/Gittorrent (again)
Date: Wed, 5 Jan 2011 23:28:11 +0000 (UTC) [thread overview]
Message-ID: <loom.20110105T222915-261@post.gmane.org> (raw)
In-Reply-To: AANLkTinUV9Z_w85Gz13J+bm8xqnxJ9jBJXJm9bn5Y2ec@mail.gmail.com
Nguyen Thai Ngoc Duy <pclouds <at> gmail.com> writes:
> I've been analyzing bittorrent protocol and come up with this. The
> last idea about a similar thing [1], gittorrent, was given by Nicolas.
> This keeps close to that idea (i.e the transfer protocol must be around git
> objects, not file chunks) with a bit difference.
>
> The idea is to transfer a chain of objects (trees or blobs), including
> base object and delta chain. Objects are chained in according to
> worktree layout, e.g. all objects of path/to/any/blob will form a
> chain, from a commit tip down to the root commits. Chains can have
> gaps, and don't need to start from commit tip. The transfer is
> resumable because if a delta chain is corrupt at some point, we can
> just request another chain from where it stops. Base object is
> obviously resumable.
I may be talking nonsense, please bare with me.
I'm not sure if it works well, since chains defined this way change over time.
I may request commits A and B while declaring to possess commits C and D. One
server may be ahead of A, so should it send me more data or repack the chain so
that the non-requested versions get excluded? At the same time the server may
be missing B and posses only some ancestors of it. Should it send me only a
part of the chain or should I better ask a different server?
Moreover, in case a directory gets renamed, the content may get transfered
needlessly. This is probably no big problem.
I haven't read the whole other thread yet, but what about going the other way
round? Use a single commit as a chain, create deltas assuming that all
ancestors are already available. The packs may arrive out of order, so the
decompression may have to wait. The number of commits may be one order of
magnitude larger than the the number of paths (there are currently 2254 paths
and 24235 commits in git.git), so grouping consequent commits into one larger
pack may be useful.
The advantage is that the packs stays stable over time, you may create them
using the most aggressive and time-consuming settings and store them forever.
You could create packs for single commits, packs for non-overlapping
consecutive pairs of them, for non-overlapping pairs of pairs, etc. I mean with
commits numbered 0, 1, 2, ... create packs [0,1], [2,3], ..., [0,3], [4,7],
etc. The reason for this is obviously to allow reading groups of commits from
different servers so that they fit together (similar to Buddy memory
allocation). Of course, there are things like branches bringing chaos in this
simple scheme, but I'm sure this can be solved somehow.
Another problem is the client requesting commits A and B while declaring to
possess commits C and D. When both C and D are ancestors of either A or B, you
can ignore it (as you assume this while packing, anyway). The other case is
less probable, unless e.g. C is the master and A is a developing branch.
Currently. I've no idea how to optimize this and whether this could be
important.
I see no disadvantage when compared to path-based chains, but am probably
overlooking something obvious.
next prev parent reply other threads:[~2011-01-05 23:28 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-01-05 16:23 Resumable clone/Gittorrent (again) Nguyen Thai Ngoc Duy
2011-01-05 16:56 ` Luke Kenneth Casson Leighton
2011-01-05 17:13 ` Thomas Rast
2011-01-05 18:07 ` Luke Kenneth Casson Leighton
2011-01-06 1:47 ` Nguyen Thai Ngoc Duy
2011-01-06 17:50 ` Luke Kenneth Casson Leighton
2011-01-05 23:28 ` Maaartin [this message]
2011-01-06 1:32 ` Nguyen Thai Ngoc Duy
2011-01-06 3:34 ` Maaartin-1
2011-01-06 6:36 ` Nguyen Thai Ngoc Duy
2011-01-08 1:04 ` Maaartin-1
2011-01-08 2:40 ` Nguyen Thai Ngoc Duy
2011-01-07 3:21 ` Nicolas Pitre
2011-01-07 6:34 ` Nguyen Thai Ngoc Duy
2011-01-07 15:59 ` Luke Kenneth Casson Leighton
2011-01-08 2:17 ` Nguyen Thai Ngoc Duy
2011-01-08 17:21 ` Luke Kenneth Casson Leighton
2011-01-09 3:34 ` Nguyen Thai Ngoc Duy
2011-01-09 13:55 ` Luke Kenneth Casson Leighton
2011-01-09 17:48 ` Nguyen Thai Ngoc Duy
2011-01-13 11:39 ` Luke Kenneth Casson Leighton
2011-01-13 23:40 ` Sam Vilain
2011-01-14 14:26 ` Luke Kenneth Casson Leighton
2011-01-16 2:11 ` Sam Vilain
2011-01-10 21:38 ` Sam Vilain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=loom.20110105T222915-261@post.gmane.org \
--to=grajcar1@seznam.cz \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).