From: Luke Kenneth Casson Leighton <luke.leighton@gmail.com>
To: Nguyen Thai Ngoc Duy <pclouds@gmail.com>
Cc: Nicolas Pitre <nico@fluxnic.net>, Git Mailing List <git@vger.kernel.org>
Subject: Re: Resumable clone/Gittorrent (again)
Date: Sun, 9 Jan 2011 13:55:04 +0000 [thread overview]
Message-ID: <AANLkTinwb8orMBjcQjK0ogXd6rMEtRwT8SV41k8D3AXL@mail.gmail.com> (raw)
In-Reply-To: <AANLkTi=KPVMEviQhyJeWHynPa2q6NJpQ2VyAhbRcmQ1D@mail.gmail.com>
On Sun, Jan 9, 2011 at 3:34 AM, Nguyen Thai Ngoc Duy <pclouds@gmail.com> wrote:
> On Sun, Jan 9, 2011 at 12:21 AM, Luke Kenneth Casson Leighton
> <luke.leighton@gmail.com> wrote:
>> ok - you haven't answered the question: are the chains perfectly
>> fixed identical sizes?
>
> No.
>
>> if so they can be slotted into the bittorrent protocol by simply
>> pre-selecting the size to match. with the downside that if there are
>> a million such "chains" you now pretty much overwhelm the peers with
>> the amount of processing, network traffic and memory requirements to
>> maintain the "pieces" map.
>
> No, there are thousands of them only (less than 100k for repos I
> examined). It's precisely the reason I stay away from commits as
> pieces because commits can potentially go up to millions.
ok - thousands is still a lot. i recommend that you examine:
* the heuristics algorithm in bittorrent for piece-selection
* large repositories such as webkit (1.2gb) and the linux kernel (600mb)
you still have to come up with a mapping from "chains" to "pieces".
in the bittorrent protocol the mapping is done *entirely* implicitly
and algorithmically. the "meta" info in the .torrent contains
filenames and file lengths. stack the files one after the other in a
big long data block, get a chopper and just go "whack, whack, whack"
at regular piece-long points, that's your "pieces". so, reassembly is
a complete bitch, and picking just _one_ file to download rather than
the whole lot becomes a total pain.
why the bloody hell the bittorrent protocol doesn't just have a file
id i _really_ don't know, it would have made things a damn sight
easier. anyway - if you're going to modify and "be inspired by" the
bittorrent protocol, you really should look at adding some sort of
"chain" identification - f*** the "chains"-to-"pieces" algorithm, just
add a unique chain id to the relevant bittorrent[-like] command.
>> if not then you now need to modify the bittorrent protocol to cope
>> with variable-length block sizes: the protocol only allows for the
>> last block to be of variable-length.
>
> Ah I see. I do not reuse bittorrent code out there. Just its ideas,
> adapted to git model.
that's hard work and you're now into "unproven" territory. the
successful R&D proof-of-concept code that i wrote i _deliberately_
stayed away from "adapting" a proven bittorrent protocol, and as a
result managed to get that proof-of-concept up and running within ...
i think it was... 3 days. most of the time was spent arseing about
adding in a VFS layer into bittornado, in order to libratise it.
i mention that just to give you something to think about. if you're
up to the challenge of writing your own p2p protocol, however, GREAT!
you'll become a world expert on _both_ peer-to-peer protocols _and_
git :)
l.
next prev parent reply other threads:[~2011-01-09 13:55 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-01-05 16:23 Resumable clone/Gittorrent (again) Nguyen Thai Ngoc Duy
2011-01-05 16:56 ` Luke Kenneth Casson Leighton
2011-01-05 17:13 ` Thomas Rast
2011-01-05 18:07 ` Luke Kenneth Casson Leighton
2011-01-06 1:47 ` Nguyen Thai Ngoc Duy
2011-01-06 17:50 ` Luke Kenneth Casson Leighton
2011-01-05 23:28 ` Maaartin
2011-01-06 1:32 ` Nguyen Thai Ngoc Duy
2011-01-06 3:34 ` Maaartin-1
2011-01-06 6:36 ` Nguyen Thai Ngoc Duy
2011-01-08 1:04 ` Maaartin-1
2011-01-08 2:40 ` Nguyen Thai Ngoc Duy
2011-01-07 3:21 ` Nicolas Pitre
2011-01-07 6:34 ` Nguyen Thai Ngoc Duy
2011-01-07 15:59 ` Luke Kenneth Casson Leighton
2011-01-08 2:17 ` Nguyen Thai Ngoc Duy
2011-01-08 17:21 ` Luke Kenneth Casson Leighton
2011-01-09 3:34 ` Nguyen Thai Ngoc Duy
2011-01-09 13:55 ` Luke Kenneth Casson Leighton [this message]
2011-01-09 17:48 ` Nguyen Thai Ngoc Duy
2011-01-13 11:39 ` Luke Kenneth Casson Leighton
2011-01-13 23:40 ` Sam Vilain
2011-01-14 14:26 ` Luke Kenneth Casson Leighton
2011-01-16 2:11 ` Sam Vilain
2011-01-10 21:38 ` Sam Vilain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=AANLkTinwb8orMBjcQjK0ogXd6rMEtRwT8SV41k8D3AXL@mail.gmail.com \
--to=luke.leighton@gmail.com \
--cc=git@vger.kernel.org \
--cc=nico@fluxnic.net \
--cc=pclouds@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).