git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "V.Krishn" <vkrishn4@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: Re: Stalled git cloning and possible solutions
Date: Wed, 4 Sep 2013 06:36:48 +0530	[thread overview]
Message-ID: <201309040636.48290.vkrishn4@gmail.com> (raw)
In-Reply-To: <xmqqvc2o16pn.fsf@gitster.dls.corp.google.com>

On Friday, August 30, 2013 03:48:44 AM you wrote:
> "V.Krishn" <vkrishn4@gmail.com> writes:
> > On Friday, August 30, 2013 02:40:34 AM you wrote:
> >> V.Krishn wrote:
> >> > Quite sometimes when cloning a large repo stalls, hitting Ctrl+c
> >> > cleans what been downloaded, and process needs re-start.
> >> > 
> >> > Is there a way to recover or continue from already downloaded files
> >> > during cloning ?
> >> 
> >> No, sadly.  The pack sent for a clone is generated dynamically, so
> >> there's no easy way to support the equivalent of an HTTP Range request
> >> to resume.  Someone might implement an appropriate protocol extension
> >> to tackle this (e.g., peff's seed-with-clone.bundle hack) some day,
> >> but for now it doesn't exist.
> > 
> > This is what I tried but then realized something more is needed:
> > 
> > During stalled clone avoid  Ctrl+c.
> > 1. Copy the content .i.e .git folder some other place.
> > 2. cd <new dir>
> > 3. git config fetch.unpackLimit 999999
> > 4. git config transfer.unpackLimit 999999
> 
> These two steps will not help, as negotiation between the sender and
> the receiver is based on the commits that are known to be complete,
> and an earlier failed "fetch" will not (and should not) update refs
> on the receiver's side.
> 
> >> What you *can* do today is create a bundle from the large repo
> >> somewhere with a reliable connection and then grab that using a
> >> resumable transport such as HTTP.
> 
> Yes.
> 
> Another possibility is, if the project being cloned has a tag (or a
> branch) that points at a commit back when it was smaller, do this
> 
> 	git init x &&
>         cd x &&
>         git fetch $that_repository
> $that_tag:refs/tags/back_then_i_was_small
> 
> to prime the object store of a temporary repository 'x' with a
> hopefully smaller transfer, and then use it as a "--reference"
> repository to the real clone.

What more files/info would be needed.
I noticed the tmp_pack_xxxxxx may not have object type commit/tree.
Do I need to manually create .git/refs..

I was wondering the following would further help in recovering.

A
1. If pack file was created in sequence to commit history(date), i.e 
blob+commit+tree....tags...+blob+commit+tree. 
also if in parallel idx was also created or atleast a tmp idx.
2. Update other files in .git dir before pack process.
    (as stated in previous email).
3. Objects are named like datestamp(epoch)+sha1 
     and stored in epoch directory. (date fmt can be yymmdd).
     (this might break back-compat)
4. Add "git fsck --defrag [1..4]" 
   #this can take another parameter like level, 
     applying various heuristic optimization.

B
Another option would be:
git clone <url> --use-method=rsync
this would transfer files as is in .git dir (ones necessary).
And run `git gc` or any other housekeeping upon completion.
This method would allow resuming.
Cons:
  Any change in pack file on server during download becomes a potential issue.

The clone resume may not be a priority but if a minor changes can help in 
recovery, this would be nice. 

I still like the bundle method if git services made this easy.

-- 
Regards.
V.Krishn

  parent reply	other threads:[~2013-09-04  1:19 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-29 19:48 Stalled git cloning and possible solutions V.Krishn
2013-08-29 21:10 ` Jonathan Nieder
2013-08-29 21:35   ` V.Krishn
2013-08-29 22:18     ` Junio C Hamano
2013-08-29 22:28       ` V.Krishn
2013-09-04  1:06       ` V.Krishn [this message]
2013-08-30 12:17   ` Duy Nguyen
2013-08-30 12:41     ` Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201309040636.48290.vkrishn4@gmail.com \
    --to=vkrishn4@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).