All of lore.kernel.org
 help / color / mirror / Atom feed
From: "V.Krishn" <vkrishn4@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: Re: Stalled git cloning and possible solutions
Date: Wed, 4 Sep 2013 06:36:48 +0530	[thread overview]
Message-ID: <201309040636.48290.vkrishn4@gmail.com> (raw)
In-Reply-To: <xmqqvc2o16pn.fsf@gitster.dls.corp.google.com>

On Friday, August 30, 2013 03:48:44 AM you wrote:
> "V.Krishn" <vkrishn4@gmail.com> writes:
> > On Friday, August 30, 2013 02:40:34 AM you wrote:
> >> V.Krishn wrote:
> >> > Quite sometimes when cloning a large repo stalls, hitting Ctrl+c
> >> > cleans what been downloaded, and process needs re-start.
> >> > 
> >> > Is there a way to recover or continue from already downloaded files
> >> > during cloning ?
> >> 
> >> No, sadly.  The pack sent for a clone is generated dynamically, so
> >> there's no easy way to support the equivalent of an HTTP Range request
> >> to resume.  Someone might implement an appropriate protocol extension
> >> to tackle this (e.g., peff's seed-with-clone.bundle hack) some day,
> >> but for now it doesn't exist.
> > 
> > This is what I tried but then realized something more is needed:
> > 
> > During stalled clone avoid  Ctrl+c.
> > 1. Copy the content .i.e .git folder some other place.
> > 2. cd <new dir>
> > 3. git config fetch.unpackLimit 999999
> > 4. git config transfer.unpackLimit 999999
> 
> These two steps will not help, as negotiation between the sender and
> the receiver is based on the commits that are known to be complete,
> and an earlier failed "fetch" will not (and should not) update refs
> on the receiver's side.
> 
> >> What you *can* do today is create a bundle from the large repo
> >> somewhere with a reliable connection and then grab that using a
> >> resumable transport such as HTTP.
> 
> Yes.
> 
> Another possibility is, if the project being cloned has a tag (or a
> branch) that points at a commit back when it was smaller, do this
> 
> 	git init x &&
>         cd x &&
>         git fetch $that_repository
> $that_tag:refs/tags/back_then_i_was_small
> 
> to prime the object store of a temporary repository 'x' with a
> hopefully smaller transfer, and then use it as a "--reference"
> repository to the real clone.

What more files/info would be needed.
I noticed the tmp_pack_xxxxxx may not have object type commit/tree.
Do I need to manually create .git/refs..

I was wondering the following would further help in recovering.

A
1. If pack file was created in sequence to commit history(date), i.e 
blob+commit+tree....tags...+blob+commit+tree. 
also if in parallel idx was also created or atleast a tmp idx.
2. Update other files in .git dir before pack process.
    (as stated in previous email).
3. Objects are named like datestamp(epoch)+sha1 
     and stored in epoch directory. (date fmt can be yymmdd).
     (this might break back-compat)
4. Add "git fsck --defrag [1..4]" 
   #this can take another parameter like level, 
     applying various heuristic optimization.

B
Another option would be:
git clone <url> --use-method=rsync
this would transfer files as is in .git dir (ones necessary).
And run `git gc` or any other housekeeping upon completion.
This method would allow resuming.
Cons:
  Any change in pack file on server during download becomes a potential issue.

The clone resume may not be a priority but if a minor changes can help in 
recovery, this would be nice. 

I still like the bundle method if git services made this easy.

-- 
Regards.
V.Krishn

  parent reply	other threads:[~2013-09-04  1:19 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-29 19:48 Stalled git cloning and possible solutions V.Krishn
2013-08-29 21:10 ` Jonathan Nieder
2013-08-29 21:35   ` V.Krishn
2013-08-29 22:18     ` Junio C Hamano
2013-08-29 22:28       ` V.Krishn
2013-09-04  1:06       ` V.Krishn [this message]
2013-08-30 12:17   ` Duy Nguyen
2013-08-30 12:41     ` Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201309040636.48290.vkrishn4@gmail.com \
    --to=vkrishn4@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.