git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: James Pickens <jepicken@gmail.com>, Git ML <git@vger.kernel.org>
Subject: Re: 'git clone' doesn't use alternates automatically?
Date: Mon, 2 Feb 2009 08:07:55 -0500	[thread overview]
Message-ID: <20090202130755.GA8487@sigio.peff.net> (raw)
In-Reply-To: <7v7i4b2bto.fsf@gitster.siamese.dyndns.org>

On Sat, Jan 31, 2009 at 05:19:31PM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> >   - without either, copy alternates from origin, but _don't_ use
> >     alternates while cloning
> 
> Are you talking about a local clone optimization that does hardlink from
> the source repository?

Sorry, I was wrong about what was happening. From reading James' posts
and not doing any experimenting or looking, I had the impression that
doing this:

  # plain repo
  mkdir repo1 &&
    (cd repo1 && git init &&
     echo content >file && git add . && git commit -m one)

  # repo with alternates, but extra content
  git clone -s repo1 repo2 &&
    (cd repo2 &&
     echo content >>file && git commit -a -m two)

  # clone of repo w/ alternates
  git clone repo2 repo3

would cause the final clone to set up the alternate to repo1, but still
pull in the objects. But that isn't the case, of course. Either:

  1. It is a local hardlink clone, in which case we just pull in the
     objects from repo2.

  2. It isn't, in which case we don't copy over the alternates.

> I am fairly certain that copying alternates from the source repository was
> not an intended behaviour but was a consequence of lazy coding of how we
> copy (or link) everything from it.  The original was literally the simple
> matter of:
> 
>     find objects ! -type d -print | cpio $cpio_quiet_flag -pumd$l "$GIT_DIR/"
> 
> whose intention was to copy objects/?? and objects/pack/. and it wasn't
> even part of the design consideration to worry about what would happen to
> the alternates the source repository might have in objects/info/.

Right, I think that is what is going on. And what I was suggesting in my
other email is that it is actively harmful to have this behavior,
because now repo3 depends on repo1, without the user having explicitly
asked for such a relationship (and they might not even be aware of
repo1).

I was tempted to suggest avoiding copying the alternates from repo2
to repo3. But you can't do that: repo2 is _missing_ objects that repo3
won't have. Without the alternates file pointing to repo1, repo3 is
corrupt. So simply avoiding copying the alternates file doesn't work;
one would have to actually pull the missing objects in from the
alternate before doing so.

But actually, I think there is even more breakage in hardlinking the
alternates file: alternates files can be relative paths. So if repo2
points to "../../../repo1/.git/objects" (which it doesn't in the example
above, as "clone -s" uses absolute paths -- but it is easy enough to
construct a broken case), then repo3 will gain that alternate pointer,
but may be in a totally different directory where that relative path is
broken. And then repo3 is corrupt. So the alternates must be copied and
any relative paths munged for it to work reliably.

The hardlink code operates by default because it was thought to be a
safe optimization that couldn't bite people. But it interacts badly with
the concept of alternates. So I think a sane fix would be to disable
hardlinking if the parent repo is using alternates at all. Then a
vanilla "git clone repo2 repo3" will do the safe but more costly
behavior of actually copying the objects. If the user wants to accept
the risks of alternates, then he can give "-s" explicitly, and git will
track the alternates recursively through repo2 to repo1 at runtime.

-Peff

  reply	other threads:[~2009-02-02 13:09 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-30 22:12 'git clone' doesn't use alternates automatically? James Pickens
2009-01-31  7:12 ` Jeff King
2009-01-31 20:08   ` James Pickens
2009-01-31 21:08     ` Jakub Narebski
2009-01-31 21:43       ` James Pickens
2009-01-31 21:55     ` Jeff King
2009-02-01  1:19       ` Junio C Hamano
2009-02-02 13:07         ` Jeff King [this message]
2009-02-03  4:30           ` Junio C Hamano
2009-02-03  6:06             ` Jeff King
2009-02-01  0:55     ` Junio C Hamano
2009-02-01  1:32       ` James Pickens
2009-02-01  1:38         ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090202130755.GA8487@sigio.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jepicken@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).