All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Hommey <mh@glandium.org>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org
Subject: Re: fast-import deltas
Date: Tue, 1 Apr 2014 22:07:03 +0900	[thread overview]
Message-ID: <20140401130703.GA1479@glandium.org> (raw)
In-Reply-To: <20140401114502.GA15549@sigill.intra.peff.net>

On Tue, Apr 01, 2014 at 07:45:03AM -0400, Jeff King wrote:
> On Tue, Apr 01, 2014 at 07:25:54PM +0900, Mike Hommey wrote:
> 
> > I am currently prototyping a "native" mercurial remote handler for git,
> 
> For my own curiosity, how does this differ from what is in
> contrib/remote-helpers/git-remote-hg?

contrib/remote-helpers/git-remote-hg does a local mercurial clone before
doing the git conversion. While this is not really a problem for most
mercurial projects, it tends to be slow with big ones, like the firefox
source code. What I'm aiming at is something that can talk directly to a
remote mercurial server.

> > Would adding a fast-import command to handle deltas be considered useful
> > for git? If so, what kind of format would be suitable?
> 
> It breaks fast-import's "lowest common denominator" data model that is
> just passing commits and their contents over the stream. But we already
> do that in other cases for the sake of performance. I think the
> important thing is that the alternate formats are optional and enabled
> by the caller with command-line options.
> 
> That being said, I foresee a few complications:
> 
>   1. Git needs to know the sha1 of the full object. So unless the
>      generating script knows that ahead of time, git has to expand the
>      delta immediately anyway (this could still be a win if we end up
>      using a good delta from elsewhere rather than doing our own delta
>      search, but I suspect it's not so big a win as if we can just blit
>      the delta straight to disk).

Good point. That could quickly become a problem with long delta chains.

>   2. Git does not store on-disk deltas between objects that are not in
>      the same packfile. So you'd only be able to delta against an object
>      that came in the same stream (or you'd have to "fix" the packfile
>      on disk by adding an extra copy of the delta base, but that
>      probably eliminates any savings).

Arguably, this would make the most difference on initial clone of big
projects, or large incremental updates (like, after a few weeks), which
would use a single pack anyways.

> As for format, I believe that git is basically xdelta under the hood, so
> you'd want either that or something that can be trivially converted to
> it.

It seems to me fast-import keeps a kind of human readable format for its
protocol, i wonder if xdelta format would fit the bill. That being said,
I also wonder if i shouldn't just try to write a pack on my own...

Cheers,

Mike

  reply	other threads:[~2014-04-01 13:09 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-01 10:25 fast-import deltas Mike Hommey
2014-04-01 11:45 ` Jeff King
2014-04-01 13:07   ` Mike Hommey [this message]
2014-04-01 13:15     ` Jeff King
2014-04-01 14:18       ` Mike Hommey
2014-04-01 17:14         ` Junio C Hamano
2014-04-01 17:38           ` Jonathan Nieder
2014-04-01 22:10           ` Mike Hommey
2014-04-01 22:32             ` Junio C Hamano
2014-04-01 23:12               ` Mike Hommey
2014-04-01 23:29       ` Max Horn
2014-04-02  4:13         ` Mike Hommey
2014-04-09 17:44           ` Felipe Contreras

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140401130703.GA1479@glandium.org \
    --to=mh@glandium.org \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.