git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mike Hommey <mh@glandium.org>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org
Subject: Re: fast-import deltas
Date: Tue, 1 Apr 2014 22:07:03 +0900	[thread overview]
Message-ID: <20140401130703.GA1479@glandium.org> (raw)
In-Reply-To: <20140401114502.GA15549@sigill.intra.peff.net>

On Tue, Apr 01, 2014 at 07:45:03AM -0400, Jeff King wrote:
> On Tue, Apr 01, 2014 at 07:25:54PM +0900, Mike Hommey wrote:
> 
> > I am currently prototyping a "native" mercurial remote handler for git,
> 
> For my own curiosity, how does this differ from what is in
> contrib/remote-helpers/git-remote-hg?

contrib/remote-helpers/git-remote-hg does a local mercurial clone before
doing the git conversion. While this is not really a problem for most
mercurial projects, it tends to be slow with big ones, like the firefox
source code. What I'm aiming at is something that can talk directly to a
remote mercurial server.

> > Would adding a fast-import command to handle deltas be considered useful
> > for git? If so, what kind of format would be suitable?
> 
> It breaks fast-import's "lowest common denominator" data model that is
> just passing commits and their contents over the stream. But we already
> do that in other cases for the sake of performance. I think the
> important thing is that the alternate formats are optional and enabled
> by the caller with command-line options.
> 
> That being said, I foresee a few complications:
> 
>   1. Git needs to know the sha1 of the full object. So unless the
>      generating script knows that ahead of time, git has to expand the
>      delta immediately anyway (this could still be a win if we end up
>      using a good delta from elsewhere rather than doing our own delta
>      search, but I suspect it's not so big a win as if we can just blit
>      the delta straight to disk).

Good point. That could quickly become a problem with long delta chains.

>   2. Git does not store on-disk deltas between objects that are not in
>      the same packfile. So you'd only be able to delta against an object
>      that came in the same stream (or you'd have to "fix" the packfile
>      on disk by adding an extra copy of the delta base, but that
>      probably eliminates any savings).

Arguably, this would make the most difference on initial clone of big
projects, or large incremental updates (like, after a few weeks), which
would use a single pack anyways.

> As for format, I believe that git is basically xdelta under the hood, so
> you'd want either that or something that can be trivially converted to
> it.

It seems to me fast-import keeps a kind of human readable format for its
protocol, i wonder if xdelta format would fit the bill. That being said,
I also wonder if i shouldn't just try to write a pack on my own...

Cheers,

Mike

  reply	other threads:[~2014-04-01 13:09 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-01 10:25 fast-import deltas Mike Hommey
2014-04-01 11:45 ` Jeff King
2014-04-01 13:07   ` Mike Hommey [this message]
2014-04-01 13:15     ` Jeff King
2014-04-01 14:18       ` Mike Hommey
2014-04-01 17:14         ` Junio C Hamano
2014-04-01 17:38           ` Jonathan Nieder
2014-04-01 22:10           ` Mike Hommey
2014-04-01 22:32             ` Junio C Hamano
2014-04-01 23:12               ` Mike Hommey
2014-04-01 23:29       ` Max Horn
2014-04-02  4:13         ` Mike Hommey
2014-04-09 17:44           ` Felipe Contreras

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140401130703.GA1479@glandium.org \
    --to=mh@glandium.org \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).