git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* git-fast-export hg mutt (24M vs 184M)
@ 2007-05-03 18:56 Thomas Glanzmann
  2007-05-03 19:17 ` Thomas Glanzmann
  0 siblings, 1 reply; 6+ messages in thread
From: Thomas Glanzmann @ 2007-05-03 18:56 UTC (permalink / raw)
  To: GIT

Hello,
I just tried to vendor track / import the mutt hg repository into git.
git-fast-export-hg is quiet amazing but the resulting git repository
blows up in size and I have not the slightes clue why.

        git clone git://repo.or.cz/fast-export.git
        hg clone http://dev.mutt.org/hg/mutt

Could someone have a look at this? I used Debian Etch with a git version
I build myself using an ugly script. I had to upgrade mecurial as well
because the version didn't had the cmdlog.py. I used the unstable debian
package for that.

        (thinkpad) [~/work/mutt] git-init-db
        Initialized empty Git repository in .git/
        (thinkpad) [~/work/mutt] ~/work/fast-export/hg-fast-export.sh -r /tmp/mutt

git version 1.5.2.rc0.56.g6169a

        Thomas

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: git-fast-export hg mutt (24M vs 184M)
  2007-05-03 18:56 git-fast-export hg mutt (24M vs 184M) Thomas Glanzmann
@ 2007-05-03 19:17 ` Thomas Glanzmann
  2007-05-03 21:01   ` Pierre Habouzit
  0 siblings, 1 reply; 6+ messages in thread
From: Thomas Glanzmann @ 2007-05-03 19:17 UTC (permalink / raw)
  To: GIT

Hello,
git-repack -a -d -f got it down to 19M. I missed the -f parameter
before. Sorry for the noise.

        Thomas

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: git-fast-export hg mutt (24M vs 184M)
  2007-05-03 19:17 ` Thomas Glanzmann
@ 2007-05-03 21:01   ` Pierre Habouzit
  2007-05-03 21:18     ` Shawn O. Pearce
  0 siblings, 1 reply; 6+ messages in thread
From: Pierre Habouzit @ 2007-05-03 21:01 UTC (permalink / raw)
  To: Thomas Glanzmann; +Cc: GIT

[-- Attachment #1: Type: text/plain, Size: 430 bytes --]

On Thu, May 03, 2007 at 09:17:16PM +0200, Thomas Glanzmann wrote:
> Hello,
> git-repack -a -d -f got it down to 19M. I missed the -f parameter
> before. Sorry for the noise.

  You may want to use git gc that does that (and a bit more) for you.

-- 
·O·  Pierre Habouzit
··O                                                madcoder@debian.org
OOO                                                http://www.madism.org

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: git-fast-export hg mutt (24M vs 184M)
  2007-05-03 21:01   ` Pierre Habouzit
@ 2007-05-03 21:18     ` Shawn O. Pearce
  2007-05-03 22:29       ` Pierre Habouzit
  0 siblings, 1 reply; 6+ messages in thread
From: Shawn O. Pearce @ 2007-05-03 21:18 UTC (permalink / raw)
  To: Pierre Habouzit; +Cc: Thomas Glanzmann, GIT

Pierre Habouzit <madcoder@debian.org> wrote:
> On Thu, May 03, 2007 at 09:17:16PM +0200, Thomas Glanzmann wrote:
> > Hello,
> > git-repack -a -d -f got it down to 19M. I missed the -f parameter
> > before. Sorry for the noise.
> 
>   You may want to use git gc that does that (and a bit more) for you.

Actually, in this case, no.

git-gc by default doesn't use the -f option.  -f to git-repack
means "no reuse deltas".  That particular feature of git-repack is
basically required to be used after running git-fast-import with
anything sizeable.

The reason you need -f is git-fast-import does not write optimally
compressed blobs (file revisions) when it creates the packfile.
Instead it does a reasonable best effort while using a minimum
amount of memory.  The Git packfiles get most of their compression
benefits from being able to see all of a project's data at once;
this is impossible in fast-import as we're only seeing a small part
of the incoming data stream at any single point in time.

If you had a lot of tags imported you might want to also use `git
pack-refs` (one of the chores that git-gc does), or `git pack-refs
--all` if you have a lot of dangling branches imported.  The other
chores in git-gc aren't actually useful after running fast-import
(reflog expire, prune, rerere gc).

-- 
Shawn.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: git-fast-export hg mutt (24M vs 184M)
  2007-05-03 21:18     ` Shawn O. Pearce
@ 2007-05-03 22:29       ` Pierre Habouzit
  2007-05-04  1:11         ` Nicolas Pitre
  0 siblings, 1 reply; 6+ messages in thread
From: Pierre Habouzit @ 2007-05-03 22:29 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Thomas Glanzmann, GIT

[-- Attachment #1: Type: text/plain, Size: 1506 bytes --]

On Thu, May 03, 2007 at 05:18:24PM -0400, Shawn O. Pearce wrote:
> Pierre Habouzit <madcoder@debian.org> wrote:
> > On Thu, May 03, 2007 at 09:17:16PM +0200, Thomas Glanzmann wrote:
> > > Hello,
> > > git-repack -a -d -f got it down to 19M. I missed the -f parameter
> > > before. Sorry for the noise.
> > 
> >   You may want to use git gc that does that (and a bit more) for you.
> 
> Actually, in this case, no.
> 
> git-gc by default doesn't use the -f option.  -f to git-repack
> means "no reuse deltas".  That particular feature of git-repack is
> basically required to be used after running git-fast-import with
> anything sizeable.

  okay, so why git fast-import does not let some note somewhere (to be
picked by git gc later) "a fast-import has been run, use -f for next
repack if you want best compression" ?

  I'd think that would make a lot of sense, and that users that now
naively (like me) think git-gc would always be enough would not be
dramatically wrong ? :)

  I mean it's nothing *very* important but some 
  `touch $GIT_DIR/info/unpacked-fast-import` in fast-import then:
  if test -f $GIT_DIR/info/unpacked-fast-import; then
      REPACK_OPTIONS=$REPACK_OPTIONS\ -f
  fi
  // do the repack
  rm -f $GIT_DIR/info/unpacked-fast-import

  would do the trick, wouldn't it ?

-- 
·O·  Pierre Habouzit
··O                                                madcoder@debian.org
OOO                                                http://www.madism.org

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: git-fast-export hg mutt (24M vs 184M)
  2007-05-03 22:29       ` Pierre Habouzit
@ 2007-05-04  1:11         ` Nicolas Pitre
  0 siblings, 0 replies; 6+ messages in thread
From: Nicolas Pitre @ 2007-05-04  1:11 UTC (permalink / raw)
  To: Pierre Habouzit; +Cc: Shawn O. Pearce, Thomas Glanzmann, GIT

On Fri, 4 May 2007, Pierre Habouzit wrote:

> On Thu, May 03, 2007 at 05:18:24PM -0400, Shawn O. Pearce wrote:
> > Pierre Habouzit <madcoder@debian.org> wrote:
> > > On Thu, May 03, 2007 at 09:17:16PM +0200, Thomas Glanzmann wrote:
> > > > Hello,
> > > > git-repack -a -d -f got it down to 19M. I missed the -f parameter
> > > > before. Sorry for the noise.
> > > 
> > >   You may want to use git gc that does that (and a bit more) for you.
> > 
> > Actually, in this case, no.
> > 
> > git-gc by default doesn't use the -f option.  -f to git-repack
> > means "no reuse deltas".  That particular feature of git-repack is
> > basically required to be used after running git-fast-import with
> > anything sizeable.
> 
>   okay, so why git fast-import does not let some note somewhere (to be
> picked by git gc later) "a fast-import has been run, use -f for next
> repack if you want best compression" ?

Nah.

The conversion script should do it itself directly after it is done with 
fast-import.


Nicolas

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2007-05-04  1:11 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-03 18:56 git-fast-export hg mutt (24M vs 184M) Thomas Glanzmann
2007-05-03 19:17 ` Thomas Glanzmann
2007-05-03 21:01   ` Pierre Habouzit
2007-05-03 21:18     ` Shawn O. Pearce
2007-05-03 22:29       ` Pierre Habouzit
2007-05-04  1:11         ` Nicolas Pitre

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).