git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] Teach "git clone" to pack refs
@ 2008-06-15 14:02 Johan Herland
  2008-06-15 14:04 ` [PATCH 1/4] Incorporate fetched packs in future object traversal Johan Herland
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Johan Herland @ 2008-06-15 14:02 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Daniel Barkalow

(This is meant for AFTER v1.5.6.)

This is a re-post of the series I posted while builtin-clone was still
under development. The rationale for this series can be found in earlier
threads, but it roughly boils down to the following:

1. "git clone" currently creates "loose" refs for every ref in the cloned
   repo. A subsequent "git gc" will pack these refs into a packed-refs file.
   Having "git clone" produce the "packed" refs in the first place seems
   more efficient.

2. For repos with few refs the performance difference between writing loose
   refs and packed refs is negligible. However, for repos with thousands of
   refs [1], the difference between writing one packed-refs file and
   thousands of "loose" refs files is definitely noticeable. Even more so
   on Windows.

3. When the user updates a ref, a "loose" ref is written, and the
   corresponding packed ref (if any) is left unused. By making "git clone"
   write packed refs, we increase the overhead of unused packed refs
   (proportionally to the number of refs updated by the user). However,
   the number of refs updated by the user is typically small. If the user
   updates tens - or even hundreds - of refs, I still expect this overhead
   to be negligible, and in any case outweighed by the added performance
   when cloning repos with many refs.

The series is based on current 'next'.

Johan Herland (4):
  Incorporate fetched packs in future object traversal
  Move pack_refs() and friends into libgit
  Prepare testsuite for a "git clone" that packs refs
  Teach "git clone" to pack refs

 Makefile                     |    2 +
 builtin-clone.c              |    8 ++-
 builtin-fetch-pack.c         |    1 +
 builtin-pack-refs.c          |  121 +-----------------------------------------
 pack-refs.c                  |  117 ++++++++++++++++++++++++++++++++++++++++
 pack-refs.h                  |   18 ++++++
 t/t5515-fetch-merge-logic.sh |   19 +++++++
 7 files changed, 164 insertions(+), 122 deletions(-)
 create mode 100644 pack-refs.c
 create mode 100644 pack-refs.h


Have fun! :)

...Johan


[1]: At $dayjob I'm converting old CVS modules with up to ~10 years of
     history, and some of the resulting repos have ~30000 refs (mostly
     build tags). When cloning these repos, the speedup of this series
     can be measured in seconds on Linux; minutes on Windows.

^ permalink raw reply	[flat|nested] 14+ messages in thread
* [RFC/PATCH 0/3] Teach builtin-clone to pack refs
@ 2008-03-22  1:10 Johan Herland
  2008-04-14  6:10 ` [RFC/PATCH 2/3] Prepare testsuite for a "git clone" that packs refs Daniel Barkalow
  0 siblings, 1 reply; 14+ messages in thread
From: Johan Herland @ 2008-03-22  1:10 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: git

The following series builds on top of Barkalow's existing builtin-clone work, available in the "builtin-clone" branch at:
	git://iabervon.org/~barkalow/git.git

This patch series teaches builtin-clone to create packed refs. Creating packed refs directly in clone (instead of creating loose refs and have the next "git gc" (re)pack them), makes cloning considerably faster on repos with many refs. In a test repo with 11000 refs (1000 branches, 10000 tags) I get the following numbers (Core 2 Quad, 4GB RAM running Gentoo Linux):

- Current "next": ~24.8 seconds
- Barkalow's "builtin-clone" branch: 1.47 seconds
- "builtin-clone" plus this series: 0.31 seconds

Although most of the speedup from current "next" is achieved by the builtin-clone work, there is still a considerable additional improvement from writing all refs to a single file instead of writing one file per ref. I expect the performance improvement to be much bigger on platforms with slower filesystem (aka. Windows).

A side-effect of this series is that the cloned refs will not get reflog entries. I don't know how important these "clone: from $URL" entries are to people; I, for one, wouldn't miss them at all.


Have fun! :)

...Johan

-- 
Johan Herland, <johan@herland.net>
www.herland.net

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2008-06-16  9:58 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-15 14:02 [PATCH 0/4] Teach "git clone" to pack refs Johan Herland
2008-06-15 14:04 ` [PATCH 1/4] Incorporate fetched packs in future object traversal Johan Herland
2008-06-15 14:05 ` [PATCH 2/4] Move pack_refs() and friends into libgit Johan Herland
2008-06-15 17:52   ` Jeff King
2008-06-15 21:27     ` [PATCH 2/4 v2] " Johan Herland
2008-06-15 14:05 ` [PATCH 3/4] Prepare testsuite for a "git clone" that packs refs Johan Herland
2008-06-15 17:54   ` Jeff King
2008-06-15 18:04     ` Jakub Narebski
2008-06-15 23:16       ` [PATCH 3/4 v2] " Johan Herland
2008-06-15 14:06 ` [PATCH 4/4] Teach "git clone" to pack refs Johan Herland
2008-06-15 17:56   ` Jeff King
2008-06-15 22:03     ` Johan Herland
2008-06-16  9:57       ` Jeff King
  -- strict thread matches above, loose matches on Subject: below --
2008-03-22  1:10 [RFC/PATCH 0/3] Teach builtin-clone " Johan Herland
2008-04-14  6:10 ` [RFC/PATCH 2/3] Prepare testsuite for a "git clone" that packs refs Daniel Barkalow
2008-04-14  8:00   ` Johan Herland
2008-04-14  8:03     ` [PATCH 3/4] " Johan Herland

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).