git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Really slow 'git gc'
@ 2009-02-19 20:24 Linus Torvalds
  2009-02-19 21:14 ` Junio C Hamano
  0 siblings, 1 reply; 21+ messages in thread
From: Linus Torvalds @ 2009-02-19 20:24 UTC (permalink / raw)
  To: Junio C Hamano, Git Mailing List


Ok, so I was wondering why doing a 'git gc' on my kernel backup on one of 
the linux-foundation machines was taking so long, and I think I've found a 
performance problem.

The way I do kernel back-ups is that I just push to two different sites 
every once in a while (read: multiple times a day when I do lots of 
merging), and one of them is master.kernel.org that then gets published to 
others.

The other one is a linux-foundation machine that I have a login on, and 
that's my "secondary" back-up, in case both kernel.org and my own home 
machines were to be corrupted somehow. And because it's my secondary, I 
seldom then log in an gc anything, so it's a mess.

But it was _really_ slow when I finally did so today. The whole "Counting 
objects" phase was counting by hundreds, which it really shouldn't do on a 
fast machine.

The reason? Tons and tons of pack-files. But just the existence of the 
pack-files is not what killed it: things were _much_ faster if I just did 
a "git pack-objects" by hand. 

The real reason _seems_ to be the "--unpacked=pack-....pack" arguments. I 
literally had 232 pack-files, and it looks like a lot of the time was 
spent in that silly loop oer 'ignore_packed' in find_pack_entry(), when 
revision.c does that "has_sha1_pack()" thing. You get a O(n**2) effect in 
number of pack-files: for each commit we look over every pack-file, and 
for every pack-file we look at, we look over each ignore_pack entry.

I didn't really analyze this a lot, and now the thing is packed and much 
faster, but I thought I'd throw this out there..

		Linus

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2009-03-20  4:06 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-02-19 20:24 Really slow 'git gc' Linus Torvalds
2009-02-19 21:14 ` Junio C Hamano
2009-02-19 21:25   ` Linus Torvalds
2009-02-28  9:15     ` [PATCH 0/6] "git repack -a -d" improvements Junio C Hamano
2009-02-28  9:15       ` [PATCH 1/6] git-repack: resist stray environment variable Junio C Hamano
2009-02-28  9:15       ` [PATCH 2/6] has_sha1_pack(): refactor "pretend these packs do not exist" interface Junio C Hamano
2009-02-28  9:15       ` [PATCH 3/6] has_sha1_kept_pack(): take "struct rev_info" Junio C Hamano
2009-02-28  9:15       ` [PATCH 4/6] Consolidate ignore_packed logic more Junio C Hamano
2009-02-28  9:15       ` [PATCH 5/6] Simplify is_kept_pack() Junio C Hamano
2009-02-28  9:15       ` [PATCH 6/6] is_kept_pack(): final clean-up Junio C Hamano
2009-02-28 12:29       ` [PATCH 0/6] "git repack -a -d" improvements Kjetil Barvik
2009-02-28 17:41         ` Junio C Hamano
     [not found]       ` <7Vazs5mFk91IKAarOd0wrBNmYj7eSJxVIcR0PEQxJl8R0aQmQDEqSJMphMrXhmVu570fijupQ34@cipher.nrlssc.navy.mil>
2009-03-18 20:59         ` [PATCH] t7700-repack: repack -a now works properly, expect success from test Brandon Casey
2009-03-20  3:47           ` [PATCH 0/5] repack improvements Brandon Casey
2009-03-20  3:47             ` [PATCH 1/5] t7700-repack: add two new tests demonstrating repacking flaws Brandon Casey
2009-03-20  3:47               ` [PATCH 2/5] git-repack.sh: don't use --kept-pack-only option to pack-objects Brandon Casey
2009-03-20  3:47                 ` [PATCH 3/5] pack-objects: only repack or loosen objects residing in "local" packs Brandon Casey
2009-03-20  3:47                   ` [PATCH 4/5] t7700-repack: repack -a now works properly, expect success from test Brandon Casey
2009-03-20  3:47                     ` [PATCH 5/5] Remove --kept-pack-only option and associated infrastructure Brandon Casey
2009-03-20  4:05             ` [PATCH 0/5] repack improvements Brandon Casey
2009-02-19 21:34   ` Really slow 'git gc' Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).