git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 'git log FILE' slow
@ 2007-07-11 20:33 Yakov Lerner
  2007-07-11 21:03 ` Linus Torvalds
  0 siblings, 1 reply; 2+ messages in thread
From: Yakov Lerner @ 2007-07-11 20:33 UTC (permalink / raw)
  To: Git Mailing List

[git version 1.5.1.3]

'git-log FILE' takes 10-13 sec.  What can I do to identify
the reason ? 'git log >/dev/null' takes 0.1 sec (cached).
On the cloned copy, the times are approximately same.

The 'git-count-objects -v' shows:

count: 9830
size: 241412
in-pack: 12080
packs: 18
prune-packable: 188
garbage: 0

The strace shows only thousands of sbrk during the 10-13 sec time
(after some initial I/O). Ltrace, I was not able to complete, takes too much.

Yakov

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: 'git log FILE' slow
  2007-07-11 20:33 'git log FILE' slow Yakov Lerner
@ 2007-07-11 21:03 ` Linus Torvalds
  0 siblings, 0 replies; 2+ messages in thread
From: Linus Torvalds @ 2007-07-11 21:03 UTC (permalink / raw)
  To: Yakov Lerner; +Cc: Git Mailing List



On Wed, 11 Jul 2007, Yakov Lerner wrote:
> 
> 'git-log FILE' takes 10-13 sec.  What can I do to identify
> the reason ? 'git log >/dev/null' takes 0.1 sec (cached).

"git log FILE" is simply *fundamnentally* much more expensive than "git 
log".

There's nothing to "identify". Both go through the whole log of the 
project, but "git log file" has to look at every tree, and see where the 
file actually changed.

However, "fundmanetally more expensive" doesn't actually mean that it 
should be that slow. I suspect that your archive is not packed, so you 
have probably thousands of individual objects in the filesystem, and are 
slowing down your git usage totally needlessly.

So do

	git gc

on the archive, and you'll probably be happy.

That said, 10-13 seconds *can* be valid for a really big archive, ie 
that's the kinds of times you might eventually expect for something like 
the full KDE archive (if they don't split the subprojects up).

I doubt that's it.

> On the cloned copy, the times are approximately same.

This is a big clue. Cloning will generate a new pack.

> The 'git-count-objects -v' shows:
> 
> count: 9830
> size: 241412
> in-pack: 12080
> packs: 18
> prune-packable: 188
> garbage: 0

Tons of packs, and lots of unpacked objects.

Just get used to doing "git gc" once a week (or maybe once a month - I 
guess you've not done it at all?)

> The strace shows only thousands of sbrk during the 10-13 sec time
> (after some initial I/O). Ltrace, I was not able to complete, takes too much.

Hmm. I'd have expected to see some "stat()/open()" calls if it was really 
just about packing, so I'm a bit surprised, but I really do think you 
should just garbage collect your packs. Having 12k objects in 18 packs is 
ridiculous - each pack must be pitifully small.

Here's my kernel archive:

	[torvalds@woody linux]$ git count-objects -v
	count: 364
	size: 2328
	in-pack: 506495
	packs: 12
	prune-packable: 5
	garbage: 0

ie I have forty times the objects, in fewer packs than you do (and most of 
it is in one big one). After a "git gc", it looks like

	[torvalds@woody linux]$ git count-objects -v
	count: 0
	size: 0
	in-pack: 506090
	packs: 1
	prune-packable: 0
	garbage: 0

and everything is happier (not that it was unhappy before either, but 
mine was much better packed than yours was).

		Linus

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2007-07-11 21:03 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-07-11 20:33 'git log FILE' slow Yakov Lerner
2007-07-11 21:03 ` Linus Torvalds

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).