From: Linus Torvalds <torvalds@linux-foundation.org>
To: Junio C Hamano <gitster@pobox.com>
Cc: Nicolas Pitre <nico@cam.org>, Git Mailing List <git@vger.kernel.org>
Subject: Re: Some git performance measurements..
Date: Thu, 29 Nov 2007 22:03:00 -0800 (PST) [thread overview]
Message-ID: <alpine.LFD.0.9999.0711292145110.8458@woody.linux-foundation.org> (raw)
In-Reply-To: <7v3auos4yi.fsf@gitster.siamese.dyndns.org>
On Thu, 29 Nov 2007, Junio C Hamano wrote:
>
> I am hoping that "probably 10s of those 17s" can actually be measured
> with the patch I sent out last night. Has anybody took a look at it?
Sorry, I missed it. But I just did timings.
Your patch helps
git read-tree -m -u --exclude-per-directory=.gitignore HEAD HEAD
timings enormously, and it's now down to 3s for me (which is the same
speed as it is without any per-directory-excludes). That's a big
improvement from the ~10s I see without your patch (I've repacked my
tree, I have to admit that I don't even know if it's the new or the old
older, but I can state that 7s for me was just those .gitignore files).
Sadly, the full "git checkout" itself is not actually improved, due to the
git update-index --refresh
there, which will end up populating the whole directory cache anyway.
I wonder why I didn't see that as the expensive operation when I timed
"git checkout". Probably because I narrowed down on the "git read-tree" as
the operation that actually accesses the pack-file and the object
directory, while the "git update-index" never touches the actual objects.
Anyway, I think your patch is great. It just doesn't help the full case of
a "git checkout", only the read-tree portion of it ;(
As to partitioning the data according to types:
> When I do archaeology, I think I often run blame first to see which
> change made the block of text into the current shape first, and then run
> a path limited "git log -p" either starting or ending at that revision.
> In that workflow, the initial blame may get slower with the new layout,
> but I suspect it would help by speeding up the latter "git log -p" step.
I really cannot convince myself one way or the other. I have a suspicion
that sometimes it helps to have objects (regardless of type) close to each
other, and sometimes it helps to have the trees packed densely. A lot of
operations *do* work on both blobs and trees (a *raw* diff doesn't, but
they are fairly rare), so this is not at all clear-cut like the commit
case.
So sorting the commits together is a no-brainer, since a lot of really
important ops only look at them. But blobs and trees? The numbers
certainly go both ways, and I suspect we are probably better off not
messing with the sort order unless we have some unambiguous real results.
Oh, well. I was hoping that I'd have a number of cases that showed good
improvements, with perhaps the bulk of it not showing much difference at
all. But while I saw the good improvements, the very first try at "git
blame" also showed quite worse numbers, so I think we should consider it
an interesting idea, but probably shelve it.
Linus
next prev parent reply other threads:[~2007-11-30 6:04 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-11-29 2:49 Some git performance measurements Linus Torvalds
2007-11-29 3:14 ` Linus Torvalds
2007-11-29 3:59 ` Nicolas Pitre
2007-11-29 4:32 ` Linus Torvalds
2007-11-29 17:25 ` Nicolas Pitre
2007-11-29 17:48 ` Linus Torvalds
2007-11-29 18:52 ` Nicolas Pitre
2007-11-30 5:00 ` Junio C Hamano
2007-11-30 6:03 ` Linus Torvalds [this message]
2007-11-30 0:54 ` Jakub Narebski
2007-11-30 2:21 ` Linus Torvalds
2007-11-30 2:39 ` Jakub Narebski
2007-11-30 2:40 ` Nicolas Pitre
2007-11-30 6:11 ` Steffen Prohaska
2007-12-07 13:35 ` Mike Ralphson
2007-12-07 13:49 ` Johannes Schindelin
2007-12-07 16:07 ` Linus Torvalds
2007-12-07 16:09 ` Mike Ralphson
2007-12-07 18:37 ` Johannes Schindelin
2007-12-07 19:15 ` Mike Ralphson
2007-12-08 11:05 ` Johannes Schindelin
2007-12-08 23:04 ` Brian Downing
2007-11-30 2:54 ` Linus Torvalds
2007-12-05 1:04 ` Federico Mena Quintero
2007-12-01 11:36 ` Joachim B Haga
2007-12-01 17:19 ` Linus Torvalds
2007-11-29 5:17 ` Junio C Hamano
2007-11-29 10:17 ` [PATCH] per-directory-exclude: lazily read .gitignore files Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LFD.0.9999.0711292145110.8458@woody.linux-foundation.org \
--to=torvalds@linux-foundation.org \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=nico@cam.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).