From: Linus Torvalds <torvalds@linux-foundation.org>
To: Daniel Berlin <dberlin@dberlin.org>
Cc: git@vger.kernel.org
Subject: Re: git annotate runs out of memory
Date: Tue, 11 Dec 2007 10:40:36 -0800 (PST) [thread overview]
Message-ID: <alpine.LFD.0.9999.0712111018540.25032@woody.linux-foundation.org> (raw)
In-Reply-To: <4aca3dc20712110933i636342fbifb15171d3e3cafb3@mail.gmail.com>
On Tue, 11 Dec 2007, Daniel Berlin wrote:
>
> This seems to be a common problem with git. It seems to use a lot of
> memory to perform common operations on the gcc repository (even though
> it is faster in some cases than hg).
The thing is, git has a very different notion of "common operations" than
you do.
To git, "git annotate" is just about the *last* thing you ever want to do.
It's not a common operation, it's a "last resort" operation. In git, the
whole workflow is designed for "git log -p <pathnamepattern>" rather than
annotate/blame.
In fact, we didn't support annotate at all for the first year or so of
git.
The reason for git being relatively slow is exactly that git doesn't have
"file history" at all, and only tracks full snapshots. So "git blame" is
really a very complex operation that basically looks at the global history
(because nothing else exists) and will basically generate a totally
different "view" of local history from that one.
The disadvantage is that it's much slower and much more costly than just
having a local history view to begin with.
However, the absolutely *huge* advantage is that it isn't then limited to
local history.
So where git shines is when you actually use the global history, and do
merges or when you track more than one file (which others find hard, but
git finds much more natural).
An examples of this is content that actually comes from multiple files.
File-based systems simply cannot do this at all. They aren't just slower,
they are totally unable to do it sanely. For git, it's all the same: it
never really cares about file boundaries in the first place.
The other example is doing things like "git log -p drivers/char", where
you don't ask for the log of a single file, but a general file pattern,
and get (still atomic!) commits as the result.
And perhaps the best example is just tracking code when you have two files
that merge into one (possibly because the "same" file was created
independently in two different branches). git gets things like that right
without even thinking about it. Others tend to just flounder about and
can't do anything at all about it.
That said, I'll see if I can speed up "git blame" on the gcc repository.
It _is_ a fundamentally much more expensive operation than it is for
systems that do single-file things.
Linus
next prev parent reply other threads:[~2007-12-11 18:41 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-12-11 17:33 git annotate runs out of memory Daniel Berlin
2007-12-11 17:47 ` Nicolas Pitre
2007-12-11 17:53 ` Daniel Berlin
2007-12-11 18:01 ` Nicolas Pitre
2007-12-11 18:32 ` Marco Costalba
2007-12-11 19:03 ` Daniel Berlin
2007-12-11 19:14 ` Marco Costalba
2007-12-11 19:27 ` Jason Sewall
2007-12-11 19:46 ` Daniel Barkalow
2007-12-11 20:14 ` Marco Costalba
2007-12-11 18:40 ` Linus Torvalds [this message]
2007-12-11 19:01 ` Matthieu Moy
2007-12-11 19:22 ` Linus Torvalds
2007-12-11 19:24 ` Daniel Berlin
2007-12-11 19:42 ` Pierre Habouzit
2007-12-11 21:09 ` Daniel Berlin
2007-12-11 23:37 ` Matthieu Moy
2007-12-11 23:48 ` Linus Torvalds
2007-12-11 19:06 ` Nicolas Pitre
2007-12-11 20:31 ` Jon Smirl
2007-12-11 19:09 ` Daniel Berlin
2007-12-11 19:26 ` Daniel Barkalow
2007-12-11 19:34 ` Pierre Habouzit
2007-12-11 19:59 ` Junio C Hamano
2007-12-11 19:42 ` Linus Torvalds
2007-12-11 19:50 ` Linus Torvalds
2007-12-11 21:14 ` Daniel Berlin
2007-12-11 21:34 ` Linus Torvalds
2007-12-12 7:57 ` Jeff King
2007-12-17 23:24 ` Jan Hudec
2007-12-18 0:05 ` Linus Torvalds
2007-12-11 21:14 ` Linus Torvalds
2007-12-11 21:54 ` Junio C Hamano
2007-12-11 23:36 ` Linus Torvalds
2007-12-12 0:02 ` Linus Torvalds
2007-12-12 0:22 ` Davide Libenzi
2007-12-12 0:50 ` Linus Torvalds
2007-12-12 1:12 ` Davide Libenzi
2007-12-12 2:10 ` Linus Torvalds
2007-12-12 3:35 ` Linus Torvalds
2007-12-12 0:56 ` Junio C Hamano
2007-12-12 2:20 ` Linus Torvalds
2007-12-12 2:39 ` Linus Torvalds
2007-12-12 19:43 ` Daniel Berlin
2007-12-12 4:48 ` Junio C Hamano
2007-12-11 21:24 ` Daniel Berlin
2007-12-12 3:57 ` Shawn O. Pearce
2007-12-11 20:29 ` Marco Costalba
2007-12-11 19:29 ` Steven Grimm
2007-12-11 20:14 ` Jakub Narebski
2007-12-12 10:36 ` Florian Weimer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LFD.0.9999.0712111018540.25032@woody.linux-foundation.org \
--to=torvalds@linux-foundation.org \
--cc=dberlin@dberlin.org \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).