git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Daniel Berlin <dberlin@dberlin.org>, Junio C Hamano <gitster@pobox.com>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Re: git annotate runs out of memory
Date: Tue, 11 Dec 2007 13:14:18 -0800 (PST)	[thread overview]
Message-ID: <alpine.LFD.0.9999.0712111300440.25032@woody.linux-foundation.org> (raw)
In-Reply-To: <alpine.LFD.0.9999.0712111122400.25032@woody.linux-foundation.org>



On Tue, 11 Dec 2007, Linus Torvalds wrote:
> 
> PS. I also do agree that we seem to use an excessive amount of memory 
> there. As to whether it's the same issue or not, I'd not go as far as Nico 
> and say "yes" yet. But it's interesting.

I think the answer here is that git-annotate is a totally different issue.

The blame machinery keeps around all the blobs it has ever needed to do a 
diff, which explains why something like gcc/ChangeLog blows up badly.

Try this trivial patch.

It will cause us to potentially re-generate some blobs much more, but 
that's a reasonably cheap operation, and our delta base cache will get the 
expensive cases.

It's still not a free operation, but I get

	[torvalds@woody gcc]$ /usr/bin/time ~/git/git-blame gcc/ChangeLog > /dev/null
	20.68user 1.25system 0:21.94elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
	0inputs+0outputs (0major+599833minor)pagefaults 0swaps

so it took 22s and I never saw it grow very large either (it grew to 72M 
resident, but I don't know how much of that was the mmap of the 
pack-file, so that number is pretty meaningless). Valgrind reports that 
it used a maximum heap of about 24M, and almost all of that seems to have 
been in the delta cache (which is all good).

		Linus

----
 builtin-blame.c |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/builtin-blame.c b/builtin-blame.c
index c158d31..18f9924 100644
--- a/builtin-blame.c
+++ b/builtin-blame.c
@@ -87,6 +87,14 @@ struct origin {
 	char path[FLEX_ARRAY];
 };
 
+static void drop_origin_blob(struct origin *o)
+{
+	if (o->file.ptr) {
+		free(o->file.ptr);
+		o->file.ptr = NULL;
+	}
+}
+
 /*
  * Given an origin, prepare mmfile_t structure to be used by the
  * diff machinery
@@ -558,6 +566,8 @@ static struct patch *get_patch(struct origin *parent, struct origin *origin)
 	if (!file_p.ptr || !file_o.ptr)
 		return NULL;
 	patch = compare_buffer(&file_p, &file_o, 0);
+	drop_origin_blob(parent);
+	drop_origin_blob(origin);
 	num_get_patch++;
 	return patch;
 }

  parent reply	other threads:[~2007-12-11 21:15 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-12-11 17:33 git annotate runs out of memory Daniel Berlin
2007-12-11 17:47 ` Nicolas Pitre
2007-12-11 17:53   ` Daniel Berlin
2007-12-11 18:01     ` Nicolas Pitre
2007-12-11 18:32 ` Marco Costalba
2007-12-11 19:03   ` Daniel Berlin
2007-12-11 19:14     ` Marco Costalba
2007-12-11 19:27     ` Jason Sewall
2007-12-11 19:46     ` Daniel Barkalow
2007-12-11 20:14       ` Marco Costalba
2007-12-11 18:40 ` Linus Torvalds
2007-12-11 19:01   ` Matthieu Moy
2007-12-11 19:22     ` Linus Torvalds
2007-12-11 19:24       ` Daniel Berlin
2007-12-11 19:42         ` Pierre Habouzit
2007-12-11 21:09           ` Daniel Berlin
2007-12-11 23:37       ` Matthieu Moy
2007-12-11 23:48         ` Linus Torvalds
2007-12-11 19:06   ` Nicolas Pitre
2007-12-11 20:31     ` Jon Smirl
2007-12-11 19:09   ` Daniel Berlin
2007-12-11 19:26     ` Daniel Barkalow
2007-12-11 19:34     ` Pierre Habouzit
2007-12-11 19:59       ` Junio C Hamano
2007-12-11 19:42     ` Linus Torvalds
2007-12-11 19:50       ` Linus Torvalds
2007-12-11 21:14         ` Daniel Berlin
2007-12-11 21:34           ` Linus Torvalds
2007-12-12  7:57         ` Jeff King
2007-12-17 23:24           ` Jan Hudec
2007-12-18  0:05             ` Linus Torvalds
2007-12-11 21:14       ` Linus Torvalds [this message]
2007-12-11 21:54         ` Junio C Hamano
2007-12-11 23:36           ` Linus Torvalds
2007-12-12  0:02             ` Linus Torvalds
2007-12-12  0:22               ` Davide Libenzi
2007-12-12  0:50                 ` Linus Torvalds
2007-12-12  1:12                   ` Davide Libenzi
2007-12-12  2:10                     ` Linus Torvalds
2007-12-12  3:35                       ` Linus Torvalds
2007-12-12  0:56               ` Junio C Hamano
2007-12-12  2:20                 ` Linus Torvalds
2007-12-12  2:39                   ` Linus Torvalds
2007-12-12 19:43               ` Daniel Berlin
2007-12-12  4:48           ` Junio C Hamano
2007-12-11 21:24       ` Daniel Berlin
2007-12-12  3:57       ` Shawn O. Pearce
2007-12-11 20:29     ` Marco Costalba
2007-12-11 19:29   ` Steven Grimm
2007-12-11 20:14     ` Jakub Narebski
2007-12-12 10:36 ` Florian Weimer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.0.9999.0712111300440.25032@woody.linux-foundation.org \
    --to=torvalds@linux-foundation.org \
    --cc=dberlin@dberlin.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).