git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@osdl.org>
To: Alex Riesen <raa.lkml@gmail.com>
Cc: Junio C Hamano <junkio@cox.net>, git@vger.kernel.org
Subject: Re: [PATCH] Speedup recursive by flushing index only once for all entries
Date: Thu, 11 Jan 2007 08:38:51 -0800 (PST)	[thread overview]
Message-ID: <Pine.LNX.4.64.0701110823300.3594@woody.osdl.org> (raw)
In-Reply-To: <81b0412b0701110102m5264696dg68a573e9d5f2a17c@mail.gmail.com>



On Thu, 11 Jan 2007, Alex Riesen wrote:
> On 1/11/07, Linus Torvalds <torvalds@osdl.org> wrote:
> > >
> > > Yep. Tried the monster merge on it: 1m15sec on that small laptop.
> > 
> > Is that supposed to be good? That still sounds really slow to me. What
> > kind of nasty project are you doing? Is this the 44k file project, and
> > under cygwin? Or is it that bad even under Linux?
> 
> It is that "bad" on a 384Mb linux laptop and 1.2GHz Celeron.
> Yes, it is that 44k files project. The previous code finishes
> that merge on that laptop in about 20 minutes, so it's defnitely
> an improvement. My cygwin machine has a lot more memory (2Gb),
> so I can't really compare them here.

Ok. Junio, I'd suggest putting it into 1.5.0, then - it's a fairly simple 
thing, after all, and if it's the difference between 20 minutes and just 
over one minute, it clearly matters.

With 384MB of memory, and 44 thousand files, I bet the problem is just 
that the working set doesn't fit entirely in RAM. It probably caches 
*most* of it, but with inodes and directories being spread out on disk 
(and I assume there are more files in the actual working tree), so writing 
out a 6MB index file (or whatever) and then reading it back several times 
just ends up generating IO simply because 6MB is actually a noticeable 
chunk of memory in that situation.

(It also generates a ton of tree objects early, so the effect at run-time 
is probably much more than 6MB).

That said, I think we actually have another problem entirely:

Look at "write_cache()", Junio: isn't it leaking memory like mad?

Shouldn't we have something like this?

It's entirely possible that the _real_ problem with the "flush the index 
all the time" was that it just caused this bug: tons and tons of lost 
memory, causing git-merge-recursive to grow explosively (~6MB per 
cache flush, and a _lot_ of cache flushes), which on a 384MB machine 
quickly uses up memory and causes totally unnecessary swapping.

Of course, it's also entirely possible that I'm a complete retard, and 
just didn't see where the data buffer is still used or freed.

"Linus - complete retard or hero in shining armor? You decide!"

		Linus

---
diff --git a/read-cache.c b/read-cache.c
index 8ecd826..c54a611 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1010,7 +1010,7 @@ int write_cache(int newfd, struct cache_entry **cache, int entries)
 		if (data &&
 		    !write_index_ext_header(&c, newfd, CACHE_EXT_TREE, sz) &&
 		    !ce_write(&c, newfd, data, sz))
-			;
+			free(data);
 		else {
 			free(data);
 			return -1;

  reply	other threads:[~2007-01-11 16:39 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-01-04 10:47 [PATCH] Speedup recursive by flushing index only once for all entries Alex Riesen
2007-01-04 12:33 ` Johannes Schindelin
2007-01-04 12:47   ` Alex Riesen
2007-01-04 20:22     ` Junio C Hamano
2007-01-05 11:22       ` Alex Riesen
2007-01-07 16:31         ` Alex Riesen
2007-01-10 18:06           ` Junio C Hamano
2007-01-10 19:28           ` Junio C Hamano
2007-01-10 22:11             ` Junio C Hamano
2007-01-10 23:07             ` Alex Riesen
2007-01-10 23:23               ` Linus Torvalds
2007-01-11  8:14                 ` Johannes Schindelin
2007-01-11  9:03                   ` Alex Riesen
2007-01-11 12:11                     ` Alex Riesen
2007-01-11 20:37                       ` Junio C Hamano
2007-01-11  9:02                 ` Alex Riesen
2007-01-11 16:38                   ` Linus Torvalds [this message]
2007-01-11 17:43                     ` Alex Riesen
2007-01-11 18:02                       ` Linus Torvalds
2007-01-11 21:48                         ` Alex Riesen
2007-01-11 20:23                     ` Junio C Hamano
2007-01-11 22:10                       ` Alex Riesen
2007-01-11 22:28                         ` Linus Torvalds
2007-01-11 23:53                           ` Junio C Hamano
2007-01-12  0:18                           ` Alex Riesen
2007-01-11  0:34               ` Junio C Hamano
2007-01-11  8:15             ` Johannes Schindelin
2007-01-12 15:48             ` Sergey Vlasov
2007-01-12 17:38               ` Alex Riesen
2007-01-12 20:37                 ` Sergey Vlasov
2007-01-12 18:23               ` Junio C Hamano
2007-01-12 20:09                 ` [PATCH] merge-recursive: do not report the resulting tree object name Junio C Hamano
2007-01-12 23:36                   ` Johannes Schindelin
2007-01-13  0:32                     ` Junio C Hamano
2007-01-13  0:57                       ` Jakub Narebski
2007-01-13 11:01                         ` Johannes Schindelin
2007-01-13  5:14                       ` Shawn O. Pearce
2007-01-13  7:03                         ` Junio C Hamano
2007-01-12 20:30                 ` [PATCH] Speedup recursive by flushing index only once for all entries Alex Riesen
2007-01-12 21:07                 ` Sergey Vlasov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0701110823300.3594@woody.osdl.org \
    --to=torvalds@osdl.org \
    --cc=git@vger.kernel.org \
    --cc=junkio@cox.net \
    --cc=raa.lkml@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).