git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] pack-objects: never deltify objects bigger than window_memory_limit.
@ 2010-09-22 10:25 Avery Pennarun
  2010-09-22 12:00 ` Nicolas Pitre
  0 siblings, 1 reply; 3+ messages in thread
From: Avery Pennarun @ 2010-09-22 10:25 UTC (permalink / raw)
  To: git, gitster, nico; +Cc: Avery Pennarun

With very large objects, just loading them into the delta window wastes a
huge amount of memory.  In one repo, I have some objects around 1GB in size,
and git-pack-objects seems to require about 8x that in order to deltify it,
even when the window memory limit is small (eg. --window-memory=100M).  With
this patch, the maximum memory usage is about halved.

Perhaps more importantly, however, disabling deltification for large objects
seems to reduce memory thrashing when you can't fit multiple large objects
into physical RAM at once.  It seems to be the difference between "never
finishes" and "finishes eventually" for me.

Test:

I created a test repo with 10 sequential commits containing a bunch of
nearly-identical 110MB files (just appending a line each time).

Without this patch:

    $ /usr/bin/time git repack -a --window-memory=100M

    Counting objects: 43, done.
    warning: suboptimal pack - out of memory
    Compressing objects: 100% (43/43), done.
    Writing objects: 100% (43/43), done.
    Total 43 (delta 14), reused 0 (delta 0)
    42.79user 1.07system 0:44.53elapsed 98%CPU (0avgtext+0avgdata
      866736maxresident)k
      0inputs+2752outputs (0major+718471minor)pagefaults 0swaps

With this patch:

    $ /usr/bin/time -a git repack -a --window-memory=100M

    Counting objects: 43, done.
    Compressing objects: 100% (30/30), done.
    Writing objects: 100% (43/43), done.
    Total 43 (delta 14), reused 0 (delta 0)
    35.86user 0.65system 0:36.30elapsed 100%CPU (0avgtext+0avgdata
      437568maxresident)k
      0inputs+2768outputs (0major+366137minor)pagefaults 0swaps

It apparently still uses about 4x the memory of the largest object, which is
about twice as good as before, though still kind of awful.  (Ideally, we
wouldn't even load the entire large object into memory even once.)

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>
---
 builtin/pack-objects.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 0e81673..9f1a289 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1791,6 +1791,9 @@ static void prepare_pack(int window, int depth)
 		if (entry->size < 50)
 			continue;
 
+		if (window_memory_limit && entry->size > window_memory_limit)
+                	continue;
+
 		if (entry->no_try_delta)
 			continue;
 
-- 
1.7.3.1.gca9d1

^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2010-09-23  5:02 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-09-22 10:25 [PATCH] pack-objects: never deltify objects bigger than window_memory_limit Avery Pennarun
2010-09-22 12:00 ` Nicolas Pitre
2010-09-23  5:01   ` Avery Pennarun

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).