From: Avery Pennarun <apenwarr@gmail.com>
To: git@vger.kernel.org, gitster@pobox.com, nico@fluxnic.net
Cc: Avery Pennarun <apenwarr@gmail.com>
Subject: [PATCH] pack-objects: never deltify objects bigger than window_memory_limit.
Date: Wed, 22 Sep 2010 03:25:05 -0700 [thread overview]
Message-ID: <1285151105-32454-1-git-send-email-apenwarr@gmail.com> (raw)
With very large objects, just loading them into the delta window wastes a
huge amount of memory. In one repo, I have some objects around 1GB in size,
and git-pack-objects seems to require about 8x that in order to deltify it,
even when the window memory limit is small (eg. --window-memory=100M). With
this patch, the maximum memory usage is about halved.
Perhaps more importantly, however, disabling deltification for large objects
seems to reduce memory thrashing when you can't fit multiple large objects
into physical RAM at once. It seems to be the difference between "never
finishes" and "finishes eventually" for me.
Test:
I created a test repo with 10 sequential commits containing a bunch of
nearly-identical 110MB files (just appending a line each time).
Without this patch:
$ /usr/bin/time git repack -a --window-memory=100M
Counting objects: 43, done.
warning: suboptimal pack - out of memory
Compressing objects: 100% (43/43), done.
Writing objects: 100% (43/43), done.
Total 43 (delta 14), reused 0 (delta 0)
42.79user 1.07system 0:44.53elapsed 98%CPU (0avgtext+0avgdata
866736maxresident)k
0inputs+2752outputs (0major+718471minor)pagefaults 0swaps
With this patch:
$ /usr/bin/time -a git repack -a --window-memory=100M
Counting objects: 43, done.
Compressing objects: 100% (30/30), done.
Writing objects: 100% (43/43), done.
Total 43 (delta 14), reused 0 (delta 0)
35.86user 0.65system 0:36.30elapsed 100%CPU (0avgtext+0avgdata
437568maxresident)k
0inputs+2768outputs (0major+366137minor)pagefaults 0swaps
It apparently still uses about 4x the memory of the largest object, which is
about twice as good as before, though still kind of awful. (Ideally, we
wouldn't even load the entire large object into memory even once.)
Signed-off-by: Avery Pennarun <apenwarr@gmail.com>
---
builtin/pack-objects.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 0e81673..9f1a289 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1791,6 +1791,9 @@ static void prepare_pack(int window, int depth)
if (entry->size < 50)
continue;
+ if (window_memory_limit && entry->size > window_memory_limit)
+ continue;
+
if (entry->no_try_delta)
continue;
--
1.7.3.1.gca9d1
next reply other threads:[~2010-09-22 10:25 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-09-22 10:25 Avery Pennarun [this message]
2010-09-22 12:00 ` [PATCH] pack-objects: never deltify objects bigger than window_memory_limit Nicolas Pitre
2010-09-23 5:01 ` Avery Pennarun
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1285151105-32454-1-git-send-email-apenwarr@gmail.com \
--to=apenwarr@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=nico@fluxnic.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).