* [PATCH] pack-objects: only throw away data during memory pressure
@ 2008-02-11 7:26 Martin Koegler
2008-02-11 15:20 ` Johannes Schindelin
2008-02-11 16:00 ` Nicolas Pitre
0 siblings, 2 replies; 12+ messages in thread
From: Martin Koegler @ 2008-02-11 7:26 UTC (permalink / raw)
To: Nicolas Pitre, Johannes Schindelin; +Cc: git, Martin Koegler
If pack-objects hit the memory limit, it deletes objects from the delta
window.
This patch make it only delete the data, which is recomputed, if needed again.
Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at>
---
What about this not really tested patch for dealing with memory pressure in git-pack-objects?
It will slow down the repack in the case of memory pressure, but missing memory will not affect the results.
builtin-pack-objects.c | 13 +++++++++++--
1 files changed, 11 insertions(+), 2 deletions(-)
diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
index 6f8f388..231d65f 100644
--- a/builtin-pack-objects.c
+++ b/builtin-pack-objects.c
@@ -1464,7 +1464,7 @@ static unsigned int check_delta_limit(struct object_entry *me, unsigned int n)
return m;
}
-static unsigned long free_unpacked(struct unpacked *n)
+static unsigned long free_unpacked_data(struct unpacked *n)
{
unsigned long freed_mem = sizeof_delta_index(n->index);
free_delta_index(n->index);
@@ -1474,6 +1474,12 @@ static unsigned long free_unpacked(struct unpacked *n)
free(n->data);
n->data = NULL;
}
+ return freed_mem;
+}
+
+static unsigned long free_unpacked(struct unpacked *n)
+{
+ unsigned long freed_mem = free_unpacked_data(n);
n->entry = NULL;
n->depth = 0;
return freed_mem;
@@ -1514,7 +1520,7 @@ static void find_deltas(struct object_entry **list, unsigned *list_size,
mem_usage > window_memory_limit &&
count > 1) {
uint32_t tail = (idx + window - count) % window;
- mem_usage -= free_unpacked(array + tail);
+ mem_usage -= free_unpacked_data(array + tail);
count--;
}
@@ -1547,6 +1553,9 @@ static void find_deltas(struct object_entry **list, unsigned *list_size,
if (!m->entry)
break;
ret = try_delta(n, m, max_depth, &mem_usage);
+ if (window_memory_limit &&
+ mem_usage > window_memory_limit)
+ mem_usage -= free_unpacked_data(m);
if (ret < 0)
break;
else if (ret > 0)
--
1.5.4.g42f90
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] pack-objects: only throw away data during memory pressure
2008-02-11 7:26 [PATCH] pack-objects: only throw away data during memory pressure Martin Koegler
@ 2008-02-11 15:20 ` Johannes Schindelin
2008-02-11 16:00 ` Nicolas Pitre
2008-02-11 16:00 ` Nicolas Pitre
1 sibling, 1 reply; 12+ messages in thread
From: Johannes Schindelin @ 2008-02-11 15:20 UTC (permalink / raw)
To: Martin Koegler; +Cc: Nicolas Pitre, git
Hi,
On Mon, 11 Feb 2008, Martin Koegler wrote:
> What about this not really tested patch for dealing with memory pressure
> in git-pack-objects?
>
> It will slow down the repack in the case of memory pressure, but missing
> memory will not affect the results.
It almost helped:
$ /usr/bin/time git repack -a -d -f --window=250 --depth=250
Counting objects: 2477715, done.
fatal: Out of memory, malloc failed411764)
Command exited with non-zero status 1
10050.12user 240.63system 2:53:37elapsed 98%CPU (0avgtext+0avgdata
0maxresident)k
0inputs+0outputs (29555major+94032945minor)pagefaults 0swaps
So, it ran longer until it ran out of memory.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] pack-objects: only throw away data during memory pressure
2008-02-11 15:20 ` Johannes Schindelin
@ 2008-02-11 16:00 ` Nicolas Pitre
2008-02-11 16:08 ` Johannes Schindelin
0 siblings, 1 reply; 12+ messages in thread
From: Nicolas Pitre @ 2008-02-11 16:00 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Martin Koegler, git
On Mon, 11 Feb 2008, Johannes Schindelin wrote:
> Hi,
>
> On Mon, 11 Feb 2008, Martin Koegler wrote:
>
> > What about this not really tested patch for dealing with memory pressure
> > in git-pack-objects?
> >
> > It will slow down the repack in the case of memory pressure, but missing
> > memory will not affect the results.
>
> It almost helped:
>
> $ /usr/bin/time git repack -a -d -f --window=250 --depth=250
> Counting objects: 2477715, done.
> fatal: Out of memory, malloc failed411764)
> Command exited with non-zero status 1
> 10050.12user 240.63system 2:53:37elapsed 98%CPU (0avgtext+0avgdata
> 0maxresident)k
> 0inputs+0outputs (29555major+94032945minor)pagefaults 0swaps
>
> So, it ran longer until it ran out of memory.
What it can do for you is to limit the window memory usage much more
without affecting the end result, say to 128MB. Of course the repack is
then going to progress much slower if active purging of the window
memory is involved.
If you still run out of memory at that point then there is not much more
to do besides using a new memory allocator that doesn't suffer as much
from memory fragmentation, or find the possible memory leak that no one
else found so far.
Nicolas
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] pack-objects: only throw away data during memory pressure
2008-02-11 7:26 [PATCH] pack-objects: only throw away data during memory pressure Martin Koegler
2008-02-11 15:20 ` Johannes Schindelin
@ 2008-02-11 16:00 ` Nicolas Pitre
2008-02-12 8:22 ` Brian Downing
2008-02-13 1:48 ` Nicolas Pitre
1 sibling, 2 replies; 12+ messages in thread
From: Nicolas Pitre @ 2008-02-11 16:00 UTC (permalink / raw)
To: Martin Koegler; +Cc: Johannes Schindelin, git
On Mon, 11 Feb 2008, Martin Koegler wrote:
> If pack-objects hit the memory limit, it deletes objects from the delta
> window.
>
> This patch make it only delete the data, which is recomputed, if needed again.
>
> Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at>
Looks fine.
Acked-by: Nicolas Pitre <nico@cam.org>
Nicolas
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] pack-objects: only throw away data during memory pressure
2008-02-11 16:00 ` Nicolas Pitre
@ 2008-02-11 16:08 ` Johannes Schindelin
0 siblings, 0 replies; 12+ messages in thread
From: Johannes Schindelin @ 2008-02-11 16:08 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Martin Koegler, git
Hi,
On Mon, 11 Feb 2008, Nicolas Pitre wrote:
> On Mon, 11 Feb 2008, Johannes Schindelin wrote:
>
> > On Mon, 11 Feb 2008, Martin Koegler wrote:
> >
> > > What about this not really tested patch for dealing with memory
> > > pressure in git-pack-objects?
> > >
> > > It will slow down the repack in the case of memory pressure, but
> > > missing memory will not affect the results.
> >
> > It almost helped:
> >
> > $ /usr/bin/time git repack -a -d -f --window=250 --depth=250
> > Counting objects: 2477715, done.
> > fatal: Out of memory, malloc failed411764)
> > Command exited with non-zero status 1
> > 10050.12user 240.63system 2:53:37elapsed 98%CPU (0avgtext+0avgdata
> > 0maxresident)k
> > 0inputs+0outputs (29555major+94032945minor)pagefaults 0swaps
> >
> > So, it ran longer until it ran out of memory.
>
> What it can do for you is to limit the window memory usage much more
> without affecting the end result, say to 128MB. Of course the repack is
> then going to progress much slower if active purging of the window
> memory is involved.
Oh, well. I did not think of setting a smaller windowMemory. Will try
that next.
> If you still run out of memory at that point then there is not much more
> to do besides using a new memory allocator that doesn't suffer as much
> from memory fragmentation, or find the possible memory leak that no one
> else found so far.
Completely forgot to say: I am running with ememoa (that is the Google
allocator, right?).
Ciao,
Dscho
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] pack-objects: only throw away data during memory pressure
2008-02-11 16:00 ` Nicolas Pitre
@ 2008-02-12 8:22 ` Brian Downing
2008-02-12 14:26 ` Nicolas Pitre
2008-02-13 1:48 ` Nicolas Pitre
1 sibling, 1 reply; 12+ messages in thread
From: Brian Downing @ 2008-02-12 8:22 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Martin Koegler, Johannes Schindelin, git
On Mon, Feb 11, 2008 at 11:00:33AM -0500, Nicolas Pitre wrote:
> On Mon, 11 Feb 2008, Martin Koegler wrote:
> > If pack-objects hit the memory limit, it deletes objects from the
> > delta window.
> >
> > This patch make it only delete the data, which is recomputed, if
> > needed again.
> >
> > Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at>
>
> Looks fine.
>
> Acked-by: Nicolas Pitre <nico@cam.org>
Unfortunately this patch (if I understand what it's doing correctly)
basically defeats my intended use-case for which I wrote the memory
limiter. I have a repository with files of very mixed size. I want the
window to be very large for small files, for good archival repacking,
but I don't want it to be very large for my 20+MB files with hundreds of
revisions, because I want it to finish someday.
Also, I've gotten into the habit of just doing:
git repack --window=100000 --window-memory=256m
for archival repacks and just letting the memory limit automatically
size the window. Basically, I don't really want to specify a window
size, I just want it to use 512mb of RAM (and go at the speed that size
of a window would entail.) While this is slow, it tends to be a
relatively constant speed, and it tends to find some very interesting
deltas in my trees that I wouldn't have otherwise expected.
If this patch is accepted, I'd really like a way to maintain the old
behavior as an option.
-bcd
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] pack-objects: only throw away data during memory pressure
2008-02-12 8:22 ` Brian Downing
@ 2008-02-12 14:26 ` Nicolas Pitre
2008-02-12 17:14 ` Brian Downing
0 siblings, 1 reply; 12+ messages in thread
From: Nicolas Pitre @ 2008-02-12 14:26 UTC (permalink / raw)
To: Brian Downing; +Cc: Martin Koegler, Johannes Schindelin, git
On Tue, 12 Feb 2008, Brian Downing wrote:
> On Mon, Feb 11, 2008 at 11:00:33AM -0500, Nicolas Pitre wrote:
> > On Mon, 11 Feb 2008, Martin Koegler wrote:
> > > If pack-objects hit the memory limit, it deletes objects from the
> > > delta window.
> > >
> > > This patch make it only delete the data, which is recomputed, if
> > > needed again.
> > >
> > > Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at>
> >
> > Looks fine.
> >
> > Acked-by: Nicolas Pitre <nico@cam.org>
>
> Unfortunately this patch (if I understand what it's doing correctly)
> basically defeats my intended use-case for which I wrote the memory
> limiter. I have a repository with files of very mixed size. I want the
> window to be very large for small files, for good archival repacking,
> but I don't want it to be very large for my 20+MB files with hundreds of
> revisions, because I want it to finish someday.
>
> Also, I've gotten into the habit of just doing:
> git repack --window=100000 --window-memory=256m
> for archival repacks and just letting the memory limit automatically
> size the window. Basically, I don't really want to specify a window
> size, I just want it to use 512mb of RAM (and go at the speed that size
> of a window would entail.) While this is slow, it tends to be a
> relatively constant speed, and it tends to find some very interesting
> deltas in my trees that I wouldn't have otherwise expected.
>
> If this patch is accepted, I'd really like a way to maintain the old
> behavior as an option.
I think your use case has merits, but the previous behavior had
semantics problems. We always had constant window size with dynamic
memory usage, and now we have constant window size with bounded memory
usage.
If what you want is really to have a dynamic window size using a
constant memory usage then it needs a different and coherent way to be
specified.
Nicolas
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] pack-objects: only throw away data during memory pressure
2008-02-12 14:26 ` Nicolas Pitre
@ 2008-02-12 17:14 ` Brian Downing
2008-02-12 17:17 ` Brian Downing
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Brian Downing @ 2008-02-12 17:14 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Martin Koegler, Johannes Schindelin, git
On Tue, Feb 12, 2008 at 09:26:25AM -0500, Nicolas Pitre wrote:
> I think your use case has merits, but the previous behavior had
> semantics problems. We always had constant window size with dynamic
> memory usage, and now we have constant window size with bounded memory
> usage.
>
> If what you want is really to have a dynamic window size using a
> constant memory usage then it needs a different and coherent way to be
> specified.
Sometimes I want a bounded window size with bounded memory usage; i.e. a
maximum of 50 entries OR 256 megs worth. That's for everyday repacking
of my troublesome repository; without the window going down to less than
10 or so for the large files, it still takes way too long, but doing the
whole thing at 10 makes for very poor packing.
So that gives four options:
1. No memory limit, constant entry depth. Original Git behavior.
2. The above with an additional memory-based depth limit. This is
what was added with --memory-limit.
3. Constant entry depth with a memory-usage limit. This is what the
proposed patch does.
4. Dynamic entry depth, with a memory-based limit. This is I believe
what you are proposing above, and what I emulate by setting
--window=$bignum --window-memory=x.
I'm willing to try and make all of those work. (Though frankly I don't
care much about #3; setting the window entry size to something "large
enough" seems a simple enough work-around for me, and it prevents what's
probably some truly ridiculous behavior if you have a gigantic number of
tiny, say, tree objects. Having a cap on window depth stops that case
from taking a truly inordinate amount of time.)
However, I can't figure out what sensible command-line and/or config
parameters would be for the cases above. Any ideas?
-bcd
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] pack-objects: only throw away data during memory pressure
2008-02-12 17:14 ` Brian Downing
@ 2008-02-12 17:17 ` Brian Downing
2008-02-12 17:28 ` Brian Downing
2008-02-12 17:39 ` Nicolas Pitre
2 siblings, 0 replies; 12+ messages in thread
From: Brian Downing @ 2008-02-12 17:17 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Martin Koegler, Johannes Schindelin, git
On Tue, Feb 12, 2008 at 11:14:03AM -0600, Brian Downing wrote:
> Sometimes I want a bounded window size with bounded memory usage; i.e. a
> maximum of 50 entries OR 256 megs worth. That's for everyday repacking
> of my troublesome repository; without the window going down to less than
> 10 or so for the large files, it still takes way too long, but doing the
> whole thing at 10 makes for very poor packing.
(Yeah, the default depth is 10. I admit to shooting from the hip a bit
in my description above, but I do really use that mode of operation, and
wouldn't like to lose it.)
-bcd
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] pack-objects: only throw away data during memory pressure
2008-02-12 17:14 ` Brian Downing
2008-02-12 17:17 ` Brian Downing
@ 2008-02-12 17:28 ` Brian Downing
2008-02-12 17:39 ` Nicolas Pitre
2 siblings, 0 replies; 12+ messages in thread
From: Brian Downing @ 2008-02-12 17:28 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Martin Koegler, Johannes Schindelin, git
On Tue, Feb 12, 2008 at 11:14:03AM -0600, Brian Downing wrote:
> 3. Constant entry depth with a memory-usage limit. This is what the
> proposed patch does.
> 4. Dynamic entry depth, with a memory-based limit. This is I believe
> what you are proposing above, and what I emulate by setting
> --window=$bignum --window-memory=x.
[...]
> (Though frankly I don't > care much about #3; setting the window entry
> size to something "large enough" seems a simple enough work-around for
> me...)
I meant #4 of course, I renumbered one case but not the other. I need
to not post right after I wake up. :)
-bcd
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] pack-objects: only throw away data during memory pressure
2008-02-12 17:14 ` Brian Downing
2008-02-12 17:17 ` Brian Downing
2008-02-12 17:28 ` Brian Downing
@ 2008-02-12 17:39 ` Nicolas Pitre
2 siblings, 0 replies; 12+ messages in thread
From: Nicolas Pitre @ 2008-02-12 17:39 UTC (permalink / raw)
To: Brian Downing; +Cc: Martin Koegler, Johannes Schindelin, git
On Tue, 12 Feb 2008, Brian Downing wrote:
> So that gives four options:
>
> 1. No memory limit, constant entry depth. Original Git behavior.
> 2. The above with an additional memory-based depth limit. This is
> what was added with --memory-limit.
> 3. Constant entry depth with a memory-usage limit. This is what the
> proposed patch does.
... and it was merged already.
> 4. Dynamic entry depth, with a memory-based limit. This is I believe
> what you are proposing above, and what I emulate by setting
> --window=$bignum --window-memory=x.
>
> I'm willing to try and make all of those work. (Though frankly I don't
> care much about #3; setting the window entry size to something "large
> enough" seems a simple enough work-around for me, and it prevents what's
> probably some truly ridiculous behavior if you have a gigantic number of
> tiny, say, tree objects. Having a cap on window depth stops that case
> from taking a truly inordinate amount of time.)
>
> However, I can't figure out what sensible command-line and/or config
> parameters would be for the cases above. Any ideas?
I explicitly avoided suggesting something in my previous email exactly
because I don't have a clear idea myself for a sensible parameter. ;-)
Maybe using a negative window size could mean it is dynamic, but caped
to the provided absolute value? Although a bit strange, this is still a
specialized mode of operation and having an odd argument for it
shouldn't rebute those who understands the implication and are willing
to use it. The more natural mode of operation with a memory cap is not
to change the end result but maybe experience a slowdown (what the
merged patch does), which is why I considered that patch good.
Nicolas
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] pack-objects: only throw away data during memory pressure
2008-02-11 16:00 ` Nicolas Pitre
2008-02-12 8:22 ` Brian Downing
@ 2008-02-13 1:48 ` Nicolas Pitre
1 sibling, 0 replies; 12+ messages in thread
From: Nicolas Pitre @ 2008-02-13 1:48 UTC (permalink / raw)
To: Martin Koegler, Junio C Hamano; +Cc: Johannes Schindelin, git, Brian Downing
On Mon, 11 Feb 2008, Nicolas Pitre wrote:
> On Mon, 11 Feb 2008, Martin Koegler wrote:
>
> > If pack-objects hit the memory limit, it deletes objects from the delta
> > window.
> >
> > This patch make it only delete the data, which is recomputed, if needed again.
> >
> > Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at>
>
> Looks fine.
>
> Acked-by: Nicolas Pitre <nico@cam.org>
Well, I take that back.
Some testing on the OOO repository with this turns out to be
completely unusable.
By the time this gets into action and data is actively thrown away,
performance simply goes down the drain due to the data constantly being
reloaded over and over and over and over and over and over again, to the
point of virtually making no relative progress at all.
So this change is not actually helping anything. The previous behavior
of enforcing the memory limit by dynamically shrinking the window size
at least had the effect of allowing some kind of progress, even if the
end result wouldn't be optimal.
And that's the whole point behind this memory limiting feature: allowing
some progress to be made when resources are too limited to let the
repack go unbounded.
Therefore I think commit 9c2174350cc0ae0f6bad126e15fe1f9f044117ab should
be reverted.
Nicolas
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2008-02-13 1:48 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-02-11 7:26 [PATCH] pack-objects: only throw away data during memory pressure Martin Koegler
2008-02-11 15:20 ` Johannes Schindelin
2008-02-11 16:00 ` Nicolas Pitre
2008-02-11 16:08 ` Johannes Schindelin
2008-02-11 16:00 ` Nicolas Pitre
2008-02-12 8:22 ` Brian Downing
2008-02-12 14:26 ` Nicolas Pitre
2008-02-12 17:14 ` Brian Downing
2008-02-12 17:17 ` Brian Downing
2008-02-12 17:28 ` Brian Downing
2008-02-12 17:39 ` Nicolas Pitre
2008-02-13 1:48 ` Nicolas Pitre
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).