git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] Memory-limited pack-object window support
@ 2007-07-12  3:14 Brian Downing
  2007-07-12  3:14 ` [PATCH 1/5] Don't try to delta if target is much smaller than source Brian Downing
                   ` (5 more replies)
  0 siblings, 6 replies; 10+ messages in thread
From: Brian Downing @ 2007-07-12  3:14 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Brian Downing

This patch series implements a memory limit on the window size for
pack-objects and repack.  Basically, the window size will temporarily
grow smaller than the --window option specifies if the total memory
usage of the window is over the specified limit.

With this series I can run even with --window=1000 on my troublesome
repository (described in an email a couple of days ago; basically, it
has a 20MB file with around 200 revisions, plus a bunch of normal sized
files) with memory usage limited to 512MB and it will happily scale down
the window to accommodate the file without blowing out my machine.

The --window option still specifies the size of the window and is still
required; the window will not grow to reach the memory limit, it will
only shrink to fit.  I think this may be a feature, as running with a
very large window depth and a memory limit basically means that you will
pack at an approximately constant slow speed, rather than rushing ahead
as it does now for very small objects.

I took the easy way out and expire objects out of the window after
allocation has occurred, rather than figuring out how much needed to
be cleared before allocating.  This made the logic much more feasible,
though.

I chose --window-bytes=N and pack.windowBytes=N as my option and
configuration names.  I'm not in love with them, though; no other
configuration in Git mentions bytes in the name as far as I can see,
but I wanted to clearly differentiate it from the window "size" that
--window gets you.

The first patch in this series is optional, but I recommend it.  It makes
it so that files that are much smaller than a potential delta source
don't even try to delta with it.  This is handy for running in a mixed
file size repository with a large window, as it means that when you get
to small files again you start moving fast without having to wait for
the large objects to trickle out of the window.  The cut-off is currently
if it is 1/32 the size, but that number was completely arbitrary.

 [PATCH 1/5] Don't try to delta if target is much smaller than source
 [PATCH 2/5] Support fetching the memory usage of a delta index
 [PATCH 3/5] Add pack-objects window memory usage limit
 [PATCH 4/5] Add --window-bytes option to git-repack
 [PATCH 5/5] Add documentation for --window-bytes, pack.windowBytes

 Documentation/config.txt           |    5 +++
 Documentation/git-pack-objects.txt |    8 +++++
 Documentation/git-repack.txt       |    8 +++++
 builtin-pack-objects.c             |   58 +++++++++++++++++++++++++++++++----
 delta.h                            |    7 ++++
 diff-delta.c                       |   10 ++++++
 git-repack.sh                      |    3 +-
 7 files changed, 91 insertions(+), 8 deletions(-)

-bcd

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/5] Don't try to delta if target is much smaller than source
  2007-07-12  3:14 [PATCH 0/5] Memory-limited pack-object window support Brian Downing
@ 2007-07-12  3:14 ` Brian Downing
  2007-07-12  3:14 ` [PATCH 2/5] Support fetching the memory usage of a delta index Brian Downing
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Brian Downing @ 2007-07-12  3:14 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Brian Downing

Add a new try_delta heuristic:  Don't bother trying to make a delta if
the target object size is much smaller (currently 1/32) than the source,
as it's very likely not going to get a match.  Even if it does, you will
have to read at least 32x the size of the new file to reassemble it,
which isn't such a good deal.  This leads to a considerable performance
improvement when deltifying a mix of small and large files with a very
large window, because you don't have to wait for the large files to
percolate out of the window before things start going fast again.

Signed-off-by: Brian Downing <bdowning@lavos.net>
---
 builtin-pack-objects.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
index 54b9d26..132ce96 100644
--- a/builtin-pack-objects.c
+++ b/builtin-pack-objects.c
@@ -1342,6 +1342,8 @@ static int try_delta(struct unpacked *trg, struct unpacked *src,
 	sizediff = src_size < trg_size ? trg_size - src_size : 0;
 	if (sizediff >= max_size)
 		return 0;
+	if (trg_size < src_size / 32)
+		return 0;
 
 	/* Load data if not already done */
 	if (!trg->data) {
-- 
1.5.2.GIT

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/5] Support fetching the memory usage of a delta index
  2007-07-12  3:14 [PATCH 0/5] Memory-limited pack-object window support Brian Downing
  2007-07-12  3:14 ` [PATCH 1/5] Don't try to delta if target is much smaller than source Brian Downing
@ 2007-07-12  3:14 ` Brian Downing
  2007-07-12  3:14 ` [PATCH 3/5] Add pack-objects window memory usage limit Brian Downing
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Brian Downing @ 2007-07-12  3:14 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Brian Downing

Delta indexes, at least on 64-bit platforms, tend to be larger than
the actual uncompressed data.  As such, keeping track of this storage
is important if you want to successfully limit the memory size of your
pack window.

Squirrel away the total allocation size inside the delta_index struct,
and add an accessor "sizeof_delta_index" to access it.

Signed-off-by: Brian Downing <bdowning@lavos.net>
---
 delta.h      |    7 +++++++
 diff-delta.c |   10 ++++++++++
 2 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/delta.h b/delta.h
index 7b3f86d..40ccf5a 100644
--- a/delta.h
+++ b/delta.h
@@ -24,6 +24,13 @@ create_delta_index(const void *buf, unsigned long bufsize);
 extern void free_delta_index(struct delta_index *index);
 
 /*
+ * sizeof_delta_index: returns memory usage of delta index
+ *
+ * Given pointer must be what create_delta_index() returned, or NULL.
+ */
+extern unsigned long sizeof_delta_index(struct delta_index *index);
+
+/*
  * create_delta: create a delta from given index for the given buffer
  *
  * This function may be called multiple times with different buffers using
diff --git a/diff-delta.c b/diff-delta.c
index faf96e4..3af5835 100644
--- a/diff-delta.c
+++ b/diff-delta.c
@@ -119,6 +119,7 @@ struct index_entry {
 };
 
 struct delta_index {
+	unsigned long memsize;
 	const void *src_buf;
 	unsigned long src_size;
 	unsigned int hash_mask;
@@ -159,6 +160,7 @@ struct delta_index * create_delta_index(const void *buf, unsigned long bufsize)
 	mem = hash + hsize;
 	entry = mem;
 
+	index->memsize = memsize;
 	index->src_buf = buf;
 	index->src_size = bufsize;
 	index->hash_mask = hmask;
@@ -228,6 +230,14 @@ void free_delta_index(struct delta_index *index)
 	free(index);
 }
 
+unsigned long sizeof_delta_index(struct delta_index *index)
+{
+	if (index)
+		return index->memsize;
+	else
+		return 0;
+}
+
 /*
  * The maximum size for any opcode sequence, including the initial header
  * plus Rabin window plus biggest copy.
-- 
1.5.2.GIT

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/5] Add pack-objects window memory usage limit
  2007-07-12  3:14 [PATCH 0/5] Memory-limited pack-object window support Brian Downing
  2007-07-12  3:14 ` [PATCH 1/5] Don't try to delta if target is much smaller than source Brian Downing
  2007-07-12  3:14 ` [PATCH 2/5] Support fetching the memory usage of a delta index Brian Downing
@ 2007-07-12  3:14 ` Brian Downing
  2007-07-12  4:25   ` Nicolas Pitre
  2007-07-12  3:14 ` [PATCH 4/5] Add --window-bytes option to git-repack Brian Downing
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 10+ messages in thread
From: Brian Downing @ 2007-07-12  3:14 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Brian Downing

This adds an option (--window-bytes=N) and configuration variable
(pack.windowBytes = N) to limit the memory size of the pack-objects
delta search window.  This works by removing the oldest unpacked objects
whenever the total size goes above the limit.  It will always leave
at least one object, though, so as not to completely eliminate the
possibility of computing deltas.

This is an extra limit on top of the normal window size (--window=N);
the window will not dynamically grow above the fixed number of entries
specified to fill the memory limit.

With this, repacking a repository with a mix of large and small objects
is possible even with a very large window.

Signed-off-by: Brian Downing <bdowning@lavos.net>
---
 builtin-pack-objects.c |   56 ++++++++++++++++++++++++++++++++++++++++++------
 1 files changed, 49 insertions(+), 7 deletions(-)

diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
index 132ce96..6e441b7 100644
--- a/builtin-pack-objects.c
+++ b/builtin-pack-objects.c
@@ -16,8 +16,9 @@
 #include "progress.h"
 
 static const char pack_usage[] = "\
-git-pack-objects [{ -q | --progress | --all-progress }] [--max-pack-size=N] \n\
-	[--local] [--incremental] [--window=N] [--depth=N] \n\
+git-pack-objects [{ -q | --progress | --all-progress }] \n\
+	[--max-pack-size=N] [--local] [--incremental] \n\
+	[--window=N] [--window-bytes=N] [--depth=N] \n\
 	[--no-reuse-delta] [--no-reuse-object] [--delta-base-offset] \n\
 	[--non-empty] [--revs [--unpacked | --all]*] [--reflog] \n\
 	[--stdout | base-name] [<ref-list | <object-list]";
@@ -79,6 +80,9 @@ static unsigned long delta_cache_size = 0;
 static unsigned long max_delta_cache_size = 0;
 static unsigned long cache_max_small_delta_size = 1000;
 
+static unsigned long window_memory_usage = 0;
+static unsigned long window_memory_limit = 0;
+
 /*
  * The object names in objects array are hashed with this hashtable,
  * to help looking up the entry by object name.
@@ -1351,12 +1355,14 @@ static int try_delta(struct unpacked *trg, struct unpacked *src,
 		if (sz != trg_size)
 			die("object %s inconsistent object length (%lu vs %lu)",
 			    sha1_to_hex(trg_entry->idx.sha1), sz, trg_size);
+		window_memory_usage += sz;
 	}
 	if (!src->data) {
 		src->data = read_sha1_file(src_entry->idx.sha1, &type, &sz);
 		if (sz != src_size)
 			die("object %s inconsistent object length (%lu vs %lu)",
 			    sha1_to_hex(src_entry->idx.sha1), sz, src_size);
+		window_memory_usage += sz;
 	}
 	if (!src->index) {
 		src->index = create_delta_index(src->data, src_size);
@@ -1366,6 +1372,7 @@ static int try_delta(struct unpacked *trg, struct unpacked *src,
 				warning("suboptimal pack - out of memory");
 			return 0;
 		}
+		window_memory_usage += sizeof_delta_index(src->index);
 	}
 
 	delta_buf = create_delta(src->index, trg->data, trg_size, &delta_size, max_size);
@@ -1408,9 +1415,22 @@ static unsigned int check_delta_limit(struct object_entry *me, unsigned int n)
 	return m;
 }
 
+static void free_unpacked(struct unpacked *n)
+{
+	window_memory_usage -= sizeof_delta_index(n->index);
+	free_delta_index(n->index);
+	n->index = NULL;
+	if (n->data) {
+		free(n->data);
+		n->data = NULL;
+		window_memory_usage -= n->entry->size;
+	}
+	n->entry = NULL;
+}
+
 static void find_deltas(struct object_entry **list, int window, int depth)
 {
-	uint32_t i = nr_objects, idx = 0, processed = 0;
+	uint32_t i = nr_objects, idx = 0, count = 0, processed = 0;
 	unsigned int array_size = window * sizeof(struct unpacked);
 	struct unpacked *array;
 	int max_depth;
@@ -1445,12 +1465,21 @@ static void find_deltas(struct object_entry **list, int window, int depth)
 		if (entry->no_try_delta)
 			continue;
 
-		free_delta_index(n->index);
-		n->index = NULL;
-		free(n->data);
-		n->data = NULL;
+		free_unpacked(n);
 		n->entry = entry;
 
+		while (window_memory_limit &&
+		       window_memory_usage > window_memory_limit &&
+		       count > 1) {
+			uint32_t tail = idx - count;
+			if (tail > idx) {
+				tail += window + 1;
+				tail %= window;
+			}
+			free_unpacked(array + tail);
+			count--;
+		}
+
 		/*
 		 * If the current object is at pack edge, take the depth the
 		 * objects that depend on the current object into account
@@ -1485,6 +1514,8 @@ static void find_deltas(struct object_entry **list, int window, int depth)
 
 		next:
 		idx++;
+		if (count < window)
+			count++;
 		if (idx >= window)
 			idx = 0;
 	} while (i > 0);
@@ -1523,6 +1554,10 @@ static int git_pack_config(const char *k, const char *v)
 		window = git_config_int(k, v);
 		return 0;
 	}
+	if(!strcmp(k, "pack.windowbytes")) {
+		window_memory_limit = git_config_int(k, v);
+		return 0;
+	}
 	if(!strcmp(k, "pack.depth")) {
 		depth = git_config_int(k, v);
 		return 0;
@@ -1699,6 +1734,13 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 				usage(pack_usage);
 			continue;
 		}
+		if (!prefixcmp(arg, "--window-bytes=")) {
+			char *end;
+			window_memory_limit = strtoul(arg+15, &end, 0);
+			if (!arg[15] || *end)
+				usage(pack_usage);
+			continue;
+		}
 		if (!prefixcmp(arg, "--depth=")) {
 			char *end;
 			depth = strtoul(arg+8, &end, 0);
-- 
1.5.2.GIT

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/5] Add --window-bytes option to git-repack
  2007-07-12  3:14 [PATCH 0/5] Memory-limited pack-object window support Brian Downing
                   ` (2 preceding siblings ...)
  2007-07-12  3:14 ` [PATCH 3/5] Add pack-objects window memory usage limit Brian Downing
@ 2007-07-12  3:14 ` Brian Downing
  2007-07-12  3:14 ` [PATCH 5/5] Add documentation for --window-bytes, pack.windowBytes Brian Downing
  2007-07-12  4:38 ` [PATCH 0/5] Memory-limited pack-object window support Nicolas Pitre
  5 siblings, 0 replies; 10+ messages in thread
From: Brian Downing @ 2007-07-12  3:14 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Brian Downing

Signed-off-by: Brian Downing <bdowning@lavos.net>
---
 git-repack.sh |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/git-repack.sh b/git-repack.sh
index b5c6671..4cff812 100755
--- a/git-repack.sh
+++ b/git-repack.sh
@@ -3,7 +3,7 @@
 # Copyright (c) 2005 Linus Torvalds
 #
 
-USAGE='[-a] [-d] [-f] [-l] [-n] [-q] [--max-pack-size=N] [--window=N] [--depth=N]'
+USAGE='[-a] [-d] [-f] [-l] [-n] [-q] [--max-pack-size=N] [--window=N] [--window-bytes=N] [--depth=N]'
 SUBDIRECTORY_OK='Yes'
 . git-sh-setup
 
@@ -20,6 +20,7 @@ do
 	-l)	local=--local ;;
 	--max-pack-size=*) extra="$extra $1" ;;
 	--window=*) extra="$extra $1" ;;
+	--window-bytes=*) extra="$extra $1" ;;
 	--depth=*) extra="$extra $1" ;;
 	*)	usage ;;
 	esac
-- 
1.5.2.GIT

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 5/5] Add documentation for --window-bytes, pack.windowBytes
  2007-07-12  3:14 [PATCH 0/5] Memory-limited pack-object window support Brian Downing
                   ` (3 preceding siblings ...)
  2007-07-12  3:14 ` [PATCH 4/5] Add --window-bytes option to git-repack Brian Downing
@ 2007-07-12  3:14 ` Brian Downing
  2007-07-12  4:35   ` Nicolas Pitre
  2007-07-12  4:38 ` [PATCH 0/5] Memory-limited pack-object window support Nicolas Pitre
  5 siblings, 1 reply; 10+ messages in thread
From: Brian Downing @ 2007-07-12  3:14 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Brian Downing

Signed-off-by: Brian Downing <bdowning@lavos.net>
---
 Documentation/config.txt           |    5 +++++
 Documentation/git-pack-objects.txt |    8 ++++++++
 Documentation/git-repack.txt       |    8 ++++++++
 3 files changed, 21 insertions(+), 0 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index aeece84..83c7dc1 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -592,6 +592,11 @@ pack.depth::
 	The maximum delta depth used by gitlink:git-pack-objects[1] when no
 	maximum depth is given on the command line. Defaults to 50.
 
+pack.windowBytes::
+	This option provides an additional limit on top of `pack.window`;
+	the window size will dynamically scale down so as to not take
+	up more than N bytes in memory.
+
 pack.compression::
 	An integer -1..9, indicating the compression level for objects
 	in a pack file. -1 is the zlib default. 0 means no
diff --git a/Documentation/git-pack-objects.txt b/Documentation/git-pack-objects.txt
index e3549b5..21ed198 100644
--- a/Documentation/git-pack-objects.txt
+++ b/Documentation/git-pack-objects.txt
@@ -85,6 +85,14 @@ base-name::
 	times to get to the necessary object.
 	The default value for --window is 10 and --depth is 50.
 
+--window-bytes=[N]::
+	This option provides an additional limit on top of `--window`;
+	the window size will dynamically scale down so as to not take
+	up more than N bytes in memory.  This is useful in
+	repositories with a mix of large and small objects to not run
+	out of memory with a large window, but still be able to take
+	advantage of the large window for the smaller objects.
+
 --max-pack-size=<n>::
 	Maximum size of each output packfile, expressed in MiB.
 	If specified,  multiple packfiles may be created.
diff --git a/Documentation/git-repack.txt b/Documentation/git-repack.txt
index 2894939..805d930 100644
--- a/Documentation/git-repack.txt
+++ b/Documentation/git-repack.txt
@@ -68,6 +68,14 @@ OPTIONS
 	to be applied that many times to get to the necessary object.
 	The default value for --window is 10 and --depth is 50.
 
+--window-bytes=[N]::
+	This option provides an additional limit on top of `--window`;
+	the window size will dynamically scale down so as to not take
+	up more than N bytes in memory.  This is useful in
+	repositories with a mix of large and small objects to not run
+	out of memory with a large window, but still be able to take
+	advantage of the large window for the smaller objects.
+
 --max-pack-size=<n>::
 	Maximum size of each output packfile, expressed in MiB.
 	If specified,  multiple packfiles may be created.
-- 
1.5.2.GIT

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/5] Add pack-objects window memory usage limit
  2007-07-12  3:14 ` [PATCH 3/5] Add pack-objects window memory usage limit Brian Downing
@ 2007-07-12  4:25   ` Nicolas Pitre
  2007-07-12 10:02     ` Brian Downing
  0 siblings, 1 reply; 10+ messages in thread
From: Nicolas Pitre @ 2007-07-12  4:25 UTC (permalink / raw)
  To: Brian Downing; +Cc: git, Junio C Hamano

On Wed, 11 Jul 2007, Brian Downing wrote:

> +		while (window_memory_limit &&
> +		       window_memory_usage > window_memory_limit &&
> +		       count > 1) {
> +			uint32_t tail = idx - count;
> +			if (tail > idx) {
> +				tail += window + 1;
> +				tail %= window;
> +			}
> +			free_unpacked(array + tail);
> +			count--;
> +		}

This is bogus.  Suppose window = 10 and only array entries 8, 9, 0, 1 
and 2 are populated.  In that case idx = 2 and count should be 4 (not 
counting the current entry yet).  You want to evict entry 8.

	-- tail = 2 - 4 = -2 (or a big uint32_t value)

	-- tail > idx is true

	-- tail += window + 1 -> -2 + 10 + 1 = 9

	-- tail %= window is useless

	-- you free entry 9 instead of entry 8.

Instead, you should do:

	tail = idx - count;
	if (tail > idx)
		tail += window;

or even:

	tail = (idx + window - count) % window;

>  		next:
>  		idx++;
> +		if (count < window)
> +			count++;

And of course you want:

	if (count + 1 < window)
		count++;

So not to count the new entry when the window gets full.


Nicolas

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 5/5] Add documentation for --window-bytes, pack.windowBytes
  2007-07-12  3:14 ` [PATCH 5/5] Add documentation for --window-bytes, pack.windowBytes Brian Downing
@ 2007-07-12  4:35   ` Nicolas Pitre
  0 siblings, 0 replies; 10+ messages in thread
From: Nicolas Pitre @ 2007-07-12  4:35 UTC (permalink / raw)
  To: Brian Downing; +Cc: git, Junio C Hamano

On Wed, 11 Jul 2007, Brian Downing wrote:

> Signed-off-by: Brian Downing <bdowning@lavos.net>
> ---
>  Documentation/config.txt           |    5 +++++
>  Documentation/git-pack-objects.txt |    8 ++++++++
>  Documentation/git-repack.txt       |    8 ++++++++
>  3 files changed, 21 insertions(+), 0 deletions(-)
> 
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index aeece84..83c7dc1 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -592,6 +592,11 @@ pack.depth::
>  	The maximum delta depth used by gitlink:git-pack-objects[1] when no
>  	maximum depth is given on the command line. Defaults to 50.
>  
> +pack.windowBytes::
> +	This option provides an additional limit on top of `pack.window`;
> +	the window size will dynamically scale down so as to not take
> +	up more than N bytes in memory.
> +

This doesn't say what the default (unlimited) is.

>  pack.compression::
>  	An integer -1..9, indicating the compression level for objects
>  	in a pack file. -1 is the zlib default. 0 means no
> diff --git a/Documentation/git-pack-objects.txt b/Documentation/git-pack-objects.txt
> index e3549b5..21ed198 100644
> --- a/Documentation/git-pack-objects.txt
> +++ b/Documentation/git-pack-objects.txt
> @@ -85,6 +85,14 @@ base-name::
>  	times to get to the necessary object.
>  	The default value for --window is 10 and --depth is 50.
>  
> +--window-bytes=[N]::
> +	This option provides an additional limit on top of `--window`;
> +	the window size will dynamically scale down so as to not take
> +	up more than N bytes in memory.  This is useful in
> +	repositories with a mix of large and small objects to not run
> +	out of memory with a large window, but still be able to take
> +	advantage of the large window for the smaller objects.

Ditto here.

Also it is a bit akward to specify a size in bytes when you probably 
want to specify a limit which is in the megabyte range.  I'd call them 
--window_mem and pack.windowmemory, and allow for unit suffixes of 'k', 
'm', or 'g' to be supported if not already.


Nicolas

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/5] Memory-limited pack-object window support
  2007-07-12  3:14 [PATCH 0/5] Memory-limited pack-object window support Brian Downing
                   ` (4 preceding siblings ...)
  2007-07-12  3:14 ` [PATCH 5/5] Add documentation for --window-bytes, pack.windowBytes Brian Downing
@ 2007-07-12  4:38 ` Nicolas Pitre
  5 siblings, 0 replies; 10+ messages in thread
From: Nicolas Pitre @ 2007-07-12  4:38 UTC (permalink / raw)
  To: Brian Downing; +Cc: git, Junio C Hamano

On Wed, 11 Jul 2007, Brian Downing wrote:

> This patch series implements a memory limit on the window size for
> pack-objects and repack.  Basically, the window size will temporarily
> grow smaller than the --window option specifies if the total memory
> usage of the window is over the specified limit.

Besides the small issues I've pointed out already, I think this is a 
very good thing.


Nicolas

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/5] Add pack-objects window memory usage limit
  2007-07-12  4:25   ` Nicolas Pitre
@ 2007-07-12 10:02     ` Brian Downing
  0 siblings, 0 replies; 10+ messages in thread
From: Brian Downing @ 2007-07-12 10:02 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git, Junio C Hamano

On Thu, Jul 12, 2007 at 12:25:54AM -0400, Nicolas Pitre wrote:
> On Wed, 11 Jul 2007, Brian Downing wrote:
> 
> > +		while (window_memory_limit &&
> > +		       window_memory_usage > window_memory_limit &&
> > +		       count > 1) {
> > +			uint32_t tail = idx - count;
> > +			if (tail > idx) {
> > +				tail += window + 1;
> > +				tail %= window;
> > +			}
> > +			free_unpacked(array + tail);
> > +			count--;
> > +		}
> 
> This is bogus.  Suppose window = 10 and only array entries 8, 9, 0, 1 
> and 2 are populated.  In that case idx = 2 and count should be 4 (not 
> counting the current entry yet).  You want to evict entry 8.

The current idx has already been depopulated by the time that code is
run, and count is probably one higher than you are expecting, so this
does actually work.

However, looking at it again, I think if the window hasn't been saturated
yet in my current code count will be what you expect in this situation
and it will screw up as you describe.

Besides, it is admittedly clumsy as hell (a common affliction when
dealing with circular buffers for me it seems).  I'll see if I can get
something better that works.

Thanks,
-bcd

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2007-07-12 10:03 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-07-12  3:14 [PATCH 0/5] Memory-limited pack-object window support Brian Downing
2007-07-12  3:14 ` [PATCH 1/5] Don't try to delta if target is much smaller than source Brian Downing
2007-07-12  3:14 ` [PATCH 2/5] Support fetching the memory usage of a delta index Brian Downing
2007-07-12  3:14 ` [PATCH 3/5] Add pack-objects window memory usage limit Brian Downing
2007-07-12  4:25   ` Nicolas Pitre
2007-07-12 10:02     ` Brian Downing
2007-07-12  3:14 ` [PATCH 4/5] Add --window-bytes option to git-repack Brian Downing
2007-07-12  3:14 ` [PATCH 5/5] Add documentation for --window-bytes, pack.windowBytes Brian Downing
2007-07-12  4:35   ` Nicolas Pitre
2007-07-12  4:38 ` [PATCH 0/5] Memory-limited pack-object window support Nicolas Pitre

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).