git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* pack.packSizeLimit and --max-pack-size not working?
@ 2008-11-12 15:12 Jon Nelson
  2008-11-12 16:17 ` [PATCH] fix pack.packSizeLimit and --max-pack-size handling Nicolas Pitre
  0 siblings, 1 reply; 7+ messages in thread
From: Jon Nelson @ 2008-11-12 15:12 UTC (permalink / raw)
  To: git

I'm using 1.6.0.4 and I've found some weird behavior with
pack.packSizeLimit and/or --max-pack-size.

Initially, I thought I could just use pack.packSizeLimit and set it to
(say) 1 to try to limit the size of individual packfiles to 1MiB or
less. That does not appear to be working.

In one case I performed the following set of commands:

# set pack.packSizeLimit to 20
git config --global pack.packSizeLimit 20

# verify that it's 20
git config --get pack.packSizeLimit # verify it's 20

# run gc --prune
git gc --prune

# show the packfiles
# I find a *single* 65MB packfile, not a series
# of 20MB (or less) packfiles.
ls -la .git/objects/pack/*.pack

# try repack -ad
git repack -ad

# I find a *single* 65MB packfile, not a series
# of 20MB (or less) packfiles.
ls -la .git/objects/pack/*.pack


So it would appear that the pack.packSizeLimit param
is just being ignored??

Then I tested using --max-pack-size explicitly. This works, to a degree.

git repack -ad --max-pack-size 20

# the following shows *4* pack files none larger
# than (about) 20MB
ls -la .git/objects/pack/*.pack

# try again with 3MB. This also works.
git repack -ad --max-pack-size 3
find .git/objects/pack -name '*.pack' -size +3M -ls # nothing

# try again with 1MB. This does NOT work.
git repack -ad --max-pack-size 1

# here, I find a *single* 65MB pack file again:
find .git/objects/pack -name '*.pack' -size +1M -ls

Am I doing something completely wrong with pack.packSizeLimit?
What is going on with --max-pack-size in the 1MB case?


-- 
Jon

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] fix pack.packSizeLimit and --max-pack-size handling
  2008-11-12 15:12 pack.packSizeLimit and --max-pack-size not working? Jon Nelson
@ 2008-11-12 16:17 ` Nicolas Pitre
  2008-11-12 17:46   ` Junio C Hamano
                     ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Nicolas Pitre @ 2008-11-12 16:17 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jon Nelson, git

First, pack.packSizeLimit and --max-pack-size didn't use the same base
unit which was confusing.  They both use MiB now.

Also, if the limit was sufficiently low, having a single object written
could bust the limit (by design), but caused the remaining allowed size 
to go negative for subsequent objects, which for an unsigned variable is 
a rather huge limit.

Signed-off-by: Nicolas Pitre <nico@cam.org>
---

On Wed, 12 Nov 2008, Jon Nelson wrote:

> I'm using 1.6.0.4 and I've found some weird behavior with
> pack.packSizeLimit and/or --max-pack-size.
> 
> Initially, I thought I could just use pack.packSizeLimit and set it to
> (say) 1 to try to limit the size of individual packfiles to 1MiB or
> less. That does not appear to be working.
> 
> In one case I performed the following set of commands:
> 
> # set pack.packSizeLimit to 20
> git config --global pack.packSizeLimit 20
> 
> # verify that it's 20
> git config --get pack.packSizeLimit # verify it's 20
> 
> # run gc --prune
> git gc --prune
> 
> # show the packfiles
> # I find a *single* 65MB packfile, not a series
> # of 20MB (or less) packfiles.
> ls -la .git/objects/pack/*.pack
> 
> # try repack -ad
> git repack -ad
> 
> # I find a *single* 65MB packfile, not a series
> # of 20MB (or less) packfiles.
> ls -la .git/objects/pack/*.pack
> 
> 
> So it would appear that the pack.packSizeLimit param
> is just being ignored??
> 
> Then I tested using --max-pack-size explicitly. This works, to a degree.
> 
> git repack -ad --max-pack-size 20
> 
> # the following shows *4* pack files none larger
> # than (about) 20MB
> ls -la .git/objects/pack/*.pack
> 
> # try again with 3MB. This also works.
> git repack -ad --max-pack-size 3
> find .git/objects/pack -name '*.pack' -size +3M -ls # nothing
> 
> # try again with 1MB. This does NOT work.
> git repack -ad --max-pack-size 1
> 
> # here, I find a *single* 65MB pack file again:
> find .git/objects/pack -name '*.pack' -size +1M -ls
> 
> Am I doing something completely wrong with pack.packSizeLimit?
> What is going on with --max-pack-size in the 1MB case?

Does this fix it for you?

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 32dcd64..e7808b8 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1036,9 +1036,9 @@ you can use linkgit:git-index-pack[1] on the *.pack file to regenerate
 the `{asterisk}.idx` file.
 
 pack.packSizeLimit::
-	The default maximum size of a pack.  This setting only affects
-	packing to a file, i.e. the git:// protocol is unaffected.  It
-	can be overridden by the `\--max-pack-size` option of
+	The default maximum size of a pack, expressed in MiB.  This
+	setting only affects packing to a file, i.e. the git:// protocol is
+	unaffected. It can be overridden by the `\--max-pack-size` option of
 	linkgit:git-repack[1].
 
 pager.<cmd>::
diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
index 0c4649c..fdee9c6 100644
--- a/builtin-pack-objects.c
+++ b/builtin-pack-objects.c
@@ -245,8 +245,12 @@ static unsigned long write_object(struct sha1file *f,
 	type = entry->type;
 
 	/* write limit if limited packsize and not first object */
-	limit = pack_size_limit && nr_written ?
-			pack_size_limit - write_offset : 0;
+	if (!pack_size_limit || !nr_written)
+		limit = 0;
+	else if (pack_size_limit <= write_offset)
+		limit = 1;
+	else
+		limit = pack_size_limit - write_offset;
 
 	if (!entry->delta)
 		usable_delta = 0;	/* no delta */
@@ -1844,7 +1848,7 @@ static int git_pack_config(const char *k, const char *v, void *cb)
 		return 0;
 	}
 	if (!strcmp(k, "pack.packsizelimit")) {
-		pack_size_limit_cfg = git_config_ulong(k, v);
+		pack_size_limit_cfg = git_config_ulong(k, v) * 1024 * 1024;
 		return 0;
 	}
 	return git_default_config(k, v, cb);

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] fix pack.packSizeLimit and --max-pack-size handling
  2008-11-12 16:17 ` [PATCH] fix pack.packSizeLimit and --max-pack-size handling Nicolas Pitre
@ 2008-11-12 17:46   ` Junio C Hamano
  2008-11-12 18:23     ` Nicolas Pitre
  2008-11-12 17:58   ` Jon Nelson
  2008-11-12 21:42   ` Nanako Shiraishi
  2 siblings, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2008-11-12 17:46 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Jon Nelson, git

Nicolas Pitre <nico@cam.org> writes:

> First, pack.packSizeLimit and --max-pack-size didn't use the same base
> unit which was confusing.  They both use MiB now.
>
> Also, if the limit was sufficiently low, having a single object written
> could bust the limit (by design), but caused the remaining allowed size 
> to go negative for subsequent objects, which for an unsigned variable is 
> a rather huge limit.
>
> Signed-off-by: Nicolas Pitre <nico@cam.org>
> ---

> @@ -1844,7 +1848,7 @@ static int git_pack_config(const char *k, const char *v, void *cb)
>  		return 0;
>  	}
>  	if (!strcmp(k, "pack.packsizelimit")) {
> -		pack_size_limit_cfg = git_config_ulong(k, v);
> +		pack_size_limit_cfg = git_config_ulong(k, v) * 1024 * 1024;

The fix to tweak the limit for subsequent split pack is a good thing to
have, but this change would break existing repositories where people
specified 20971520 (or worse yet "20m") to limit the size to 20MB.

I think --max-pack-size is what should be fixed to use git_parse_ulong()
to match the configuration, if you find the discrepancy disturbing.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] fix pack.packSizeLimit and --max-pack-size handling
  2008-11-12 16:17 ` [PATCH] fix pack.packSizeLimit and --max-pack-size handling Nicolas Pitre
  2008-11-12 17:46   ` Junio C Hamano
@ 2008-11-12 17:58   ` Jon Nelson
  2008-11-12 21:42   ` Nanako Shiraishi
  2 siblings, 0 replies; 7+ messages in thread
From: Jon Nelson @ 2008-11-12 17:58 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git

On Wed, Nov 12, 2008 at 10:17 AM, Nicolas Pitre <nico@cam.org> wrote:
> First, pack.packSizeLimit and --max-pack-size didn't use the same base
> unit which was confusing.  They both use MiB now.
>
> Also, if the limit was sufficiently low, having a single object written
> could bust the limit (by design), but caused the remaining allowed size
> to go negative for subsequent objects, which for an unsigned variable is
> a rather huge limit.
>
> Signed-off-by: Nicolas Pitre <nico@cam.org>

The patch does appear to resolve the issue!

-- 
Jon

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] fix pack.packSizeLimit and --max-pack-size handling
  2008-11-12 17:46   ` Junio C Hamano
@ 2008-11-12 18:23     ` Nicolas Pitre
  2008-11-12 21:23       ` Junio C Hamano
  0 siblings, 1 reply; 7+ messages in thread
From: Nicolas Pitre @ 2008-11-12 18:23 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jon Nelson, git

First, pack.packSizeLimit and --max-pack-size didn't use the same base
unit which was confusing.  They both use suffixable value arguments now.

Also, if the limit was sufficiently low, having a single object written
could bust the limit (by design), but caused the remaining allowed size
to go negative for subsequent objects, which for an unsigned variable is
a rather huge limit.

Signed-off-by: Nicolas Pitre <nico@cam.org>
---

On Wed, 12 Nov 2008, Junio C Hamano wrote:

> Nicolas Pitre <nico@cam.org> writes:
> 
> > First, pack.packSizeLimit and --max-pack-size didn't use the same base
> > unit which was confusing.  They both use MiB now.
> >
> > Also, if the limit was sufficiently low, having a single object written
> > could bust the limit (by design), but caused the remaining allowed size 
> > to go negative for subsequent objects, which for an unsigned variable is 
> > a rather huge limit.
> >
> > Signed-off-by: Nicolas Pitre <nico@cam.org>
> > ---
> 
> > @@ -1844,7 +1848,7 @@ static int git_pack_config(const char *k, const char *v, void *cb)
> >  		return 0;
> >  	}
> >  	if (!strcmp(k, "pack.packsizelimit")) {
> > -		pack_size_limit_cfg = git_config_ulong(k, v);
> > +		pack_size_limit_cfg = git_config_ulong(k, v) * 1024 * 1024;
> 
> The fix to tweak the limit for subsequent split pack is a good thing to
> have, but this change would break existing repositories where people
> specified 20971520 (or worse yet "20m") to limit the size to 20MB.
> 
> I think --max-pack-size is what should be fixed to use git_parse_ulong()
> to match the configuration, if you find the discrepancy disturbing.

Fair enough.

diff --git a/Documentation/git-pack-objects.txt b/Documentation/git-pack-objects.txt
index 1c4fb43..d60409a 100644
--- a/Documentation/git-pack-objects.txt
+++ b/Documentation/git-pack-objects.txt
@@ -104,10 +104,11 @@ base-name::
 	default.
 
 --max-pack-size=<n>::
-	Maximum size of each output packfile, expressed in MiB.
+	Maximum size of each output packfile.
 	If specified,  multiple packfiles may be created.
 	The default is unlimited, unless the config variable
-	`pack.packSizeLimit` is set.
+	`pack.packSizeLimit` is set. The size can be suffixed with
+	"k", "m", or "g".
 
 --incremental::
 	This flag causes an object already in a pack ignored
diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
index 0c4649c..9c4333b 100644
--- a/builtin-pack-objects.c
+++ b/builtin-pack-objects.c
@@ -75,7 +75,7 @@ static int allow_ofs_delta;
 static const char *base_name;
 static int progress = 1;
 static int window = 10;
-static uint32_t pack_size_limit, pack_size_limit_cfg;
+static unsigned long pack_size_limit, pack_size_limit_cfg;
 static int depth = 50;
 static int delta_search_threads = 1;
 static int pack_to_stdout;
@@ -245,8 +245,12 @@ static unsigned long write_object(struct sha1file *f,
 	type = entry->type;
 
 	/* write limit if limited packsize and not first object */
-	limit = pack_size_limit && nr_written ?
-			pack_size_limit - write_offset : 0;
+	if (!pack_size_limit || !nr_written)
+		limit = 0;
+	else if (pack_size_limit <= write_offset)
+		limit = 1;
+	else
+		limit = pack_size_limit - write_offset;
 
 	if (!entry->delta)
 		usable_delta = 0;	/* no delta */
@@ -2103,11 +2107,10 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 			continue;
 		}
 		if (!prefixcmp(arg, "--max-pack-size=")) {
-			char *end;
-			pack_size_limit_cfg = 0;
-			pack_size_limit = strtoul(arg+16, &end, 0) * 1024 * 1024;
-			if (!arg[16] || *end)
+			if (!git_parse_ulong(arg+16, &pack_size_limit))
 				usage(pack_usage);
+			else
+				pack_size_limit_cfg = 0;
 			continue;
 		}
 		if (!prefixcmp(arg, "--window=")) {

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] fix pack.packSizeLimit and --max-pack-size handling
  2008-11-12 18:23     ` Nicolas Pitre
@ 2008-11-12 21:23       ` Junio C Hamano
  0 siblings, 0 replies; 7+ messages in thread
From: Junio C Hamano @ 2008-11-12 21:23 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Jon Nelson, git

Nicolas Pitre <nico@cam.org> writes:

> @@ -2103,11 +2107,10 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
>  			continue;
>  		}
>  		if (!prefixcmp(arg, "--max-pack-size=")) {
> -			char *end;
> -			pack_size_limit_cfg = 0;
> -			pack_size_limit = strtoul(arg+16, &end, 0) * 1024 * 1024;
> -			if (!arg[16] || *end)
> +			if (!git_parse_ulong(arg+16, &pack_size_limit))
>  				usage(pack_usage);
> +			else
> +				pack_size_limit_cfg = 0;
>  			continue;
>  		}
>  		if (!prefixcmp(arg, "--window=")) {

Hmm, an unrelated funniness here is why the code even needs to futz with
the value read from the configuration.  Later in the code we have:

	if (!pack_to_stdout && !pack_size_limit)
		pack_size_limit = pack_size_limit_cfg;

and the intent seems to be:

 - if there is nothing on the command line, use config;
 - if there is something on the command line, ignore config.

But if you have a configured limit and would want to override from the
command line, this won't work.

I will apply the wraparound avoidance as a separate "pure fix" patch to
'maint' first.

Besides, --max-pack-size has been defined as megabytes for the entirety of
its existence, and I am a bit worried about changing the semantics at this
point without any warning.  I realize that I am worried too much for
people with a script that give an explicit --max-pack-size=1 to obtain a
set of split packs that would fit on floppies, who probably would not
exist ;-)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] fix pack.packSizeLimit and --max-pack-size handling
  2008-11-12 16:17 ` [PATCH] fix pack.packSizeLimit and --max-pack-size handling Nicolas Pitre
  2008-11-12 17:46   ` Junio C Hamano
  2008-11-12 17:58   ` Jon Nelson
@ 2008-11-12 21:42   ` Nanako Shiraishi
  2 siblings, 0 replies; 7+ messages in thread
From: Nanako Shiraishi @ 2008-11-12 21:42 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Nicolas Pitre, Jon Nelson, git

Quoting Junio C Hamano <gitster@pobox.com>:

> I think --max-pack-size is what should be fixed to use git_parse_ulong()
> to match the configuration, if you find the discrepancy disturbing.

But doesn't that break existing scripts and habits?

-- 
Nanako Shiraishi
http://ivory.ap.teacup.com/nanako3/

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2008-11-12 21:44 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-11-12 15:12 pack.packSizeLimit and --max-pack-size not working? Jon Nelson
2008-11-12 16:17 ` [PATCH] fix pack.packSizeLimit and --max-pack-size handling Nicolas Pitre
2008-11-12 17:46   ` Junio C Hamano
2008-11-12 18:23     ` Nicolas Pitre
2008-11-12 21:23       ` Junio C Hamano
2008-11-12 17:58   ` Jon Nelson
2008-11-12 21:42   ` Nanako Shiraishi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).