* [PATCH 1/3] fix multiple issues with t5300 @ 2010-02-04 3:48 Nicolas Pitre 2010-02-04 3:48 ` [PATCH 2/3] pack-objects: fix pack generation when using pack_size_limit Nicolas Pitre 2010-02-04 3:48 ` [PATCH 3/3] make --max-pack-size argument to 'git pack-object' count in bytes Nicolas Pitre 0 siblings, 2 replies; 16+ messages in thread From: Nicolas Pitre @ 2010-02-04 3:48 UTC (permalink / raw) To: Junio C Hamano; +Cc: git First of all, trying to run 'git verify-pack' on packs produced by the tests using pack.packSizeLimit always failed. After lots of digging and head scratching, it turns out that the preceeding test simulating a SHA1 collision did leave the repository quite confused, impacting subsequent tests. So let's move that destructive test last, and add tests to run verify-pack on the output from those packSizeLimit tests to catch such goofage. Finally, group those packSizeLimit tests together. Signed-off-by: Nicolas Pitre <nico@fluxnic.net> --- t/t5300-pack-object.sh | 54 ++++++++++++++++++++++++++++++----------------- 1 files changed, 34 insertions(+), 20 deletions(-) diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh index e2aa254..ac81114 100755 --- a/t/t5300-pack-object.sh +++ b/t/t5300-pack-object.sh @@ -280,26 +280,8 @@ test_expect_success \ :' -test_expect_success \ - 'fake a SHA1 hash collision' \ - 'test -f .git/objects/c8/2de19312b6c3695c0c18f70709a6c535682a67 && - cp -f .git/objects/9d/235ed07cd19811a6ceb342de82f190e49c9f68 \ - .git/objects/c8/2de19312b6c3695c0c18f70709a6c535682a67' - -test_expect_success \ - 'make sure index-pack detects the SHA1 collision' \ - 'test_must_fail git index-pack -o bad.idx test-3.pack 2>msg && - grep "SHA1 COLLISION FOUND" msg' - -test_expect_success \ - 'honor pack.packSizeLimit' \ - 'git config pack.packSizeLimit 200 && - packname_4=$(git pack-objects test-4 <obj-list) && - test 3 = $(ls test-4-*.pack | wc -l)' - test_expect_success 'unpacking with --strict' ' - git config --unset pack.packsizelimit && for j in a b c d e f g do for i in 0 1 2 3 4 5 6 7 8 9 @@ -392,10 +374,42 @@ test_expect_success 'index-pack with --strict' ' ) ' +test_expect_success 'honor pack.packSizeLimit' ' + git config pack.packSizeLimit 200 && + packname_10=$(git pack-objects test-10 <obj-list) && + test 3 = $(ls test-10-*.pack | wc -l) +' + +test_expect_success 'verify resulting packs' ' + git verify-pack test-10-*.pack +' + test_expect_success 'tolerate absurdly small packsizelimit' ' git config pack.packSizeLimit 2 && - packname_9=$(git pack-objects test-9 <obj-list) && - test $(wc -l <obj-list) = $(ls test-9-*.pack | wc -l) + packname_11=$(git pack-objects test-11 <obj-list) && + test $(wc -l <obj-list) = $(ls test-11-*.pack | wc -l) ' +test_expect_success 'verify resulting packs' ' + git verify-pack test-11-*.pack +' + +# +# WARNING! +# +# The following test is destructive. Please keep the next +# two tests at the end of this file. +# + +test_expect_success \ + 'fake a SHA1 hash collision' \ + 'test -f .git/objects/c8/2de19312b6c3695c0c18f70709a6c535682a67 && + cp -f .git/objects/9d/235ed07cd19811a6ceb342de82f190e49c9f68 \ + .git/objects/c8/2de19312b6c3695c0c18f70709a6c535682a67' + +test_expect_success \ + 'make sure index-pack detects the SHA1 collision' \ + 'test_must_fail git index-pack -o bad.idx test-3.pack 2>msg && + grep "SHA1 COLLISION FOUND" msg' + test_done -- 1.7.0.rc1.149.g0b0b7 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 2/3] pack-objects: fix pack generation when using pack_size_limit 2010-02-04 3:48 [PATCH 1/3] fix multiple issues with t5300 Nicolas Pitre @ 2010-02-04 3:48 ` Nicolas Pitre 2010-02-04 3:48 ` [PATCH 3/3] make --max-pack-size argument to 'git pack-object' count in bytes Nicolas Pitre 1 sibling, 0 replies; 16+ messages in thread From: Nicolas Pitre @ 2010-02-04 3:48 UTC (permalink / raw) To: Junio C Hamano; +Cc: git Current handling of pack_size_limit is quite suboptimal. Let's consider a list of objects to pack which contain alternatively big and small objects (which pretty matches reality when big blobs are interlaced with tree objects). Currently, the code simply close the pack and opens a new one when the next object in line breaks the size limit. The current code may degenerate into: - small tree object => store into pack #1 - big blob object busting the pack size limit => store into pack #2 - small blob but pack #2 is over the limit already => pack #3 - big blob busting the size limit => pack #4 - small tree but pack #4 is over the limit => pack #5 - big blob => pack #6 - small tree => pack #7 - ... and so on. The reality is that the content of packs 1, 3, 5 and 7 could well be stored more efficiently (and delta compressed) together in pack #1 if the big blobs were not forcing an immediate transition to a new pack. Incidentally this can be fixed pretty easily by simply skipping over those objects that are too big to fit in the current pack while trying the whole list of unwritten objects, and then that list considered from the beginning again when a new pack is opened. This creates much fewer smallish pack files and help making more predictable test cases for the test suite. This change made one of the self sanity checks useless so it is removed as well. That check was rather redundant already anyway. Signed-off-by: Nicolas Pitre <nico@fluxnic.net> --- builtin-pack-objects.c | 37 +++++++++++++------------------------ 1 files changed, 13 insertions(+), 24 deletions(-) diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c index 4a41547..3186035 100644 --- a/builtin-pack-objects.c +++ b/builtin-pack-objects.c @@ -246,7 +246,7 @@ static unsigned long write_object(struct sha1file *f, type = entry->type; - /* write limit if limited packsize and not first object */ + /* apply size limit if limited packsize and not first object */ if (!pack_size_limit || !nr_written) limit = 0; else if (pack_size_limit <= write_offset) @@ -443,11 +443,15 @@ static int write_one(struct sha1file *f, /* offset is non zero if object is written already. */ if (e->idx.offset || e->preferred_base) - return 1; + return -1; - /* if we are deltified, write out base object first. */ - if (e->delta && !write_one(f, e->delta, offset)) - return 0; + /* + * If we are deltified, attempt to write out base object first. + * If that fails due to the pack size limit then the current + * object might still possibly fit undeltified within that limit. + */ + if (e->delta) + write_one(f, e->delta, offset); e->idx.offset = *offset; size = write_object(f, e, *offset); @@ -501,11 +505,9 @@ static void write_pack_file(void) sha1write(f, &hdr, sizeof(hdr)); offset = sizeof(hdr); nr_written = 0; - for (; i < nr_objects; i++) { - if (!write_one(f, objects + i, &offset)) - break; - display_progress(progress_state, written); - } + for (i = 0; i < nr_objects; i++) + if (write_one(f, objects + i, &offset) == 1) + display_progress(progress_state, written); /* * Did we write the wrong # entries in the header? @@ -580,26 +582,13 @@ static void write_pack_file(void) written_list[j]->offset = (off_t)-1; } nr_remaining -= nr_written; - } while (nr_remaining && i < nr_objects); + } while (nr_remaining); free(written_list); stop_progress(&progress_state); if (written != nr_result) die("wrote %"PRIu32" objects while expecting %"PRIu32, written, nr_result); - /* - * We have scanned through [0 ... i). Since we have written - * the correct number of objects, the remaining [i ... nr_objects) - * items must be either already written (due to out-of-order delta base) - * or a preferred base. Count those which are neither and complain if any. - */ - for (j = 0; i < nr_objects; i++) { - struct object_entry *e = objects + i; - j += !e->idx.offset && !e->preferred_base; - } - if (j) - die("wrote %"PRIu32" objects as expected but %"PRIu32 - " unwritten", written, j); } static int locate_object_entry_hash(const unsigned char *sha1) -- 1.7.0.rc1.149.g0b0b7 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 3/3] make --max-pack-size argument to 'git pack-object' count in bytes 2010-02-04 3:48 [PATCH 1/3] fix multiple issues with t5300 Nicolas Pitre 2010-02-04 3:48 ` [PATCH 2/3] pack-objects: fix pack generation when using pack_size_limit Nicolas Pitre @ 2010-02-04 3:48 ` Nicolas Pitre 2010-02-04 4:00 ` Shawn O. Pearce 1 sibling, 1 reply; 16+ messages in thread From: Nicolas Pitre @ 2010-02-04 3:48 UTC (permalink / raw) To: Junio C Hamano; +Cc: git The value passed to --max-pack-size used to count in MiB which was inconsistent with the corresponding configuration variable as well as other command arguments which are defined to count in bytes with an optional unit suffix. This brings --max-pack-size in line with the rest of Git. Also, in order not to cause havoc with people used to the previous megabyte scale, and because this is a sane thing to do anyway, a minimum size of 1 MiB is enforced to avoid an explosion of pack files. Adjust and extend test suite accordingly. Signed-off-by: Nicolas Pitre <nico@fluxnic.net> --- Documentation/RelNotes-1.7.0.txt | 8 +++++++- Documentation/config.txt | 11 +++++++---- Documentation/git-pack-objects.txt | 5 +++-- Documentation/git-repack.txt | 8 +++++--- builtin-pack-objects.c | 11 ++++++----- t/t5300-pack-object.sh | 14 ++++++++------ 6 files changed, 36 insertions(+), 21 deletions(-) diff --git a/Documentation/RelNotes-1.7.0.txt b/Documentation/RelNotes-1.7.0.txt index 323ae54..e66945c 100644 --- a/Documentation/RelNotes-1.7.0.txt +++ b/Documentation/RelNotes-1.7.0.txt @@ -38,7 +38,7 @@ Notes on behaviour change whitespaces is reported with zero exit status when run with --exit-code, and there is no "diff --git" header for such a change. - * external diff and textconv helpers are now executed using the shell. + * External diff and textconv helpers are now executed using the shell. This makes them consistent with other programs executed by git, and allows you to pass command-line parameters to the helpers. Any helper paths containing spaces or other metacharacters now need to be @@ -46,6 +46,12 @@ Notes on behaviour change environment, and diff.*.command and diff.*.textconv in the config file. + * The --max-pack-size argument to 'git repack' and 'git pack-objects' was + assuming the provided size to be expressed in MiB, unlike the + corresponding config variable and other similar options accepting a size + value. It is now expecting a size expressed in bytes, with a possible + unit suffix of 'k', 'm', or 'g'. + Updates since v1.6.6 -------------------- diff --git a/Documentation/config.txt b/Documentation/config.txt index 36ad992..4c36aa9 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1368,10 +1368,13 @@ you can use linkgit:git-index-pack[1] on the *.pack file to regenerate the `{asterisk}.idx` file. pack.packSizeLimit:: - The default maximum size of a pack. This setting only affects - packing to a file, i.e. the git:// protocol is unaffected. It - can be overridden by the `\--max-pack-size` option of - linkgit:git-repack[1]. + The maximum size of a pack. This setting only affects + packing to a file when repacking, i.e. the git:// protocol + is unaffected. It can be overridden by the `\--max-pack-size` + option of linkgit:git-repack[1]. The minimum size allowed is + limited to 1 MiB. The default is unlimited. + Common unit suffixes of 'k', 'm', or 'g' are + supported. pager.<cmd>:: Allows turning on or off pagination of the output of a diff --git a/Documentation/git-pack-objects.txt b/Documentation/git-pack-objects.txt index 097a147..ffd5025 100644 --- a/Documentation/git-pack-objects.txt +++ b/Documentation/git-pack-objects.txt @@ -105,8 +105,9 @@ base-name:: `--window-memory=0` makes memory usage unlimited, which is the default. ---max-pack-size=<n>:: - Maximum size of each output packfile, expressed in MiB. +--max-pack-size=[N]:: + Maximum size of each output pack file. The size can be suffixed with + "k", "m", or "g". The minimum size allowed is limited to 1 MiB. If specified, multiple packfiles may be created. The default is unlimited, unless the config variable `pack.packSizeLimit` is set. diff --git a/Documentation/git-repack.txt b/Documentation/git-repack.txt index 538895c..e2f2fa2 100644 --- a/Documentation/git-repack.txt +++ b/Documentation/git-repack.txt @@ -98,10 +98,12 @@ other objects in that pack they already have locally. `--window-memory=0` makes memory usage unlimited, which is the default. ---max-pack-size=<n>:: - Maximum size of each output packfile, expressed in MiB. +--max-pack-size=[N]:: + Maximum size of each output pack file. The size can be suffixed with + "k", "m", or "g". The minimum size allowed is limited to 1 MiB. If specified, multiple packfiles may be created. - The default is unlimited. + The default is unlimited, unless the config variable + `pack.packSizeLimit` is set. Configuration diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c index 3186035..dcfe62a 100644 --- a/builtin-pack-objects.c +++ b/builtin-pack-objects.c @@ -77,7 +77,7 @@ static int allow_ofs_delta; static const char *base_name; static int progress = 1; static int window = 10; -static uint32_t pack_size_limit, pack_size_limit_cfg; +static unsigned long pack_size_limit, pack_size_limit_cfg; static int depth = 50; static int delta_search_threads; static int pack_to_stdout; @@ -2192,10 +2192,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) continue; } if (!prefixcmp(arg, "--max-pack-size=")) { - char *end; pack_size_limit_cfg = 0; - pack_size_limit = strtoul(arg+16, &end, 0) * 1024 * 1024; - if (!arg[16] || *end) + if (!git_parse_ulong(arg+16, &pack_size_limit)) usage(pack_usage); continue; } @@ -2335,9 +2333,12 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) if (!pack_to_stdout && !pack_size_limit) pack_size_limit = pack_size_limit_cfg; - if (pack_to_stdout && pack_size_limit) die("--max-pack-size cannot be used to build a pack for transfer."); + if (pack_size_limit && pack_size_limit < 1024*1024) { + warning("minimum pack size limit is 1 MiB"); + pack_size_limit = 1024*1024; + } if (!pack_to_stdout && thin) die("--thin cannot be used to build an indexable pack."); diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh index ac81114..c80f3a3 100755 --- a/t/t5300-pack-object.sh +++ b/t/t5300-pack-object.sh @@ -16,7 +16,9 @@ test_expect_success \ perl -e "print \"a\" x 4096;" > a && perl -e "print \"b\" x 4096;" > b && perl -e "print \"c\" x 4096;" > c && - git update-index --add a b c && + test-genrandom "seed a" 2097152 > a_big && + test-genrandom "seed b" 2097152 > b_big && + git update-index --add a a_big b b_big c && cat c >d && echo foo >>d && git update-index --add d && tree=`git write-tree` && commit=`git commit-tree $tree </dev/null` && { @@ -375,19 +377,19 @@ test_expect_success 'index-pack with --strict' ' ' test_expect_success 'honor pack.packSizeLimit' ' - git config pack.packSizeLimit 200 && + git config pack.packSizeLimit 3m && packname_10=$(git pack-objects test-10 <obj-list) && - test 3 = $(ls test-10-*.pack | wc -l) + test 2 = $(ls test-10-*.pack | wc -l) ' test_expect_success 'verify resulting packs' ' git verify-pack test-10-*.pack ' -test_expect_success 'tolerate absurdly small packsizelimit' ' - git config pack.packSizeLimit 2 && +test_expect_success 'tolerate packsizelimit smaller than biggest object' ' + git config pack.packSizeLimit 1 && packname_11=$(git pack-objects test-11 <obj-list) && - test $(wc -l <obj-list) = $(ls test-11-*.pack | wc -l) + test 3 = $(ls test-11-*.pack | wc -l) ' test_expect_success 'verify resulting packs' ' -- 1.7.0.rc1.149.g0b0b7 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] make --max-pack-size argument to 'git pack-object' count in bytes 2010-02-04 3:48 ` [PATCH 3/3] make --max-pack-size argument to 'git pack-object' count in bytes Nicolas Pitre @ 2010-02-04 4:00 ` Shawn O. Pearce 2010-02-04 4:38 ` Junio C Hamano 0 siblings, 1 reply; 16+ messages in thread From: Shawn O. Pearce @ 2010-02-04 4:00 UTC (permalink / raw) To: Nicolas Pitre; +Cc: Junio C Hamano, git Nicolas Pitre <nico@fluxnic.net> wrote: > The value passed to --max-pack-size used to count in MiB which was > inconsistent with the corresponding configuration variable as well as > other command arguments which are defined to count in bytes with an > optional unit suffix. This brings --max-pack-size in line with the > rest of Git. ... > Documentation/RelNotes-1.7.0.txt | 8 +++++++- > Documentation/config.txt | 11 +++++++---- > Documentation/git-pack-objects.txt | 5 +++-- > Documentation/git-repack.txt | 8 +++++--- > builtin-pack-objects.c | 11 ++++++----- > t/t5300-pack-object.sh | 14 ++++++++------ > 6 files changed, 36 insertions(+), 21 deletions(-) Shouldn't we also change fast-import.c ? -- Shawn. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] make --max-pack-size argument to 'git pack-object' count in bytes 2010-02-04 4:00 ` Shawn O. Pearce @ 2010-02-04 4:38 ` Junio C Hamano 2010-02-04 17:11 ` Junio C Hamano 0 siblings, 1 reply; 16+ messages in thread From: Junio C Hamano @ 2010-02-04 4:38 UTC (permalink / raw) To: Shawn O. Pearce; +Cc: Nicolas Pitre, Junio C Hamano, git "Shawn O. Pearce" <spearce@spearce.org> writes: > Nicolas Pitre <nico@fluxnic.net> wrote: >> The value passed to --max-pack-size used to count in MiB which was >> inconsistent with the corresponding configuration variable as well as >> other command arguments which are defined to count in bytes with an >> optional unit suffix. This brings --max-pack-size in line with the >> rest of Git. > ... >> Documentation/RelNotes-1.7.0.txt | 8 +++++++- >> Documentation/config.txt | 11 +++++++---- >> Documentation/git-pack-objects.txt | 5 +++-- >> Documentation/git-repack.txt | 8 +++++--- >> builtin-pack-objects.c | 11 ++++++----- >> t/t5300-pack-object.sh | 14 ++++++++------ >> 6 files changed, 36 insertions(+), 21 deletions(-) > > Shouldn't we also change fast-import.c ? Surely; could you do the honors? I cannot really decide how big the deal would be to break backward compatibility for max-pack-size myself. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] make --max-pack-size argument to 'git pack-object' count in bytes 2010-02-04 4:38 ` Junio C Hamano @ 2010-02-04 17:11 ` Junio C Hamano 2010-02-04 17:24 ` Shawn O. Pearce 2010-02-04 17:49 ` [PATCH 3/3] make --max-pack-size argument to 'git pack-object' count " Nicolas Pitre 0 siblings, 2 replies; 16+ messages in thread From: Junio C Hamano @ 2010-02-04 17:11 UTC (permalink / raw) To: Shawn O. Pearce; +Cc: Nicolas Pitre, git Junio C Hamano <gitster@pobox.com> writes: > "Shawn O. Pearce" <spearce@spearce.org> writes: > ... >> Shouldn't we also change fast-import.c ? > > Surely; could you do the honors? I cannot really decide how big the deal > would be to break backward compatibility for max-pack-size myself. I'll queue the "fix mismerge" one I got "Yup, looks good to me." from you, with this: commit 76ea93ccb5df138eb57b2e8f2aee61dd1ca666ea Author: Junio C Hamano <gitster@pobox.com> Date: Wed Feb 3 18:27:08 2010 -0800 fast-import.c: Fix big-file-threshold parsing bug Manual merge made at 844ad3d (Merge branch 'sp/maint-fast-import-large-blob' into sp/fast-import-large-blob, 2010-02-01) did not correctly reflect the change of unit in which this variable's value is counted from its previous version. Now it counts in bytes, not in megabytes. Signed-off-by: Junio C Hamano <gitster@pobox.com> Acked-by: Shawn O. Pearce <spearce@spearce.org> I am debating myself if we want to have another one (see below) on top; I am not a heavy user of fast-import and I don't know if existing users will be hurt by such a change and if so how much. -- >8 -- Subject: [PATCH] fast-import: count --max-pack-size in bytes Similar in spirit to 07cf0f2 (make --max-pack-size argument to 'git pack-object' count in bytes, 2010-02-03) which made the option by the same name to pack-objects, this counts the pack size limit in bytes. In order not to cause havoc with people used to the previous megabyte scale, and because this is a sane thing to do anyway, a minimum size of 1 MiB is enforced to avoid an explosion of pack files. Signed-off-by: Junio C Hamano <gitster@pobox.com> --- Documentation/RelNotes-1.7.0.txt | 8 ++++---- Documentation/git-fast-import.txt | 4 ++-- fast-import.c | 14 ++++++++------ 3 files changed, 14 insertions(+), 12 deletions(-) diff --git a/Documentation/RelNotes-1.7.0.txt b/Documentation/RelNotes-1.7.0.txt index e66945c..255666f 100644 --- a/Documentation/RelNotes-1.7.0.txt +++ b/Documentation/RelNotes-1.7.0.txt @@ -46,10 +46,10 @@ Notes on behaviour change environment, and diff.*.command and diff.*.textconv in the config file. - * The --max-pack-size argument to 'git repack' and 'git pack-objects' was - assuming the provided size to be expressed in MiB, unlike the - corresponding config variable and other similar options accepting a size - value. It is now expecting a size expressed in bytes, with a possible + * The --max-pack-size argument to 'git repack', 'git pack-objects', and + 'git fast-import' was assuming the provided size to be expressed in MiB, + unlike the corresponding config variable and other similar options accepting + a size value. It is now expecting a size expressed in bytes, with a possible unit suffix of 'k', 'm', or 'g'. Updates since v1.6.6 diff --git a/Documentation/git-fast-import.txt b/Documentation/git-fast-import.txt index 2691114..6764ff1 100644 --- a/Documentation/git-fast-import.txt +++ b/Documentation/git-fast-import.txt @@ -44,8 +44,8 @@ OPTIONS not contain the old commit). --max-pack-size=<n>:: - Maximum size of each output packfile, expressed in MiB. - The default is 4096 (4 GiB) as that is the maximum allowed + Maximum size of each output packfile. + The default is 4 GiB as that is the maximum allowed packfile size (due to file format limitations). Some importers may wish to lower this, such as to ensure the resulting packfiles fit on CDs. diff --git a/fast-import.c b/fast-import.c index a6730d0..09c0817 100644 --- a/fast-import.c +++ b/fast-import.c @@ -2764,11 +2764,6 @@ static void option_date_format(const char *fmt) die("unknown --date-format argument %s", fmt); } -static void option_max_pack_size(const char *packsize) -{ - max_packsize = strtoumax(packsize, NULL, 0) * 1024 * 1024; -} - static void option_depth(const char *depth) { max_depth = strtoul(depth, NULL, 0); @@ -2798,7 +2793,14 @@ static void option_export_pack_edges(const char *edges) static int parse_one_option(const char *option) { if (!prefixcmp(option, "max-pack-size=")) { - option_max_pack_size(option + 14); + unsigned long v; + if (!git_parse_ulong(option + 14, &v)) + return 0; + if (v < 1024 * 1024) { + warning("minimum max-pack-size is 1 MiB"); + v = 1024 * 1024; + } + max_packsize = v; } else if (!prefixcmp(option, "big-file-threshold=")) { unsigned long v; if (!git_parse_ulong(option + 19, &v)) -- 1.7.0.rc1.199.g9253ab ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] make --max-pack-size argument to 'git pack-object' count in bytes 2010-02-04 17:11 ` Junio C Hamano @ 2010-02-04 17:24 ` Shawn O. Pearce 2010-02-04 17:58 ` Nicolas Pitre 2010-02-04 17:49 ` [PATCH 3/3] make --max-pack-size argument to 'git pack-object' count " Nicolas Pitre 1 sibling, 1 reply; 16+ messages in thread From: Shawn O. Pearce @ 2010-02-04 17:24 UTC (permalink / raw) To: Junio C Hamano; +Cc: Nicolas Pitre, git Junio C Hamano <gitster@pobox.com> wrote: > Subject: [PATCH] fast-import: count --max-pack-size in bytes > if (!prefixcmp(option, "max-pack-size=")) { > - option_max_pack_size(option + 14); > + unsigned long v; > + if (!git_parse_ulong(option + 14, &v)) > + return 0; > + if (v < 1024 * 1024) { > + warning("minimum max-pack-size is 1 MiB"); > + v = 1024 * 1024; > + } > + max_packsize = v; How about for a transition period we do: if (v < 8192) { warning("max-pack-size is now in bytes, assuming %dm", v); v *= 1024 * 1024; } So that existing users won't be completely broken if they are relying on this flag, and have some time to adjust. Given the huge magnitude between the old sane value range, and the new sane value range, we can safely assume anything below a small number like 8192 is an old user, warn them, and assume old behavior. A local pack smaller than 1 MiB is mostly pointless coming out of a tool like git repack or git fast-import, unless its a complete copy of the repository. So the old style calling convention of 4096 for 4 GiB would now imply a pack so small, we probably can't get more than 1 object per pack. -- Shawn. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] make --max-pack-size argument to 'git pack-object' count in bytes 2010-02-04 17:24 ` Shawn O. Pearce @ 2010-02-04 17:58 ` Nicolas Pitre 2010-02-04 17:59 ` Shawn O. Pearce 0 siblings, 1 reply; 16+ messages in thread From: Nicolas Pitre @ 2010-02-04 17:58 UTC (permalink / raw) To: Shawn O. Pearce; +Cc: Junio C Hamano, git On Thu, 4 Feb 2010, Shawn O. Pearce wrote: > How about for a transition period we do: > > if (v < 8192) { > warning("max-pack-size is now in bytes, assuming %dm", v); > v *= 1024 * 1024; > } > > So that existing users won't be completely broken if they are > relying on this flag, and have some time to adjust. For 'git fast-import' which is not meant to be directly user operated this makes sense. I don't think this is a good idea for 'git repack' though. In the later case it is best if the user simply adjust right away. Also the warning text above could be less ambigous. Something like: "max-pack-size is now in bytes, assuming --max-pack-size=%dm" Nicolas ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] make --max-pack-size argument to 'git pack-object' count in bytes 2010-02-04 17:58 ` Nicolas Pitre @ 2010-02-04 17:59 ` Shawn O. Pearce 2010-02-04 19:10 ` [PATCH] fast-import: count --max-pack-size " Junio C Hamano 0 siblings, 1 reply; 16+ messages in thread From: Shawn O. Pearce @ 2010-02-04 17:59 UTC (permalink / raw) To: Nicolas Pitre; +Cc: Junio C Hamano, git Nicolas Pitre <nico@fluxnic.net> wrote: > On Thu, 4 Feb 2010, Shawn O. Pearce wrote: > > > How about for a transition period we do: > > > > if (v < 8192) { > > warning("max-pack-size is now in bytes, assuming %dm", v); > > v *= 1024 * 1024; > > } > > > > So that existing users won't be completely broken if they are > > relying on this flag, and have some time to adjust. > > For 'git fast-import' which is not meant to be directly user operated > this makes sense. I don't think this is a good idea for 'git repack' > though. In the later case it is best if the user simply adjust right > away. > > Also the warning text above could be less ambigous. Something like: > > "max-pack-size is now in bytes, assuming --max-pack-size=%dm" ACK on the fixed warning text. I lacked enough coffee at the time to write intelligent text. -- Shawn. ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH] fast-import: count --max-pack-size in bytes 2010-02-04 17:59 ` Shawn O. Pearce @ 2010-02-04 19:10 ` Junio C Hamano 2010-02-04 19:13 ` Shawn O. Pearce 2010-02-04 20:03 ` Ilari Liusvaara 0 siblings, 2 replies; 16+ messages in thread From: Junio C Hamano @ 2010-02-04 19:10 UTC (permalink / raw) To: Shawn O. Pearce; +Cc: Nicolas Pitre, git Similar in spirit to 07cf0f2 (make --max-pack-size argument to 'git pack-object' count in bytes, 2010-02-03) which made the option by the same name to pack-objects, this counts the pack size limit in bytes. In order not to cause havoc with people used to the previous megabyte scale an integer smaller than 8092 is interpreted in megabytes but the user gets a warning. Also a minimum size of 1 MiB is enforced to avoid an explosion of pack files. Signed-off-by: Junio C Hamano <gitster@pobox.com> Acked-by: Shawn O. Pearce <spearce@spearce.org> Acked-by: Nicolas Pitre <nico@fluxnic.net> --- Ok, third-time lucky? Knock wood... Documentation/RelNotes-1.7.0.txt | 8 ++++---- Documentation/git-fast-import.txt | 4 ++-- fast-import.c | 17 +++++++++++------ 3 files changed, 17 insertions(+), 12 deletions(-) diff --git a/Documentation/RelNotes-1.7.0.txt b/Documentation/RelNotes-1.7.0.txt index e66945c..255666f 100644 --- a/Documentation/RelNotes-1.7.0.txt +++ b/Documentation/RelNotes-1.7.0.txt @@ -46,10 +46,10 @@ Notes on behaviour change environment, and diff.*.command and diff.*.textconv in the config file. - * The --max-pack-size argument to 'git repack' and 'git pack-objects' was - assuming the provided size to be expressed in MiB, unlike the - corresponding config variable and other similar options accepting a size - value. It is now expecting a size expressed in bytes, with a possible + * The --max-pack-size argument to 'git repack', 'git pack-objects', and + 'git fast-import' was assuming the provided size to be expressed in MiB, + unlike the corresponding config variable and other similar options accepting + a size value. It is now expecting a size expressed in bytes, with a possible unit suffix of 'k', 'm', or 'g'. Updates since v1.6.6 diff --git a/Documentation/git-fast-import.txt b/Documentation/git-fast-import.txt index 2691114..6764ff1 100644 --- a/Documentation/git-fast-import.txt +++ b/Documentation/git-fast-import.txt @@ -44,8 +44,8 @@ OPTIONS not contain the old commit). --max-pack-size=<n>:: - Maximum size of each output packfile, expressed in MiB. - The default is 4096 (4 GiB) as that is the maximum allowed + Maximum size of each output packfile. + The default is 4 GiB as that is the maximum allowed packfile size (due to file format limitations). Some importers may wish to lower this, such as to ensure the resulting packfiles fit on CDs. diff --git a/fast-import.c b/fast-import.c index a6730d0..b477dc6 100644 --- a/fast-import.c +++ b/fast-import.c @@ -2764,11 +2764,6 @@ static void option_date_format(const char *fmt) die("unknown --date-format argument %s", fmt); } -static void option_max_pack_size(const char *packsize) -{ - max_packsize = strtoumax(packsize, NULL, 0) * 1024 * 1024; -} - static void option_depth(const char *depth) { max_depth = strtoul(depth, NULL, 0); @@ -2798,7 +2793,17 @@ static void option_export_pack_edges(const char *edges) static int parse_one_option(const char *option) { if (!prefixcmp(option, "max-pack-size=")) { - option_max_pack_size(option + 14); + unsigned long v; + if (!git_parse_ulong(option + 14, &v)) + return 0; + if (v < 8192) { + warning("max-pack-size is now in bytes, assuming --max-pack-size=%lum", v); + v *= 1024 * 1024; + } else if (v < 1024 * 1024) { + warning("minimum max-pack-size is 1 MiB"); + v = 1024 * 1024; + } + max_packsize = v; } else if (!prefixcmp(option, "big-file-threshold=")) { unsigned long v; if (!git_parse_ulong(option + 19, &v)) -- 1.7.0.rc1.199.g9253ab ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH] fast-import: count --max-pack-size in bytes 2010-02-04 19:10 ` [PATCH] fast-import: count --max-pack-size " Junio C Hamano @ 2010-02-04 19:13 ` Shawn O. Pearce 2010-02-04 20:03 ` Ilari Liusvaara 1 sibling, 0 replies; 16+ messages in thread From: Shawn O. Pearce @ 2010-02-04 19:13 UTC (permalink / raw) To: Junio C Hamano; +Cc: Nicolas Pitre, git Junio C Hamano <gitster@pobox.com> wrote: > Similar in spirit to 07cf0f2 (make --max-pack-size argument to 'git > pack-object' count in bytes, 2010-02-03) which made the option by the same > name to pack-objects, this counts the pack size limit in bytes. > > In order not to cause havoc with people used to the previous megabyte > scale an integer smaller than 8092 is interpreted in megabytes but the > user gets a warning. Also a minimum size of 1 MiB is enforced to avoid an > explosion of pack files. > > Signed-off-by: Junio C Hamano <gitster@pobox.com> > Acked-by: Shawn O. Pearce <spearce@spearce.org> > Acked-by: Nicolas Pitre <nico@fluxnic.net> > --- > > Ok, third-time lucky? Knock wood... Yea, I like it. Thanks Junio. -- Shawn. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] fast-import: count --max-pack-size in bytes 2010-02-04 19:10 ` [PATCH] fast-import: count --max-pack-size " Junio C Hamano 2010-02-04 19:13 ` Shawn O. Pearce @ 2010-02-04 20:03 ` Ilari Liusvaara 1 sibling, 0 replies; 16+ messages in thread From: Ilari Liusvaara @ 2010-02-04 20:03 UTC (permalink / raw) To: Junio C Hamano; +Cc: Shawn O. Pearce, Nicolas Pitre, git On Thu, Feb 04, 2010 at 11:10:44AM -0800, Junio C Hamano wrote: > Similar in spirit to 07cf0f2 (make --max-pack-size argument to 'git > pack-object' count in bytes, 2010-02-03) which made the option by the same > name to pack-objects, this counts the pack size limit in bytes. > > In order not to cause havoc with people used to the previous megabyte > scale an integer smaller than 8092 is interpreted in megabytes but the Typo? Shouldn't it be 8192? > user gets a warning. Also a minimum size of 1 MiB is enforced to avoid an > explosion of pack files. -Ilari ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] make --max-pack-size argument to 'git pack-object' count in bytes 2010-02-04 17:11 ` Junio C Hamano 2010-02-04 17:24 ` Shawn O. Pearce @ 2010-02-04 17:49 ` Nicolas Pitre 2010-02-04 18:00 ` Shawn O. Pearce 1 sibling, 1 reply; 16+ messages in thread From: Nicolas Pitre @ 2010-02-04 17:49 UTC (permalink / raw) To: Junio C Hamano; +Cc: Shawn O. Pearce, git On Thu, 4 Feb 2010, Junio C Hamano wrote: > -- >8 -- > Subject: [PATCH] fast-import: count --max-pack-size in bytes > > Similar in spirit to 07cf0f2 (make --max-pack-size argument to 'git > pack-object' count in bytes, 2010-02-03) which made the option by the same > name to pack-objects, this counts the pack size limit in bytes. > > In order not to cause havoc with people used to the previous megabyte > scale, and because this is a sane thing to do anyway, a minimum size of 1 > MiB is enforced to avoid an explosion of pack files. > > Signed-off-by: Junio C Hamano <gitster@pobox.com> ACK. > --max-pack-size=<n>:: > - Maximum size of each output packfile, expressed in MiB. > - The default is 4096 (4 GiB) as that is the maximum allowed > + Maximum size of each output packfile. > + The default is 4 GiB as that is the maximum allowed > packfile size (due to file format limitations). Some > importers may wish to lower this, such as to ensure the > resulting packfiles fit on CDs. What file format limitation is alluded to here? It has been a while since the 4GB limit on pack file format has been removed. If this is a limitation of fast-import only then maybe this should be explained more explicitly. Nicolas ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] make --max-pack-size argument to 'git pack-object' count in bytes 2010-02-04 17:49 ` [PATCH 3/3] make --max-pack-size argument to 'git pack-object' count " Nicolas Pitre @ 2010-02-04 18:00 ` Shawn O. Pearce 2010-02-04 18:11 ` Nicolas Pitre 0 siblings, 1 reply; 16+ messages in thread From: Shawn O. Pearce @ 2010-02-04 18:00 UTC (permalink / raw) To: Nicolas Pitre; +Cc: Junio C Hamano, git Nicolas Pitre <nico@fluxnic.net> wrote: > > --max-pack-size=<n>:: > > - Maximum size of each output packfile, expressed in MiB. > > - The default is 4096 (4 GiB) as that is the maximum allowed > > + Maximum size of each output packfile. > > + The default is 4 GiB as that is the maximum allowed > > packfile size (due to file format limitations). Some > > importers may wish to lower this, such as to ensure the > > resulting packfiles fit on CDs. > > What file format limitation is alluded to here? It has been a while > since the 4GB limit on pack file format has been removed. The pack index v1 32 bit offset thing. Which you fixed. > If this is a > limitation of fast-import only then maybe this should be explained more > explicitly. Damn. It is. fast-import can't write a v2 index. Ugh. -- Shawn. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] make --max-pack-size argument to 'git pack-object' count in bytes 2010-02-04 18:00 ` Shawn O. Pearce @ 2010-02-04 18:11 ` Nicolas Pitre 2010-02-04 21:20 ` Shawn O. Pearce 0 siblings, 1 reply; 16+ messages in thread From: Nicolas Pitre @ 2010-02-04 18:11 UTC (permalink / raw) To: Shawn O. Pearce; +Cc: Junio C Hamano, git On Thu, 4 Feb 2010, Shawn O. Pearce wrote: > Nicolas Pitre <nico@fluxnic.net> wrote: > > > --max-pack-size=<n>:: > > > - Maximum size of each output packfile, expressed in MiB. > > > - The default is 4096 (4 GiB) as that is the maximum allowed > > > + Maximum size of each output packfile. > > > + The default is 4 GiB as that is the maximum allowed > > > packfile size (due to file format limitations). Some > > > importers may wish to lower this, such as to ensure the > > > resulting packfiles fit on CDs. > > > > What file format limitation is alluded to here? It has been a while > > since the 4GB limit on pack file format has been removed. > > The pack index v1 32 bit offset thing. Which you fixed. > > > If this is a > > limitation of fast-import only then maybe this should be explained more > > explicitly. > > Damn. It is. fast-import can't write a v2 index. Ugh. Isn't it using write_idx_file()? That function would do it all for you already. Nicolas ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] make --max-pack-size argument to 'git pack-object' count in bytes 2010-02-04 18:11 ` Nicolas Pitre @ 2010-02-04 21:20 ` Shawn O. Pearce 0 siblings, 0 replies; 16+ messages in thread From: Shawn O. Pearce @ 2010-02-04 21:20 UTC (permalink / raw) To: Nicolas Pitre; +Cc: Junio C Hamano, git Nicolas Pitre <nico@fluxnic.net> wrote: > On Thu, 4 Feb 2010, Shawn O. Pearce wrote: > > Nicolas Pitre <nico@fluxnic.net> wrote: > > > > --max-pack-size=<n>:: > > > > - Maximum size of each output packfile, expressed in MiB. > > > > - The default is 4096 (4 GiB) as that is the maximum allowed > > > > + Maximum size of each output packfile. > > > > + The default is 4 GiB as that is the maximum allowed > > > > packfile size (due to file format limitations). Some > > > > importers may wish to lower this, such as to ensure the > > > > resulting packfiles fit on CDs. > > > > > > What file format limitation is alluded to here? It has been a while > > > since the 4GB limit on pack file format has been removed. > > > > The pack index v1 32 bit offset thing. Which you fixed. > > > > > If this is a > > > limitation of fast-import only then maybe this should be explained more > > > explicitly. > > > > Damn. It is. fast-import can't write a v2 index. Ugh. > > Isn't it using write_idx_file()? That function would do it all for you > already. Nope. Its still using its own code. I never ported it to write_idx_file(). And I don't have the CRC32 data available for each object (it failed to compute/save it as it wrote). I mean to port to write_idx_file() but didn't... :-( -- Shawn. ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2010-02-04 21:20 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-02-04 3:48 [PATCH 1/3] fix multiple issues with t5300 Nicolas Pitre 2010-02-04 3:48 ` [PATCH 2/3] pack-objects: fix pack generation when using pack_size_limit Nicolas Pitre 2010-02-04 3:48 ` [PATCH 3/3] make --max-pack-size argument to 'git pack-object' count in bytes Nicolas Pitre 2010-02-04 4:00 ` Shawn O. Pearce 2010-02-04 4:38 ` Junio C Hamano 2010-02-04 17:11 ` Junio C Hamano 2010-02-04 17:24 ` Shawn O. Pearce 2010-02-04 17:58 ` Nicolas Pitre 2010-02-04 17:59 ` Shawn O. Pearce 2010-02-04 19:10 ` [PATCH] fast-import: count --max-pack-size " Junio C Hamano 2010-02-04 19:13 ` Shawn O. Pearce 2010-02-04 20:03 ` Ilari Liusvaara 2010-02-04 17:49 ` [PATCH 3/3] make --max-pack-size argument to 'git pack-object' count " Nicolas Pitre 2010-02-04 18:00 ` Shawn O. Pearce 2010-02-04 18:11 ` Nicolas Pitre 2010-02-04 21:20 ` Shawn O. Pearce
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).