From: Jiang Xin <worldhello.net@gmail.com>
To: "Junio C Hamano" <gitster@pobox.com>,
"Git List" <git@vger.kernel.org>,
"SZEDER Gábor" <szeder.dev@gmail.com>
Cc: Jiang Xin <worldhello.net@gmail.com>, Sun Chao <sunchao9@huawei.com>
Subject: [PATCH v6 0/5] pack-redundant: new algorithm to find min packs
Date: Sat, 12 Jan 2019 17:17:49 +0800 [thread overview]
Message-ID: <20190112091754.30985-1-worldhello.net@gmail.com> (raw)
In-Reply-To: <20190110120142.22271-1-worldhello.net@gmail.com>
> Sun Chao (my former colleague at Huawei) found a bug of
> git-pack-redundant. If there are too many packs and many of them
> overlap each other, running `git pack-redundant --all` will
> exhaust all memories and the process will be killed by kernel.
>
> There is a script in commit log of commit 2/5, which can be used to
> create a repository with lots of redundant packs. Running `git
> pack-redundant --all` in it can reproduce this issue.
Junio C Hamano <gitster@pobox.com> 于2019年1月12日周六 上午2:00写道:
> >> Yikes. Can't "git pack-objects" get the input directly without
> >> overlong printf, something along the lines of...
> >>
> >> P1=$(git -C .git/objects/pack pack-objects pack <<-EOF
> >> $A
> >> $B
> >> $C
> >> ...
> >> $R
> >> EOF
> >> )
> >
> > Find that no space before <OID>, because git-pack-objects not allow that,
> > and mached parentheses should in the same line.
> > So Will write like this:
> >
> > create_pack_1() {
> > P1=$(git -C .git/objects/pack pack-objects pack <<-EOF) &&
> > $T
>
> Isn't the whole point of <<-EOF (notice the leading dash) to allow
> us to indent the here-doc with horizontal tab?
The reason that indents are not stripped even with `<<-EOF` is I mixed
tabs and spaces to make a better align.
If put the heredoc outside the parentheses, it will failed on MacOS, so
use the syntax Junio previously suggested.
SZEDER Gábor <szeder.dev@gmail.com> 于2019年1月11日周五 上午9:19写道:
> I see that the last patch in this series removes those three
> unused functions, but that patch should be squashed into this one to
> keep Git buildable with '-Werror' or DEVELOPER=1.
>
> Furthermore, after building this patch (without '-Werror'), several
> tests in 't5323-pack-redundant.sh' fail. To avoid the test failure I
> think the fourth patch ensuring a consistent sort order should be
> squashed in as well.
Patch 3/5 to 5/5 can be squashed to patch 2/5.
## Changes since reroll v5
1: 40fea5d67f ! 1: 7e4e703083 t5323: test cases for git-pack-redundant
@@ -22,8 +22,7 @@
+
+. ./test-lib.sh
+
-+create_commits()
-+{
++create_commits() {
+ parent=
+ for name in A B C D E F G H I J K L M N O P Q R
+ do
@@ -39,54 +38,98 @@
+ parent=$oid ||
+ return 1
+ done
-+ git update-ref refs/heads/master $M
++ git update-ref refs/heads/master $R
+}
+
-+create_pack_1()
-+{
-+ P1=$(cd .git/objects/pack; printf "$T\n$A\n$B\n$C\n$D\n$E\n$F\n$R\n" | git pack-objects pack 2>/dev/null) &&
++create_pack_1() {
++ P1=$(git -C .git/objects/pack pack-objects -q pack <<-EOF
++ $T
++ $A
++ $B
++ $C
++ $D
++ $E
++ $F
++ $R
++ EOF
++ ) &&
+ eval P$P1=P1:$P1
+}
+
-+create_pack_2()
-+{
-+ P2=$(cd .git/objects/pack; printf "$B\n$C\n$D\n$E\n$G\n$H\n$I\n" | git pack-objects pack 2>/dev/null) &&
++create_pack_2() {
++ P2=$(git -C .git/objects/pack pack-objects -q pack <<-EOF
++ $B
++ $C
++ $D
++ $E
++ $G
++ $H
++ $I
++ EOF
++ ) &&
+ eval P$P2=P2:$P2
+}
+
-+create_pack_3()
-+{
-+ P3=$(cd .git/objects/pack; printf "$F\n$I\n$J\n$K\n$L\n$M\n" | git pack-objects pack 2>/dev/null) &&
++create_pack_3() {
++ P3=$(git -C .git/objects/pack pack-objects -q pack <<-EOF
++ $F
++ $I
++ $J
++ $K
++ $L
++ $M
++ EOF
++ ) &&
+ eval P$P3=P3:$P3
+}
+
-+create_pack_4()
-+{
-+ P4=$(cd .git/objects/pack; printf "$J\n$K\n$L\n$M\n$P\n" | git pack-objects pack 2>/dev/null) &&
++create_pack_4() {
++ P4=$(git -C .git/objects/pack pack-objects -q pack <<-EOF
++ $J
++ $K
++ $L
++ $M
++ $P
++ EOF
++ ) &&
+ eval P$P4=P4:$P4
+}
+
-+create_pack_5()
-+{
-+ P5=$(cd .git/objects/pack; printf "$G\n$H\n$N\n$O\n" | git pack-objects pack 2>/dev/null) &&
++create_pack_5() {
++ P5=$(git -C .git/objects/pack pack-objects -q pack <<-EOF
++ $G
++ $H
++ $N
++ $O
++ EOF
++ ) &&
+ eval P$P5=P5:$P5
+}
+
-+create_pack_6()
-+{
-+ P6=$(cd .git/objects/pack; printf "$N\n$O\n$Q\n" | git pack-objects pack 2>/dev/null) &&
++create_pack_6() {
++ P6=$(git -C .git/objects/pack pack-objects -q pack <<-EOF
++ $N
++ $O
++ $Q
++ EOF
++ ) &&
+ eval P$P6=P6:$P6
+}
+
-+create_pack_7()
-+{
-+ P7=$(cd .git/objects/pack; printf "$P\n$Q\n" | git pack-objects pack 2>/dev/null) &&
++create_pack_7() {
++ P7=$(git -C .git/objects/pack pack-objects -q pack <<-EOF
++ $P
++ $Q
++ EOF
++ ) &&
+ eval P$P7=P7:$P7
+}
+
-+create_pack_8()
-+{
-+ P8=$(cd .git/objects/pack; printf "$A\n" | git pack-objects pack 2>/dev/null) &&
++create_pack_8() {
++ P8=$(git -C .git/objects/pack pack-objects -q pack <<-EOF
++ $A
++ EOF
++ ) &&
+ eval P$P8=P8:$P8
+}
+
@@ -110,10 +153,12 @@
+
+test_expect_success 'one of pack-2/pack-3 is redundant' '
+ git pack-redundant --all >out &&
-+ sed -E -e "s#.*/pack-(.*)\.(idx|pack)#\1#" out | \
-+ sort -u | \
-+ while read p; do eval echo "\${P$p}"; done | \
-+ sort >actual && \
++ sed \
++ -e "s#.*/pack-\(.*\)\.idx#\1#" \
++ -e "s#.*/pack-\(.*\)\.pack#\1#" out |
++ sort -u |
++ while read p; do eval echo "\${P$p}"; done |
++ sort >actual &&
+ test_cmp expected actual
+'
+
@@ -121,6 +166,7 @@
+ create_pack_6 && create_pack_7
+'
+
++# Only after calling create_pack_6, we can use $P6 variable.
+cat >expected <<EOF
+P2:$P2
+P4:$P4
@@ -129,10 +175,12 @@
+
+test_expect_success 'pack 2, 4, and 6 are redundant' '
+ git pack-redundant --all >out &&
-+ sed -E -e "s#.*/pack-(.*)\.(idx|pack)#\1#" out | \
-+ sort -u | \
-+ while read p; do eval echo "\${P$p}"; done | \
-+ sort >actual && \
++ sed \
++ -e "s#.*/pack-\(.*\)\.idx#\1#" \
++ -e "s#.*/pack-\(.*\)\.pack#\1#" out |
++ sort -u |
++ while read p; do eval echo "\${P$p}"; done |
++ sort >actual &&
+ test_cmp expected actual
+'
+
@@ -147,24 +195,26 @@
+P8:$P8
+EOF
+
-+test_expect_success 'pack-8, subset of pack-1, is also redundant' '
++test_expect_success 'pack-8 (subset of pack-1) is also redundant' '
+ git pack-redundant --all >out &&
-+ sed -E -e "s#.*/pack-(.*)\.(idx|pack)#\1#" out | \
-+ sort -u | \
-+ while read p; do eval echo "\${P$p}"; done | \
-+ sort >actual && \
++ sed \
++ -e "s#.*/pack-\(.*\)\.idx#\1#" \
++ -e "s#.*/pack-\(.*\)\.pack#\1#" out |
++ sort -u |
++ while read p; do eval echo "\${P$p}"; done |
++ sort >actual &&
+ test_cmp expected actual
+'
+
-+test_expect_success 'clear loose objects' '
++test_expect_success 'clean loose objects' '
+ git prune-packed &&
+ find .git/objects -type f | sed -e "/objects\/pack\//d" >out &&
+ test_must_be_empty out
+'
+
-+test_expect_success 'remove redundant packs' '
++test_expect_success 'remove redundant packs and pass fsck' '
+ git pack-redundant --all | xargs rm &&
-+ git fsck &&
++ git fsck --no-progress &&
+ git pack-redundant --all >out &&
+ test_must_be_empty out
+'
2: 50cd5a5b47 ! 2: 51a9c2d8a5 pack-redundant: new algorithm to find min packs
@@ -67,7 +67,7 @@
Original PR and discussions: https://github.com/jiangxin/git/pull/25
Signed-off-by: Sun Chao <sunchao9@huawei.com>
- Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
+ Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
diff --git a/builtin/pack-redundant.c b/builtin/pack-redundant.c
5: b7ccdea1ad ! 3: c5eb21c23c pack-redundant: remove unused functions
@@ -6,14 +6,14 @@
`pll_free`, etc.
Signed-off-by: Sun Chao <sunchao9@huawei.com>
- Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
+ Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
diff --git a/builtin/pack-redundant.c b/builtin/pack-redundant.c
--- a/builtin/pack-redundant.c
+++ b/builtin/pack-redundant.c
@@
- size_t all_objects_size;
+ struct llist *all_objects;
} *local_packs = NULL, *altodb_packs = NULL;
-struct pll {
@@ -105,7 +105,7 @@
- diff = llist_copy(list);
-
- while (pl) {
-- llist_sorted_difference_inplace(diff, pl->remaining_objects);
+- llist_sorted_difference_inplace(diff, pl->all_objects);
- if (diff->size == 0) { /* we're done */
- llist_free(diff);
- return 1;
3: 6338c6fad4 ! 4: 1acdd0af1e pack-redundant: rename pack_list.all_objects
@@ -18,16 +18,7 @@
+ struct llist *remaining_objects;
} *local_packs = NULL, *altodb_packs = NULL;
- struct pll {
-@@
- diff = llist_copy(list);
-
- while (pl) {
-- llist_sorted_difference_inplace(diff, pl->all_objects);
-+ llist_sorted_difference_inplace(diff, pl->remaining_objects);
- if (diff->size == 0) { /* we're done */
- llist_free(diff);
- return 1;
+ static struct llist_item *free_nodes;
@@
{
struct pack_list *pl_a = *((struct pack_list **)a);
4: 734f4d8a8b ! 5: 306d515cda pack-redundant: consistent sort method
@@ -26,7 +26,7 @@
+ size_t all_objects_size;
} *local_packs = NULL, *altodb_packs = NULL;
- struct pll {
+ static struct llist_item *free_nodes;
@@
return ret;
}
@@ -42,20 +42,24 @@
- if (sz_a == sz_b)
- return 0;
- else if (sz_a < sz_b)
-+ /* if have the same remaining_objects, big pack first */
-+ if (pl_a->remaining_objects->size == pl_b->remaining_objects->size)
++ if (pl_a->remaining_objects->size == pl_b->remaining_objects->size) {
++ /* have the same remaining_objects, big pack first */
+ if (pl_a->all_objects_size == pl_b->all_objects_size)
+ return 0;
+ else if (pl_a->all_objects_size < pl_b->all_objects_size)
+ return 1;
+ else
+ return -1;
-+
-+ /* sort according to remaining objects, more remaining objects first */
-+ if (pl_a->remaining_objects->size < pl_b->remaining_objects->size)
++ } else if (pl_a->remaining_objects->size < pl_b->remaining_objects->size) {
++ /* sort by remaining objects, more objects first */
return 1;
- else
+- else
++ } else {
return -1;
++ }
+ }
+
+ /* Sort pack_list, greater size of remaining_objects first */
@@
for (n = 0, p = *pl; p; p = p->next)
ary[n++] = p;
## This reroll has the following commits:
Jiang Xin (3):
t5323: test cases for git-pack-redundant
pack-redundant: rename pack_list.all_objects
pack-redundant: consistent sort method
Sun Chao (2):
pack-redundant: new algorithm to find min packs
pack-redundant: remove unused functions
builtin/pack-redundant.c | 221 +++++++++++++++-----------------------
t/t5323-pack-redundant.sh | 207 +++++++++++++++++++++++++++++++++++
2 files changed, 292 insertions(+), 136 deletions(-)
create mode 100755 t/t5323-pack-redundant.sh
--
2.20.0.3.gc45e608566
next prev parent reply other threads:[~2019-01-12 9:19 UTC|newest]
Thread overview: 83+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-12-18 9:58 [PATCH 1/2] pack-redundant: new algorithm to find min packs Jiang Xin
2018-12-18 9:58 ` [PATCH 2/2] pack-redundant: remove unused functions Jiang Xin
2018-12-19 12:14 ` [PATCH v2 0/3] pack-redundant: new algorithm to find min packs Jiang Xin
2019-01-02 4:34 ` [PATCH v3 " Jiang Xin
2019-01-02 4:34 ` [PATCH v3 1/3] t5323: test cases for git-pack-redundant Jiang Xin
2019-01-09 12:56 ` SZEDER Gábor
2019-01-09 16:47 ` SZEDER Gábor
2019-01-10 12:01 ` [PATCH v5 0/5] pack-redundant: new algorithm to find min packs Jiang Xin
2019-01-12 9:17 ` Jiang Xin [this message]
2019-01-30 11:47 ` [PATCH v7 0/6] " Jiang Xin
2019-02-01 16:21 ` [PATCH v9 " Jiang Xin
2019-02-01 16:21 ` [PATCH v9 1/6] t5323: test cases for git-pack-redundant Jiang Xin
2019-02-01 19:42 ` Eric Sunshine
2019-02-01 21:03 ` Junio C Hamano
2019-02-01 21:49 ` Eric Sunshine
2019-02-02 13:30 ` [PATCH v10 0/6] pack-redundant: new algorithm to find min packs Jiang Xin
2019-02-02 13:30 ` [PATCH v10 1/6] t5323: test cases for git-pack-redundant Jiang Xin
2019-02-02 13:30 ` [PATCH v10 2/6] pack-redundant: delay creation of unique_objects Jiang Xin
2019-02-02 13:30 ` [PATCH v10 3/6] pack-redundant: delete redundant code Jiang Xin
2019-02-02 13:30 ` [PATCH v10 4/6] pack-redundant: new algorithm to find min packs Jiang Xin
2019-02-02 13:30 ` [PATCH v10 5/6] pack-redundant: rename pack_list.all_objects Jiang Xin
2019-02-02 13:30 ` [PATCH v10 6/6] pack-redundant: consistent sort method Jiang Xin
2019-02-01 16:21 ` [PATCH v9 2/6] pack-redundant: delay creation of unique_objects Jiang Xin
2019-02-01 16:21 ` [PATCH v9 3/6] pack-redundant: delete redundant code Jiang Xin
2019-02-01 16:21 ` [PATCH v9 4/6] pack-redundant: new algorithm to find min packs Jiang Xin
2019-02-01 16:21 ` [PATCH v9 5/6] pack-redundant: rename pack_list.all_objects Jiang Xin
2019-02-01 16:21 ` [PATCH v9 6/6] pack-redundant: consistent sort method Jiang Xin
2019-01-30 11:47 ` [PATCH v7 1/6] t5323: test cases for git-pack-redundant Jiang Xin
2019-01-31 21:44 ` Junio C Hamano
2019-02-01 5:44 ` Jiang Xin
2019-02-01 6:11 ` Eric Sunshine
2019-02-01 7:23 ` Jiang Xin
2019-02-01 7:25 ` Jiang Xin
2019-02-01 9:51 ` Jiang Xin
2019-01-30 11:47 ` [PATCH v7 2/6] pack-redundant: delay creation of unique_objects Jiang Xin
2019-01-30 11:47 ` [PATCH v7 3/6] pack-redundant: new algorithm to find min packs Jiang Xin
2019-01-31 19:30 ` Junio C Hamano
2019-02-01 9:55 ` Jiang Xin
2019-01-30 11:47 ` [PATCH v7 4/6] pack-redundant: remove unused functions Jiang Xin
2019-01-30 15:03 ` [PATCH v8 1/1] pack-redundant: delete redundant code 16657101987
2019-01-30 11:47 ` [PATCH v7 5/6] pack-redundant: rename pack_list.all_objects Jiang Xin
2019-01-30 11:47 ` [PATCH v7 6/6] pack-redundant: consistent sort method Jiang Xin
2019-01-12 9:17 ` [PATCH v6 1/5] t5323: test cases for git-pack-redundant Jiang Xin
2019-01-12 9:17 ` [PATCH v6 2/5] pack-redundant: new algorithm to find min packs Jiang Xin
2019-01-12 9:17 ` [PATCH v6 3/5] pack-redundant: remove unused functions Jiang Xin
2019-01-12 9:17 ` [PATCH v6 4/5] pack-redundant: rename pack_list.all_objects Jiang Xin
2019-01-12 9:17 ` [PATCH v6 5/5] pack-redundant: consistent sort method Jiang Xin
2019-01-10 12:01 ` [PATCH v5 1/5] t5323: test cases for git-pack-redundant Jiang Xin
2019-01-10 21:11 ` Junio C Hamano
2019-01-11 1:59 ` Jiang Xin
2019-01-11 18:00 ` Junio C Hamano
2019-01-10 12:01 ` [PATCH v5 2/5] pack-redundant: new algorithm to find min packs Jiang Xin
2019-01-11 1:19 ` SZEDER Gábor
2019-01-10 12:01 ` [PATCH v5 3/5] pack-redundant: rename pack_list.all_objects Jiang Xin
2019-01-10 12:01 ` [PATCH v5 4/5] pack-redundant: consistent sort method Jiang Xin
2019-01-10 20:05 ` SZEDER Gábor
2019-01-10 12:01 ` [PATCH v5 5/5] pack-redundant: remove unused functions Jiang Xin
2019-01-10 3:28 ` [PATCH v3 1/3] t5323: test cases for git-pack-redundant Jiang Xin
2019-01-10 7:11 ` Johannes Sixt
2019-01-10 11:57 ` SZEDER Gábor
2019-01-10 12:25 ` Torsten Bögershausen
2019-01-10 17:36 ` Junio C Hamano
2019-01-15 20:30 ` [PATCH/RFC v1 1/1] test-lint: sed -E (or -a, -l) are not portable tboegi
2019-01-15 21:09 ` Eric Sunshine
2019-01-16 11:24 ` Ævar Arnfjörð Bjarmason
2019-01-20 7:53 ` [PATCH/RFC v2 1/1] test-lint: Only use only sed [-n] [-e command] [-f command_file] tboegi
2019-01-22 19:47 ` Junio C Hamano
2019-01-22 20:00 ` Torsten Bögershausen
2019-01-22 21:15 ` Eric Sunshine
2019-01-23 6:35 ` Torsten Bögershausen
2019-01-23 17:54 ` Junio C Hamano
2019-01-25 19:12 ` Torsten Bögershausen
2019-01-27 22:34 ` Junio C Hamano
2019-01-02 4:34 ` [PATCH v3 2/3] pack-redundant: new algorithm to find min packs Jiang Xin
2019-01-02 4:34 ` [PATCH v3 3/3] pack-redundant: remove unused functions Jiang Xin
2019-01-08 16:40 ` [PATCH v4 0/1] " 16657101987
2019-01-08 19:30 ` Junio C Hamano
2019-01-09 0:29 ` 16657101987
2019-01-08 16:43 ` [PATCH v4 1/1] " 16657101987
2019-01-08 16:45 ` [PATCH v4 0/1] " 16657101987
2018-12-19 12:14 ` [PATCH v2 1/3] t5322: test cases for git-pack-redundant Jiang Xin
2018-12-19 12:14 ` [PATCH v2 2/3] pack-redundant: new algorithm to find min packs Jiang Xin
2018-12-19 12:14 ` [PATCH v2 3/3] pack-redundant: remove unused functions Jiang Xin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190112091754.30985-1-worldhello.net@gmail.com \
--to=worldhello.net@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=sunchao9@huawei.com \
--cc=szeder.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.