* [PATCH 2/3] Only repack active packs by skipping over kept packs.
@ 2006-10-29 9:37 Shawn Pearce
2006-10-30 19:07 ` WARNING: THIS PATCH CAN BREAK YOUR REPO, was " Nicolas Pitre
0 siblings, 1 reply; 17+ messages in thread
From: Shawn Pearce @ 2006-10-29 9:37 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
During `git repack -a -d` only repack objects which are loose or
which reside in an active (a non-kept) pack. This allows the user
to keep large packs as-is without continuous repacking and can be
very helpful on large repositories. It should also help us resolve
a race condition between `git repack -a -d` and the new pack store
functionality in `git-receive-pack`.
Kept packs are those which have a corresponding .keep file in
$GIT_OBJECT_DIRECTORY/pack. That is pack-X.pack will be kept
(not repacked and not deleted) if pack-X.keep exists in the same
directory when `git repack -a -d` starts.
Currently this feature is not documented and there is no user
interface to keep an existing pack.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
git-repack.sh | 27 +++++++++++++++++----------
1 files changed, 17 insertions(+), 10 deletions(-)
diff --git a/git-repack.sh b/git-repack.sh
index 17e2452..f150a55 100755
--- a/git-repack.sh
+++ b/git-repack.sh
@@ -45,11 +45,19 @@ case ",$all_into_one," in
args='--unpacked --incremental'
;;
,t,)
- args=
-
- # Redundancy check in all-into-one case is trivial.
- existing=`test -d "$PACKDIR" && cd "$PACKDIR" && \
- find . -type f \( -name '*.pack' -o -name '*.idx' \) -print`
+ if [ -d "$PACKDIR" ]; then
+ for e in `cd "$PACKDIR" && find . -type f -name '*.pack' \
+ | sed -e 's/^\.\///' -e 's/\.pack$//'`
+ do
+ if [ -e "$PACKDIR/$e.keep" ]; then
+ : keep
+ else
+ args="$args --unpacked=$e.pack"
+ existing="$existing $e"
+ fi
+ done
+ fi
+ [ -z "$args" ] && args='--unpacked --incremental'
;;
esac
@@ -86,17 +94,16 @@ fi
if test "$remove_redundant" = t
then
- # We know $existing are all redundant only when
- # all-into-one is used.
- if test "$all_into_one" != '' && test "$existing" != ''
+ # We know $existing are all redundant.
+ if [ -n "$existing" ]
then
sync
( cd "$PACKDIR" &&
for e in $existing
do
case "$e" in
- ./pack-$name.pack | ./pack-$name.idx) ;;
- *) rm -f $e ;;
+ pack-$name) ;;
+ *) rm -f "$e.pack" "$e.idx" "$e.keep" ;;
esac
done
)
--
1.4.3.3.g7d63
^ permalink raw reply related [flat|nested] 17+ messages in thread
* WARNING: THIS PATCH CAN BREAK YOUR REPO, was Re: [PATCH 2/3] Only repack active packs by skipping over kept packs.
2006-10-29 9:37 [PATCH 2/3] Only repack active packs by skipping over kept packs Shawn Pearce
@ 2006-10-30 19:07 ` Nicolas Pitre
2006-10-30 19:23 ` Shawn Pearce
2006-10-30 20:26 ` Shawn Pearce
0 siblings, 2 replies; 17+ messages in thread
From: Nicolas Pitre @ 2006-10-30 19:07 UTC (permalink / raw)
To: Shawn Pearce; +Cc: Junio C Hamano, git
On Sun, 29 Oct 2006, Shawn Pearce wrote:
> During `git repack -a -d` only repack objects which are loose or
> which reside in an active (a non-kept) pack. This allows the user
> to keep large packs as-is without continuous repacking and can be
> very helpful on large repositories.
Something is really broken here.
Here's how to destroy your GIT's git repository.
WARNING: MAKE A COPY BEFORE TRYING THIS! I'm serious.
First, let's make a single pack just to make things simpler and
reproducible:
$ git-repack -a -f -d
$ git-prune
$ git-fsck-objects --full
So far everything should be fine. It's still time to make a backup copy
of your .git directory if you've not done so.
Now let's create a second pack containing a subset of the existing one.
$ git-rev-list --objects v1.4.3..v1.4.3.3 | \
git-pack-objects --stdout | \
git-index-pack --stdin -v --keep
$ git-fsck-objects --full
At this point the repository is still fine, but the --keep to
git-index-pack above will have created a file called
.git/objects/pack/pack-aceb4c6394c586abaf65d76dd6cf088f50a5b806.keep and
that is the source of all the trouble to come. You still can remove
that file if you don't have a backup yet.
If you still want to give it the coup de grace, just do:
$ git-repack -a -d
And now you've just lost a large amount of objects. To see the extent
of the dammage, just do:
$ git-fsck-objects
So... what is the --unpacked=<pack>.pack switch supposed to mean? It is
not documented anywhere and it certainly doesn't produce the expected
result with a repack.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: WARNING: THIS PATCH CAN BREAK YOUR REPO, was Re: [PATCH 2/3] Only repack active packs by skipping over kept packs.
2006-10-30 19:07 ` WARNING: THIS PATCH CAN BREAK YOUR REPO, was " Nicolas Pitre
@ 2006-10-30 19:23 ` Shawn Pearce
2006-10-30 20:26 ` Shawn Pearce
1 sibling, 0 replies; 17+ messages in thread
From: Shawn Pearce @ 2006-10-30 19:23 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Junio C Hamano, git
Nicolas Pitre <nico@cam.org> wrote:
> On Sun, 29 Oct 2006, Shawn Pearce wrote:
>
> > During `git repack -a -d` only repack objects which are loose or
> > which reside in an active (a non-kept) pack. This allows the user
> > to keep large packs as-is without continuous repacking and can be
> > very helpful on large repositories.
>
> Something is really broken here.
Holy cow. Since this is now in 'next', 'next' is now seriously
broken if you have a .keep file.
> So... what is the --unpacked=<pack>.pack switch supposed to mean? It is
> not documented anywhere and it certainly doesn't produce the expected
> result with a repack.
Junio introduced --unpacked=<pack>.pack a while ago for this
application. What it does is skip an object unless its a loose
object file or it is in the named pack. The idea being that
pack-objects would only consider object files which are loose or
ready to be repacked.
In your example above we should have copied all objects from your
first pack into the new pack during the final destructive repack,
but we didn't. I don't know why.
--
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: WARNING: THIS PATCH CAN BREAK YOUR REPO, was Re: [PATCH 2/3] Only repack active packs by skipping over kept packs.
2006-10-30 19:07 ` WARNING: THIS PATCH CAN BREAK YOUR REPO, was " Nicolas Pitre
2006-10-30 19:23 ` Shawn Pearce
@ 2006-10-30 20:26 ` Shawn Pearce
2006-10-30 20:52 ` Jan Harkes
2006-10-30 21:48 ` WARNING: THIS PATCH CAN BREAK YOUR REPO, was Re: [PATCH 2/3] Only repack active packs by skipping over kept packs Junio C Hamano
1 sibling, 2 replies; 17+ messages in thread
From: Shawn Pearce @ 2006-10-30 20:26 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Junio C Hamano, git
Nicolas Pitre <nico@cam.org> wrote:
> On Sun, 29 Oct 2006, Shawn Pearce wrote:
>
> > During `git repack -a -d` only repack objects which are loose or
> > which reside in an active (a non-kept) pack. This allows the user
> > to keep large packs as-is without continuous repacking and can be
> > very helpful on large repositories.
>
> Something is really broken here.
>
> Here's how to destroy your GIT's git repository.
>
> WARNING: MAKE A COPY BEFORE TRYING THIS! I'm serious.
>
> First, let's make a single pack just to make things simpler and
> reproducible:
>
> $ git-repack -a -f -d
> $ git-prune
> $ git-fsck-objects --full
Actually the breakage is easier to reproduce without trashing
a repository.
Do the above so you have everything in one pack. Now use rev-list
to simulate the object list construction in pack-objects as though
we were doing a 'git repack -a -d':
git-rev-list --objects --all \
--unpacked=.git/objects/pack/pack-*.pack \
| wc -l
gives me 102 (WRONG WRONG WRONG WRONG!!!!!!)
and
git-rev-list --objects --all | wc -l
gives me 31912 (correct). The --unpacked flag is horribly broken.
--
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: WARNING: THIS PATCH CAN BREAK YOUR REPO, was Re: [PATCH 2/3] Only repack active packs by skipping over kept packs.
2006-10-30 20:26 ` Shawn Pearce
@ 2006-10-30 20:52 ` Jan Harkes
2006-10-30 21:07 ` Jan Harkes
` (2 more replies)
2006-10-30 21:48 ` WARNING: THIS PATCH CAN BREAK YOUR REPO, was Re: [PATCH 2/3] Only repack active packs by skipping over kept packs Junio C Hamano
1 sibling, 3 replies; 17+ messages in thread
From: Jan Harkes @ 2006-10-30 20:52 UTC (permalink / raw)
To: Shawn Pearce; +Cc: Nicolas Pitre, Junio C Hamano, git
On Mon, Oct 30, 2006 at 03:26:11PM -0500, Shawn Pearce wrote:
> Actually the breakage is easier to reproduce without trashing
> a repository.
>
> Do the above so you have everything in one pack. Now use rev-list
> to simulate the object list construction in pack-objects as though
> we were doing a 'git repack -a -d':
>
> git-rev-list --objects --all \
> --unpacked=.git/objects/pack/pack-*.pack \
> | wc -l
>
> gives me 102 (WRONG WRONG WRONG WRONG!!!!!!)
The problem seems to be that as soon as we hit something that is found
in a pack that is not on the ignore list, that object and all it's
parents are marked as uninteresting. So if the kept pack contains a
slice of commits (v1.4.3..v1.4.3.3) the revision walker will only return
the recent stuff (v1.4.3.3..) and drop the older data (..v1.4.3).
The following patch does fix the problem Nicolas reported, but for some
reason I'm still getting only 102 objects (only tags and the commits
they refer to?) with your test.
Jan
----
diff --git a/revision.c b/revision.c
index 280e92b..a69c873 100644
--- a/revision.c
+++ b/revision.c
@@ -418,9 +418,6 @@ static void limit_list(struct rev_info *
if (revs->max_age != -1 && (commit->date < revs->max_age))
obj->flags |= UNINTERESTING;
- if (revs->unpacked &&
- has_sha1_pack(obj->sha1, revs->ignore_packed))
- obj->flags |= UNINTERESTING;
add_parents_to_list(revs, commit, &list);
if (obj->flags & UNINTERESTING) {
mark_parents_uninteresting(commit);
@@ -1149,17 +1146,18 @@ struct commit *get_revision(struct rev_i
* that we'd otherwise have done in limit_list().
*/
if (!revs->limited) {
- if ((revs->unpacked &&
- has_sha1_pack(commit->object.sha1,
- revs->ignore_packed)) ||
- (revs->max_age != -1 &&
- (commit->date < revs->max_age)))
+ if (revs->max_age != -1 &&
+ (commit->date < revs->max_age))
continue;
add_parents_to_list(revs, commit, &revs->commits);
}
if (commit->object.flags & SHOWN)
continue;
+ if (revs->unpacked && has_sha1_pack(commit->object.sha1,
+ revs->ignore_packed))
+ continue;
+
/* We want to show boundary commits only when their
* children are shown. When path-limiter is in effect,
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: WARNING: THIS PATCH CAN BREAK YOUR REPO, was Re: [PATCH 2/3] Only repack active packs by skipping over kept packs.
2006-10-30 20:52 ` Jan Harkes
@ 2006-10-30 21:07 ` Jan Harkes
2006-10-30 21:09 ` Shawn Pearce
2006-10-30 21:59 ` Junio C Hamano
2 siblings, 0 replies; 17+ messages in thread
From: Jan Harkes @ 2006-10-30 21:07 UTC (permalink / raw)
To: Shawn Pearce, Nicolas Pitre, Junio C Hamano, git
On Mon, Oct 30, 2006 at 03:52:00PM -0500, Jan Harkes wrote:
> On Mon, Oct 30, 2006 at 03:26:11PM -0500, Shawn Pearce wrote:
> > Do the above so you have everything in one pack. Now use rev-list
> > to simulate the object list construction in pack-objects as though
> > we were doing a 'git repack -a -d':
> >
> > git-rev-list --objects --all \
> > --unpacked=.git/objects/pack/pack-*.pack \
> > | wc -l
> >
> > gives me 102 (WRONG WRONG WRONG WRONG!!!!!!)
>
...
>
> The following patch does fix the problem Nicolas reported, but for some
> reason I'm still getting only 102 objects (only tags and the commits
> they refer to?) with your test.
Seems to be operator error, I guess the shell can't (won't) expand
--unpacked=.git/objects/pack/pack-*.pack and there is no pack named
pack-*.pack, so rev-list will actually find every object in one of the
packs and skip them.
The following works correctly,
$ git-rev-list --objects --all --unpacked=.git/objects/pack/pack-234f8136e45fb34d118bb346c15267535e80e5f0.pack --unpacked=.git/objects/pack/pack-aceb4c6394c586abaf65d76dd6cf088f50a5b806.pack | wc -l
28713
$ ~/git/git/git-rev-list --objects --all | wc -l
28713
Jan
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: WARNING: THIS PATCH CAN BREAK YOUR REPO, was Re: [PATCH 2/3] Only repack active packs by skipping over kept packs.
2006-10-30 20:52 ` Jan Harkes
2006-10-30 21:07 ` Jan Harkes
@ 2006-10-30 21:09 ` Shawn Pearce
2006-10-30 21:59 ` Junio C Hamano
2 siblings, 0 replies; 17+ messages in thread
From: Shawn Pearce @ 2006-10-30 21:09 UTC (permalink / raw)
To: Jan Harkes; +Cc: Nicolas Pitre, Junio C Hamano, git
Jan Harkes <jaharkes@cs.cmu.edu> wrote:
> On Mon, Oct 30, 2006 at 03:26:11PM -0500, Shawn Pearce wrote:
> > Actually the breakage is easier to reproduce without trashing
> > a repository.
> >
> > Do the above so you have everything in one pack. Now use rev-list
> > to simulate the object list construction in pack-objects as though
> > we were doing a 'git repack -a -d':
> >
> > git-rev-list --objects --all \
> > --unpacked=.git/objects/pack/pack-*.pack \
> > | wc -l
> >
> > gives me 102 (WRONG WRONG WRONG WRONG!!!!!!)
>
> The problem seems to be that as soon as we hit something that is found
> in a pack that is not on the ignore list, that object and all it's
> parents are marked as uninteresting. So if the kept pack contains a
> slice of commits (v1.4.3..v1.4.3.3) the revision walker will only return
> the recent stuff (v1.4.3.3..) and drop the older data (..v1.4.3).
Right - I got that far in my own research and then saw your patch
drop into my inbox. :-)
> The following patch does fix the problem Nicolas reported, but for some
> reason I'm still getting only 102 objects (only tags and the commits
> they refer to?) with your test.
Ack'd.
Your patch fixes both bugs for me. The rev-list test I talked about
above is now turning up 31846 objects which is the correct count.
The repack test Nico crafted works correctly too.
--
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: WARNING: THIS PATCH CAN BREAK YOUR REPO, was Re: [PATCH 2/3] Only repack active packs by skipping over kept packs.
2006-10-30 20:52 ` Jan Harkes
2006-10-30 21:07 ` Jan Harkes
2006-10-30 21:09 ` Shawn Pearce
@ 2006-10-30 21:59 ` Junio C Hamano
2006-10-30 22:54 ` Junio C Hamano
2006-10-30 22:55 ` Jan Harkes
2 siblings, 2 replies; 17+ messages in thread
From: Junio C Hamano @ 2006-10-30 21:59 UTC (permalink / raw)
To: Shawn Pearce; +Cc: Nicolas Pitre, git
Jan Harkes <jaharkes@cs.cmu.edu> writes:
> The following patch does fix the problem Nicolas reported, but for some
> reason I'm still getting only 102 objects (only tags and the commits
> they refer to?) with your test.
One potential downside of this is that this makes an obscure but
useful "gitk --unpacked" useless (robs performance).
http://thread.gmane.org/gmane.comp.version-control.git/19197/focus=19207
But other than that, I think it is an Ok change. The original
semantics of --unpacked (with or without "pretend as if objects
in this pack are loose") were, eh, "strange".
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: WARNING: THIS PATCH CAN BREAK YOUR REPO, was Re: [PATCH 2/3] Only repack active packs by skipping over kept packs.
2006-10-30 21:59 ` Junio C Hamano
@ 2006-10-30 22:54 ` Junio C Hamano
2006-10-30 22:55 ` Jan Harkes
1 sibling, 0 replies; 17+ messages in thread
From: Junio C Hamano @ 2006-10-30 22:54 UTC (permalink / raw)
To: Jan Harkes; +Cc: git, Nicolas Pitre, Shawn Pearce
Junio C Hamano <junkio@cox.net> writes:
> Jan Harkes <jaharkes@cs.cmu.edu> writes:
>
>> The following patch does fix the problem Nicolas reported, but for some
>> reason I'm still getting only 102 objects (only tags and the commits
>> they refer to?) with your test.
>
> One potential downside of this is that this makes an obscure but
> useful "gitk --unpacked" useless (robs performance).
>
> http://thread.gmane.org/gmane.comp.version-control.git/19197/focus=19207
>
> But other than that, I think it is an Ok change. The original
> semantics of --unpacked (with or without "pretend as if objects
> in this pack are loose") were, eh, "strange".
I changed my mind.
Even without --unpacked=pretend-this-is-loose nor .keep flag,
the original semantics of --unpacked and git repack do not play
with each other well. You can have a history where you have a
pack in the middle of the history, and would expect "git repack"
without -a to make your .git/objects/??/ directories empty but
it would not because --unpacked has been defined to mean "stop
traversal when we hit a packed commit". That would _not_
corrupt the repository, but is very counter-intuitive.
Unfortunately this is a semantic change in the middle of the
road (and it would change the _output_ not just performance of
"gitk --unpacked"), but I think it is a semantic change of a
good kind.
So I'll take Jan's patch as is. It needs to go all the way down
to "maint", since we have --unpacked= there already.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: WARNING: THIS PATCH CAN BREAK YOUR REPO, was Re: [PATCH 2/3] Only repack active packs by skipping over kept packs.
2006-10-30 21:59 ` Junio C Hamano
2006-10-30 22:54 ` Junio C Hamano
@ 2006-10-30 22:55 ` Jan Harkes
2006-10-30 23:24 ` Junio C Hamano
1 sibling, 1 reply; 17+ messages in thread
From: Jan Harkes @ 2006-10-30 22:55 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On Mon, Oct 30, 2006 at 01:59:03PM -0800, Junio C Hamano wrote:
> Jan Harkes <jaharkes@cs.cmu.edu> writes:
>
> > The following patch does fix the problem Nicolas reported, but for some
> > reason I'm still getting only 102 objects (only tags and the commits
> > they refer to?) with your test.
>
> One potential downside of this is that this makes an obscure but
> useful "gitk --unpacked" useless (robs performance).
>
> http://thread.gmane.org/gmane.comp.version-control.git/19197/focus=19207
If I use 'git fetch' followed later on by a 'git fetch -k', the result
from --unpacked would not include the unpacked objects created by the
first fetch. Although it may have been fast, it seems to be somewhat
counter-intuitive.
> But other than that, I think it is an Ok change. The original
> semantics of --unpacked (with or without "pretend as if objects
> in this pack are loose") were, eh, "strange".
Do you need a resend with a proper 'Signed-Off-By' line?
Jan
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: WARNING: THIS PATCH CAN BREAK YOUR REPO, was Re: [PATCH 2/3] Only repack active packs by skipping over kept packs.
2006-10-30 22:55 ` Jan Harkes
@ 2006-10-30 23:24 ` Junio C Hamano
2006-10-31 1:37 ` [PATCH] Continue traversal when rev-list --unpacked finds a packed commit Jan Harkes
0 siblings, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2006-10-30 23:24 UTC (permalink / raw)
To: Jan Harkes; +Cc: git
Jan Harkes <jaharkes@cs.cmu.edu> writes:
> On Mon, Oct 30, 2006 at 01:59:03PM -0800, Junio C Hamano wrote:
>> Jan Harkes <jaharkes@cs.cmu.edu> writes:
>>
>> > The following patch does fix the problem Nicolas reported, but for some
>> > reason I'm still getting only 102 objects (only tags and the commits
>> > they refer to?) with your test.
>>
>> One potential downside of this is that this makes an obscure but
>> useful "gitk --unpacked" useless (robs performance).
>>
>> http://thread.gmane.org/gmane.comp.version-control.git/19197/focus=19207
>
> If I use 'git fetch' followed later on by a 'git fetch -k', the result
> from --unpacked would not include the unpacked objects created by the
> first fetch. Although it may have been fast, it seems to be somewhat
> counter-intuitive.
>
>> But other than that, I think it is an Ok change. The original
>> semantics of --unpacked (with or without "pretend as if objects
>> in this pack are loose") were, eh, "strange".
>
> Do you need a resend with a proper 'Signed-Off-By' line?
Oh, I was planning to write a fairly detailed explanation
myself, but the description with S-o-b by the original author
would certainly be more appropriate. Thanks.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH] Continue traversal when rev-list --unpacked finds a packed commit.
2006-10-30 23:24 ` Junio C Hamano
@ 2006-10-31 1:37 ` Jan Harkes
2006-10-31 1:47 ` Junio C Hamano
0 siblings, 1 reply; 17+ messages in thread
From: Jan Harkes @ 2006-10-31 1:37 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
When getting the list of all unpacked objects by walking the commit history,
we would stop traversal whenever we hit a packed commit. However the fact
that we found a packed commit does not guarantee that all previous commits
are also packed. As a result the commit walkers did not show all reachable
unpacked objects.
Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>
---
revision.c | 14 ++++++--------
1 files changed, 6 insertions(+), 8 deletions(-)
diff --git a/revision.c b/revision.c
index 280e92b..a69c873 100644
--- a/revision.c
+++ b/revision.c
@@ -418,9 +418,6 @@ static void limit_list(struct rev_info *
if (revs->max_age != -1 && (commit->date < revs->max_age))
obj->flags |= UNINTERESTING;
- if (revs->unpacked &&
- has_sha1_pack(obj->sha1, revs->ignore_packed))
- obj->flags |= UNINTERESTING;
add_parents_to_list(revs, commit, &list);
if (obj->flags & UNINTERESTING) {
mark_parents_uninteresting(commit);
@@ -1149,17 +1146,18 @@ struct commit *get_revision(struct rev_i
* that we'd otherwise have done in limit_list().
*/
if (!revs->limited) {
- if ((revs->unpacked &&
- has_sha1_pack(commit->object.sha1,
- revs->ignore_packed)) ||
- (revs->max_age != -1 &&
- (commit->date < revs->max_age)))
+ if (revs->max_age != -1 &&
+ (commit->date < revs->max_age))
continue;
add_parents_to_list(revs, commit, &revs->commits);
}
if (commit->object.flags & SHOWN)
continue;
+ if (revs->unpacked && has_sha1_pack(commit->object.sha1,
+ revs->ignore_packed))
+ continue;
+
/* We want to show boundary commits only when their
* children are shown. When path-limiter is in effect,
* rewrite_parents() drops some commits from getting shown,
--
1.4.2.4.gd5de
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH] Continue traversal when rev-list --unpacked finds a packed commit.
2006-10-31 1:37 ` [PATCH] Continue traversal when rev-list --unpacked finds a packed commit Jan Harkes
@ 2006-10-31 1:47 ` Junio C Hamano
2006-10-31 2:17 ` Jan Harkes
0 siblings, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2006-10-31 1:47 UTC (permalink / raw)
To: Jan Harkes; +Cc: git, Linus Torvalds
Jan Harkes <jaharkes@cs.cmu.edu> writes:
> When getting the list of all unpacked objects by walking the commit history,
> we would stop traversal whenever we hit a packed commit. However the fact
> that we found a packed commit does not guarantee that all previous commits
> are also packed. As a result the commit walkers did not show all reachable
> unpacked objects.
>
> Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>
Thanks.
With this, I think revs->unpacked should not mean "limited", so
I suspect this is also needed, no?
diff --git a/revision.c b/revision.c
index 93f2513..2d7cad9 100644
--- a/revision.c
+++ b/revision.c
@@ -1010,7 +1010,7 @@ int setup_revisions(int argc, const char
add_pending_object(revs, object, def);
}
- if (revs->topo_order || revs->unpacked)
+ if (revs->topo_order)
revs->limited = 1;
if (revs->prune_data) {
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH] Continue traversal when rev-list --unpacked finds a packed commit.
2006-10-31 1:47 ` Junio C Hamano
@ 2006-10-31 2:17 ` Jan Harkes
0 siblings, 0 replies; 17+ messages in thread
From: Jan Harkes @ 2006-10-31 2:17 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Linus Torvalds
On Mon, Oct 30, 2006 at 05:47:14PM -0800, Junio C Hamano wrote:
> Jan Harkes <jaharkes@cs.cmu.edu> writes:
>
> > When getting the list of all unpacked objects by walking the commit history,
> > we would stop traversal whenever we hit a packed commit. However the fact
> > that we found a packed commit does not guarantee that all previous commits
> > are also packed. As a result the commit walkers did not show all reachable
> > unpacked objects.
> >
> > Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>
>
> Thanks.
>
> With this, I think revs->unpacked should not mean "limited", so
> I suspect this is also needed, no?
I'm not familiar enough with the code to know for sure, but my gut
feeling is that that would be needed. Let me check...
When that flag is set, the code calls limit_list, which no longer stops
traversal when we hit a packed commit. So we end up with a list of all
commits in memory. If the flag is not set, the list is kept minimal and
parents are only traversed as they are encountered.
So it looks like not setting the flag reduces memory usage we traverse
all parents in both cases. Yes, you are correct.
Jan
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: WARNING: THIS PATCH CAN BREAK YOUR REPO, was Re: [PATCH 2/3] Only repack active packs by skipping over kept packs.
2006-10-30 20:26 ` Shawn Pearce
2006-10-30 20:52 ` Jan Harkes
@ 2006-10-30 21:48 ` Junio C Hamano
2006-10-30 21:55 ` Shawn Pearce
1 sibling, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2006-10-30 21:48 UTC (permalink / raw)
To: Shawn Pearce; +Cc: Nicolas Pitre, git
Shawn Pearce <spearce@spearce.org> writes:
> Do the above so you have everything in one pack. Now use rev-list
> to simulate the object list construction in pack-objects as though
> we were doing a 'git repack -a -d':
>
> git-rev-list --objects --all \
> --unpacked=.git/objects/pack/pack-*.pack \
> | wc -l
>
> gives me 102 (WRONG WRONG WRONG WRONG!!!!!!)
Now I think I know what is going on.
The meaning of "unpacked" (with or without the "pretend as if
all objects in this pack are loose") has always been to stop
traversing once we hit a packed object, not "do not include
already packed object".
So --unpacked=pretend-this-is-loose was wrong to begin with; it
probably should have been --incremental=pretend-this-is-loose.
How about reverting the following:
commit ce8590748b918687abc4c7cd2d432dd23f07ae40
Author: Shawn Pearce <spearce@spearce.org>
Only repack active packs by skipping over kept packs.
commit 106d710bc13f34aec1a15c4cff80f062f384edf6
Author: Junio C Hamano <junkio@cox.net>
pack-objects --unpacked=<existing pack> option.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: WARNING: THIS PATCH CAN BREAK YOUR REPO, was Re: [PATCH 2/3] Only repack active packs by skipping over kept packs.
2006-10-30 21:48 ` WARNING: THIS PATCH CAN BREAK YOUR REPO, was Re: [PATCH 2/3] Only repack active packs by skipping over kept packs Junio C Hamano
@ 2006-10-30 21:55 ` Shawn Pearce
2006-10-30 22:07 ` Junio C Hamano
0 siblings, 1 reply; 17+ messages in thread
From: Shawn Pearce @ 2006-10-30 21:55 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Nicolas Pitre, git
Junio C Hamano <junkio@cox.net> wrote:
> Shawn Pearce <spearce@spearce.org> writes:
>
> > Do the above so you have everything in one pack. Now use rev-list
> > to simulate the object list construction in pack-objects as though
> > we were doing a 'git repack -a -d':
> >
> > git-rev-list --objects --all \
> > --unpacked=.git/objects/pack/pack-*.pack \
> > | wc -l
> >
> > gives me 102 (WRONG WRONG WRONG WRONG!!!!!!)
>
> Now I think I know what is going on.
>
> The meaning of "unpacked" (with or without the "pretend as if
> all objects in this pack are loose") has always been to stop
> traversing once we hit a packed object, not "do not include
> already packed object".
Did you see Jan Harkes' patch that changes the behavior to be what
it should have been?
> So --unpacked=pretend-this-is-loose was wrong to begin with; it
> probably should have been --incremental=pretend-this-is-loose.
I don't care about what the option name is. If you want to change
it to --incremental we can but the --unpacked actually makes more
sense now... Its saying pretend every object in this pack is
unpacked and therefore should be packed.
> How about reverting the following:
>
> commit ce8590748b918687abc4c7cd2d432dd23f07ae40
> Author: Shawn Pearce <spearce@spearce.org>
>
> Only repack active packs by skipping over kept packs.
>
>
> commit 106d710bc13f34aec1a15c4cff80f062f384edf6
> Author: Junio C Hamano <junkio@cox.net>
>
> pack-objects --unpacked=<existing pack> option.
>
Nah. I think Jan's patch fixes the bug and the --unpacked option
now makes sense as is, so I don't see why we would revert these.
--
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: WARNING: THIS PATCH CAN BREAK YOUR REPO, was Re: [PATCH 2/3] Only repack active packs by skipping over kept packs.
2006-10-30 21:55 ` Shawn Pearce
@ 2006-10-30 22:07 ` Junio C Hamano
0 siblings, 0 replies; 17+ messages in thread
From: Junio C Hamano @ 2006-10-30 22:07 UTC (permalink / raw)
To: Shawn Pearce; +Cc: Nicolas Pitre, git
Shawn Pearce <spearce@spearce.org> writes:
> Did you see Jan Harkes' patch that changes the behavior to be what
> it should have been?
>
>> So --unpacked=pretend-this-is-loose was wrong to begin with; it
>> probably should have been --incremental=pretend-this-is-loose.
>
> I don't care about what the option name is.
It is not about name, but what --unpacked means. See my other
mail.
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2006-10-31 2:17 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-10-29 9:37 [PATCH 2/3] Only repack active packs by skipping over kept packs Shawn Pearce
2006-10-30 19:07 ` WARNING: THIS PATCH CAN BREAK YOUR REPO, was " Nicolas Pitre
2006-10-30 19:23 ` Shawn Pearce
2006-10-30 20:26 ` Shawn Pearce
2006-10-30 20:52 ` Jan Harkes
2006-10-30 21:07 ` Jan Harkes
2006-10-30 21:09 ` Shawn Pearce
2006-10-30 21:59 ` Junio C Hamano
2006-10-30 22:54 ` Junio C Hamano
2006-10-30 22:55 ` Jan Harkes
2006-10-30 23:24 ` Junio C Hamano
2006-10-31 1:37 ` [PATCH] Continue traversal when rev-list --unpacked finds a packed commit Jan Harkes
2006-10-31 1:47 ` Junio C Hamano
2006-10-31 2:17 ` Jan Harkes
2006-10-30 21:48 ` WARNING: THIS PATCH CAN BREAK YOUR REPO, was Re: [PATCH 2/3] Only repack active packs by skipping over kept packs Junio C Hamano
2006-10-30 21:55 ` Shawn Pearce
2006-10-30 22:07 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).