* [BUG] git-rev-list: --topo-order --boundary and --max-count
@ 2007-03-05 10:02 Santi Béjar
2007-03-05 10:39 ` Junio C Hamano
0 siblings, 1 reply; 26+ messages in thread
From: Santi Béjar @ 2007-03-05 10:02 UTC (permalink / raw)
To: Git Mailing List
Hi *,
the --topo-order does not play well with --boundary and --max-count.
$ git-rev-list --boundary --max-count=50 5ced0 | wc -l
56
$ git-rev-list --topo-order --boundary --max-count=50 5ced0 | wc -l
8846
(5ced0 is git.git's master). I think it should be 56 for both. It
presents this behaviour since c4025103fa, when was added --boundary
support for git-rev-list --max-count and --max-age.
Santi
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [BUG] git-rev-list: --topo-order --boundary and --max-count
2007-03-05 10:02 [BUG] git-rev-list: --topo-order --boundary and --max-count Santi Béjar
@ 2007-03-05 10:39 ` Junio C Hamano
2007-03-05 12:15 ` Junio C Hamano
` (2 more replies)
0 siblings, 3 replies; 26+ messages in thread
From: Junio C Hamano @ 2007-03-05 10:39 UTC (permalink / raw)
To: Santi Béjar, Johannes Schindelin; +Cc: Git Mailing List
"Santi Béjar" <sbejar@gmail.com> writes:
> the --topo-order does not play well with --boundary and --max-count.
>
> $ git-rev-list --boundary --max-count=50 5ced0 | wc -l
> 56
> $ git-rev-list --topo-order --boundary --max-count=50 5ced0 | wc -l
> 8846
>
> (5ced0 is git.git's master). I think it should be 56 for both. It
> presents this behaviour since c4025103fa, when was added --boundary
> support for git-rev-list --max-count and --max-age.
I think the code that does --boundary when the list is limited
with --max-count is not quite right, even without topo-order.
Only when the traversal is not limited, the code happens to work
correctly because in that case alone we pick up positive commits
one by one up to the specified count, and do not place anything
other than their immediate parents in the list.
It needs to find out commits (be they marked as UNINTERESTING or
not) still in the revs->commits that are _not_ reachable by any
other commits in the list, or something like that.
I suspect that would unfortunately be very expensive. Dscho,
have better ideas?
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [BUG] git-rev-list: --topo-order --boundary and --max-count
2007-03-05 10:39 ` Junio C Hamano
@ 2007-03-05 12:15 ` Junio C Hamano
2007-03-05 17:05 ` Linus Torvalds
2007-03-05 18:55 ` [PATCH] revision walker: Fix --boundary when limited Johannes Schindelin
2 siblings, 0 replies; 26+ messages in thread
From: Junio C Hamano @ 2007-03-05 12:15 UTC (permalink / raw)
To: Santi Béjar; +Cc: Johannes Schindelin, Git Mailing List
Junio C Hamano <junkio@cox.net> writes:
> "Santi Béjar" <sbejar@gmail.com> writes:
>
>> the --topo-order does not play well with --boundary and --max-count.
>>
>> $ git-rev-list --boundary --max-count=50 5ced0 | wc -l
>> 56
>> $ git-rev-list --topo-order --boundary --max-count=50 5ced0 | wc -l
>> 8846
>>
>> (5ced0 is git.git's master). I think it should be 56 for both. It
>> presents this behaviour since c4025103fa, when was added --boundary
>> support for git-rev-list --max-count and --max-age.
>
> I think the code that does --boundary when the list is limited
> with --max-count is not quite right, even without topo-order.
> Only when the traversal is not limited, the code happens to work
> correctly because in that case alone we pick up positive commits
> one by one up to the specified count, and do not place anything
> other than their immediate parents in the list.
This is not even correct. Let's see an extreme example.
Suppose you have something like this:
---o---o---o---x---A
---o---o---o---y---B
and think about what "rev-list --boundary --max-count=1 A B"
should return. It does not matter how branches A and B are
related in the past because we are showing only one.
Without --boundary, it is clear we will show B (time flows from
left to right). With --boundary, the current code would show B,
and show -y and -A as boundaries, and I think that is wrong.
Originally --boundary was invented for the specific purpose of
supporting thin packs. It worked on the set of commits
resulting from a limited traversal (that is, you have at least
one negative, iow UNINTERESTING, commit and one or more positive
commits), and it showed the negative commit that is a parent of
a positive commit.
There are two primary users of --boundary right now. gitk wants
to show where the partial traversal ends (although it can figure
it out itself without help from --boundary), and thin pack
generation wants to have it upfront so that it can see which
trees and blobs can be used as the bases of delta. In both
cases, the semantics desired is to show commits that are _not_
included in the usual (i.e. non --boundary) results that are
immediate parents of the commits that are included in the
result.
So with that definition, the above example should show B and
then -y as boundary, and should not even talk about A nor -x.
This may affect the git-bundle's computation of references
included in the bundle (I think the current code assumes that if
you do "git bundle --max-count=1 A B" the resulting bundle says
its set of tips consists of A and B) but if that is broken it
also needs to be fixed.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [BUG] git-rev-list: --topo-order --boundary and --max-count
2007-03-05 10:39 ` Junio C Hamano
2007-03-05 12:15 ` Junio C Hamano
@ 2007-03-05 17:05 ` Linus Torvalds
2007-03-05 18:55 ` [PATCH] revision walker: Fix --boundary when limited Johannes Schindelin
2 siblings, 0 replies; 26+ messages in thread
From: Linus Torvalds @ 2007-03-05 17:05 UTC (permalink / raw)
To: Junio C Hamano
Cc: Santi Béjar, Johannes Schindelin, Git Mailing List
On Mon, 5 Mar 2007, Junio C Hamano wrote:
>
> I think the code that does --boundary when the list is limited
> with --max-count is not quite right, even without topo-order.
Yeah. Sadly, this is a really irritating bug, becuase it means that you
cannot do
gitk -50
to see some random collection of 50 recent commits.
(And yes, I've wanted to do that - I know the commit is fairly recent, so
rather than write "gitk @{1.hour.ag}..", I'd rather just be lazy and say
"gitk -100" to get a smaller slider bar and easier to find the recent
ones).
I never cared enough to fix it, but it's a mis-feature. I agree that it's
probably not entirely trivial to fix.
Linus
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH] revision walker: Fix --boundary when limited
2007-03-05 10:39 ` Junio C Hamano
2007-03-05 12:15 ` Junio C Hamano
2007-03-05 17:05 ` Linus Torvalds
@ 2007-03-05 18:55 ` Johannes Schindelin
2007-03-05 19:00 ` Johannes Schindelin
` (2 more replies)
2 siblings, 3 replies; 26+ messages in thread
From: Johannes Schindelin @ 2007-03-05 18:55 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Santi Béjar, Git Mailing List
[-- Attachment #1: Type: TEXT/PLAIN, Size: 3547 bytes --]
In case revs->limited == 1, the revision walker really reads
everything into revs->commits. The behaviour introduced in
c4025103fa does not behave correctly in that case.
It used to say: everything which is _still_ in the pipeline
must be a boundary commit.
So, in the case that revs->limited == 1, filter out all commits
which are not dangling, in effect marking only the dangling
ones as boundary commits.
This bug was noticed by Santi Béjar.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
---
On Mon, 5 Mar 2007, Junio C Hamano wrote:
> "Santi Béjar" <sbejar@gmail.com> writes:
>
> > the --topo-order does not play well with --boundary and
> > --max-count.
> >
> > $ git-rev-list --boundary --max-count=50 5ced0 | wc -l
> > 56
> > $ git-rev-list --topo-order --boundary --max-count=50 5ced0 \
> > | wc -l
> > 8846
> >
> > (5ced0 is git.git's master). I think it should be 56 for both.
> > It presents this behaviour since c4025103fa, when was added
> > --boundary support for git-rev-list --max-count and --max-age.
>
> I think the code that does --boundary when the list is limited
> with --max-count is not quite right, even without topo-order.
Right.
> It needs to find out commits (be they marked as UNINTERESTING or
> not) still in the revs->commits that are _not_ reachable by any
> other commits in the list, or something like that.
>
> I suspect that would unfortunately be very expensive. Dscho,
> have better ideas?
Unfortunately not. Fortunately, it is nowhere near as expensive as
I originally thought. It just has to iterate through revs->commits
three times, which is way cheaper than sorting the commits in the
first place.
Junio, if you apply this, could you make extra sure that I did not
fsck up Santi's family name (I am running on a peculiar mixture of
UTF-8 and ISO-8859-1 terminals...).
revision.c | 31 +++++++++++++++++++++++++++----
1 files changed, 27 insertions(+), 4 deletions(-)
diff --git a/revision.c b/revision.c
index f5b8ae4..8d47fac 100644
--- a/revision.c
+++ b/revision.c
@@ -1294,6 +1294,26 @@ static struct commit *get_revision_1(struct rev_info *revs)
return NULL;
}
+static struct commit_list *get_dangling_commits(struct commit_list *list)
+{
+ struct commit_list *result = NULL, *iter, *parent;
+
+ for (iter = list; iter; iter = iter->next)
+ for (parent = iter->item->parents;
+ parent;
+ parent = parent->next)
+ parent->item->object.flags |= TMP_MARK;
+ for (iter = list; iter; iter = iter->next)
+ if (!(iter->item->object.flags & TMP_MARK))
+ commit_list_insert(iter->item, &result);
+ for (iter = list; iter; iter = iter->next)
+ for (parent = iter->item->parents;
+ parent;
+ parent = parent->next)
+ parent->item->object.flags &= ~TMP_MARK;
+ return result;
+}
+
struct commit *get_revision(struct rev_info *revs)
{
struct commit *c = NULL;
@@ -1345,12 +1365,15 @@ struct commit *get_revision(struct rev_info *revs)
break;
case 0:
if (revs->boundary) {
- struct commit_list *list = revs->commits;
- while (list) {
+ struct commit_list *list;
+ if (revs->limited) {
+ list = get_dangling_commits(revs->commits);
+ free_commit_list(revs->commits);
+ revs->commits = list;
+ }
+ for (list = revs->commits; list; list = list->next)
list->item->object.flags |=
BOUNDARY_SHOW | BOUNDARY;
- list = list->next;
- }
/* all remaining commits are boundary commits */
revs->max_count = -1;
revs->limited = 1;
--
1.5.0.3.2518.g2f72-dirty
^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH] revision walker: Fix --boundary when limited
2007-03-05 18:55 ` [PATCH] revision walker: Fix --boundary when limited Johannes Schindelin
@ 2007-03-05 19:00 ` Johannes Schindelin
2007-03-05 19:39 ` Linus Torvalds
2007-03-05 21:10 ` [PATCH] revision walker: Fix --boundary when limited Junio C Hamano
2 siblings, 0 replies; 26+ messages in thread
From: Johannes Schindelin @ 2007-03-05 19:00 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Santi Béjar, Git Mailing List
[-- Attachment #1: Type: TEXT/PLAIN, Size: 851 bytes --]
Hi,
On Mon, 5 Mar 2007, Johannes Schindelin wrote:
> On Mon, 5 Mar 2007, Junio C Hamano wrote:
>
> > "Santi Béjar" <sbejar@gmail.com> writes:
> >
> > > the --topo-order does not play well with --boundary and
> > > --max-count.
> > >
> > > $ git-rev-list --boundary --max-count=50 5ced0 | wc -l
> > > 56
> > > $ git-rev-list --topo-order --boundary --max-count=50 5ced0 \
> > > | wc -l
> > > 8846
> > >
> > > (5ced0 is git.git's master). I think it should be 56 for both.
Side note: with the patch I am replying to, the latter command returns 51.
This is correct, since the only boundary commit it shows (1db8b60b) has
(all) the other 5 commits as ancestors.
The behaviour of both is correct: in the former case, there is not enough
information to tell that one of the 6 boundary commits reaches all the
others.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] revision walker: Fix --boundary when limited
2007-03-05 18:55 ` [PATCH] revision walker: Fix --boundary when limited Johannes Schindelin
2007-03-05 19:00 ` Johannes Schindelin
@ 2007-03-05 19:39 ` Linus Torvalds
2007-03-05 19:57 ` Linus Torvalds
2007-03-05 21:10 ` [PATCH] revision walker: Fix --boundary when limited Junio C Hamano
2 siblings, 1 reply; 26+ messages in thread
From: Linus Torvalds @ 2007-03-05 19:39 UTC (permalink / raw)
To: Johannes Schindelin
Cc: Junio C Hamano, Santi Béjar, Git Mailing List
On Mon, 5 Mar 2007, Johannes Schindelin wrote:
>
> In case revs->limited == 1, the revision walker really reads
> everything into revs->commits. The behaviour introduced in
> c4025103fa does not behave correctly in that case.
>
> It used to say: everything which is _still_ in the pipeline
> must be a boundary commit.
I would suggest this (more invasive) patch instead.
Yours is
revision.c | 31 +++++++++++++++++++++++++++----
1 files changed, 27 insertions(+), 4 deletions(-)
and mine is
revision.c | 86 ++++++++++++++++++++++++++++++-----------------------------
1 files changed, 44 insertions(+), 42 deletions(-)
ie I have bigger changes, but on the whole this patch just adds two lines
total, and I *think* the end result is more readable.
NOTE! Our patches aren't really mutually incompatible, and they attack the
problem from two different directions. You do the separate phase (which is
also correct), and my patch instead tries to clean up the commit walking
so that the commit number limiter works more like the date limiter (which
fundamentally has all the same issues! Including the problem with some
commits possibly being marked as boundary commits when they aren't really,
because the path-limiting or revision-limiting ended up cutting things off
*differently* than the date-limiting).
So I would humbly suggest applying this one first (which makes the
handling of the walk-time commit limiter more uniform and less hacky), and
if we need to, we can *also* add the whole separate phase for the
"revs->limited" case..
Linus
---
commit d3dd7e89c123b644ef199380f4f050226e4df862
Author: Linus Torvalds <torvalds@osdl.org>
Date: Mon Mar 5 10:15:20 2007 -0800
revision list: fix BOUNDARY handling with limiters and commit counts
When we limited the number of commits using "max_count", we would not
correctly handle the combination of various time- and reachability-based
limiting and the use of a commit counting. Everything that was
reachable (but overflowed the commit count) would be marked as a
BOUNDARY commit, resulting in things like "gitk" not being usable
together with a numerical limit on the number of commits.
This largely fixes it by being more careful about how we mark commits
that went over the commit counts.
NOTE! Because the numerical limiting happens without a separate phase as
we traverse the commit list, we still won't do the boundary handling
100% correct when a commit may be reachable from multiple sources, and
under those circumstances, some commits will be marked as boundary
commits even though they strictly aren't.
To fix this, we would need to make rather more invasive changes, with
commit counting being an integral part of the limiting (whuch is
fundamnetally hard, since limiting itself will change the number of
commits!).
So this is the "good enough to be quite usable" approach. The problem
only affects boundary commits, and programs like 'gitk' that uses
boundary commits would be better off just noticing themselves that not
all boundary commits are necessarily useful.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
revision.c | 86 ++++++++++++++++++++++++++++++-----------------------------
1 files changed, 44 insertions(+), 42 deletions(-)
diff --git a/revision.c b/revision.c
index f5b8ae4..f5430d6 100644
--- a/revision.c
+++ b/revision.c
@@ -1213,6 +1213,30 @@ static int commit_match(struct commit *commit, struct rev_info *opt)
commit->buffer, strlen(commit->buffer));
}
+enum walk_action {
+ WALK_PARENTS,
+ WALK_STOP,
+};
+
+/*
+ * When we do the list limiting at commit-walking time, we
+ * need to make sure that we stop walking parenthood when
+ * we hit a commit that isn't interesting any more. This can
+ * be due to max_count or due to date limiters.
+ */
+static enum walk_action walk_commit(struct rev_info *revs, struct commit *commit)
+{
+ if (!revs->max_count)
+ return WALK_STOP;
+
+ if (revs->max_age != -1) {
+ if (commit->date < revs->max_age)
+ return WALK_STOP;
+ }
+
+ return WALK_PARENTS;
+}
+
static struct commit *get_revision_1(struct rev_info *revs)
{
if (!revs->commits)
@@ -1233,17 +1257,19 @@ static struct commit *get_revision_1(struct rev_info *revs)
* the parents here. We also need to do the date-based limiting
* that we'd otherwise have done in limit_list().
*/
- if (!revs->limited) {
- if (revs->max_age != -1 &&
- (commit->date < revs->max_age)) {
- if (revs->boundary)
- commit->object.flags |=
- BOUNDARY_SHOW | BOUNDARY;
- else
- continue;
- } else
- add_parents_to_list(revs, commit,
- &revs->commits);
+ switch (walk_commit(revs, commit)) {
+ case WALK_PARENTS:
+ if (revs->limited)
+ break;
+ add_parents_to_list(revs, commit, &revs->commits);
+ break;
+ case WALK_STOP:
+ if (!revs->boundary)
+ continue;
+ if (!(commit->object.flags & UNINTERESTING))
+ commit->object.flags |= BOUNDARY_SHOW | BOUNDARY | UNINTERESTING;
+ mark_parents_uninteresting(commit);
+ break;
}
if (commit->object.flags & SHOWN)
continue;
@@ -1289,6 +1315,12 @@ static struct commit *get_revision_1(struct rev_info *revs)
if (revs->boundary)
mark_boundary_to_show(commit);
commit->object.flags |= SHOWN;
+ if (revs->skip_count > 0) {
+ revs->skip_count--;
+ continue;
+ }
+ if (revs->max_count > 0)
+ revs->max_count--;
return commit;
} while (revs->commits);
return NULL;
@@ -1296,9 +1328,8 @@ static struct commit *get_revision_1(struct rev_info *revs)
struct commit *get_revision(struct rev_info *revs)
{
- struct commit *c = NULL;
-
if (revs->reverse) {
+ struct commit *c;
struct commit_list *list;
/*
@@ -1332,34 +1363,5 @@ struct commit *get_revision(struct rev_info *revs)
return c;
}
- if (0 < revs->skip_count) {
- while ((c = get_revision_1(revs)) != NULL) {
- if (revs->skip_count-- <= 0)
- break;
- }
- }
-
- /* Check the max_count ... */
- switch (revs->max_count) {
- case -1:
- break;
- case 0:
- if (revs->boundary) {
- struct commit_list *list = revs->commits;
- while (list) {
- list->item->object.flags |=
- BOUNDARY_SHOW | BOUNDARY;
- list = list->next;
- }
- /* all remaining commits are boundary commits */
- revs->max_count = -1;
- revs->limited = 1;
- } else
- return NULL;
- default:
- revs->max_count--;
- }
- if (c)
- return c;
return get_revision_1(revs);
}
^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH] revision walker: Fix --boundary when limited
2007-03-05 19:39 ` Linus Torvalds
@ 2007-03-05 19:57 ` Linus Torvalds
2007-03-05 21:10 ` Junio C Hamano
2007-03-05 23:17 ` Johannes Schindelin
0 siblings, 2 replies; 26+ messages in thread
From: Linus Torvalds @ 2007-03-05 19:57 UTC (permalink / raw)
To: Johannes Schindelin
Cc: Junio C Hamano, Santi Béjar, Git Mailing List
On Mon, 5 Mar 2007, Linus Torvalds wrote:
>
> NOTE! Our patches aren't really mutually incompatible, and they attack the
> problem from two different directions. You do the separate phase (which is
> also correct), and my patch instead tries to clean up the commit walking
> so that the commit number limiter works more like the date limiter (which
> fundamentally has all the same issues! Including the problem with some
> commits possibly being marked as boundary commits when they aren't really,
> because the path-limiting or revision-limiting ended up cutting things off
> *differently* than the date-limiting).
Side note: the reason you don't *notice* it with the date-limiter is
simply that the date limiter *also* runs at limit-time, rather than just
at the incremental "run at the end" phase. So the date-limiter is much
simpler when done together with other limiters (like path and revision
limiters).
HOWEVER. We can't do the same thing for the numerical one, because we need
to run the other limiters *first*, and the numerical limiter always comes
at the end. And the path-based "dense" limiter actually runs mostly
incrementally, so you cannot do the numerical limiter until after it has
run..
The way to really clean stuff up would be to:
- first phase: limit by date and revision ranges first (both of those are
static and quick, and don't depend on anything else)
We do this already (limit_list)
- second phase: limit by pathname (we don't do this as a phase at all, we
do it incrementally: see "rewrite_parents()")
-third phase: limit by number
HOWEVER. There's a damn good reason why we do things the way we do, namely
simply the fact that we want to do pathname limiting as much at run-time
as possible.. But we *could* do the "rewrite_parents()" thing both in the
non-incremental and in the final phase. However, doing the parent
rewriting is quite nasty and error-prone, so I've been avoiding it.
Anyway, I *suspect* that Dscho's patch might do the wrong thing for
something like
gitk -20 v1.4.4.. t/
exactly because of the subtle interaction between pathname limiting,
static commit limiting *and* commit number limiting. Dscho?
Linus
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] revision walker: Fix --boundary when limited
2007-03-05 19:57 ` Linus Torvalds
@ 2007-03-05 21:10 ` Junio C Hamano
2007-03-06 1:12 ` Johannes Schindelin
2007-03-05 23:17 ` Johannes Schindelin
1 sibling, 1 reply; 26+ messages in thread
From: Junio C Hamano @ 2007-03-05 21:10 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Johannes Schindelin, Santi Béjar, Git Mailing List
Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Mon, 5 Mar 2007, Linus Torvalds wrote:
>>
>> NOTE! Our patches aren't really mutually incompatible, and they attack the
>> problem from two different directions. You do the separate phase (which is
>> also correct), and my patch instead tries to clean up the commit walking
>> so that the commit number limiter works more like the date limiter (which
>> fundamentally has all the same issues! Including the problem with some
>> commits possibly being marked as boundary commits when they aren't really,
>> because the path-limiting or revision-limiting ended up cutting things off
>> *differently* than the date-limiting).
I was working on a different approach, which is:
- rip out the boundary logic from the commit walker. Placing
"negative" commits in the revs->commits list was Ok if all we
cared about "boundary" was the UNINTERESTING limiting case,
but conceptually it was wrong.
- make get_revision_1() function to walk the commits and return
the results as if there is no funny postprocessing flags such
as --reverse, --skip nor --max-count.
- make get_revision() function the postprocessing phase:
If reverse is given, wait for get_revision_1() to give
everything that it would normally give, and then reverse it
before consuming.
If skip is given, skip that many before going further.
If max is given, stop when we gave out that many.
Now that we are about to return one positive commit, mark
the parents of that commit to be potential boundaries
before returning, iff we are doing the boundary processing.
Return the commit.
- After get_revision() finishes giving out all the positive
commits, if we are doing the boundary processing, we look at
the parents that we marked as potential boundaries earlier,
see if they are really boundaries, and give them out.
It loses more code than it adds, even when the new gc_boundary()
function, which is purely for early optimization, is counted.
Note that this patch is purely for eyeballing and discussion
only. It breaks git-bundle's verify logic because the logic
does not use BOUNDARY_SHOW flag for its internal computation
anymore. After we correct it not to attempt to affect the
boundary processing by setting the BOUNDARY_SHOW flag, we can
remove BOUNDARY_SHOW from revision.h and use that bit assignment
for the new CHILD_SHOWN flag.
---
revision.c | 208 +++++++++++++++++++++++++++---------------------------------
revision.h | 8 ++-
2 files changed, 100 insertions(+), 116 deletions(-)
diff --git a/revision.c b/revision.c
index f5b8ae4..5d137ea 100644
--- a/revision.c
+++ b/revision.c
@@ -437,36 +437,6 @@ static void limit_list(struct rev_info *revs)
continue;
p = &commit_list_insert(commit, p)->next;
}
- if (revs->boundary) {
- /* mark the ones that are on the result list first */
- for (list = newlist; list; list = list->next) {
- struct commit *commit = list->item;
- commit->object.flags |= TMP_MARK;
- }
- for (list = newlist; list; list = list->next) {
- struct commit *commit = list->item;
- struct object *obj = &commit->object;
- struct commit_list *parent;
- if (obj->flags & UNINTERESTING)
- continue;
- for (parent = commit->parents;
- parent;
- parent = parent->next) {
- struct commit *pcommit = parent->item;
- if (!(pcommit->object.flags & UNINTERESTING))
- continue;
- pcommit->object.flags |= BOUNDARY;
- if (pcommit->object.flags & TMP_MARK)
- continue;
- pcommit->object.flags |= TMP_MARK;
- p = &commit_list_insert(pcommit, p)->next;
- }
- }
- for (list = newlist; list; list = list->next) {
- struct commit *commit = list->item;
- commit->object.flags &= ~TMP_MARK;
- }
- }
revs->commits = newlist;
}
@@ -1193,17 +1163,6 @@ static void rewrite_parents(struct rev_info *revs, struct commit *commit)
}
}
-static void mark_boundary_to_show(struct commit *commit)
-{
- struct commit_list *p = commit->parents;
- while (p) {
- commit = p->item;
- p = p->next;
- if (commit->object.flags & BOUNDARY)
- commit->object.flags |= BOUNDARY_SHOW;
- }
-}
-
static int commit_match(struct commit *commit, struct rev_info *opt)
{
if (!opt->grep_filter)
@@ -1235,15 +1194,9 @@ static struct commit *get_revision_1(struct rev_info *revs)
*/
if (!revs->limited) {
if (revs->max_age != -1 &&
- (commit->date < revs->max_age)) {
- if (revs->boundary)
- commit->object.flags |=
- BOUNDARY_SHOW | BOUNDARY;
- else
- continue;
- } else
- add_parents_to_list(revs, commit,
- &revs->commits);
+ (commit->date < revs->max_age))
+ continue;
+ add_parents_to_list(revs, commit, &revs->commits);
}
if (commit->object.flags & SHOWN)
continue;
@@ -1252,18 +1205,6 @@ static struct commit *get_revision_1(struct rev_info *revs)
revs->ignore_packed))
continue;
- /* We want to show boundary commits only when their
- * children are shown. When path-limiter is in effect,
- * rewrite_parents() drops some commits from getting shown,
- * and there is no point showing boundary parents that
- * are not shown. After rewrite_parents() rewrites the
- * parents of a commit that is shown, we mark the boundary
- * parents with BOUNDARY_SHOW.
- */
- if (commit->object.flags & BOUNDARY_SHOW) {
- commit->object.flags |= SHOWN;
- return commit;
- }
if (commit->object.flags & UNINTERESTING)
continue;
if (revs->min_age != -1 && (commit->date > revs->min_age))
@@ -1286,80 +1227,119 @@ static struct commit *get_revision_1(struct rev_info *revs)
if (revs->parents)
rewrite_parents(revs, commit);
}
- if (revs->boundary)
- mark_boundary_to_show(commit);
commit->object.flags |= SHOWN;
return commit;
} while (revs->commits);
return NULL;
}
+static void gc_boundary(struct object_array *array)
+{
+ unsigned nr = array->nr;
+ unsigned alloc = array->alloc;
+ struct object_array_entry *objects = array->objects;
+
+ if (alloc <= nr) {
+ unsigned i, j;
+ for (i = j = 0; i < nr; i++) {
+ if (objects[i].item->flags & SHOWN)
+ continue;
+ if (i != j)
+ objects[j] = objects[i];
+ j++;
+ }
+ for (i = j; j < nr; j++)
+ objects[i].item = NULL;
+ array->nr = j;
+ }
+}
+
struct commit *get_revision(struct rev_info *revs)
{
struct commit *c = NULL;
-
- if (revs->reverse) {
- struct commit_list *list;
-
- /*
- * rev_info.reverse is used to note the fact that we
- * want to output the list of revisions in reverse
- * order. To accomplish this goal, reverse can have
- * different values:
- *
- * 0 do nothing
- * 1 reverse the list
- * 2 internal use: we have already obtained and
- * reversed the list, now we only need to yield
- * its items.
- */
-
- if (revs->reverse == 1) {
- revs->reverse = 0;
- list = NULL;
- while ((c = get_revision(revs)))
- commit_list_insert(c, &list);
- revs->commits = list;
- revs->reverse = 2;
+ struct commit_list *l;
+
+ if (revs->boundary == 2) {
+ unsigned i;
+ struct object_array *array = &revs->boundary_commits;
+ struct object_array_entry *objects = array->objects;
+ for (i = 0; i < array->nr; i++) {
+ c = (struct commit *)(objects[i].item);
+ if (!c)
+ continue;
+ if (!(c->object.flags & CHILD_SHOWN))
+ continue;
+ if (!(c->object.flags & SHOWN))
+ break;
}
-
- if (!revs->commits)
+ if (array->nr <= i)
return NULL;
- c = revs->commits->item;
- list = revs->commits->next;
- free(revs->commits);
- revs->commits = list;
+
+ c->object.flags |= SHOWN | BOUNDARY;
return c;
}
- if (0 < revs->skip_count) {
- while ((c = get_revision_1(revs)) != NULL) {
- if (revs->skip_count-- <= 0)
- break;
- }
+ if (revs->reverse) {
+ l = NULL;
+ while ((c = get_revision_1(revs)))
+ commit_list_insert(c, &l);
+ revs->commits = l;
+ revs->reverse = 0;
}
- /* Check the max_count ... */
+ /*
+ * Now pick up what they want to give us
+ */
+ c = get_revision_1(revs);
+ while (0 < revs->skip_count) {
+ revs->skip_count--;
+ c = get_revision_1(revs);
+ if (!c)
+ break;
+ }
+
+ /*
+ * Check the max_count.
+ */
switch (revs->max_count) {
case -1:
break;
case 0:
- if (revs->boundary) {
- struct commit_list *list = revs->commits;
- while (list) {
- list->item->object.flags |=
- BOUNDARY_SHOW | BOUNDARY;
- list = list->next;
- }
- /* all remaining commits are boundary commits */
- revs->max_count = -1;
- revs->limited = 1;
- } else
- return NULL;
+ c = NULL;
+ break;
default:
revs->max_count--;
}
- if (c)
+
+ if (!revs->boundary)
return c;
- return get_revision_1(revs);
+
+ if (!c) {
+ /*
+ * get_revision_1() runs out the commits, and
+ * we are done computing the boundaries.
+ * switch to boundary commits output mode.
+ */
+ revs->boundary = 2;
+ return get_revision(revs);
+ }
+
+ /*
+ * boundary commits are the commits that are parents of the
+ * ones we got from get_revision_1() but they themselves are
+ * not returned from get_revision_1(). Before returning
+ * 'c', we need to mark its parents that they could be boundaries.
+ */
+
+ for (l = c->parents; l; l = l->next) {
+ struct object *p;
+ p = &(l->item->object);
+ if (p->flags & CHILD_SHOWN)
+ continue;
+ p->flags |= CHILD_SHOWN;
+ gc_boundary(&revs->boundary_commits);
+ add_object_array(p, NULL, &revs->boundary_commits);
+ }
+
+ return c;
}
diff --git a/revision.h b/revision.h
index 5fec184..6579a44 100644
--- a/revision.h
+++ b/revision.h
@@ -10,6 +10,7 @@
#define BOUNDARY_SHOW (1u<<6)
#define ADDED (1u<<7) /* Parents already parsed and added? */
#define SYMMETRIC_LEFT (1u<<8)
+#define CHILD_SHOWN (1u<<9)
struct rev_info;
struct log_info;
@@ -21,6 +22,9 @@ struct rev_info {
struct commit_list *commits;
struct object_array pending;
+ /* Parents of shown commits */
+ struct object_array boundary_commits;
+
/* Basic information */
const char *prefix;
void *prune_data;
@@ -40,10 +44,10 @@ struct rev_info {
edge_hint:1,
limited:1,
unpacked:1, /* see also ignore_packed below */
- boundary:1,
+ boundary:2,
left_right:1,
parents:1,
- reverse:2;
+ reverse:1;
/* Diff flags */
unsigned int diff:1,
^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH] revision walker: Fix --boundary when limited
2007-03-05 18:55 ` [PATCH] revision walker: Fix --boundary when limited Johannes Schindelin
2007-03-05 19:00 ` Johannes Schindelin
2007-03-05 19:39 ` Linus Torvalds
@ 2007-03-05 21:10 ` Junio C Hamano
2 siblings, 0 replies; 26+ messages in thread
From: Junio C Hamano @ 2007-03-05 21:10 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Santi Béjar, Git Mailing List
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> In case revs->limited == 1, the revision walker really reads
> everything into revs->commits. The behaviour introduced in
> c4025103fa does not behave correctly in that case.
>
> It used to say: everything which is _still_ in the pipeline
> must be a boundary commit.
>
> So, in the case that revs->limited == 1, filter out all commits
> which are not dangling, in effect marking only the dangling
> ones as boundary commits.
That's what I suggested initially but I think that is wrong. We
should show only unshown immediate parents of shown commits at
that stage.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] revision walker: Fix --boundary when limited
2007-03-05 19:57 ` Linus Torvalds
2007-03-05 21:10 ` Junio C Hamano
@ 2007-03-05 23:17 ` Johannes Schindelin
2007-03-06 0:36 ` Junio C Hamano
1 sibling, 1 reply; 26+ messages in thread
From: Johannes Schindelin @ 2007-03-05 23:17 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Junio C Hamano, Santi Béjar, Git Mailing List
Hi,
On Mon, 5 Mar 2007, Linus Torvalds wrote:
> Anyway, I *suspect* that Dscho's patch might do the wrong thing for
> something like
>
> gitk -20 v1.4.4.. t/
>
> exactly because of the subtle interaction between pathname limiting,
> static commit limiting *and* commit number limiting. Dscho?
Correct. I'll have a look at Junio's patch shortly, which hopefully makes
me forget my shortsighted patch.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] revision walker: Fix --boundary when limited
2007-03-05 23:17 ` Johannes Schindelin
@ 2007-03-06 0:36 ` Junio C Hamano
2007-03-06 0:41 ` [PATCH 2/4] revision traversal: retire BOUNDARY_SHOW Junio C Hamano
` (2 more replies)
0 siblings, 3 replies; 26+ messages in thread
From: Junio C Hamano @ 2007-03-06 0:36 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Linus Torvalds, Santi Béjar, Git Mailing List
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> On Mon, 5 Mar 2007, Linus Torvalds wrote:
>
>> Anyway, I *suspect* that Dscho's patch might do the wrong thing for
>> something like
>>
>> gitk -20 v1.4.4.. t/
>>
>> exactly because of the subtle interaction between pathname limiting,
>> static commit limiting *and* commit number limiting. Dscho?
>
> Correct. I'll have a look at Junio's patch shortly, which hopefully makes
> me forget my shortsighted patch.
I have a handful updates, including a few 'git-bundle' patches.
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 2/4] revision traversal: retire BOUNDARY_SHOW
2007-03-06 0:36 ` Junio C Hamano
@ 2007-03-06 0:41 ` Junio C Hamano
2007-03-06 2:05 ` Johannes Schindelin
2007-03-06 0:41 ` [PATCH 3/4] git-bundle: various fixups Junio C Hamano
2007-03-06 0:41 ` [PATCH 4/4] git-bundle: --list-prereqs Junio C Hamano
2 siblings, 1 reply; 26+ messages in thread
From: Junio C Hamano @ 2007-03-06 0:41 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Linus Torvalds, Santi Béjar, Git Mailing List
This removes the flag internally used by revision traversal to
decide which commits are indeed boundaries and renames it to
CHILD_SHOWN. builtin-bundle uses the symbol for its
verification, but I think the logic it uses it is wrong. The
flag is still useful but it is local to the git-bundle, so it is
renamed to PREREQ_MARK.
Signed-off-by: Junio C Hamano <junkio@cox.net>
---
* The earlier one that competed with Johannes's and Linus's is
the [1/4] of this series.
builtin-bundle.c | 6 ++++--
revision.c | 9 ++++++++-
revision.h | 3 +--
3 files changed, 13 insertions(+), 5 deletions(-)
diff --git a/builtin-bundle.c b/builtin-bundle.c
index d41a413..0265845 100644
--- a/builtin-bundle.c
+++ b/builtin-bundle.c
@@ -160,6 +160,8 @@ static int fork_with_pipe(const char **argv, int *in, int *out)
return pid;
}
+#define PREREQ_MARK (1u<<16)
+
static int verify_bundle(struct bundle_header *header)
{
/*
@@ -179,7 +181,7 @@ static int verify_bundle(struct bundle_header *header)
struct ref_list_entry *e = p->list + i;
struct object *o = parse_object(e->sha1);
if (o) {
- o->flags |= BOUNDARY_SHOW;
+ o->flags |= PREREQ_MARK;
add_pending_object(&revs, o, e->name);
continue;
}
@@ -202,7 +204,7 @@ static int verify_bundle(struct bundle_header *header)
i = req_nr;
while (i && (commit = get_revision(&revs)))
- if (commit->object.flags & BOUNDARY_SHOW)
+ if (commit->object.flags & PREREQ_MARK)
i--;
for (i = 0; i < req_nr; i++)
diff --git a/revision.c b/revision.c
index 5d137ea..2d27ccf 100644
--- a/revision.c
+++ b/revision.c
@@ -1285,17 +1285,21 @@ struct commit *get_revision(struct rev_info *revs)
commit_list_insert(c, &l);
revs->commits = l;
revs->reverse = 0;
+ c = NULL;
}
/*
* Now pick up what they want to give us
*/
- c = get_revision_1(revs);
+ if (!(c = get_revision_1(revs)))
+ return NULL;
while (0 < revs->skip_count) {
revs->skip_count--;
c = get_revision_1(revs);
if (!c)
break;
+ /* Although we grabbed it, it is not shown. */
+ c->object.flags &= ~SHOWN;
}
/*
@@ -1305,6 +1309,9 @@ struct commit *get_revision(struct rev_info *revs)
case -1:
break;
case 0:
+ /* Although we grabbed it, it is not shown. */
+ if (c)
+ c->object.flags &= ~SHOWN;
c = NULL;
break;
default:
diff --git a/revision.h b/revision.h
index 6579a44..1885f8d 100644
--- a/revision.h
+++ b/revision.h
@@ -7,10 +7,9 @@
#define SHOWN (1u<<3)
#define TMP_MARK (1u<<4) /* for isolated cases; clean after use */
#define BOUNDARY (1u<<5)
-#define BOUNDARY_SHOW (1u<<6)
+#define CHILD_SHOWN (1u<<6)
#define ADDED (1u<<7) /* Parents already parsed and added? */
#define SYMMETRIC_LEFT (1u<<8)
-#define CHILD_SHOWN (1u<<9)
struct rev_info;
struct log_info;
--
1.5.0.3.862.g71037
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH 3/4] git-bundle: various fixups
2007-03-06 0:36 ` Junio C Hamano
2007-03-06 0:41 ` [PATCH 2/4] revision traversal: retire BOUNDARY_SHOW Junio C Hamano
@ 2007-03-06 0:41 ` Junio C Hamano
2007-03-06 2:13 ` Johannes Schindelin
2007-03-06 0:41 ` [PATCH 4/4] git-bundle: --list-prereqs Junio C Hamano
2 siblings, 1 reply; 26+ messages in thread
From: Junio C Hamano @ 2007-03-06 0:41 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Linus Torvalds, Santi Béjar, Git Mailing List
verify_bundle() returned with an error early only when all
prerequisite commits were missing. It should error out much
earlier when some are missing.
When the rev-list is limited in ways other than revision range
(e.g. --max-count or --max-age), create_bundle() listed all
positive refs given from the command line as if they are
available, but resulting pack may not have some of them. Add a
logic to make sure all of them are included, and error out
otherwise.
Signed-off-by: Junio C Hamano <junkio@cox.net>
---
| 46 +++++++++++++++++++++++++++++++++-------------
1 files changed, 33 insertions(+), 13 deletions(-)
--git a/builtin-bundle.c b/builtin-bundle.c
index 0265845..9286f3d 100644
--- a/builtin-bundle.c
+++ b/builtin-bundle.c
@@ -189,7 +189,7 @@ static int verify_bundle(struct bundle_header *header)
error(message);
error("%s %s", sha1_to_hex(e->sha1), e->name);
}
- if (revs.pending.nr == 0)
+ if (revs.pending.nr != p->nr)
return ret;
req_nr = revs.pending.nr;
setup_revisions(2, argv, &revs, NULL);
@@ -274,6 +274,7 @@ static int create_bundle(struct bundle_header *header, const char *path,
int pid, in, out, i, status;
char buffer[1024];
struct rev_info revs;
+ struct object_array tips;
bundle_fd = (!strcmp(path, "-") ? 1 :
open(path, O_CREAT | O_WRONLY, 0666));
@@ -311,19 +312,23 @@ static int create_bundle(struct bundle_header *header, const char *path,
argc = setup_revisions(argc, argv, &revs, NULL);
if (argc > 1)
return error("unrecognized argument: %s'", argv[1]);
+
+ memset(&tips, 0, sizeof(tips));
for (i = 0; i < revs.pending.nr; i++) {
struct object_array_entry *e = revs.pending.objects + i;
- if (!(e->item->flags & UNINTERESTING)) {
- unsigned char sha1[20];
- char *ref;
- if (dwim_ref(e->name, strlen(e->name), sha1, &ref) != 1)
- continue;
- write_or_die(bundle_fd, sha1_to_hex(e->item->sha1), 40);
- write_or_die(bundle_fd, " ", 1);
- write_or_die(bundle_fd, ref, strlen(ref));
- write_or_die(bundle_fd, "\n", 1);
- free(ref);
- }
+ unsigned char sha1[20];
+ char *ref;
+
+ if (e->item->flags & UNINTERESTING)
+ continue;
+ if (dwim_ref(e->name, strlen(e->name), sha1, &ref) != 1)
+ continue;
+ write_or_die(bundle_fd, sha1_to_hex(e->item->sha1), 40);
+ write_or_die(bundle_fd, " ", 1);
+ write_or_die(bundle_fd, ref, strlen(ref));
+ write_or_die(bundle_fd, "\n", 1);
+ add_object_array(e->item, e->name, &tips);
+ free(ref);
}
/* end header */
@@ -350,7 +355,22 @@ static int create_bundle(struct bundle_header *header, const char *path,
return -1;
if (!WIFEXITED(status) || WEXITSTATUS(status))
return error ("pack-objects died");
- return 0;
+
+ /*
+ * Make sure the refs we wrote out is correct; --max-count and
+ * other limiting options could have prevented all the tips
+ * from getting output.
+ */
+ status = 0;
+ for (i = 0; i < tips.nr; i++) {
+ if (!(tips.objects[i].item->flags & SHOWN)) {
+ status = 1;
+ error("%s: not included in the resulting pack",
+ tips.objects[i].name);
+ }
+ }
+
+ return status;
}
static int unbundle(struct bundle_header *header, int bundle_fd,
--
1.5.0.3.862.g71037
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH 4/4] git-bundle: --list-prereqs
2007-03-06 0:36 ` Junio C Hamano
2007-03-06 0:41 ` [PATCH 2/4] revision traversal: retire BOUNDARY_SHOW Junio C Hamano
2007-03-06 0:41 ` [PATCH 3/4] git-bundle: various fixups Junio C Hamano
@ 2007-03-06 0:41 ` Junio C Hamano
2 siblings, 0 replies; 26+ messages in thread
From: Junio C Hamano @ 2007-03-06 0:41 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Linus Torvalds, Santi Béjar, Git Mailing List
This is primarily of interest while debugging the bundle creation,
but there may be other uses.
Signed-off-by: Junio C Hamano <junkio@cox.net>
---
* I am not sure if this should be a separate command, but the
current command set did not have a way to review prereqs,
short of:
sed -ne '/^PACK/q' -e 'p' bundle.bdl
It would make sense to have 'git bundle verify' output the
heads and prereqs, maybe with --verbose, but I think doing
that always might be even more helpful.
builtin-bundle.c | 18 +++++++++++++++++-
1 files changed, 17 insertions(+), 1 deletions(-)
diff --git a/builtin-bundle.c b/builtin-bundle.c
index 9286f3d..4fe74a7 100644
--- a/builtin-bundle.c
+++ b/builtin-bundle.c
@@ -241,6 +241,19 @@ static int list_heads(struct bundle_header *header, int argc, const char **argv)
return 0;
}
+static int list_prereqs(struct bundle_header *header, int argc, const char **argv)
+{
+ int i;
+
+ struct ref_list *p = &header->prerequisites;
+ for (i = 0; i < p->nr; i++) {
+ printf("%s %s\n",
+ sha1_to_hex(p->list[i].sha1),
+ p->list[i].name);
+ }
+ return 0;
+}
+
static void show_commit(struct commit *commit)
{
write_or_die(1, sha1_to_hex(commit->object.sha1), 40);
@@ -432,6 +445,10 @@ int cmd_bundle(int argc, const char **argv, const char *prefix)
close(bundle_fd);
return !!list_heads(&header, argc, argv);
}
+ if (!strcmp(cmd, "list-prereqs")) {
+ close(bundle_fd);
+ return !!list_prereqs(&header, argc, argv);
+ }
if (!strcmp(cmd, "create")) {
if (nongit)
die("Need a repository to create a bundle.");
@@ -443,4 +460,3 @@ int cmd_bundle(int argc, const char **argv, const char *prefix)
} else
usage(bundle_usage);
}
-
--
1.5.0.3.862.g71037
^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH] revision walker: Fix --boundary when limited
2007-03-05 21:10 ` Junio C Hamano
@ 2007-03-06 1:12 ` Johannes Schindelin
2007-03-06 1:32 ` Junio C Hamano
0 siblings, 1 reply; 26+ messages in thread
From: Johannes Schindelin @ 2007-03-06 1:12 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Git Mailing List
Hi,
On Mon, 5 Mar 2007, Junio C Hamano wrote:
> - rip out the boundary logic from the commit walker. Placing
> "negative" commits in the revs->commits list was Ok if all we
> cared about "boundary" was the UNINTERESTING limiting case,
> but conceptually it was wrong.
I agree.
> It loses more code than it adds, even when the new gc_boundary()
> function, which is purely for early optimization, is counted.
I have the slight suspicion that it might be faster if gc_boundary() was
not called at all... But that's probably a subject for later.
> Note that this patch is purely for eyeballing and discussion
> only. It breaks git-bundle's verify logic because the logic
> does not use BOUNDARY_SHOW flag for its internal computation
> anymore. After we correct it not to attempt to affect the
> boundary processing by setting the BOUNDARY_SHOW flag, we can
> remove BOUNDARY_SHOW from revision.h and use that bit assignment
> for the new CHILD_SHOWN flag.
Yes, git-bundle should be easy enough to fix.
> + struct commit_list *l;
> +
> + if (revs->boundary == 2) {
> + unsigned i;
> + struct object_array *array = &revs->boundary_commits;
> + struct object_array_entry *objects = array->objects;
> + for (i = 0; i < array->nr; i++) {
An easy optimisation would be iterate in the other direction, resetting
array->nr after the loop. Of course, this does not preserve the order...
> + if (revs->reverse) {
> + l = NULL;
> + while ((c = get_revision_1(revs)))
> + commit_list_insert(c, &l);
> + revs->commits = l;
> + revs->reverse = 0;
> }
Clever!
>
> - /* Check the max_count ... */
> + /*
> + * Now pick up what they want to give us
> + */
> + c = get_revision_1(revs);
> + while (0 < revs->skip_count) {
> + revs->skip_count--;
> + c = get_revision_1(revs);
> + if (!c)
> + break;
> + }
> +
> + /*
> + * Check the max_count.
> + */
> switch (revs->max_count) {
> case -1:
> break;
> case 0:
> - if (revs->boundary) {
> - struct commit_list *list = revs->commits;
> - while (list) {
> - list->item->object.flags |=
> - BOUNDARY_SHOW | BOUNDARY;
> - list = list->next;
> - }
> - /* all remaining commits are boundary commits */
> - revs->max_count = -1;
> - revs->limited = 1;
> - } else
> - return NULL;
> + c = NULL;
> + break;
I guess that you want to do this instead:
case 0:
c = NULL;
/* fall through */
> default:
> revs->max_count--;
so that --reverse actually works (otherwise, the while() loop would
deplete max_count, and then max_count would be 0, and nothing would be
returned).
> + /*
> + * boundary commits are the commits that are parents of the
> + * ones we got from get_revision_1() but they themselves are
> + * not returned from get_revision_1(). Before returning
> + * 'c', we need to mark its parents that they could be boundaries.
> + */
> +
> + for (l = c->parents; l; l = l->next) {
AFAICT this changes behaviour: c->parents were possibly rewritten. But I
guess it makes sense showing the rewritten parents as boundary commits,
not the real parents.
> + struct object *p;
> + p = &(l->item->object);
> + if (p->flags & CHILD_SHOWN)
... or
if (p->flags & (CHILD_SHOWN | SHOWN))
?
Ciao,
Dscho
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] revision walker: Fix --boundary when limited
2007-03-06 1:12 ` Johannes Schindelin
@ 2007-03-06 1:32 ` Junio C Hamano
2007-03-06 1:44 ` Johannes Schindelin
0 siblings, 1 reply; 26+ messages in thread
From: Junio C Hamano @ 2007-03-06 1:32 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Git Mailing List
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>> + if (revs->reverse) {
>> + l = NULL;
>> + while ((c = get_revision_1(revs)))
>> + commit_list_insert(c, &l);
>> + revs->commits = l;
>> + revs->reverse = 0;
>> }
>
> Clever!
It is not clever, but just is stupid and WRONG. It just shows
how little I care about --reverse ;-).
revision_1() is to get the full list without non limit limiters,
so the above loop would not even deplete the max_count but
literally grabs everything.
Arguably, you can do "rev-list --reverse -2 master" to get the
first two commits with this, but it was not intended. We would
need to grab (skip + max_count) from get_revision_1() and set
max_count to -1 before leaving the if () {} here, or something
like that.
> I guess that you want to do this instead:
>
> case 0:
> c = NULL;
> /* fall through */
>> default:
>> revs->max_count--;
>
> ...
Actually that is not correct either. The other patch fixes it
(but not the unintended semantic change to --reverse).
>> + for (l = c->parents; l; l = l->next) {
>
> AFAICT this changes behaviour: c->parents were possibly rewritten.
Well, the behaviour of max with boundary in 'master' did the
same thing, as what was in revs->commits are rewritten parents
of commits we already returned, didn't it?
> But I
> guess it makes sense showing the rewritten parents as boundary commits,
> not the real parents.
>
>> + struct object *p;
>> + p = &(l->item->object);
>> + if (p->flags & CHILD_SHOWN)
>
> ... or
> if (p->flags & (CHILD_SHOWN | SHOWN))
"If itself is shown it cannot be a boundary, and if we know we
marked it as potential boundary we do not have to mark it
again"... I think you are right.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] revision walker: Fix --boundary when limited
2007-03-06 1:32 ` Junio C Hamano
@ 2007-03-06 1:44 ` Johannes Schindelin
2007-03-06 1:58 ` Junio C Hamano
0 siblings, 1 reply; 26+ messages in thread
From: Johannes Schindelin @ 2007-03-06 1:44 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Git Mailing List
Hi,
On Mon, 5 Mar 2007, Junio C Hamano wrote:
> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
> >> + if (revs->reverse) {
> >> + l = NULL;
> >> + while ((c = get_revision_1(revs)))
> >> + commit_list_insert(c, &l);
> >> + revs->commits = l;
> >> + revs->reverse = 0;
> >> }
> >
> > Clever!
>
> It is not clever, but just is stupid and WRONG. It just shows
> how little I care about --reverse ;-).
>
> revision_1() is to get the full list without non limit limiters,
> so the above loop would not even deplete the max_count but
> literally grabs everything.
Oops. I missed the _1(), _and_ the missing "revs->reverse = 0"...
> >> + for (l = c->parents; l; l = l->next) {
> >
> > AFAICT this changes behaviour: c->parents were possibly rewritten.
>
> Well, the behaviour of max with boundary in 'master' did the same thing,
> as what was in revs->commits are rewritten parents of commits we already
> returned, didn't it?
I missed that, too. Maybe I should get more familiar with the revision
walker first, before continuing to ask for ridicule.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] revision walker: Fix --boundary when limited
2007-03-06 1:44 ` Johannes Schindelin
@ 2007-03-06 1:58 ` Junio C Hamano
0 siblings, 0 replies; 26+ messages in thread
From: Junio C Hamano @ 2007-03-06 1:58 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Git Mailing List
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> I missed that, too. Maybe I should get more familiar with the revision
> walker first,...
No. You are asking the right questions and helped finding a few
bugs already.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 2/4] revision traversal: retire BOUNDARY_SHOW
2007-03-06 0:41 ` [PATCH 2/4] revision traversal: retire BOUNDARY_SHOW Junio C Hamano
@ 2007-03-06 2:05 ` Johannes Schindelin
2007-03-06 2:17 ` Junio C Hamano
0 siblings, 1 reply; 26+ messages in thread
From: Johannes Schindelin @ 2007-03-06 2:05 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Linus Torvalds, Santi Béjar, Git Mailing List
Hi,
On Mon, 5 Mar 2007, Junio C Hamano wrote:
> This removes the flag internally used by revision traversal to
> decide which commits are indeed boundaries and renames it to
> CHILD_SHOWN. builtin-bundle uses the symbol for its
> verification, but I think the logic it uses it is wrong. The
> flag is still useful but it is local to the git-bundle, so it is
> renamed to PREREQ_MARK.
The idea was to bail out of revision walking when that prerequisite was
found, _even_ if it happened to be uninteresting.
Just marking the parents uninteresting unless they are prerequisites would
be better, probably.
>
> i = req_nr;
> while (i && (commit = get_revision(&revs)))
> - if (commit->object.flags & BOUNDARY_SHOW)
> + if (commit->object.flags & PREREQ_MARK)
> i--;
The above-mentioned idea, then would be: instead of just i--; do
{
struct commit_list *parent = commit->parents;
i--;
for (; parent; parent = parent->next)
if (!(parent->item->object.flag &
PREREQ_MARK))
parent->item->object.flag |=
UNINTERESTING;
}
This should help performance, as not all reachable commits are traversed
any more.
>
> for (i = 0; i < req_nr; i++)
> diff --git a/revision.c b/revision.c
> index 5d137ea..2d27ccf 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -1285,17 +1285,21 @@ struct commit *get_revision(struct rev_info *revs)
> commit_list_insert(c, &l);
> revs->commits = l;
> revs->reverse = 0;
> + c = NULL;
Why? It gets assigned here anyway:
> }
>
> /*
> * Now pick up what they want to give us
> */
> - c = get_revision_1(revs);
> + if (!(c = get_revision_1(revs)))
> + return NULL;
> while (0 < revs->skip_count) {
> revs->skip_count--;
> c = get_revision_1(revs);
> if (!c)
> break;
> + /* Although we grabbed it, it is not shown. */
> + c->object.flags &= ~SHOWN;
Wasn't --skip meant for something like gitweb, where you just want to skip
the first <n> commits from the list, but do not want to change the list
otherwise?
> @@ -1305,6 +1309,9 @@ struct commit *get_revision(struct rev_info *revs)
> case -1:
> break;
> case 0:
> + /* Although we grabbed it, it is not shown. */
> + if (c)
> + c->object.flags &= ~SHOWN;
> c = NULL;
Is this really relevant in practice?
Ciao,
Dscho
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 3/4] git-bundle: various fixups
2007-03-06 0:41 ` [PATCH 3/4] git-bundle: various fixups Junio C Hamano
@ 2007-03-06 2:13 ` Johannes Schindelin
0 siblings, 0 replies; 26+ messages in thread
From: Johannes Schindelin @ 2007-03-06 2:13 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Linus Torvalds, Santi Béjar, Git Mailing List
Hi,
On Mon, 5 Mar 2007, Junio C Hamano wrote:
> verify_bundle() returned with an error early only when all
> prerequisite commits were missing. It should error out much
> earlier when some are missing.
The idea was to list _all_ missing prerequisites so that git-bundle will
not complain _again_, about different missing prerequisites, when the
first set of missing prerequisites was obtained.
The check only prevented looking for _reachable_ commits, when none of the
prerequisites were found in the object database to begin with.
> When the rev-list is limited in ways other than revision range (e.g.
> --max-count or --max-age), create_bundle() listed all positive refs
> given from the command line as if they are available, but resulting pack
> may not have some of them. Add a logic to make sure all of them are
> included, and error out otherwise.
Good catch!
Ciao,
Dscho
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 2/4] revision traversal: retire BOUNDARY_SHOW
2007-03-06 2:05 ` Johannes Schindelin
@ 2007-03-06 2:17 ` Junio C Hamano
2007-03-06 2:23 ` SHOWN means shown Junio C Hamano
2007-03-06 2:29 ` [PATCH 2/4] revision traversal: retire BOUNDARY_SHOW Johannes Schindelin
0 siblings, 2 replies; 26+ messages in thread
From: Junio C Hamano @ 2007-03-06 2:17 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Linus Torvalds, Santi Béjar, Git Mailing List
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> On Mon, 5 Mar 2007, Junio C Hamano wrote:
>
>> This removes the flag internally used by revision traversal to
>> decide which commits are indeed boundaries and renames it to
>> CHILD_SHOWN. builtin-bundle uses the symbol for its
>> verification, but I think the logic it uses it is wrong. The
>> flag is still useful but it is local to the git-bundle, so it is
>> renamed to PREREQ_MARK.
>
> The idea was to bail out of revision walking when that prerequisite was
> found, _even_ if it happened to be uninteresting.
>
> Just marking the parents uninteresting unless they are prerequisites would
> be better, probably.
>
>>
>> i = req_nr;
>> while (i && (commit = get_revision(&revs)))
>> - if (commit->object.flags & BOUNDARY_SHOW)
>> + if (commit->object.flags & PREREQ_MARK)
>> i--;
>
> The above-mentioned idea, then would be: instead of just i--; do
I explicitly wanted to avoid that. If the generated bundle had
prereqs that are not independent of each other, doing that would
hide some prereqs.
> This should help performance, as not all reachable commits are traversed
> any more.
You prevented that with "while (i &&" part already, didn't you?
> Wasn't --skip meant for something like gitweb, where you just want to skip
> the first <n> commits from the list, but do not want to change the list
> otherwise?
Yes, but does the patch change that?
>
>> @@ -1305,6 +1309,9 @@ struct commit *get_revision(struct rev_info *revs)
>> case -1:
>> break;
>> case 0:
>> + /* Although we grabbed it, it is not shown. */
>> + if (c)
>> + c->object.flags &= ~SHOWN;
>> c = NULL;
>
> Is this really relevant in practice?
Absolutely. But I have further updates on this series.
^ permalink raw reply [flat|nested] 26+ messages in thread
* SHOWN means shown
2007-03-06 2:17 ` Junio C Hamano
@ 2007-03-06 2:23 ` Junio C Hamano
2007-03-06 2:29 ` [PATCH 2/4] revision traversal: retire BOUNDARY_SHOW Johannes Schindelin
1 sibling, 0 replies; 26+ messages in thread
From: Junio C Hamano @ 2007-03-06 2:23 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Linus Torvalds, Santi Béjar, Git Mailing List
This moves the code to set SHOWN on the commit from get_revision_1()
back to get_revision(), so that the bit means what it originally
meant: this commit has been given back to the caller.
Also it fixes the --reverse breakage Dscho pointed out.
---
Junio C Hamano <junkio@cox.net> writes:
>> Is this really relevant in practice?
>
> Absolutely. But I have further updates on this series.
... otherwise, "git-bundle create -3 A B" when B is A~3 would
mistakenly say "Ah, B is SHOWN so include it".
revision.c | 26 ++++++++++++++++++--------
1 files changed, 18 insertions(+), 8 deletions(-)
diff --git a/revision.c b/revision.c
index 2d27ccf..35a1711 100644
--- a/revision.c
+++ b/revision.c
@@ -1227,7 +1227,6 @@ static struct commit *get_revision_1(struct rev_info *revs)
if (revs->parents)
rewrite_parents(revs, commit);
}
- commit->object.flags |= SHOWN;
return commit;
} while (revs->commits);
return NULL;
@@ -1280,11 +1279,22 @@ struct commit *get_revision(struct rev_info *revs)
}
if (revs->reverse) {
+ int limit = -1;
+
+ if (0 <= revs->max_count) {
+ limit = revs->max_count;
+ if (0 < revs->skip_count)
+ limit += revs->skip_count;
+ }
l = NULL;
- while ((c = get_revision_1(revs)))
+ while ((c = get_revision_1(revs))) {
commit_list_insert(c, &l);
+ if ((0 < limit) && !--limit)
+ break;
+ }
revs->commits = l;
revs->reverse = 0;
+ revs->max_count = -1;
c = NULL;
}
@@ -1298,8 +1308,6 @@ struct commit *get_revision(struct rev_info *revs)
c = get_revision_1(revs);
if (!c)
break;
- /* Although we grabbed it, it is not shown. */
- c->object.flags &= ~SHOWN;
}
/*
@@ -1310,16 +1318,18 @@ struct commit *get_revision(struct rev_info *revs)
break;
case 0:
/* Although we grabbed it, it is not shown. */
- if (c)
- c->object.flags &= ~SHOWN;
c = NULL;
break;
default:
revs->max_count--;
}
- if (!revs->boundary)
+ if (c)
+ c->object.flags |= SHOWN;
+
+ if (!revs->boundary) {
return c;
+ }
if (!c) {
/*
@@ -1341,7 +1351,7 @@ struct commit *get_revision(struct rev_info *revs)
for (l = c->parents; l; l = l->next) {
struct object *p;
p = &(l->item->object);
- if (p->flags & CHILD_SHOWN)
+ if (p->flags & (CHILD_SHOWN | SHOWN))
continue;
p->flags |= CHILD_SHOWN;
gc_boundary(&revs->boundary_commits);
^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH 2/4] revision traversal: retire BOUNDARY_SHOW
2007-03-06 2:17 ` Junio C Hamano
2007-03-06 2:23 ` SHOWN means shown Junio C Hamano
@ 2007-03-06 2:29 ` Johannes Schindelin
2007-03-06 11:34 ` Junio C Hamano
1 sibling, 1 reply; 26+ messages in thread
From: Johannes Schindelin @ 2007-03-06 2:29 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Linus Torvalds, Santi Béjar, Git Mailing List
Hi,
On Mon, 5 Mar 2007, Junio C Hamano wrote:
> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
> > Just marking the parents uninteresting unless they are prerequisites
> > would be better, probably.
> >
> >>
> >> i = req_nr;
> >> while (i && (commit = get_revision(&revs)))
> >> - if (commit->object.flags & BOUNDARY_SHOW)
> >> + if (commit->object.flags & PREREQ_MARK)
> >> i--;
> >
> > The above-mentioned idea, then would be: instead of just i--; do
>
> I explicitly wanted to avoid that. If the generated bundle had
> prereqs that are not independent of each other, doing that would
> hide some prereqs.
Yes. I wanted to say SHOWN instead of UNINTERESTING, but reading the
source again, it seems that add_parents_to_list() is called in
get_revision_1() even if the commit is marked SHOWN.
So forget about that idea of mine.
> > This should help performance, as not all reachable commits are traversed
> > any more.
>
> You prevented that with "while (i &&" part already, didn't you?
Well, yes. I also wanted to prevent going down all paths, though.
> > Wasn't --skip meant for something like gitweb, where you just want to
> > skip the first <n> commits from the list, but do not want to change
> > the list otherwise?
>
> Yes, but does the patch change that?
I _think_ that yes, it could. If the timestamp of HEAD is older than of
HEAD^, and you say "git log HEAD^ HEAD", then HEAD^ will be actually shown
first, right? If you then say "--skip 1", HEAD will be shown first, and
then HEAD^...
> >> @@ -1305,6 +1309,9 @@ struct commit *get_revision(struct rev_info *revs)
> >> case -1:
> >> break;
> >> case 0:
> >> + /* Although we grabbed it, it is not shown. */
> >> + if (c)
> >> + c->object.flags &= ~SHOWN;
> >> c = NULL;
> >
> > Is this really relevant in practice?
>
> Absolutely. But I have further updates on this series.
Sorry for interrupting your patch spree :-)
Ciao,
Dscho
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 2/4] revision traversal: retire BOUNDARY_SHOW
2007-03-06 2:29 ` [PATCH 2/4] revision traversal: retire BOUNDARY_SHOW Johannes Schindelin
@ 2007-03-06 11:34 ` Junio C Hamano
2007-03-06 15:52 ` Johannes Schindelin
0 siblings, 1 reply; 26+ messages in thread
From: Junio C Hamano @ 2007-03-06 11:34 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Linus Torvalds, Santi Béjar, Git Mailing List
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>> > This should help performance, as not all reachable commits are traversed
>> > any more.
>>
>> You prevented that with "while (i &&" part already, didn't you?
>
> Well, yes. I also wanted to prevent going down all paths, though.
If we wanted to bundle "-8 A B", I think we would make 'x' and
'y' prereqs, as they are the direct parents commits that are
shown, and that themselves are not shown.
.---*---*---*---*---* A
/
---x---y---*---*---* B
If we say upon hitting prereq (x and y) we stop traversal by
marking the parent UNINTERESTING, I suspected that we may not
find out 'x' with get_revision() loop, and that was why I chose
not to. Instead the loop stops by finding y and then x (and by
saying "ok I needed to find two and now I have two".
But this probably needs more optimization.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 2/4] revision traversal: retire BOUNDARY_SHOW
2007-03-06 11:34 ` Junio C Hamano
@ 2007-03-06 15:52 ` Johannes Schindelin
0 siblings, 0 replies; 26+ messages in thread
From: Johannes Schindelin @ 2007-03-06 15:52 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Linus Torvalds, Santi Béjar, Git Mailing List
Hi,
On Tue, 6 Mar 2007, Junio C Hamano wrote:
> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
> >> > This should help performance, as not all reachable commits are
> >> > traversed any more.
> >>
> >> You prevented that with "while (i &&" part already, didn't you?
> >
> > Well, yes. I also wanted to prevent going down all paths, though.
>
> If we wanted to bundle "-8 A B", I think we would make 'x' and
> 'y' prereqs, as they are the direct parents commits that are
> shown, and that themselves are not shown.
>
> .---*---*---*---*---* A
> /
> ---x---y---*---*---* B
>
> If we say upon hitting prereq (x and y) we stop traversal by
> marking the parent UNINTERESTING, I suspected that we may not
> find out 'x' with get_revision() loop, and that was why I chose
> not to. Instead the loop stops by finding y and then x (and by
> saying "ok I needed to find two and now I have two".
Yes, that is correct. What I wanted to optimize for was this:
x--*--*--A
y--*--B
There is no need to traverse the parents of y to find x. I wanted for the
traversal to just stop on paths where a prereq was found.
But setting the parents to UNINTERESTING or SHOWN was wrong, because that
would not stop the traversal: see get_revision_1(). In case revs->limited
is 0, and the cutoff date is not yet reached, the parents are added
always. Even if the current commit is UNINTERESTING or SHOWN. (Which is
correct, of course.)
BTW this behaviour with revs->limited = 1 made me set revs->limited in
case of revs->reverse in the first place...
Ciao,
Dscho
^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2007-03-06 15:52 UTC | newest]
Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-03-05 10:02 [BUG] git-rev-list: --topo-order --boundary and --max-count Santi Béjar
2007-03-05 10:39 ` Junio C Hamano
2007-03-05 12:15 ` Junio C Hamano
2007-03-05 17:05 ` Linus Torvalds
2007-03-05 18:55 ` [PATCH] revision walker: Fix --boundary when limited Johannes Schindelin
2007-03-05 19:00 ` Johannes Schindelin
2007-03-05 19:39 ` Linus Torvalds
2007-03-05 19:57 ` Linus Torvalds
2007-03-05 21:10 ` Junio C Hamano
2007-03-06 1:12 ` Johannes Schindelin
2007-03-06 1:32 ` Junio C Hamano
2007-03-06 1:44 ` Johannes Schindelin
2007-03-06 1:58 ` Junio C Hamano
2007-03-05 23:17 ` Johannes Schindelin
2007-03-06 0:36 ` Junio C Hamano
2007-03-06 0:41 ` [PATCH 2/4] revision traversal: retire BOUNDARY_SHOW Junio C Hamano
2007-03-06 2:05 ` Johannes Schindelin
2007-03-06 2:17 ` Junio C Hamano
2007-03-06 2:23 ` SHOWN means shown Junio C Hamano
2007-03-06 2:29 ` [PATCH 2/4] revision traversal: retire BOUNDARY_SHOW Johannes Schindelin
2007-03-06 11:34 ` Junio C Hamano
2007-03-06 15:52 ` Johannes Schindelin
2007-03-06 0:41 ` [PATCH 3/4] git-bundle: various fixups Junio C Hamano
2007-03-06 2:13 ` Johannes Schindelin
2007-03-06 0:41 ` [PATCH 4/4] git-bundle: --list-prereqs Junio C Hamano
2007-03-05 21:10 ` [PATCH] revision walker: Fix --boundary when limited Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).