* [BUG] git-rev-list: --topo-order --boundary and --max-count @ 2007-03-05 10:02 Santi Béjar 2007-03-05 10:39 ` Junio C Hamano 0 siblings, 1 reply; 26+ messages in thread From: Santi Béjar @ 2007-03-05 10:02 UTC (permalink / raw) To: Git Mailing List Hi *, the --topo-order does not play well with --boundary and --max-count. $ git-rev-list --boundary --max-count=50 5ced0 | wc -l 56 $ git-rev-list --topo-order --boundary --max-count=50 5ced0 | wc -l 8846 (5ced0 is git.git's master). I think it should be 56 for both. It presents this behaviour since c4025103fa, when was added --boundary support for git-rev-list --max-count and --max-age. Santi ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [BUG] git-rev-list: --topo-order --boundary and --max-count 2007-03-05 10:02 [BUG] git-rev-list: --topo-order --boundary and --max-count Santi Béjar @ 2007-03-05 10:39 ` Junio C Hamano 2007-03-05 12:15 ` Junio C Hamano ` (2 more replies) 0 siblings, 3 replies; 26+ messages in thread From: Junio C Hamano @ 2007-03-05 10:39 UTC (permalink / raw) To: Santi Béjar, Johannes Schindelin; +Cc: Git Mailing List "Santi Béjar" <sbejar@gmail.com> writes: > the --topo-order does not play well with --boundary and --max-count. > > $ git-rev-list --boundary --max-count=50 5ced0 | wc -l > 56 > $ git-rev-list --topo-order --boundary --max-count=50 5ced0 | wc -l > 8846 > > (5ced0 is git.git's master). I think it should be 56 for both. It > presents this behaviour since c4025103fa, when was added --boundary > support for git-rev-list --max-count and --max-age. I think the code that does --boundary when the list is limited with --max-count is not quite right, even without topo-order. Only when the traversal is not limited, the code happens to work correctly because in that case alone we pick up positive commits one by one up to the specified count, and do not place anything other than their immediate parents in the list. It needs to find out commits (be they marked as UNINTERESTING or not) still in the revs->commits that are _not_ reachable by any other commits in the list, or something like that. I suspect that would unfortunately be very expensive. Dscho, have better ideas? ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [BUG] git-rev-list: --topo-order --boundary and --max-count 2007-03-05 10:39 ` Junio C Hamano @ 2007-03-05 12:15 ` Junio C Hamano 2007-03-05 17:05 ` Linus Torvalds 2007-03-05 18:55 ` [PATCH] revision walker: Fix --boundary when limited Johannes Schindelin 2 siblings, 0 replies; 26+ messages in thread From: Junio C Hamano @ 2007-03-05 12:15 UTC (permalink / raw) To: Santi Béjar; +Cc: Johannes Schindelin, Git Mailing List Junio C Hamano <junkio@cox.net> writes: > "Santi Béjar" <sbejar@gmail.com> writes: > >> the --topo-order does not play well with --boundary and --max-count. >> >> $ git-rev-list --boundary --max-count=50 5ced0 | wc -l >> 56 >> $ git-rev-list --topo-order --boundary --max-count=50 5ced0 | wc -l >> 8846 >> >> (5ced0 is git.git's master). I think it should be 56 for both. It >> presents this behaviour since c4025103fa, when was added --boundary >> support for git-rev-list --max-count and --max-age. > > I think the code that does --boundary when the list is limited > with --max-count is not quite right, even without topo-order. > Only when the traversal is not limited, the code happens to work > correctly because in that case alone we pick up positive commits > one by one up to the specified count, and do not place anything > other than their immediate parents in the list. This is not even correct. Let's see an extreme example. Suppose you have something like this: ---o---o---o---x---A ---o---o---o---y---B and think about what "rev-list --boundary --max-count=1 A B" should return. It does not matter how branches A and B are related in the past because we are showing only one. Without --boundary, it is clear we will show B (time flows from left to right). With --boundary, the current code would show B, and show -y and -A as boundaries, and I think that is wrong. Originally --boundary was invented for the specific purpose of supporting thin packs. It worked on the set of commits resulting from a limited traversal (that is, you have at least one negative, iow UNINTERESTING, commit and one or more positive commits), and it showed the negative commit that is a parent of a positive commit. There are two primary users of --boundary right now. gitk wants to show where the partial traversal ends (although it can figure it out itself without help from --boundary), and thin pack generation wants to have it upfront so that it can see which trees and blobs can be used as the bases of delta. In both cases, the semantics desired is to show commits that are _not_ included in the usual (i.e. non --boundary) results that are immediate parents of the commits that are included in the result. So with that definition, the above example should show B and then -y as boundary, and should not even talk about A nor -x. This may affect the git-bundle's computation of references included in the bundle (I think the current code assumes that if you do "git bundle --max-count=1 A B" the resulting bundle says its set of tips consists of A and B) but if that is broken it also needs to be fixed. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [BUG] git-rev-list: --topo-order --boundary and --max-count 2007-03-05 10:39 ` Junio C Hamano 2007-03-05 12:15 ` Junio C Hamano @ 2007-03-05 17:05 ` Linus Torvalds 2007-03-05 18:55 ` [PATCH] revision walker: Fix --boundary when limited Johannes Schindelin 2 siblings, 0 replies; 26+ messages in thread From: Linus Torvalds @ 2007-03-05 17:05 UTC (permalink / raw) To: Junio C Hamano Cc: Santi Béjar, Johannes Schindelin, Git Mailing List On Mon, 5 Mar 2007, Junio C Hamano wrote: > > I think the code that does --boundary when the list is limited > with --max-count is not quite right, even without topo-order. Yeah. Sadly, this is a really irritating bug, becuase it means that you cannot do gitk -50 to see some random collection of 50 recent commits. (And yes, I've wanted to do that - I know the commit is fairly recent, so rather than write "gitk @{1.hour.ag}..", I'd rather just be lazy and say "gitk -100" to get a smaller slider bar and easier to find the recent ones). I never cared enough to fix it, but it's a mis-feature. I agree that it's probably not entirely trivial to fix. Linus ^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH] revision walker: Fix --boundary when limited 2007-03-05 10:39 ` Junio C Hamano 2007-03-05 12:15 ` Junio C Hamano 2007-03-05 17:05 ` Linus Torvalds @ 2007-03-05 18:55 ` Johannes Schindelin 2007-03-05 19:00 ` Johannes Schindelin ` (2 more replies) 2 siblings, 3 replies; 26+ messages in thread From: Johannes Schindelin @ 2007-03-05 18:55 UTC (permalink / raw) To: Junio C Hamano; +Cc: Santi Béjar, Git Mailing List [-- Attachment #1: Type: TEXT/PLAIN, Size: 3547 bytes --] In case revs->limited == 1, the revision walker really reads everything into revs->commits. The behaviour introduced in c4025103fa does not behave correctly in that case. It used to say: everything which is _still_ in the pipeline must be a boundary commit. So, in the case that revs->limited == 1, filter out all commits which are not dangling, in effect marking only the dangling ones as boundary commits. This bug was noticed by Santi Béjar. Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> --- On Mon, 5 Mar 2007, Junio C Hamano wrote: > "Santi Béjar" <sbejar@gmail.com> writes: > > > the --topo-order does not play well with --boundary and > > --max-count. > > > > $ git-rev-list --boundary --max-count=50 5ced0 | wc -l > > 56 > > $ git-rev-list --topo-order --boundary --max-count=50 5ced0 \ > > | wc -l > > 8846 > > > > (5ced0 is git.git's master). I think it should be 56 for both. > > It presents this behaviour since c4025103fa, when was added > > --boundary support for git-rev-list --max-count and --max-age. > > I think the code that does --boundary when the list is limited > with --max-count is not quite right, even without topo-order. Right. > It needs to find out commits (be they marked as UNINTERESTING or > not) still in the revs->commits that are _not_ reachable by any > other commits in the list, or something like that. > > I suspect that would unfortunately be very expensive. Dscho, > have better ideas? Unfortunately not. Fortunately, it is nowhere near as expensive as I originally thought. It just has to iterate through revs->commits three times, which is way cheaper than sorting the commits in the first place. Junio, if you apply this, could you make extra sure that I did not fsck up Santi's family name (I am running on a peculiar mixture of UTF-8 and ISO-8859-1 terminals...). revision.c | 31 +++++++++++++++++++++++++++---- 1 files changed, 27 insertions(+), 4 deletions(-) diff --git a/revision.c b/revision.c index f5b8ae4..8d47fac 100644 --- a/revision.c +++ b/revision.c @@ -1294,6 +1294,26 @@ static struct commit *get_revision_1(struct rev_info *revs) return NULL; } +static struct commit_list *get_dangling_commits(struct commit_list *list) +{ + struct commit_list *result = NULL, *iter, *parent; + + for (iter = list; iter; iter = iter->next) + for (parent = iter->item->parents; + parent; + parent = parent->next) + parent->item->object.flags |= TMP_MARK; + for (iter = list; iter; iter = iter->next) + if (!(iter->item->object.flags & TMP_MARK)) + commit_list_insert(iter->item, &result); + for (iter = list; iter; iter = iter->next) + for (parent = iter->item->parents; + parent; + parent = parent->next) + parent->item->object.flags &= ~TMP_MARK; + return result; +} + struct commit *get_revision(struct rev_info *revs) { struct commit *c = NULL; @@ -1345,12 +1365,15 @@ struct commit *get_revision(struct rev_info *revs) break; case 0: if (revs->boundary) { - struct commit_list *list = revs->commits; - while (list) { + struct commit_list *list; + if (revs->limited) { + list = get_dangling_commits(revs->commits); + free_commit_list(revs->commits); + revs->commits = list; + } + for (list = revs->commits; list; list = list->next) list->item->object.flags |= BOUNDARY_SHOW | BOUNDARY; - list = list->next; - } /* all remaining commits are boundary commits */ revs->max_count = -1; revs->limited = 1; -- 1.5.0.3.2518.g2f72-dirty ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH] revision walker: Fix --boundary when limited 2007-03-05 18:55 ` [PATCH] revision walker: Fix --boundary when limited Johannes Schindelin @ 2007-03-05 19:00 ` Johannes Schindelin 2007-03-05 19:39 ` Linus Torvalds 2007-03-05 21:10 ` [PATCH] revision walker: Fix --boundary when limited Junio C Hamano 2 siblings, 0 replies; 26+ messages in thread From: Johannes Schindelin @ 2007-03-05 19:00 UTC (permalink / raw) To: Junio C Hamano; +Cc: Santi Béjar, Git Mailing List [-- Attachment #1: Type: TEXT/PLAIN, Size: 851 bytes --] Hi, On Mon, 5 Mar 2007, Johannes Schindelin wrote: > On Mon, 5 Mar 2007, Junio C Hamano wrote: > > > "Santi Béjar" <sbejar@gmail.com> writes: > > > > > the --topo-order does not play well with --boundary and > > > --max-count. > > > > > > $ git-rev-list --boundary --max-count=50 5ced0 | wc -l > > > 56 > > > $ git-rev-list --topo-order --boundary --max-count=50 5ced0 \ > > > | wc -l > > > 8846 > > > > > > (5ced0 is git.git's master). I think it should be 56 for both. Side note: with the patch I am replying to, the latter command returns 51. This is correct, since the only boundary commit it shows (1db8b60b) has (all) the other 5 commits as ancestors. The behaviour of both is correct: in the former case, there is not enough information to tell that one of the 6 boundary commits reaches all the others. Ciao, Dscho ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] revision walker: Fix --boundary when limited 2007-03-05 18:55 ` [PATCH] revision walker: Fix --boundary when limited Johannes Schindelin 2007-03-05 19:00 ` Johannes Schindelin @ 2007-03-05 19:39 ` Linus Torvalds 2007-03-05 19:57 ` Linus Torvalds 2007-03-05 21:10 ` [PATCH] revision walker: Fix --boundary when limited Junio C Hamano 2 siblings, 1 reply; 26+ messages in thread From: Linus Torvalds @ 2007-03-05 19:39 UTC (permalink / raw) To: Johannes Schindelin Cc: Junio C Hamano, Santi Béjar, Git Mailing List On Mon, 5 Mar 2007, Johannes Schindelin wrote: > > In case revs->limited == 1, the revision walker really reads > everything into revs->commits. The behaviour introduced in > c4025103fa does not behave correctly in that case. > > It used to say: everything which is _still_ in the pipeline > must be a boundary commit. I would suggest this (more invasive) patch instead. Yours is revision.c | 31 +++++++++++++++++++++++++++---- 1 files changed, 27 insertions(+), 4 deletions(-) and mine is revision.c | 86 ++++++++++++++++++++++++++++++----------------------------- 1 files changed, 44 insertions(+), 42 deletions(-) ie I have bigger changes, but on the whole this patch just adds two lines total, and I *think* the end result is more readable. NOTE! Our patches aren't really mutually incompatible, and they attack the problem from two different directions. You do the separate phase (which is also correct), and my patch instead tries to clean up the commit walking so that the commit number limiter works more like the date limiter (which fundamentally has all the same issues! Including the problem with some commits possibly being marked as boundary commits when they aren't really, because the path-limiting or revision-limiting ended up cutting things off *differently* than the date-limiting). So I would humbly suggest applying this one first (which makes the handling of the walk-time commit limiter more uniform and less hacky), and if we need to, we can *also* add the whole separate phase for the "revs->limited" case.. Linus --- commit d3dd7e89c123b644ef199380f4f050226e4df862 Author: Linus Torvalds <torvalds@osdl.org> Date: Mon Mar 5 10:15:20 2007 -0800 revision list: fix BOUNDARY handling with limiters and commit counts When we limited the number of commits using "max_count", we would not correctly handle the combination of various time- and reachability-based limiting and the use of a commit counting. Everything that was reachable (but overflowed the commit count) would be marked as a BOUNDARY commit, resulting in things like "gitk" not being usable together with a numerical limit on the number of commits. This largely fixes it by being more careful about how we mark commits that went over the commit counts. NOTE! Because the numerical limiting happens without a separate phase as we traverse the commit list, we still won't do the boundary handling 100% correct when a commit may be reachable from multiple sources, and under those circumstances, some commits will be marked as boundary commits even though they strictly aren't. To fix this, we would need to make rather more invasive changes, with commit counting being an integral part of the limiting (whuch is fundamnetally hard, since limiting itself will change the number of commits!). So this is the "good enough to be quite usable" approach. The problem only affects boundary commits, and programs like 'gitk' that uses boundary commits would be better off just noticing themselves that not all boundary commits are necessarily useful. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> --- revision.c | 86 ++++++++++++++++++++++++++++++----------------------------- 1 files changed, 44 insertions(+), 42 deletions(-) diff --git a/revision.c b/revision.c index f5b8ae4..f5430d6 100644 --- a/revision.c +++ b/revision.c @@ -1213,6 +1213,30 @@ static int commit_match(struct commit *commit, struct rev_info *opt) commit->buffer, strlen(commit->buffer)); } +enum walk_action { + WALK_PARENTS, + WALK_STOP, +}; + +/* + * When we do the list limiting at commit-walking time, we + * need to make sure that we stop walking parenthood when + * we hit a commit that isn't interesting any more. This can + * be due to max_count or due to date limiters. + */ +static enum walk_action walk_commit(struct rev_info *revs, struct commit *commit) +{ + if (!revs->max_count) + return WALK_STOP; + + if (revs->max_age != -1) { + if (commit->date < revs->max_age) + return WALK_STOP; + } + + return WALK_PARENTS; +} + static struct commit *get_revision_1(struct rev_info *revs) { if (!revs->commits) @@ -1233,17 +1257,19 @@ static struct commit *get_revision_1(struct rev_info *revs) * the parents here. We also need to do the date-based limiting * that we'd otherwise have done in limit_list(). */ - if (!revs->limited) { - if (revs->max_age != -1 && - (commit->date < revs->max_age)) { - if (revs->boundary) - commit->object.flags |= - BOUNDARY_SHOW | BOUNDARY; - else - continue; - } else - add_parents_to_list(revs, commit, - &revs->commits); + switch (walk_commit(revs, commit)) { + case WALK_PARENTS: + if (revs->limited) + break; + add_parents_to_list(revs, commit, &revs->commits); + break; + case WALK_STOP: + if (!revs->boundary) + continue; + if (!(commit->object.flags & UNINTERESTING)) + commit->object.flags |= BOUNDARY_SHOW | BOUNDARY | UNINTERESTING; + mark_parents_uninteresting(commit); + break; } if (commit->object.flags & SHOWN) continue; @@ -1289,6 +1315,12 @@ static struct commit *get_revision_1(struct rev_info *revs) if (revs->boundary) mark_boundary_to_show(commit); commit->object.flags |= SHOWN; + if (revs->skip_count > 0) { + revs->skip_count--; + continue; + } + if (revs->max_count > 0) + revs->max_count--; return commit; } while (revs->commits); return NULL; @@ -1296,9 +1328,8 @@ static struct commit *get_revision_1(struct rev_info *revs) struct commit *get_revision(struct rev_info *revs) { - struct commit *c = NULL; - if (revs->reverse) { + struct commit *c; struct commit_list *list; /* @@ -1332,34 +1363,5 @@ struct commit *get_revision(struct rev_info *revs) return c; } - if (0 < revs->skip_count) { - while ((c = get_revision_1(revs)) != NULL) { - if (revs->skip_count-- <= 0) - break; - } - } - - /* Check the max_count ... */ - switch (revs->max_count) { - case -1: - break; - case 0: - if (revs->boundary) { - struct commit_list *list = revs->commits; - while (list) { - list->item->object.flags |= - BOUNDARY_SHOW | BOUNDARY; - list = list->next; - } - /* all remaining commits are boundary commits */ - revs->max_count = -1; - revs->limited = 1; - } else - return NULL; - default: - revs->max_count--; - } - if (c) - return c; return get_revision_1(revs); } ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH] revision walker: Fix --boundary when limited 2007-03-05 19:39 ` Linus Torvalds @ 2007-03-05 19:57 ` Linus Torvalds 2007-03-05 21:10 ` Junio C Hamano 2007-03-05 23:17 ` Johannes Schindelin 0 siblings, 2 replies; 26+ messages in thread From: Linus Torvalds @ 2007-03-05 19:57 UTC (permalink / raw) To: Johannes Schindelin Cc: Junio C Hamano, Santi Béjar, Git Mailing List On Mon, 5 Mar 2007, Linus Torvalds wrote: > > NOTE! Our patches aren't really mutually incompatible, and they attack the > problem from two different directions. You do the separate phase (which is > also correct), and my patch instead tries to clean up the commit walking > so that the commit number limiter works more like the date limiter (which > fundamentally has all the same issues! Including the problem with some > commits possibly being marked as boundary commits when they aren't really, > because the path-limiting or revision-limiting ended up cutting things off > *differently* than the date-limiting). Side note: the reason you don't *notice* it with the date-limiter is simply that the date limiter *also* runs at limit-time, rather than just at the incremental "run at the end" phase. So the date-limiter is much simpler when done together with other limiters (like path and revision limiters). HOWEVER. We can't do the same thing for the numerical one, because we need to run the other limiters *first*, and the numerical limiter always comes at the end. And the path-based "dense" limiter actually runs mostly incrementally, so you cannot do the numerical limiter until after it has run.. The way to really clean stuff up would be to: - first phase: limit by date and revision ranges first (both of those are static and quick, and don't depend on anything else) We do this already (limit_list) - second phase: limit by pathname (we don't do this as a phase at all, we do it incrementally: see "rewrite_parents()") -third phase: limit by number HOWEVER. There's a damn good reason why we do things the way we do, namely simply the fact that we want to do pathname limiting as much at run-time as possible.. But we *could* do the "rewrite_parents()" thing both in the non-incremental and in the final phase. However, doing the parent rewriting is quite nasty and error-prone, so I've been avoiding it. Anyway, I *suspect* that Dscho's patch might do the wrong thing for something like gitk -20 v1.4.4.. t/ exactly because of the subtle interaction between pathname limiting, static commit limiting *and* commit number limiting. Dscho? Linus ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] revision walker: Fix --boundary when limited 2007-03-05 19:57 ` Linus Torvalds @ 2007-03-05 21:10 ` Junio C Hamano 2007-03-06 1:12 ` Johannes Schindelin 2007-03-05 23:17 ` Johannes Schindelin 1 sibling, 1 reply; 26+ messages in thread From: Junio C Hamano @ 2007-03-05 21:10 UTC (permalink / raw) To: Linus Torvalds; +Cc: Johannes Schindelin, Santi Béjar, Git Mailing List Linus Torvalds <torvalds@linux-foundation.org> writes: > On Mon, 5 Mar 2007, Linus Torvalds wrote: >> >> NOTE! Our patches aren't really mutually incompatible, and they attack the >> problem from two different directions. You do the separate phase (which is >> also correct), and my patch instead tries to clean up the commit walking >> so that the commit number limiter works more like the date limiter (which >> fundamentally has all the same issues! Including the problem with some >> commits possibly being marked as boundary commits when they aren't really, >> because the path-limiting or revision-limiting ended up cutting things off >> *differently* than the date-limiting). I was working on a different approach, which is: - rip out the boundary logic from the commit walker. Placing "negative" commits in the revs->commits list was Ok if all we cared about "boundary" was the UNINTERESTING limiting case, but conceptually it was wrong. - make get_revision_1() function to walk the commits and return the results as if there is no funny postprocessing flags such as --reverse, --skip nor --max-count. - make get_revision() function the postprocessing phase: If reverse is given, wait for get_revision_1() to give everything that it would normally give, and then reverse it before consuming. If skip is given, skip that many before going further. If max is given, stop when we gave out that many. Now that we are about to return one positive commit, mark the parents of that commit to be potential boundaries before returning, iff we are doing the boundary processing. Return the commit. - After get_revision() finishes giving out all the positive commits, if we are doing the boundary processing, we look at the parents that we marked as potential boundaries earlier, see if they are really boundaries, and give them out. It loses more code than it adds, even when the new gc_boundary() function, which is purely for early optimization, is counted. Note that this patch is purely for eyeballing and discussion only. It breaks git-bundle's verify logic because the logic does not use BOUNDARY_SHOW flag for its internal computation anymore. After we correct it not to attempt to affect the boundary processing by setting the BOUNDARY_SHOW flag, we can remove BOUNDARY_SHOW from revision.h and use that bit assignment for the new CHILD_SHOWN flag. --- revision.c | 208 +++++++++++++++++++++++++++--------------------------------- revision.h | 8 ++- 2 files changed, 100 insertions(+), 116 deletions(-) diff --git a/revision.c b/revision.c index f5b8ae4..5d137ea 100644 --- a/revision.c +++ b/revision.c @@ -437,36 +437,6 @@ static void limit_list(struct rev_info *revs) continue; p = &commit_list_insert(commit, p)->next; } - if (revs->boundary) { - /* mark the ones that are on the result list first */ - for (list = newlist; list; list = list->next) { - struct commit *commit = list->item; - commit->object.flags |= TMP_MARK; - } - for (list = newlist; list; list = list->next) { - struct commit *commit = list->item; - struct object *obj = &commit->object; - struct commit_list *parent; - if (obj->flags & UNINTERESTING) - continue; - for (parent = commit->parents; - parent; - parent = parent->next) { - struct commit *pcommit = parent->item; - if (!(pcommit->object.flags & UNINTERESTING)) - continue; - pcommit->object.flags |= BOUNDARY; - if (pcommit->object.flags & TMP_MARK) - continue; - pcommit->object.flags |= TMP_MARK; - p = &commit_list_insert(pcommit, p)->next; - } - } - for (list = newlist; list; list = list->next) { - struct commit *commit = list->item; - commit->object.flags &= ~TMP_MARK; - } - } revs->commits = newlist; } @@ -1193,17 +1163,6 @@ static void rewrite_parents(struct rev_info *revs, struct commit *commit) } } -static void mark_boundary_to_show(struct commit *commit) -{ - struct commit_list *p = commit->parents; - while (p) { - commit = p->item; - p = p->next; - if (commit->object.flags & BOUNDARY) - commit->object.flags |= BOUNDARY_SHOW; - } -} - static int commit_match(struct commit *commit, struct rev_info *opt) { if (!opt->grep_filter) @@ -1235,15 +1194,9 @@ static struct commit *get_revision_1(struct rev_info *revs) */ if (!revs->limited) { if (revs->max_age != -1 && - (commit->date < revs->max_age)) { - if (revs->boundary) - commit->object.flags |= - BOUNDARY_SHOW | BOUNDARY; - else - continue; - } else - add_parents_to_list(revs, commit, - &revs->commits); + (commit->date < revs->max_age)) + continue; + add_parents_to_list(revs, commit, &revs->commits); } if (commit->object.flags & SHOWN) continue; @@ -1252,18 +1205,6 @@ static struct commit *get_revision_1(struct rev_info *revs) revs->ignore_packed)) continue; - /* We want to show boundary commits only when their - * children are shown. When path-limiter is in effect, - * rewrite_parents() drops some commits from getting shown, - * and there is no point showing boundary parents that - * are not shown. After rewrite_parents() rewrites the - * parents of a commit that is shown, we mark the boundary - * parents with BOUNDARY_SHOW. - */ - if (commit->object.flags & BOUNDARY_SHOW) { - commit->object.flags |= SHOWN; - return commit; - } if (commit->object.flags & UNINTERESTING) continue; if (revs->min_age != -1 && (commit->date > revs->min_age)) @@ -1286,80 +1227,119 @@ static struct commit *get_revision_1(struct rev_info *revs) if (revs->parents) rewrite_parents(revs, commit); } - if (revs->boundary) - mark_boundary_to_show(commit); commit->object.flags |= SHOWN; return commit; } while (revs->commits); return NULL; } +static void gc_boundary(struct object_array *array) +{ + unsigned nr = array->nr; + unsigned alloc = array->alloc; + struct object_array_entry *objects = array->objects; + + if (alloc <= nr) { + unsigned i, j; + for (i = j = 0; i < nr; i++) { + if (objects[i].item->flags & SHOWN) + continue; + if (i != j) + objects[j] = objects[i]; + j++; + } + for (i = j; j < nr; j++) + objects[i].item = NULL; + array->nr = j; + } +} + struct commit *get_revision(struct rev_info *revs) { struct commit *c = NULL; - - if (revs->reverse) { - struct commit_list *list; - - /* - * rev_info.reverse is used to note the fact that we - * want to output the list of revisions in reverse - * order. To accomplish this goal, reverse can have - * different values: - * - * 0 do nothing - * 1 reverse the list - * 2 internal use: we have already obtained and - * reversed the list, now we only need to yield - * its items. - */ - - if (revs->reverse == 1) { - revs->reverse = 0; - list = NULL; - while ((c = get_revision(revs))) - commit_list_insert(c, &list); - revs->commits = list; - revs->reverse = 2; + struct commit_list *l; + + if (revs->boundary == 2) { + unsigned i; + struct object_array *array = &revs->boundary_commits; + struct object_array_entry *objects = array->objects; + for (i = 0; i < array->nr; i++) { + c = (struct commit *)(objects[i].item); + if (!c) + continue; + if (!(c->object.flags & CHILD_SHOWN)) + continue; + if (!(c->object.flags & SHOWN)) + break; } - - if (!revs->commits) + if (array->nr <= i) return NULL; - c = revs->commits->item; - list = revs->commits->next; - free(revs->commits); - revs->commits = list; + + c->object.flags |= SHOWN | BOUNDARY; return c; } - if (0 < revs->skip_count) { - while ((c = get_revision_1(revs)) != NULL) { - if (revs->skip_count-- <= 0) - break; - } + if (revs->reverse) { + l = NULL; + while ((c = get_revision_1(revs))) + commit_list_insert(c, &l); + revs->commits = l; + revs->reverse = 0; } - /* Check the max_count ... */ + /* + * Now pick up what they want to give us + */ + c = get_revision_1(revs); + while (0 < revs->skip_count) { + revs->skip_count--; + c = get_revision_1(revs); + if (!c) + break; + } + + /* + * Check the max_count. + */ switch (revs->max_count) { case -1: break; case 0: - if (revs->boundary) { - struct commit_list *list = revs->commits; - while (list) { - list->item->object.flags |= - BOUNDARY_SHOW | BOUNDARY; - list = list->next; - } - /* all remaining commits are boundary commits */ - revs->max_count = -1; - revs->limited = 1; - } else - return NULL; + c = NULL; + break; default: revs->max_count--; } - if (c) + + if (!revs->boundary) return c; - return get_revision_1(revs); + + if (!c) { + /* + * get_revision_1() runs out the commits, and + * we are done computing the boundaries. + * switch to boundary commits output mode. + */ + revs->boundary = 2; + return get_revision(revs); + } + + /* + * boundary commits are the commits that are parents of the + * ones we got from get_revision_1() but they themselves are + * not returned from get_revision_1(). Before returning + * 'c', we need to mark its parents that they could be boundaries. + */ + + for (l = c->parents; l; l = l->next) { + struct object *p; + p = &(l->item->object); + if (p->flags & CHILD_SHOWN) + continue; + p->flags |= CHILD_SHOWN; + gc_boundary(&revs->boundary_commits); + add_object_array(p, NULL, &revs->boundary_commits); + } + + return c; } diff --git a/revision.h b/revision.h index 5fec184..6579a44 100644 --- a/revision.h +++ b/revision.h @@ -10,6 +10,7 @@ #define BOUNDARY_SHOW (1u<<6) #define ADDED (1u<<7) /* Parents already parsed and added? */ #define SYMMETRIC_LEFT (1u<<8) +#define CHILD_SHOWN (1u<<9) struct rev_info; struct log_info; @@ -21,6 +22,9 @@ struct rev_info { struct commit_list *commits; struct object_array pending; + /* Parents of shown commits */ + struct object_array boundary_commits; + /* Basic information */ const char *prefix; void *prune_data; @@ -40,10 +44,10 @@ struct rev_info { edge_hint:1, limited:1, unpacked:1, /* see also ignore_packed below */ - boundary:1, + boundary:2, left_right:1, parents:1, - reverse:2; + reverse:1; /* Diff flags */ unsigned int diff:1, ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH] revision walker: Fix --boundary when limited 2007-03-05 21:10 ` Junio C Hamano @ 2007-03-06 1:12 ` Johannes Schindelin 2007-03-06 1:32 ` Junio C Hamano 0 siblings, 1 reply; 26+ messages in thread From: Johannes Schindelin @ 2007-03-06 1:12 UTC (permalink / raw) To: Junio C Hamano; +Cc: Git Mailing List Hi, On Mon, 5 Mar 2007, Junio C Hamano wrote: > - rip out the boundary logic from the commit walker. Placing > "negative" commits in the revs->commits list was Ok if all we > cared about "boundary" was the UNINTERESTING limiting case, > but conceptually it was wrong. I agree. > It loses more code than it adds, even when the new gc_boundary() > function, which is purely for early optimization, is counted. I have the slight suspicion that it might be faster if gc_boundary() was not called at all... But that's probably a subject for later. > Note that this patch is purely for eyeballing and discussion > only. It breaks git-bundle's verify logic because the logic > does not use BOUNDARY_SHOW flag for its internal computation > anymore. After we correct it not to attempt to affect the > boundary processing by setting the BOUNDARY_SHOW flag, we can > remove BOUNDARY_SHOW from revision.h and use that bit assignment > for the new CHILD_SHOWN flag. Yes, git-bundle should be easy enough to fix. > + struct commit_list *l; > + > + if (revs->boundary == 2) { > + unsigned i; > + struct object_array *array = &revs->boundary_commits; > + struct object_array_entry *objects = array->objects; > + for (i = 0; i < array->nr; i++) { An easy optimisation would be iterate in the other direction, resetting array->nr after the loop. Of course, this does not preserve the order... > + if (revs->reverse) { > + l = NULL; > + while ((c = get_revision_1(revs))) > + commit_list_insert(c, &l); > + revs->commits = l; > + revs->reverse = 0; > } Clever! > > - /* Check the max_count ... */ > + /* > + * Now pick up what they want to give us > + */ > + c = get_revision_1(revs); > + while (0 < revs->skip_count) { > + revs->skip_count--; > + c = get_revision_1(revs); > + if (!c) > + break; > + } > + > + /* > + * Check the max_count. > + */ > switch (revs->max_count) { > case -1: > break; > case 0: > - if (revs->boundary) { > - struct commit_list *list = revs->commits; > - while (list) { > - list->item->object.flags |= > - BOUNDARY_SHOW | BOUNDARY; > - list = list->next; > - } > - /* all remaining commits are boundary commits */ > - revs->max_count = -1; > - revs->limited = 1; > - } else > - return NULL; > + c = NULL; > + break; I guess that you want to do this instead: case 0: c = NULL; /* fall through */ > default: > revs->max_count--; so that --reverse actually works (otherwise, the while() loop would deplete max_count, and then max_count would be 0, and nothing would be returned). > + /* > + * boundary commits are the commits that are parents of the > + * ones we got from get_revision_1() but they themselves are > + * not returned from get_revision_1(). Before returning > + * 'c', we need to mark its parents that they could be boundaries. > + */ > + > + for (l = c->parents; l; l = l->next) { AFAICT this changes behaviour: c->parents were possibly rewritten. But I guess it makes sense showing the rewritten parents as boundary commits, not the real parents. > + struct object *p; > + p = &(l->item->object); > + if (p->flags & CHILD_SHOWN) ... or if (p->flags & (CHILD_SHOWN | SHOWN)) ? Ciao, Dscho ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] revision walker: Fix --boundary when limited 2007-03-06 1:12 ` Johannes Schindelin @ 2007-03-06 1:32 ` Junio C Hamano 2007-03-06 1:44 ` Johannes Schindelin 0 siblings, 1 reply; 26+ messages in thread From: Junio C Hamano @ 2007-03-06 1:32 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Git Mailing List Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: >> + if (revs->reverse) { >> + l = NULL; >> + while ((c = get_revision_1(revs))) >> + commit_list_insert(c, &l); >> + revs->commits = l; >> + revs->reverse = 0; >> } > > Clever! It is not clever, but just is stupid and WRONG. It just shows how little I care about --reverse ;-). revision_1() is to get the full list without non limit limiters, so the above loop would not even deplete the max_count but literally grabs everything. Arguably, you can do "rev-list --reverse -2 master" to get the first two commits with this, but it was not intended. We would need to grab (skip + max_count) from get_revision_1() and set max_count to -1 before leaving the if () {} here, or something like that. > I guess that you want to do this instead: > > case 0: > c = NULL; > /* fall through */ >> default: >> revs->max_count--; > > ... Actually that is not correct either. The other patch fixes it (but not the unintended semantic change to --reverse). >> + for (l = c->parents; l; l = l->next) { > > AFAICT this changes behaviour: c->parents were possibly rewritten. Well, the behaviour of max with boundary in 'master' did the same thing, as what was in revs->commits are rewritten parents of commits we already returned, didn't it? > But I > guess it makes sense showing the rewritten parents as boundary commits, > not the real parents. > >> + struct object *p; >> + p = &(l->item->object); >> + if (p->flags & CHILD_SHOWN) > > ... or > if (p->flags & (CHILD_SHOWN | SHOWN)) "If itself is shown it cannot be a boundary, and if we know we marked it as potential boundary we do not have to mark it again"... I think you are right. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] revision walker: Fix --boundary when limited 2007-03-06 1:32 ` Junio C Hamano @ 2007-03-06 1:44 ` Johannes Schindelin 2007-03-06 1:58 ` Junio C Hamano 0 siblings, 1 reply; 26+ messages in thread From: Johannes Schindelin @ 2007-03-06 1:44 UTC (permalink / raw) To: Junio C Hamano; +Cc: Git Mailing List Hi, On Mon, 5 Mar 2007, Junio C Hamano wrote: > Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > > >> + if (revs->reverse) { > >> + l = NULL; > >> + while ((c = get_revision_1(revs))) > >> + commit_list_insert(c, &l); > >> + revs->commits = l; > >> + revs->reverse = 0; > >> } > > > > Clever! > > It is not clever, but just is stupid and WRONG. It just shows > how little I care about --reverse ;-). > > revision_1() is to get the full list without non limit limiters, > so the above loop would not even deplete the max_count but > literally grabs everything. Oops. I missed the _1(), _and_ the missing "revs->reverse = 0"... > >> + for (l = c->parents; l; l = l->next) { > > > > AFAICT this changes behaviour: c->parents were possibly rewritten. > > Well, the behaviour of max with boundary in 'master' did the same thing, > as what was in revs->commits are rewritten parents of commits we already > returned, didn't it? I missed that, too. Maybe I should get more familiar with the revision walker first, before continuing to ask for ridicule. Ciao, Dscho ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] revision walker: Fix --boundary when limited 2007-03-06 1:44 ` Johannes Schindelin @ 2007-03-06 1:58 ` Junio C Hamano 0 siblings, 0 replies; 26+ messages in thread From: Junio C Hamano @ 2007-03-06 1:58 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Git Mailing List Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > I missed that, too. Maybe I should get more familiar with the revision > walker first,... No. You are asking the right questions and helped finding a few bugs already. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] revision walker: Fix --boundary when limited 2007-03-05 19:57 ` Linus Torvalds 2007-03-05 21:10 ` Junio C Hamano @ 2007-03-05 23:17 ` Johannes Schindelin 2007-03-06 0:36 ` Junio C Hamano 1 sibling, 1 reply; 26+ messages in thread From: Johannes Schindelin @ 2007-03-05 23:17 UTC (permalink / raw) To: Linus Torvalds; +Cc: Junio C Hamano, Santi Béjar, Git Mailing List Hi, On Mon, 5 Mar 2007, Linus Torvalds wrote: > Anyway, I *suspect* that Dscho's patch might do the wrong thing for > something like > > gitk -20 v1.4.4.. t/ > > exactly because of the subtle interaction between pathname limiting, > static commit limiting *and* commit number limiting. Dscho? Correct. I'll have a look at Junio's patch shortly, which hopefully makes me forget my shortsighted patch. Ciao, Dscho ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] revision walker: Fix --boundary when limited 2007-03-05 23:17 ` Johannes Schindelin @ 2007-03-06 0:36 ` Junio C Hamano 2007-03-06 0:41 ` [PATCH 2/4] revision traversal: retire BOUNDARY_SHOW Junio C Hamano ` (2 more replies) 0 siblings, 3 replies; 26+ messages in thread From: Junio C Hamano @ 2007-03-06 0:36 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Linus Torvalds, Santi Béjar, Git Mailing List Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > On Mon, 5 Mar 2007, Linus Torvalds wrote: > >> Anyway, I *suspect* that Dscho's patch might do the wrong thing for >> something like >> >> gitk -20 v1.4.4.. t/ >> >> exactly because of the subtle interaction between pathname limiting, >> static commit limiting *and* commit number limiting. Dscho? > > Correct. I'll have a look at Junio's patch shortly, which hopefully makes > me forget my shortsighted patch. I have a handful updates, including a few 'git-bundle' patches. ^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 2/4] revision traversal: retire BOUNDARY_SHOW 2007-03-06 0:36 ` Junio C Hamano @ 2007-03-06 0:41 ` Junio C Hamano 2007-03-06 2:05 ` Johannes Schindelin 2007-03-06 0:41 ` [PATCH 3/4] git-bundle: various fixups Junio C Hamano 2007-03-06 0:41 ` [PATCH 4/4] git-bundle: --list-prereqs Junio C Hamano 2 siblings, 1 reply; 26+ messages in thread From: Junio C Hamano @ 2007-03-06 0:41 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Linus Torvalds, Santi Béjar, Git Mailing List This removes the flag internally used by revision traversal to decide which commits are indeed boundaries and renames it to CHILD_SHOWN. builtin-bundle uses the symbol for its verification, but I think the logic it uses it is wrong. The flag is still useful but it is local to the git-bundle, so it is renamed to PREREQ_MARK. Signed-off-by: Junio C Hamano <junkio@cox.net> --- * The earlier one that competed with Johannes's and Linus's is the [1/4] of this series. builtin-bundle.c | 6 ++++-- revision.c | 9 ++++++++- revision.h | 3 +-- 3 files changed, 13 insertions(+), 5 deletions(-) diff --git a/builtin-bundle.c b/builtin-bundle.c index d41a413..0265845 100644 --- a/builtin-bundle.c +++ b/builtin-bundle.c @@ -160,6 +160,8 @@ static int fork_with_pipe(const char **argv, int *in, int *out) return pid; } +#define PREREQ_MARK (1u<<16) + static int verify_bundle(struct bundle_header *header) { /* @@ -179,7 +181,7 @@ static int verify_bundle(struct bundle_header *header) struct ref_list_entry *e = p->list + i; struct object *o = parse_object(e->sha1); if (o) { - o->flags |= BOUNDARY_SHOW; + o->flags |= PREREQ_MARK; add_pending_object(&revs, o, e->name); continue; } @@ -202,7 +204,7 @@ static int verify_bundle(struct bundle_header *header) i = req_nr; while (i && (commit = get_revision(&revs))) - if (commit->object.flags & BOUNDARY_SHOW) + if (commit->object.flags & PREREQ_MARK) i--; for (i = 0; i < req_nr; i++) diff --git a/revision.c b/revision.c index 5d137ea..2d27ccf 100644 --- a/revision.c +++ b/revision.c @@ -1285,17 +1285,21 @@ struct commit *get_revision(struct rev_info *revs) commit_list_insert(c, &l); revs->commits = l; revs->reverse = 0; + c = NULL; } /* * Now pick up what they want to give us */ - c = get_revision_1(revs); + if (!(c = get_revision_1(revs))) + return NULL; while (0 < revs->skip_count) { revs->skip_count--; c = get_revision_1(revs); if (!c) break; + /* Although we grabbed it, it is not shown. */ + c->object.flags &= ~SHOWN; } /* @@ -1305,6 +1309,9 @@ struct commit *get_revision(struct rev_info *revs) case -1: break; case 0: + /* Although we grabbed it, it is not shown. */ + if (c) + c->object.flags &= ~SHOWN; c = NULL; break; default: diff --git a/revision.h b/revision.h index 6579a44..1885f8d 100644 --- a/revision.h +++ b/revision.h @@ -7,10 +7,9 @@ #define SHOWN (1u<<3) #define TMP_MARK (1u<<4) /* for isolated cases; clean after use */ #define BOUNDARY (1u<<5) -#define BOUNDARY_SHOW (1u<<6) +#define CHILD_SHOWN (1u<<6) #define ADDED (1u<<7) /* Parents already parsed and added? */ #define SYMMETRIC_LEFT (1u<<8) -#define CHILD_SHOWN (1u<<9) struct rev_info; struct log_info; -- 1.5.0.3.862.g71037 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH 2/4] revision traversal: retire BOUNDARY_SHOW 2007-03-06 0:41 ` [PATCH 2/4] revision traversal: retire BOUNDARY_SHOW Junio C Hamano @ 2007-03-06 2:05 ` Johannes Schindelin 2007-03-06 2:17 ` Junio C Hamano 0 siblings, 1 reply; 26+ messages in thread From: Johannes Schindelin @ 2007-03-06 2:05 UTC (permalink / raw) To: Junio C Hamano; +Cc: Linus Torvalds, Santi Béjar, Git Mailing List Hi, On Mon, 5 Mar 2007, Junio C Hamano wrote: > This removes the flag internally used by revision traversal to > decide which commits are indeed boundaries and renames it to > CHILD_SHOWN. builtin-bundle uses the symbol for its > verification, but I think the logic it uses it is wrong. The > flag is still useful but it is local to the git-bundle, so it is > renamed to PREREQ_MARK. The idea was to bail out of revision walking when that prerequisite was found, _even_ if it happened to be uninteresting. Just marking the parents uninteresting unless they are prerequisites would be better, probably. > > i = req_nr; > while (i && (commit = get_revision(&revs))) > - if (commit->object.flags & BOUNDARY_SHOW) > + if (commit->object.flags & PREREQ_MARK) > i--; The above-mentioned idea, then would be: instead of just i--; do { struct commit_list *parent = commit->parents; i--; for (; parent; parent = parent->next) if (!(parent->item->object.flag & PREREQ_MARK)) parent->item->object.flag |= UNINTERESTING; } This should help performance, as not all reachable commits are traversed any more. > > for (i = 0; i < req_nr; i++) > diff --git a/revision.c b/revision.c > index 5d137ea..2d27ccf 100644 > --- a/revision.c > +++ b/revision.c > @@ -1285,17 +1285,21 @@ struct commit *get_revision(struct rev_info *revs) > commit_list_insert(c, &l); > revs->commits = l; > revs->reverse = 0; > + c = NULL; Why? It gets assigned here anyway: > } > > /* > * Now pick up what they want to give us > */ > - c = get_revision_1(revs); > + if (!(c = get_revision_1(revs))) > + return NULL; > while (0 < revs->skip_count) { > revs->skip_count--; > c = get_revision_1(revs); > if (!c) > break; > + /* Although we grabbed it, it is not shown. */ > + c->object.flags &= ~SHOWN; Wasn't --skip meant for something like gitweb, where you just want to skip the first <n> commits from the list, but do not want to change the list otherwise? > @@ -1305,6 +1309,9 @@ struct commit *get_revision(struct rev_info *revs) > case -1: > break; > case 0: > + /* Although we grabbed it, it is not shown. */ > + if (c) > + c->object.flags &= ~SHOWN; > c = NULL; Is this really relevant in practice? Ciao, Dscho ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 2/4] revision traversal: retire BOUNDARY_SHOW 2007-03-06 2:05 ` Johannes Schindelin @ 2007-03-06 2:17 ` Junio C Hamano 2007-03-06 2:23 ` SHOWN means shown Junio C Hamano 2007-03-06 2:29 ` [PATCH 2/4] revision traversal: retire BOUNDARY_SHOW Johannes Schindelin 0 siblings, 2 replies; 26+ messages in thread From: Junio C Hamano @ 2007-03-06 2:17 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Linus Torvalds, Santi Béjar, Git Mailing List Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > On Mon, 5 Mar 2007, Junio C Hamano wrote: > >> This removes the flag internally used by revision traversal to >> decide which commits are indeed boundaries and renames it to >> CHILD_SHOWN. builtin-bundle uses the symbol for its >> verification, but I think the logic it uses it is wrong. The >> flag is still useful but it is local to the git-bundle, so it is >> renamed to PREREQ_MARK. > > The idea was to bail out of revision walking when that prerequisite was > found, _even_ if it happened to be uninteresting. > > Just marking the parents uninteresting unless they are prerequisites would > be better, probably. > >> >> i = req_nr; >> while (i && (commit = get_revision(&revs))) >> - if (commit->object.flags & BOUNDARY_SHOW) >> + if (commit->object.flags & PREREQ_MARK) >> i--; > > The above-mentioned idea, then would be: instead of just i--; do I explicitly wanted to avoid that. If the generated bundle had prereqs that are not independent of each other, doing that would hide some prereqs. > This should help performance, as not all reachable commits are traversed > any more. You prevented that with "while (i &&" part already, didn't you? > Wasn't --skip meant for something like gitweb, where you just want to skip > the first <n> commits from the list, but do not want to change the list > otherwise? Yes, but does the patch change that? > >> @@ -1305,6 +1309,9 @@ struct commit *get_revision(struct rev_info *revs) >> case -1: >> break; >> case 0: >> + /* Although we grabbed it, it is not shown. */ >> + if (c) >> + c->object.flags &= ~SHOWN; >> c = NULL; > > Is this really relevant in practice? Absolutely. But I have further updates on this series. ^ permalink raw reply [flat|nested] 26+ messages in thread
* SHOWN means shown 2007-03-06 2:17 ` Junio C Hamano @ 2007-03-06 2:23 ` Junio C Hamano 2007-03-06 2:29 ` [PATCH 2/4] revision traversal: retire BOUNDARY_SHOW Johannes Schindelin 1 sibling, 0 replies; 26+ messages in thread From: Junio C Hamano @ 2007-03-06 2:23 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Linus Torvalds, Santi Béjar, Git Mailing List This moves the code to set SHOWN on the commit from get_revision_1() back to get_revision(), so that the bit means what it originally meant: this commit has been given back to the caller. Also it fixes the --reverse breakage Dscho pointed out. --- Junio C Hamano <junkio@cox.net> writes: >> Is this really relevant in practice? > > Absolutely. But I have further updates on this series. ... otherwise, "git-bundle create -3 A B" when B is A~3 would mistakenly say "Ah, B is SHOWN so include it". revision.c | 26 ++++++++++++++++++-------- 1 files changed, 18 insertions(+), 8 deletions(-) diff --git a/revision.c b/revision.c index 2d27ccf..35a1711 100644 --- a/revision.c +++ b/revision.c @@ -1227,7 +1227,6 @@ static struct commit *get_revision_1(struct rev_info *revs) if (revs->parents) rewrite_parents(revs, commit); } - commit->object.flags |= SHOWN; return commit; } while (revs->commits); return NULL; @@ -1280,11 +1279,22 @@ struct commit *get_revision(struct rev_info *revs) } if (revs->reverse) { + int limit = -1; + + if (0 <= revs->max_count) { + limit = revs->max_count; + if (0 < revs->skip_count) + limit += revs->skip_count; + } l = NULL; - while ((c = get_revision_1(revs))) + while ((c = get_revision_1(revs))) { commit_list_insert(c, &l); + if ((0 < limit) && !--limit) + break; + } revs->commits = l; revs->reverse = 0; + revs->max_count = -1; c = NULL; } @@ -1298,8 +1308,6 @@ struct commit *get_revision(struct rev_info *revs) c = get_revision_1(revs); if (!c) break; - /* Although we grabbed it, it is not shown. */ - c->object.flags &= ~SHOWN; } /* @@ -1310,16 +1318,18 @@ struct commit *get_revision(struct rev_info *revs) break; case 0: /* Although we grabbed it, it is not shown. */ - if (c) - c->object.flags &= ~SHOWN; c = NULL; break; default: revs->max_count--; } - if (!revs->boundary) + if (c) + c->object.flags |= SHOWN; + + if (!revs->boundary) { return c; + } if (!c) { /* @@ -1341,7 +1351,7 @@ struct commit *get_revision(struct rev_info *revs) for (l = c->parents; l; l = l->next) { struct object *p; p = &(l->item->object); - if (p->flags & CHILD_SHOWN) + if (p->flags & (CHILD_SHOWN | SHOWN)) continue; p->flags |= CHILD_SHOWN; gc_boundary(&revs->boundary_commits); ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH 2/4] revision traversal: retire BOUNDARY_SHOW 2007-03-06 2:17 ` Junio C Hamano 2007-03-06 2:23 ` SHOWN means shown Junio C Hamano @ 2007-03-06 2:29 ` Johannes Schindelin 2007-03-06 11:34 ` Junio C Hamano 1 sibling, 1 reply; 26+ messages in thread From: Johannes Schindelin @ 2007-03-06 2:29 UTC (permalink / raw) To: Junio C Hamano; +Cc: Linus Torvalds, Santi Béjar, Git Mailing List Hi, On Mon, 5 Mar 2007, Junio C Hamano wrote: > Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > > > Just marking the parents uninteresting unless they are prerequisites > > would be better, probably. > > > >> > >> i = req_nr; > >> while (i && (commit = get_revision(&revs))) > >> - if (commit->object.flags & BOUNDARY_SHOW) > >> + if (commit->object.flags & PREREQ_MARK) > >> i--; > > > > The above-mentioned idea, then would be: instead of just i--; do > > I explicitly wanted to avoid that. If the generated bundle had > prereqs that are not independent of each other, doing that would > hide some prereqs. Yes. I wanted to say SHOWN instead of UNINTERESTING, but reading the source again, it seems that add_parents_to_list() is called in get_revision_1() even if the commit is marked SHOWN. So forget about that idea of mine. > > This should help performance, as not all reachable commits are traversed > > any more. > > You prevented that with "while (i &&" part already, didn't you? Well, yes. I also wanted to prevent going down all paths, though. > > Wasn't --skip meant for something like gitweb, where you just want to > > skip the first <n> commits from the list, but do not want to change > > the list otherwise? > > Yes, but does the patch change that? I _think_ that yes, it could. If the timestamp of HEAD is older than of HEAD^, and you say "git log HEAD^ HEAD", then HEAD^ will be actually shown first, right? If you then say "--skip 1", HEAD will be shown first, and then HEAD^... > >> @@ -1305,6 +1309,9 @@ struct commit *get_revision(struct rev_info *revs) > >> case -1: > >> break; > >> case 0: > >> + /* Although we grabbed it, it is not shown. */ > >> + if (c) > >> + c->object.flags &= ~SHOWN; > >> c = NULL; > > > > Is this really relevant in practice? > > Absolutely. But I have further updates on this series. Sorry for interrupting your patch spree :-) Ciao, Dscho ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 2/4] revision traversal: retire BOUNDARY_SHOW 2007-03-06 2:29 ` [PATCH 2/4] revision traversal: retire BOUNDARY_SHOW Johannes Schindelin @ 2007-03-06 11:34 ` Junio C Hamano 2007-03-06 15:52 ` Johannes Schindelin 0 siblings, 1 reply; 26+ messages in thread From: Junio C Hamano @ 2007-03-06 11:34 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Linus Torvalds, Santi Béjar, Git Mailing List Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: >> > This should help performance, as not all reachable commits are traversed >> > any more. >> >> You prevented that with "while (i &&" part already, didn't you? > > Well, yes. I also wanted to prevent going down all paths, though. If we wanted to bundle "-8 A B", I think we would make 'x' and 'y' prereqs, as they are the direct parents commits that are shown, and that themselves are not shown. .---*---*---*---*---* A / ---x---y---*---*---* B If we say upon hitting prereq (x and y) we stop traversal by marking the parent UNINTERESTING, I suspected that we may not find out 'x' with get_revision() loop, and that was why I chose not to. Instead the loop stops by finding y and then x (and by saying "ok I needed to find two and now I have two". But this probably needs more optimization. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 2/4] revision traversal: retire BOUNDARY_SHOW 2007-03-06 11:34 ` Junio C Hamano @ 2007-03-06 15:52 ` Johannes Schindelin 0 siblings, 0 replies; 26+ messages in thread From: Johannes Schindelin @ 2007-03-06 15:52 UTC (permalink / raw) To: Junio C Hamano; +Cc: Linus Torvalds, Santi Béjar, Git Mailing List Hi, On Tue, 6 Mar 2007, Junio C Hamano wrote: > Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > > >> > This should help performance, as not all reachable commits are > >> > traversed any more. > >> > >> You prevented that with "while (i &&" part already, didn't you? > > > > Well, yes. I also wanted to prevent going down all paths, though. > > If we wanted to bundle "-8 A B", I think we would make 'x' and > 'y' prereqs, as they are the direct parents commits that are > shown, and that themselves are not shown. > > .---*---*---*---*---* A > / > ---x---y---*---*---* B > > If we say upon hitting prereq (x and y) we stop traversal by > marking the parent UNINTERESTING, I suspected that we may not > find out 'x' with get_revision() loop, and that was why I chose > not to. Instead the loop stops by finding y and then x (and by > saying "ok I needed to find two and now I have two". Yes, that is correct. What I wanted to optimize for was this: x--*--*--A y--*--B There is no need to traverse the parents of y to find x. I wanted for the traversal to just stop on paths where a prereq was found. But setting the parents to UNINTERESTING or SHOWN was wrong, because that would not stop the traversal: see get_revision_1(). In case revs->limited is 0, and the cutoff date is not yet reached, the parents are added always. Even if the current commit is UNINTERESTING or SHOWN. (Which is correct, of course.) BTW this behaviour with revs->limited = 1 made me set revs->limited in case of revs->reverse in the first place... Ciao, Dscho ^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 3/4] git-bundle: various fixups 2007-03-06 0:36 ` Junio C Hamano 2007-03-06 0:41 ` [PATCH 2/4] revision traversal: retire BOUNDARY_SHOW Junio C Hamano @ 2007-03-06 0:41 ` Junio C Hamano 2007-03-06 2:13 ` Johannes Schindelin 2007-03-06 0:41 ` [PATCH 4/4] git-bundle: --list-prereqs Junio C Hamano 2 siblings, 1 reply; 26+ messages in thread From: Junio C Hamano @ 2007-03-06 0:41 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Linus Torvalds, Santi Béjar, Git Mailing List verify_bundle() returned with an error early only when all prerequisite commits were missing. It should error out much earlier when some are missing. When the rev-list is limited in ways other than revision range (e.g. --max-count or --max-age), create_bundle() listed all positive refs given from the command line as if they are available, but resulting pack may not have some of them. Add a logic to make sure all of them are included, and error out otherwise. Signed-off-by: Junio C Hamano <junkio@cox.net> --- builtin-bundle.c | 46 +++++++++++++++++++++++++++++++++------------- 1 files changed, 33 insertions(+), 13 deletions(-) diff --git a/builtin-bundle.c b/builtin-bundle.c index 0265845..9286f3d 100644 --- a/builtin-bundle.c +++ b/builtin-bundle.c @@ -189,7 +189,7 @@ static int verify_bundle(struct bundle_header *header) error(message); error("%s %s", sha1_to_hex(e->sha1), e->name); } - if (revs.pending.nr == 0) + if (revs.pending.nr != p->nr) return ret; req_nr = revs.pending.nr; setup_revisions(2, argv, &revs, NULL); @@ -274,6 +274,7 @@ static int create_bundle(struct bundle_header *header, const char *path, int pid, in, out, i, status; char buffer[1024]; struct rev_info revs; + struct object_array tips; bundle_fd = (!strcmp(path, "-") ? 1 : open(path, O_CREAT | O_WRONLY, 0666)); @@ -311,19 +312,23 @@ static int create_bundle(struct bundle_header *header, const char *path, argc = setup_revisions(argc, argv, &revs, NULL); if (argc > 1) return error("unrecognized argument: %s'", argv[1]); + + memset(&tips, 0, sizeof(tips)); for (i = 0; i < revs.pending.nr; i++) { struct object_array_entry *e = revs.pending.objects + i; - if (!(e->item->flags & UNINTERESTING)) { - unsigned char sha1[20]; - char *ref; - if (dwim_ref(e->name, strlen(e->name), sha1, &ref) != 1) - continue; - write_or_die(bundle_fd, sha1_to_hex(e->item->sha1), 40); - write_or_die(bundle_fd, " ", 1); - write_or_die(bundle_fd, ref, strlen(ref)); - write_or_die(bundle_fd, "\n", 1); - free(ref); - } + unsigned char sha1[20]; + char *ref; + + if (e->item->flags & UNINTERESTING) + continue; + if (dwim_ref(e->name, strlen(e->name), sha1, &ref) != 1) + continue; + write_or_die(bundle_fd, sha1_to_hex(e->item->sha1), 40); + write_or_die(bundle_fd, " ", 1); + write_or_die(bundle_fd, ref, strlen(ref)); + write_or_die(bundle_fd, "\n", 1); + add_object_array(e->item, e->name, &tips); + free(ref); } /* end header */ @@ -350,7 +355,22 @@ static int create_bundle(struct bundle_header *header, const char *path, return -1; if (!WIFEXITED(status) || WEXITSTATUS(status)) return error ("pack-objects died"); - return 0; + + /* + * Make sure the refs we wrote out is correct; --max-count and + * other limiting options could have prevented all the tips + * from getting output. + */ + status = 0; + for (i = 0; i < tips.nr; i++) { + if (!(tips.objects[i].item->flags & SHOWN)) { + status = 1; + error("%s: not included in the resulting pack", + tips.objects[i].name); + } + } + + return status; } static int unbundle(struct bundle_header *header, int bundle_fd, -- 1.5.0.3.862.g71037 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH 3/4] git-bundle: various fixups 2007-03-06 0:41 ` [PATCH 3/4] git-bundle: various fixups Junio C Hamano @ 2007-03-06 2:13 ` Johannes Schindelin 0 siblings, 0 replies; 26+ messages in thread From: Johannes Schindelin @ 2007-03-06 2:13 UTC (permalink / raw) To: Junio C Hamano; +Cc: Linus Torvalds, Santi Béjar, Git Mailing List Hi, On Mon, 5 Mar 2007, Junio C Hamano wrote: > verify_bundle() returned with an error early only when all > prerequisite commits were missing. It should error out much > earlier when some are missing. The idea was to list _all_ missing prerequisites so that git-bundle will not complain _again_, about different missing prerequisites, when the first set of missing prerequisites was obtained. The check only prevented looking for _reachable_ commits, when none of the prerequisites were found in the object database to begin with. > When the rev-list is limited in ways other than revision range (e.g. > --max-count or --max-age), create_bundle() listed all positive refs > given from the command line as if they are available, but resulting pack > may not have some of them. Add a logic to make sure all of them are > included, and error out otherwise. Good catch! Ciao, Dscho ^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 4/4] git-bundle: --list-prereqs 2007-03-06 0:36 ` Junio C Hamano 2007-03-06 0:41 ` [PATCH 2/4] revision traversal: retire BOUNDARY_SHOW Junio C Hamano 2007-03-06 0:41 ` [PATCH 3/4] git-bundle: various fixups Junio C Hamano @ 2007-03-06 0:41 ` Junio C Hamano 2 siblings, 0 replies; 26+ messages in thread From: Junio C Hamano @ 2007-03-06 0:41 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Linus Torvalds, Santi Béjar, Git Mailing List This is primarily of interest while debugging the bundle creation, but there may be other uses. Signed-off-by: Junio C Hamano <junkio@cox.net> --- * I am not sure if this should be a separate command, but the current command set did not have a way to review prereqs, short of: sed -ne '/^PACK/q' -e 'p' bundle.bdl It would make sense to have 'git bundle verify' output the heads and prereqs, maybe with --verbose, but I think doing that always might be even more helpful. builtin-bundle.c | 18 +++++++++++++++++- 1 files changed, 17 insertions(+), 1 deletions(-) diff --git a/builtin-bundle.c b/builtin-bundle.c index 9286f3d..4fe74a7 100644 --- a/builtin-bundle.c +++ b/builtin-bundle.c @@ -241,6 +241,19 @@ static int list_heads(struct bundle_header *header, int argc, const char **argv) return 0; } +static int list_prereqs(struct bundle_header *header, int argc, const char **argv) +{ + int i; + + struct ref_list *p = &header->prerequisites; + for (i = 0; i < p->nr; i++) { + printf("%s %s\n", + sha1_to_hex(p->list[i].sha1), + p->list[i].name); + } + return 0; +} + static void show_commit(struct commit *commit) { write_or_die(1, sha1_to_hex(commit->object.sha1), 40); @@ -432,6 +445,10 @@ int cmd_bundle(int argc, const char **argv, const char *prefix) close(bundle_fd); return !!list_heads(&header, argc, argv); } + if (!strcmp(cmd, "list-prereqs")) { + close(bundle_fd); + return !!list_prereqs(&header, argc, argv); + } if (!strcmp(cmd, "create")) { if (nongit) die("Need a repository to create a bundle."); @@ -443,4 +460,3 @@ int cmd_bundle(int argc, const char **argv, const char *prefix) } else usage(bundle_usage); } - -- 1.5.0.3.862.g71037 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH] revision walker: Fix --boundary when limited 2007-03-05 18:55 ` [PATCH] revision walker: Fix --boundary when limited Johannes Schindelin 2007-03-05 19:00 ` Johannes Schindelin 2007-03-05 19:39 ` Linus Torvalds @ 2007-03-05 21:10 ` Junio C Hamano 2 siblings, 0 replies; 26+ messages in thread From: Junio C Hamano @ 2007-03-05 21:10 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Santi Béjar, Git Mailing List Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > In case revs->limited == 1, the revision walker really reads > everything into revs->commits. The behaviour introduced in > c4025103fa does not behave correctly in that case. > > It used to say: everything which is _still_ in the pipeline > must be a boundary commit. > > So, in the case that revs->limited == 1, filter out all commits > which are not dangling, in effect marking only the dangling > ones as boundary commits. That's what I suggested initially but I think that is wrong. We should show only unshown immediate parents of shown commits at that stage. ^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2007-03-06 15:52 UTC | newest] Thread overview: 26+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-03-05 10:02 [BUG] git-rev-list: --topo-order --boundary and --max-count Santi Béjar 2007-03-05 10:39 ` Junio C Hamano 2007-03-05 12:15 ` Junio C Hamano 2007-03-05 17:05 ` Linus Torvalds 2007-03-05 18:55 ` [PATCH] revision walker: Fix --boundary when limited Johannes Schindelin 2007-03-05 19:00 ` Johannes Schindelin 2007-03-05 19:39 ` Linus Torvalds 2007-03-05 19:57 ` Linus Torvalds 2007-03-05 21:10 ` Junio C Hamano 2007-03-06 1:12 ` Johannes Schindelin 2007-03-06 1:32 ` Junio C Hamano 2007-03-06 1:44 ` Johannes Schindelin 2007-03-06 1:58 ` Junio C Hamano 2007-03-05 23:17 ` Johannes Schindelin 2007-03-06 0:36 ` Junio C Hamano 2007-03-06 0:41 ` [PATCH 2/4] revision traversal: retire BOUNDARY_SHOW Junio C Hamano 2007-03-06 2:05 ` Johannes Schindelin 2007-03-06 2:17 ` Junio C Hamano 2007-03-06 2:23 ` SHOWN means shown Junio C Hamano 2007-03-06 2:29 ` [PATCH 2/4] revision traversal: retire BOUNDARY_SHOW Johannes Schindelin 2007-03-06 11:34 ` Junio C Hamano 2007-03-06 15:52 ` Johannes Schindelin 2007-03-06 0:41 ` [PATCH 3/4] git-bundle: various fixups Junio C Hamano 2007-03-06 2:13 ` Johannes Schindelin 2007-03-06 0:41 ` [PATCH 4/4] git-bundle: --list-prereqs Junio C Hamano 2007-03-05 21:10 ` [PATCH] revision walker: Fix --boundary when limited Junio C Hamano
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).