* [RFC] Add bad-branch-first option for git-bisect @ 2011-01-24 2:03 Shuang He 2011-01-24 2:05 ` [PATCH] add config option core.bisectbadbranchfirst Shuang He 2011-01-24 9:53 ` [RFC] Add bad-branch-first option for git-bisect Christian Couder 0 siblings, 2 replies; 14+ messages in thread From: Shuang He @ 2011-01-24 2:03 UTC (permalink / raw) To: git; +Cc: shuang.he Hi The default git-bisect algorithm will jump around the commit tree, on the purpose of taking least steps to find the first culprit commit. We may find it sometime would locate a old culprit commit that we're not concerned about anymore. In most software development, there's one or two main branch which is maintained for release, and a bunch of feature branches are created for new feature development or bug fix. For the reason that sometime git-bisect will locate a old culprit commit would be: 1. Quality of those branches may not match the main branch, some functionality are broken at first and fixed later on the feature branch. If git-bisect jump to there by chance, git-bisect will only find that old culprit commit which only exists on that feature branch 2. Some of those branches may not synchronized with main branch in time. Say feature1 is broken when feature2 branch is created, and feature1 is fixed just a moment later after feature2 branch is created, and when feature2's development is done, and developer want to merge feature2 branch back to master branch, feature2 will be firstly synchronized to master branch tip, then merge into master. For the same reason addressed in issue 1, this will also lead git-bisect into wrong direction. In all, we think we do not care about branches that we're not currently working, unless we're sure the regression is caused by that branch. To address those issue, we propose to add a new config option: core.bisectbadbranchfirst:: With this algorithm, git-bisect will always try to select commits that on the same branch current bad commit sits. And will fall back to default git-bisect algorithm when bad-branch-first algorithm does not apply + This setting defaults to "false". The draft patch will be sent out in a later email, so it could be reviewed inline. Any question or suggestion is welcome :-) Thanks --Shuang ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH] add config option core.bisectbadbranchfirst 2011-01-24 2:03 [RFC] Add bad-branch-first option for git-bisect Shuang He @ 2011-01-24 2:05 ` Shuang He 2011-01-24 9:53 ` [RFC] Add bad-branch-first option for git-bisect Christian Couder 1 sibling, 0 replies; 14+ messages in thread From: Shuang He @ 2011-01-24 2:05 UTC (permalink / raw) To: git; +Cc: Shuang He which enable recursive bad-branch-first algorithm for git-bisect. With this algorithm, git-bisect will always try to select commits that on the same branch current bad commit sits. And will fall back to default git-bisect algorithm when bad-branch-first algorithm does not apply Signed-off-by: Shuang He <shuang.he@intel.com> --- Documentation/config.txt | 8 ++ bisect.c | 244 +++++++++++++++++++++++++++++++++++++++++----- bisect.h | 2 +- builtin/rev-list.c | 2 +- cache.h | 1 + config.c | 6 + environment.c | 1 + 7 files changed, 238 insertions(+), 26 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index ff7c225..8502859 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -558,6 +558,14 @@ core.sparseCheckout:: Enable "sparse checkout" feature. See section "Sparse checkout" in linkgit:git-read-tree[1] for more information. +core.bisectbadbranchfirst:: + With this algorithm, git-bisect will always try to select commits + that on the same branch current bad commit sits. And will fall back + to default git-bisect algorithm when bad-branch-first algorithm does + not apply ++ +This setting defaults to "false". + add.ignore-errors:: add.ignoreErrors:: Tells 'git add' to continue adding files when some files cannot be diff --git a/bisect.c b/bisect.c index 060c042..57c410d 100644 --- a/bisect.c +++ b/bisect.c @@ -189,6 +189,46 @@ static struct commit_list *best_bisection(struct commit_list *list, int nr) return best; } +static struct commit_list *best_bisection_first_parent(struct commit_list *list, struct commit_list *list2, int nr) +{ + struct commit_list *p, *best; + struct commit_list *p2; + int best_distance = -1; + + best = NULL; + for (p2 = list2; p2; p2 = p2->next) { + int distance; + int on_branch; + + on_branch = 0; + for (p = list; p; p = p->next) { + unsigned flags = p->item->object.flags; + if (flags & TREESAME) + continue; + if (!hashcmp(p->item->object.sha1, p2->item->object.sha1)) { + on_branch = 1; + break; + } + } + + if (!on_branch) + continue; + + weight_set(p2, weight(p)); + distance = weight(p); + if (nr - distance < distance) + distance = nr - distance; + if (distance > best_distance) { + best = p; + best_distance = distance; + } + + } + + return list2; +} + + struct commit_dist { struct commit *commit; int distance; @@ -253,7 +293,7 @@ static struct commit_list *best_bisection_sorted(struct commit_list *list, int n * unknown. After running count_distance() first, they will get zero * or positive distance. */ -static struct commit_list *do_find_bisection(struct commit_list *list, +static struct commit_list *do_find_bisection(struct commit_list *list, struct commit_list *list2, int nr, int *weights, int find_all) { @@ -314,7 +354,7 @@ static struct commit_list *do_find_bisection(struct commit_list *list, clear_distance(list); /* Does it happen to be at exactly half-way? */ - if (!find_all && halfway(p, nr)) + if (!list2 && !find_all && halfway(p, nr)) return p; counted++; } @@ -352,20 +392,27 @@ static struct commit_list *do_find_bisection(struct commit_list *list, weight_set(p, weight(q)); /* Does it happen to be at exactly half-way? */ - if (!find_all && halfway(p, nr)) + if (!list2 && !find_all && halfway(p, nr)) return p; } } show_list("bisection 2 counted all", counted, nr, list); + if (list2) { + struct commit_list *t; + t = best_bisection_first_parent(list, list2, nr); + t = best_bisection_sorted(t, nr); + return t; + } + if (!find_all) return best_bisection(list, nr); else return best_bisection_sorted(list, nr); } -struct commit_list *find_bisection(struct commit_list *list, +struct commit_list *find_bisection(struct commit_list *list, struct commit_list *list2, int *reaches, int *all, int find_all) { @@ -400,7 +447,7 @@ struct commit_list *find_bisection(struct commit_list *list, weights = xcalloc(on_list, sizeof(*weights)); /* Do the real work of finding bisection commit. */ - best = do_find_bisection(list, nr, weights, find_all); + best = do_find_bisection(list, list2, nr, weights, find_all); if (best) { if (!find_all) best->next = NULL; @@ -683,6 +730,33 @@ static void bisect_rev_setup(struct rev_info *revs, const char *prefix, setup_revisions(rev_argv.argv_nr, rev_argv.argv, revs, NULL); } +static void bisect_rev_setup_first_parent(struct rev_info *revs, const char *prefix, + const char *bad_format, const char *good_format, + int read_paths) +{ + struct argv_array rev_argv = { NULL, 0, 0 }; + int i; + + init_revisions(revs, prefix); + revs->abbrev = 0; + revs->commit_format = CMIT_FMT_UNSPECIFIED; + + /* rev_argv.argv[0] will be ignored by setup_revisions */ + argv_array_push(&rev_argv, xstrdup("bisect_rev_setup_first_parent")); + argv_array_push_sha1(&rev_argv, current_bad_sha1, bad_format); + for (i = 0; i < good_revs.sha1_nr; i++) + argv_array_push_sha1(&rev_argv, good_revs.sha1[i], + good_format); + argv_array_push(&rev_argv, xstrdup("--first-parent")); + argv_array_push(&rev_argv, xstrdup("--")); + if (read_paths) + read_bisect_paths(&rev_argv); + argv_array_push(&rev_argv, NULL); + + setup_revisions(rev_argv.argv_nr, rev_argv.argv, revs, NULL); +} + + static void bisect_common(struct rev_info *revs) { if (prepare_revision_walk(revs)) @@ -944,6 +1018,24 @@ static void show_diff_tree(const char *prefix, struct commit *commit) log_tree_commit(&opt, commit); } +int is_merge_commit(struct commit_list *entry) { + int nr = 0; + struct commit *commit; + struct commit_list *p; + + if (entry) { + p = entry->item->parents; + + while (p) { + nr += 1; + p = p->next; + } + } + + return nr - 1; +} + + /* * We use the convention that exiting with an exit code 10 means that * the bisection process finished successfully. @@ -952,41 +1044,145 @@ static void show_diff_tree(const char *prefix, struct commit *commit) int bisect_next_all(const char *prefix) { struct rev_info revs; + struct rev_info first_parent_revs; struct commit_list *tried; int reaches = 0, all = 0, nr, steps; + struct object_array pending_copy; + struct object_array pending_copy2; const unsigned char *bisect_rev; + int i; char bisect_rev_hex[41]; + int bad_branch_first_working = 1; if (read_bisect_refs()) die("reading bisect refs failed"); check_good_are_ancestors_of_bad(prefix); + struct argv_array rev_argv = { NULL, 0, 0 }; + + read_bisect_paths(&rev_argv); + if (rev_argv.argv_nr > 0) + core_bisect_bad_branch_first = 0; + + if (core_bisect_bad_branch_first) { + bisect_rev_setup_first_parent(&first_parent_revs, prefix, "%s", "^%s", 1); + memset(&pending_copy, 0, sizeof(pending_copy)); + for (i = 0; i < first_parent_revs.pending.nr; i++) + add_object_array(first_parent_revs.pending.objects[i].item, + first_parent_revs.pending.objects[i].name, + &pending_copy); + + bisect_common(&first_parent_revs); + + /* Clean up objects used, as they will be reused. */ + for (i = 0; i < pending_copy.nr; i++) { + struct object *o = pending_copy.objects[i].item; + clear_commit_marks((struct commit *)o, ALL_REV_FLAGS); + } + } - bisect_rev_setup(&revs, prefix, "%s", "^%s", 1); - revs.limited = 1; - bisect_common(&revs); - revs.commits = find_bisection(revs.commits, &reaches, &all, - !!skipped_revs.sha1_nr); - revs.commits = managed_skipped(revs.commits, &tried); + if (core_bisect_bad_branch_first) { + bisect_rev_setup(&revs, prefix, "%s", "^%s", 1); + memset(&pending_copy2, 0, sizeof(pending_copy2)); + for (i = 0; i < revs.pending.nr; i++) + add_object_array(revs.pending.objects[i].item, + revs.pending.objects[i].name, + &pending_copy2); + + revs.limited = 1; + + bisect_common(&revs); + + revs.commits = find_bisection(revs.commits, first_parent_revs.commits, &reaches, &all, + !!skipped_revs.sha1_nr); + /* TODO: Check if this is a first bad commit on bad branch, and if it's a merge commit */ + /* If this is a merge commit, we need to try all of its parents, until we found the merged bad branch */ + /* If all parents are good, we just announce it as first bad commit */ + if (!hashcmp(revs.commits->item->object.sha1, current_bad_sha1)) { + printf("%s is the first bad commit\n", sha1_to_hex(current_bad_sha1)); + + if (is_merge_commit(revs.commits)) { + int all_parents_good = 1; + struct commit_list *p; + + printf("%s is a merge commit\n", sha1_to_hex(current_bad_sha1)); + printf("need to try each merged branch\n", sha1_to_hex(current_bad_sha1)); + p = revs.commits->item->parents; + + while (p) { + if (!hashcmp(p->item->object.sha1, current_bad_sha1)) { + fprintf(stderr, "should not get here\n"); + exit(1); + } else if (0 <= lookup_sha1_array(&good_revs, p->item->object.sha1)) { + fprintf(stderr, "good parent %s\n", sha1_to_hex(p->item->object.sha1)); + } else if (0 <= lookup_sha1_array(&skipped_revs, p->item->object.sha1)) { + fprintf(stderr, "skipped parent %s\n", sha1_to_hex(p->item->object.sha1)); + all_parents_good = 0; + } else { + printf("try merged branch %s\n", sha1_to_hex(p->item->object.sha1)); + exit(bisect_checkout(sha1_to_hex(p->item->object.sha1))); + } + p = p->next; + } + + if (all_parents_good) { + show_diff_tree(prefix, revs.commits->item); + exit(10); + } + } + else { + show_diff_tree(prefix, revs.commits->item); + exit(10); + } + } + + revs.commits = managed_skipped(revs.commits, &tried); - if (!revs.commits) { - /* - * We should exit here only if the "bad" - * commit is also a "skip" commit. - */ - exit_if_skipped_commits(tried, NULL); + if (!revs.commits || !all) { + bad_branch_first_working = 0; + fprintf(stderr, "fall back to default git-bisect algorithm\n"); + } + else + fprintf(stderr, "proceed with core_bisect_bad_branch_first algorithm\n"); - printf("%s was both good and bad\n", - sha1_to_hex(current_bad_sha1)); - exit(1); + /* Clean up objects used, as they will be reused. */ + for (i = 0; i < pending_copy2.nr; i++) { + struct object *o = pending_copy2.objects[i].item; + clear_commit_marks((struct commit *)o, ALL_REV_FLAGS); + } } - if (!all) { - fprintf(stderr, "No testable commit found.\n" - "Maybe you started with bad path parameters?\n"); - exit(4); + + if (!core_bisect_bad_branch_first || !bad_branch_first_working) { + bisect_rev_setup(&revs, prefix, "%s", "^%s", 1); + revs.limited = 1; + + bisect_common(&revs); + show_list("fall back", 0, 0, revs.commits); + + revs.commits = find_bisection(revs.commits, NULL, &reaches, &all, + !!skipped_revs.sha1_nr); + revs.commits = managed_skipped(revs.commits, &tried); + + if (!revs.commits) { + /* + * We should exit here only if the "bad" + * commit is also a "skip" commit. + */ + exit_if_skipped_commits(tried, NULL); + + printf("%s was both good and bad\n", + sha1_to_hex(current_bad_sha1)); + exit(1); + } + + if (!all) { + fprintf(stderr, "No testable commit found.\n" + "Maybe you started with bad path parameters?\n"); + exit(4); + } } bisect_rev = revs.commits->item->object.sha1; diff --git a/bisect.h b/bisect.h index 0862ce5..fad4fbe 100644 --- a/bisect.h +++ b/bisect.h @@ -1,7 +1,7 @@ #ifndef BISECT_H #define BISECT_H -extern struct commit_list *find_bisection(struct commit_list *list, +extern struct commit_list *find_bisection(struct commit_list *list, struct commit_list *list2, int *reaches, int *all, int find_all); diff --git a/builtin/rev-list.c b/builtin/rev-list.c index ba27d39..ab56f6b 100644 --- a/builtin/rev-list.c +++ b/builtin/rev-list.c @@ -399,7 +399,7 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix) if (bisect_list) { int reaches = reaches, all = all; - revs.commits = find_bisection(revs.commits, &reaches, &all, + revs.commits = find_bisection(revs.commits, NULL, &reaches, &all, bisect_find_all); if (bisect_show_vars) diff --git a/cache.h b/cache.h index d83d68c..bfd4448 100644 --- a/cache.h +++ b/cache.h @@ -559,6 +559,7 @@ extern int read_replace_refs; extern int fsync_object_files; extern int core_preload_index; extern int core_apply_sparse_checkout; +extern int core_bisect_bad_branch_first; enum safe_crlf { SAFE_CRLF_FALSE = 0, diff --git a/config.c b/config.c index 625e051..b37a70c 100644 --- a/config.c +++ b/config.c @@ -660,6 +660,12 @@ static int git_default_core_config(const char *var, const char *value) return 0; } + if (!strcmp(var, "core.bisectbadbranchfirst")) { + core_bisect_bad_branch_first = git_config_bool(var, value); + return 0; + } + + /* Add other config variables here and to Documentation/config.txt. */ return 0; } diff --git a/environment.c b/environment.c index 9564475..cf813af 100644 --- a/environment.c +++ b/environment.c @@ -55,6 +55,7 @@ enum object_creation_mode object_creation_mode = OBJECT_CREATION_MODE; char *notes_ref_name; int grafts_replace_parents = 1; int core_apply_sparse_checkout; +int core_bisect_bad_branch_first = 0; struct startup_info *startup_info; /* Parallel index stat data preload? */ -- 1.7.4.rc2.21.g7f7a8 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [RFC] Add bad-branch-first option for git-bisect 2011-01-24 2:03 [RFC] Add bad-branch-first option for git-bisect Shuang He 2011-01-24 2:05 ` [PATCH] add config option core.bisectbadbranchfirst Shuang He @ 2011-01-24 9:53 ` Christian Couder 2011-01-24 10:30 ` Shuang He 2011-01-24 20:28 ` Avery Pennarun 1 sibling, 2 replies; 14+ messages in thread From: Christian Couder @ 2011-01-24 9:53 UTC (permalink / raw) To: Shuang He; +Cc: git, apenwarr Hi, On Mon, Jan 24, 2011 at 3:03 AM, Shuang He <shuang.he@intel.com> wrote: > Hi > The default git-bisect algorithm will jump around the commit tree, > on the purpose of taking least steps to find the first culprit commit. > We may find it sometime would locate a old culprit commit that we're not > concerned about anymore. Yes, it can be a problem. > In most software development, there's one or two main branch which > is maintained for release, and a bunch of feature branches are created > for new feature development or bug fix. For the reason that sometime > git-bisect will locate a old culprit commit would be: > 1. Quality of those branches may not match the main branch, > some functionality are broken at first and fixed later on the feature > branch. If git-bisect jump to there by chance, git-bisect will only find > that old > culprit commit which only exists on that feature branch If the quality of these branches is too bad, I think they should not have been merged in the first place. If they are not merged (and not marked as good), then git bisect will not look at them, since it will look only at commits that are ancestors of the bad commit it is given. Or if one is merged but it causes too many problems, then perhaps a replacement commit could be used to unmerge the branch. Another possibility is to have in a file a list of commits that are the last commits on these branches before the merge commits, and do a: git bisect good $(cat good_commits_file.txt) at the beginning of each bisection. So I think the long term solution in this case is not what your are suggesting. > 2. Some of those branches may not synchronized with main > branch in time. Say feature1 is broken when feature2 branch is created, and > feature1 is fixed just a moment later after feature2 branch is created, > and when feature2's development is done, and developer want to merge > feature2 branch back to master branch, feature2 will be firstly > synchronized to master branch tip, then merge into master. For the same > reason addressed in issue 1, this will also lead git-bisect into wrong > direction. I am not sure what you mean by " feature2 will be firstly synchronized to master branch tip", and I think this should mean a rebase that would fix the bug if feature1 has already been merged into the master branch. But anyway in this case, I think that git bisect will find that the first bad commit is the last commit in the branch, just before it was merged. And by looking at the branch graph it should be quite easy to understand what happened. And then the obvious thing to do is to decide to just start a new bisection like it was started the first time but with an added "git bisect good <merge_commit>", where <merge_commit> is the commit that merges the branch. > In all, we think we do not care about branches that we're not > currently working, unless we're sure the regression is caused by that > branch. > > To address those issue, we propose to add a new config option: > core.bisectbadbranchfirst:: > With this algorithm, git-bisect will always try to select > commits > that on the same branch current bad commit sits. And will > fall back > to default git-bisect algorithm when bad-branch-first > algorithm does > not apply > + > This setting defaults to "false". I am not opposed to an option to bisect on the first parents of the bad commit only. And after a very fast look at your patch it seems to be what it does. By the way Avery Pennarun's gitbuilder (https://github.com/apenwarr/gitbuilder) does the same thing. So I know some people are interested in such a feature. But here are some suggestions/comments: - your explanations about why it could be useful should be improved, - the name "bisectbadbranchfirst" seems wrong to me, because git branches are just some special tags; "firstparentsonly" would be a better name, - before having a config option for it, why not have it first as an option to "git bisect start", - if there is a config option, then there should probably be an option to "git bisect start" to use the regular algorithm, - it seems to me that bisecting only on first parents could fail only if some "good" commits are not ancestors of the bad commit, and if the bug is fixed on a branch where such a commit is; so I don't see the point in defaulting to the usual algorithm in this case because in this case the usual algorithm just stops, - perhaps the message given when the first bad commit is found should be changed a little bit when this option is used. Maybe I am missing something, because as I said I didn't read your patch carefully, but in this case please try to give a concrete example. Thanks, Christian. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC] Add bad-branch-first option for git-bisect 2011-01-24 9:53 ` [RFC] Add bad-branch-first option for git-bisect Christian Couder @ 2011-01-24 10:30 ` Shuang He 2011-01-24 10:50 ` Johannes Sixt 2011-01-25 9:20 ` Christian Couder 2011-01-24 20:28 ` Avery Pennarun 1 sibling, 2 replies; 14+ messages in thread From: Shuang He @ 2011-01-24 10:30 UTC (permalink / raw) To: Christian Couder; +Cc: git@vger.kernel.org, apenwarr@gmail.com On 2011/1/24 17:53, Christian Couder wrote: > Hi, > > On Mon, Jan 24, 2011 at 3:03 AM, Shuang He<shuang.he@intel.com> wrote: >> Hi >> The default git-bisect algorithm will jump around the commit tree, >> on the purpose of taking least steps to find the first culprit commit. >> We may find it sometime would locate a old culprit commit that we're not >> concerned about anymore. > Yes, it can be a problem. I'm honored to be given so much comment :) Thank you >> In most software development, there's one or two main branch which >> is maintained for release, and a bunch of feature branches are created >> for new feature development or bug fix. For the reason that sometime >> git-bisect will locate a old culprit commit would be: >> 1. Quality of those branches may not match the main branch, >> some functionality are broken at first and fixed later on the feature >> branch. If git-bisect jump to there by chance, git-bisect will only find >> that old >> culprit commit which only exists on that feature branch > If the quality of these branches is too bad, I think they should not > have been merged in the first place. > If they are not merged (and not marked as good), then git bisect will > not look at them, since it will look only at commits that are > ancestors of the bad commit it is given. > > Or if one is merged but it causes too many problems, then perhaps a > replacement commit could be used to unmerge the branch. > > Another possibility is to have in a file a list of commits that are > the last commits on these branches before the merge commits, and do a: > > git bisect good $(cat good_commits_file.txt) > > at the beginning of each bisection. > > So I think the long term solution in this case is not what your are suggesting. Yeah, I agree that the issue I addressed above will not be a problem if all those branches are maintained very well. Actually we've implemented a automated bisect system for Intel Linux Graphics Driver Project, and so we'd like the system helps us to locate issue in an more automatic way when branches are not maintained as good as expected. >> 2. Some of those branches may not synchronized with main >> branch in time. Say feature1 is broken when feature2 branch is created, and >> feature1 is fixed just a moment later after feature2 branch is created, >> and when feature2's development is done, and developer want to merge >> feature2 branch back to master branch, feature2 will be firstly >> synchronized to master branch tip, then merge into master. For the same >> reason addressed in issue 1, this will also lead git-bisect into wrong >> direction. > I am not sure what you mean by " feature2 will be firstly synchronized > to master branch tip", and I think this should mean a rebase that > would fix the bug if feature1 has already been merged into the master > branch. > > But anyway in this case, I think that git bisect will find that the > first bad commit is the last commit in the branch, just before it was > merged. And by looking at the branch graph it should be quite easy to > understand what happened. > > And then the obvious thing to do is to decide to just start a new > bisection like it was started the first time but with an added "git > bisect good<merge_commit>", where<merge_commit> is the commit that > merges the branch. For the same reason, that we're implementing automated bisect system , so we want git-bisect to be able to help with this condition also. >> In all, we think we do not care about branches that we're not >> currently working, unless we're sure the regression is caused by that >> branch. >> >> To address those issue, we propose to add a new config option: >> core.bisectbadbranchfirst:: >> With this algorithm, git-bisect will always try to select >> commits >> that on the same branch current bad commit sits. And will >> fall back >> to default git-bisect algorithm when bad-branch-first >> algorithm does >> not apply >> + >> This setting defaults to "false". > I am not opposed to an option to bisect on the first parents of the > bad commit only. And after a very fast look at your patch it seems to > be what it does. By the way Avery Pennarun's gitbuilder > (https://github.com/apenwarr/gitbuilder) does the same thing. So I > know some people are interested in such a feature. > > But here are some suggestions/comments: > > - your explanations about why it could be useful should be improved, I don't have much data to prove it. I could just say it could help if branches are not maintained very well. > - the name "bisectbadbranchfirst" seems wrong to me, because git > branches are just some special tags; "firstparentsonly" would be a > better name, It's recursively applying bad branch first algorithm, not just constantly stick to first parent. Given this condition: A -> B -> C -> D -> E -> F -> G -> H (master) \ a -> b -> c -> d -> e / (feature 1) \ x -> y -> z/ (feature 2) start with H as bad commit, and A as good commit, if y is the target bad commit. bad-branch-first algorithm will do it like this: 1. In first round stick to master branch, so it will locate G as first bad commit 2. In second round stick to feature1 branch, then it will locate d as first bad commit 3. In third round stick to feature2 branch, then it will finally locate y as first bad commit So you could see, it's always sticking to branch where current bad commit sit > - before having a config option for it, why not have it first as an > option to "git bisect start", Agree, I have thought about it. Haven't done it yet > - if there is a config option, then there should probably be an option > to "git bisect start" to use the regular algorithm, Agree > - it seems to me that bisecting only on first parents could fail only > if some "good" commits are not ancestors of the bad commit, and if the > bug is fixed on a branch where such a commit is; so I don't see the > point in defaulting to the usual algorithm in this case because in > this case the usual algorithm just stops, It should stop as default git-bisect do, when There are a few cases that this algorithm will fall back to default algorithm: when bisect path is specified. Since --first-parent seems not working as expected when file path is specified when bad branch first algorithm could not find a commit to bisect > - perhaps the message given when the first bad commit is found should > be changed a little bit when this option is used. Yeah, agree > Maybe I am missing something, because as I said I didn't read your > patch carefully, but in this case please try to give a concrete > example. > > Thanks, > Christian. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC] Add bad-branch-first option for git-bisect 2011-01-24 10:30 ` Shuang He @ 2011-01-24 10:50 ` Johannes Sixt 2011-01-24 11:05 ` Shuang He 2011-01-25 9:20 ` Christian Couder 1 sibling, 1 reply; 14+ messages in thread From: Johannes Sixt @ 2011-01-24 10:50 UTC (permalink / raw) To: Shuang He; +Cc: Christian Couder, git@vger.kernel.org, apenwarr@gmail.com Am 1/24/2011 11:30, schrieb Shuang He: > It's recursively applying bad branch first algorithm, not just constantly > stick to first parent. > Given this condition: > A -> B -> C -> D -> E -> F -> G -> H (master) > \ a -> b -> c -> d -> e / (feature 1) > \ x -> y -> z/ (feature 2) > start with H as bad commit, and A as good commit, if y is the target bad > commit. bad-branch-first algorithm will do it like this: > 1. In first round stick to master branch, so it will locate G as first > bad commit > 2. In second round stick to feature1 branch, then it will locate d as > first bad commit > 3. In third round stick to feature2 branch, then it will finally > locate y as first bad commit > So you could see, it's always sticking to branch where current bad commit sit Ok, so you explain what your algorithm does. But you did not illustrate your problem. The history above is ordinary, somewhat branchy, has *ONE* commit that introduces a regression, and *NO* commit that fixes the regression. But in your rationale you said something about "feature1 is fixed just a moment later after feature2 branch is created". How does this fit into the picture, where is the problem, and how does your algorithm solve it? -- Hannes ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC] Add bad-branch-first option for git-bisect 2011-01-24 10:50 ` Johannes Sixt @ 2011-01-24 11:05 ` Shuang He 2011-01-24 20:04 ` Junio C Hamano 0 siblings, 1 reply; 14+ messages in thread From: Shuang He @ 2011-01-24 11:05 UTC (permalink / raw) To: Johannes Sixt; +Cc: Christian Couder, git@vger.kernel.org, apenwarr@gmail.com On 2011/1/24 18:50, Johannes Sixt wrote: > Am 1/24/2011 11:30, schrieb Shuang He: >> It's recursively applying bad branch first algorithm, not just constantly >> stick to first parent. >> Given this condition: >> A -> B -> C -> D -> E -> F -> G -> H (master) >> \ a -> b -> c -> d -> e / (feature 1) >> \ x -> y -> z/ (feature 2) >> start with H as bad commit, and A as good commit, if y is the target bad >> commit. bad-branch-first algorithm will do it like this: >> 1. In first round stick to master branch, so it will locate G as first >> bad commit >> 2. In second round stick to feature1 branch, then it will locate d as >> first bad commit >> 3. In third round stick to feature2 branch, then it will finally >> locate y as first bad commit >> So you could see, it's always sticking to branch where current bad commit sit > Ok, so you explain what your algorithm does. > > But you did not illustrate your problem. The history above is ordinary, > somewhat branchy, has *ONE* commit that introduces a regression, and *NO* > commit that fixes the regression. But in your rationale you said something > about "feature1 is fixed just a moment later after feature2 branch is > created". How does this fit into the picture, where is the problem, and > how does your algorithm solve it? > > -- Hannes If A is bad commit, and C fixed it, and then F is bad again, A -> B -> C -> D -> E -> F -> G -> H (master) \ \ / a -> b... c -> d -> e->f (feature 1) Start with H as bad commit, and D as good commit, it's possible git-bisect would jump to c, and it will lead to wrong direction If bad-branch-first is used, it would be: 1. first round found F 2. end Thanks --Shuang Thanks --Shuang ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC] Add bad-branch-first option for git-bisect 2011-01-24 11:05 ` Shuang He @ 2011-01-24 20:04 ` Junio C Hamano 2011-01-25 3:27 ` Shuang He 0 siblings, 1 reply; 14+ messages in thread From: Junio C Hamano @ 2011-01-24 20:04 UTC (permalink / raw) To: Shuang He Cc: Johannes Sixt, Christian Couder, git@vger.kernel.org, apenwarr@gmail.com Shuang He <shuang.he@intel.com> writes: > If A is bad commit, and C fixed it, and then F is bad again, > > A -> B -> C -> D -> E -> F -> G -> H (master) > \ \ / > a -> b... c -> d -> e->f (feature 1) > > Start with H as bad commit, and D as good commit, it's possible git-bisect would jump to c, and it will lead to wrong direction > > If bad-branch-first is used, it would be: > 1. first round found F > 2. end It is unclear from the way you drew the picture if "F" is supposed to be a merge of "E" and "f", but I'd assume that it is. So what you are saying in 1. is "skip from H until you hit a first merge (without testing any intermediate commit), find F and stop to check it, and find that it is broken". What makes you decide "2. end"? The fact that both of its parents "E" and "f" are Ok? IOW, it won't be "2. end" if one of the parents of the merge is broken? What if there is _no_ merge from a side branch but there were breakages in A (fixed in C) and then F in your original picture, i.e. A---B---C---D---E---F---G---H (broken) x o x and you are hunting for the bug starting from H? How does your algorithm help? I grossed over the linear part by saying "skip from H until you hit a first merge", but in general, what is your plan to handle linear part of the history? A totally unacceptable answer is "It does not help linear case, but it helps when there are merges". The a-thru-f side branch in your picture, or any "culprit side branch that was merged" your algorithm finds in general, would eventually have a linear segment, and having x-o-x in the history fundmentally breaks "bisect"---your band-aid will not help. The whole idea behind using "bisect" to gain efficiency in isolating the issue depends on "Once you see a Good commit, you do not have to search beyond its ancestors", as it is to look for a single breakage that persists to the "Bad" commit you give, and as far as "bisect" is concerned, the breakage at A in your example is an unrelated breakage that did not persist through the history to the "Bad" commit H. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC] Add bad-branch-first option for git-bisect 2011-01-24 20:04 ` Junio C Hamano @ 2011-01-25 3:27 ` Shuang He 0 siblings, 0 replies; 14+ messages in thread From: Shuang He @ 2011-01-25 3:27 UTC (permalink / raw) To: Junio C Hamano Cc: Johannes Sixt, Christian Couder, git@vger.kernel.org, apenwarr@gmail.com On 2011/1/25 4:04, Junio C Hamano wrote: > Shuang He<shuang.he@intel.com> writes: > >> If A is bad commit, and C fixed it, and then F is bad again, >> >> A -> B -> C -> D -> E -> F -> G -> H (master) >> \ \ / >> a -> b... c -> d -> e->f (feature 1) >> >> Start with H as bad commit, and D as good commit, it's possible git-bisect would jump to c, and it will lead to wrong direction >> >> If bad-branch-first is used, it would be: >> 1. first round found F >> 2. end > It is unclear from the way you drew the picture if "F" is supposed to be a > merge of "E" and "f", but I'd assume that it is. Oh, I lost this mail That graph is different from what I meant, when shown in different email client. It's G which is merged from e and F > So what you are saying in 1. is "skip from H until you hit a first merge > (without testing any intermediate commit), find F and stop to check it, > and find that it is broken". > > What makes you decide "2. end"? The fact that both of its parents "E" and > "f" are Ok? IOW, it won't be "2. end" if one of the parents of the merge > is broken? I think the correction above should have answer those two questions. > What if there is _no_ merge from a side branch but there were breakages in > A (fixed in C) and then F in your original picture, i.e. > > A---B---C---D---E---F---G---H (broken) > x o x > > and you are hunting for the bug starting from H? How does your algorithm > help? I grossed over the linear part by saying "skip from H until you hit > a first merge", but in general, what is your plan to handle linear part of > the history? If the history is linear, the new algorithm won't help, it will just behavior like default git-bisect algorithm. > A totally unacceptable answer is "It does not help linear case, but it > helps when there are merges". The a-thru-f side branch in your picture, > or any "culprit side branch that was merged" your algorithm finds in > general, would eventually have a linear segment, and having x-o-x in the > history fundmentally breaks "bisect"---your band-aid will not help. > > The whole idea behind using "bisect" to gain efficiency in isolating the > issue depends on "Once you see a Good commit, you do not have to search > beyond its ancestors", as it is to look for a single breakage that > persists to the "Bad" commit you give, and as far as "bisect" is > concerned, the breakage at A in your example is an unrelated breakage that > did not persist through the history to the "Bad" commit H. In the example above (after we know G is merged from e and F), Those commits are old bad commit: A, B, a, b, ..., c, d (but we don't care about those old bad commits, we cared about latest bad commit that we met which is F) It's possible that default git-bisect would jump to these old bad commits, and will finally find an old first bad commit With bad-branch-first, it could help us to get away from the trouble that old culprit commit exist on feature1 branch for a period of time not fixed Thanks --Shuang ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC] Add bad-branch-first option for git-bisect 2011-01-24 10:30 ` Shuang He 2011-01-24 10:50 ` Johannes Sixt @ 2011-01-25 9:20 ` Christian Couder 2011-01-26 7:22 ` Shuang He 1 sibling, 1 reply; 14+ messages in thread From: Christian Couder @ 2011-01-25 9:20 UTC (permalink / raw) To: Shuang He Cc: git@vger.kernel.org, apenwarr@gmail.com, Junio C Hamano, Johannes Sixt On Mon, Jan 24, 2011 at 11:30 AM, Shuang He <shuang.he@intel.com> wrote: > On 2011/1/24 17:53, Christian Couder wrote: >> >> Hi, >> >> On Mon, Jan 24, 2011 at 3:03 AM, Shuang He<shuang.he@intel.com> wrote: >>> >>> Hi >>> The default git-bisect algorithm will jump around the commit tree, >>> on the purpose of taking least steps to find the first culprit commit. >>> We may find it sometime would locate a old culprit commit that we're not >>> concerned about anymore. >> >> Yes, it can be a problem. > > I'm honored to be given so much comment :) > Thank you I am honored by your interest in git bisect and the fact that you provided a patch :-) Thanks! >> If the quality of these branches is too bad, I think they should not >> have been merged in the first place. >> If they are not merged (and not marked as good), then git bisect will >> not look at them, since it will look only at commits that are >> ancestors of the bad commit it is given. >> >> Or if one is merged but it causes too many problems, then perhaps a >> replacement commit could be used to unmerge the branch. >> >> Another possibility is to have in a file a list of commits that are >> the last commits on these branches before the merge commits, and do a: >> >> git bisect good $(cat good_commits_file.txt) >> >> at the beginning of each bisection. >> >> So I think the long term solution in this case is not what your are >> suggesting. > > Yeah, I agree that the issue I addressed above will not be a problem if all > those branches are maintained very well. > Actually we've implemented a automated bisect system for Intel Linux > Graphics Driver Project, and so we'd like the system > helps us to locate issue in an more automatic way when branches are not > maintained as good as expected. I think there is always a price to pay when you bisect if the branches are not well maintained. Maybe your algorithm could help in some cases, but my opinion is that there will probably still be many problems and a human will often have to take a look. >>> 2. Some of those branches may not synchronized with main >>> branch in time. Say feature1 is broken when feature2 branch is created, >>> and >>> feature1 is fixed just a moment later after feature2 branch is created, >>> and when feature2's development is done, and developer want to merge >>> feature2 branch back to master branch, feature2 will be firstly >>> synchronized to master branch tip, then merge into master. For the same >>> reason addressed in issue 1, this will also lead git-bisect into wrong >>> direction. >> >> I am not sure what you mean by " feature2 will be firstly synchronized >> to master branch tip", and I think this should mean a rebase that >> would fix the bug if feature1 has already been merged into the master >> branch. >> >> But anyway in this case, I think that git bisect will find that the >> first bad commit is the last commit in the branch, just before it was >> merged. And by looking at the branch graph it should be quite easy to >> understand what happened. Now I think I was wrong here, as git bisect will probably find that the first commit in the branch (not the last one) is the first bad commit. [...] >> - the name "bisectbadbranchfirst" seems wrong to me, because git >> branches are just some special tags; "firstparentsonly" would be a >> better name, > > It's recursively applying bad branch first algorithm, not just constantly > stick to first parent. > Given this condition: > A -> B -> C -> D -> E -> F -> G -> H (master) > \ a -> b -> c -> d -> e / (feature 1) > \ x -> y -> z/ (feature 2) > start with H as bad commit, and A as good commit, if y is the target bad > commit. bad-branch-first algorithm will do it like this: > 1. In first round stick to master branch, so it will locate G as first > bad commit > 2. In second round stick to feature1 branch, then it will locate d as > first bad commit > 3. In third round stick to feature2 branch, then it will finally locate y > as first bad commit > So you could see, it's always sticking to branch where current bad commit > sit I see. It is interesting, but why not develop a "firstparentsonly" algorithm first? As Avery explains in his email, it is already interesting to have a "firstparentsonly" algorithm because some people are only interested to know from which branch the bug comes from. When they know that, they can just contact the relevant people and be done with it. And when we have a "firstparentsonly" algorithm, then your algorithm could be just a script that repeatedly uses git bisect with the "firstparentsonly" algorithm. And this script might be integrated in the "contrib" directory if it not considered important to be integrated as an algorithm into git bisect. Thanks, Christian. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC] Add bad-branch-first option for git-bisect 2011-01-25 9:20 ` Christian Couder @ 2011-01-26 7:22 ` Shuang He 2011-01-26 9:44 ` Christian Couder 0 siblings, 1 reply; 14+ messages in thread From: Shuang He @ 2011-01-26 7:22 UTC (permalink / raw) To: Christian Couder Cc: git@vger.kernel.org, apenwarr@gmail.com, Junio C Hamano, Johannes Sixt On 2011/1/25 17:20, Christian Couder wrote: > On Mon, Jan 24, 2011 at 11:30 AM, Shuang He<shuang.he@intel.com> wrote: >> On 2011/1/24 17:53, Christian Couder wrote: >>> Hi, >>> >>> On Mon, Jan 24, 2011 at 3:03 AM, Shuang He<shuang.he@intel.com> wrote: >>>> Hi >>>> The default git-bisect algorithm will jump around the commit tree, >>>> on the purpose of taking least steps to find the first culprit commit. >>>> We may find it sometime would locate a old culprit commit that we're not >>>> concerned about anymore. >>> Yes, it can be a problem. >> I'm honored to be given so much comment :) >> Thank you > I am honored by your interest in git bisect and the fact that you > provided a patch :-) > Thanks! I'm glad to see that git community is so hot. > >>> If the quality of these branches is too bad, I think they should not >>> have been merged in the first place. >>> If they are not merged (and not marked as good), then git bisect will >>> not look at them, since it will look only at commits that are >>> ancestors of the bad commit it is given. >>> >>> Or if one is merged but it causes too many problems, then perhaps a >>> replacement commit could be used to unmerge the branch. >>> >>> Another possibility is to have in a file a list of commits that are >>> the last commits on these branches before the merge commits, and do a: >>> >>> git bisect good $(cat good_commits_file.txt) >>> >>> at the beginning of each bisection. >>> >>> So I think the long term solution in this case is not what your are >>> suggesting. >> Yeah, I agree that the issue I addressed above will not be a problem if all >> those branches are maintained very well. >> Actually we've implemented a automated bisect system for Intel Linux >> Graphics Driver Project, and so we'd like the system >> helps us to locate issue in an more automatic way when branches are not >> maintained as good as expected. > I think there is always a price to pay when you bisect if the branches > are not well maintained. > Maybe your algorithm could help in some cases, but my opinion is that > there will probably still be many problems and a human will often have > to take a look. > Yes, I agree. What we trying to do is just make the machine to do more help for human. >>>> 2. Some of those branches may not synchronized with main >>>> branch in time. Say feature1 is broken when feature2 branch is created, >>>> and >>>> feature1 is fixed just a moment later after feature2 branch is created, >>>> and when feature2's development is done, and developer want to merge >>>> feature2 branch back to master branch, feature2 will be firstly >>>> synchronized to master branch tip, then merge into master. For the same >>>> reason addressed in issue 1, this will also lead git-bisect into wrong >>>> direction. >>> I am not sure what you mean by " feature2 will be firstly synchronized >>> to master branch tip", and I think this should mean a rebase that >>> would fix the bug if feature1 has already been merged into the master >>> branch. >>> >>> But anyway in this case, I think that git bisect will find that the >>> first bad commit is the last commit in the branch, just before it was >>> merged. And by looking at the branch graph it should be quite easy to >>> understand what happened. > Now I think I was wrong here, as git bisect will probably find that > the first commit in the branch (not the last one) is the first bad > commit. > > [...] > >>> - the name "bisectbadbranchfirst" seems wrong to me, because git >>> branches are just some special tags; "firstparentsonly" would be a >>> better name, >> It's recursively applying bad branch first algorithm, not just constantly >> stick to first parent. >> Given this condition: >> A -> B -> C -> D -> E -> F -> G -> H (master) >> \ a -> b -> c -> d -> e / (feature 1) >> \ x -> y -> z/ (feature 2) >> start with H as bad commit, and A as good commit, if y is the target bad >> commit. bad-branch-first algorithm will do it like this: >> 1. In first round stick to master branch, so it will locate G as first >> bad commit >> 2. In second round stick to feature1 branch, then it will locate d as >> first bad commit >> 3. In third round stick to feature2 branch, then it will finally locate y >> as first bad commit >> So you could see, it's always sticking to branch where current bad commit >> sit > I see. It is interesting, but why not develop a "firstparentsonly" > algorithm first? > > As Avery explains in his email, it is already interesting to have a > "firstparentsonly" algorithm because some people are only interested > to know from which branch the bug comes from. > When they know that, they can just contact the relevant people and be > done with it. > > And when we have a "firstparentsonly" algorithm, then your algorithm > could be just a script that repeatedly uses git bisect with the > "firstparentsonly" algorithm. And this script might be integrated in > the "contrib" directory if it not considered important to be > integrated as an algorithm into git bisect. Sorry to reply so late, since I was on a long journey home for Chinese New Year vacation ;) I agree that's also an good option. Is it acceptable to add option to git-bisect stuff, so user could choose which algorithm to use at every step at will. And we have tested previous attached patch with t6002-rev-list-bisect.sh and t6030-bisect-porcelain.sh, and we get: with bad-branch-first disabled (which is the default setting): t6002-rev-list-bisect.sh: # passed all 45 test(s) t6030-bisect-porcelain.sh: # passed all 40 test(s) and with bad-branch-first enabled: t6002-rev-list-bisect.sh: # passed all 45 test(s) t6030-bisect-porcelain.sh: # failed 5 among 40 test(s), and I have spent some time digging into those failures ,and it seems they're all false negative since they're using hard-coded bisect path to validate specific case Thanks --Shuang > Thanks, > Christian. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC] Add bad-branch-first option for git-bisect 2011-01-26 7:22 ` Shuang He @ 2011-01-26 9:44 ` Christian Couder 2011-01-26 10:40 ` Shuang He 0 siblings, 1 reply; 14+ messages in thread From: Christian Couder @ 2011-01-26 9:44 UTC (permalink / raw) To: Shuang He Cc: git@vger.kernel.org, apenwarr@gmail.com, Junio C Hamano, Johannes Sixt On Wed, Jan 26, 2011 at 8:22 AM, Shuang He <shuang.he@intel.com> wrote: > On 2011/1/25 17:20, Christian Couder wrote: >> >>> >>> Yeah, I agree that the issue I addressed above will not be a problem if >>> all >>> those branches are maintained very well. >>> Actually we've implemented a automated bisect system for Intel Linux >>> Graphics Driver Project, and so we'd like the system >>> helps us to locate issue in an more automatic way when branches are not >>> maintained as good as expected. >> >> I think there is always a price to pay when you bisect if the branches >> are not well maintained. >> Maybe your algorithm could help in some cases, but my opinion is that >> there will probably still be many problems and a human will often have >> to take a look. >> > > Yes, I agree. What we trying to do is just make the machine to do more help > for human. Yeah, this is the way to go. And by the way I am happy to know that you have implemented an automated bisect system. That's great and I hope it already helps. >>>> - the name "bisectbadbranchfirst" seems wrong to me, because git >>>> branches are just some special tags; "firstparentsonly" would be a >>>> better name, >>> >>> It's recursively applying bad branch first algorithm, not just constantly >>> stick to first parent. >>> Given this condition: >>> A -> B -> C -> D -> E -> F -> G -> H (master) >>> \ a -> b -> c -> d -> e / (feature 1) >>> \ x -> y -> z/ (feature 2) >>> start with H as bad commit, and A as good commit, if y is the target bad >>> commit. bad-branch-first algorithm will do it like this: >>> 1. In first round stick to master branch, so it will locate G as first >>> bad commit >>> 2. In second round stick to feature1 branch, then it will locate d as >>> first bad commit >>> 3. In third round stick to feature2 branch, then it will finally >>> locate y >>> as first bad commit >>> So you could see, it's always sticking to branch where current bad commit >>> sit >> >> I see. It is interesting, but why not develop a "firstparentsonly" >> algorithm first? >> >> As Avery explains in his email, it is already interesting to have a >> "firstparentsonly" algorithm because some people are only interested >> to know from which branch the bug comes from. >> When they know that, they can just contact the relevant people and be >> done with it. >> >> And when we have a "firstparentsonly" algorithm, then your algorithm >> could be just a script that repeatedly uses git bisect with the >> "firstparentsonly" algorithm. And this script might be integrated in >> the "contrib" directory if it not considered important to be >> integrated as an algorithm into git bisect. > > Sorry to reply so late, since I was on a long journey home for Chinese New > Year vacation ;) No problem. I am not in a hurry at all. In fact I don't have much time these days so I reply very late too. > I agree that's also an good option. > Is it acceptable to add option to git-bisect stuff, so user could choose > which algorithm to use at every step at will. Are you sure it is needed to be able to change the algorithm at every step? This means that you would like a new "git bisect strategy <strategy>" subcommand ? First I thought that we could just add a "--strategy <strategy>" option to "git bisect start". But anyway, I think it should be easy to add afterward, and it can be done in a separated patch that can be discussed on its own. > And we have tested previous attached patch with t6002-rev-list-bisect.sh and > t6030-bisect-porcelain.sh, and we get: > with bad-branch-first disabled (which is the default setting): > t6002-rev-list-bisect.sh: # passed all 45 test(s) > t6030-bisect-porcelain.sh: # passed all 40 test(s) > and with bad-branch-first enabled: > t6002-rev-list-bisect.sh: # passed all 45 test(s) > t6030-bisect-porcelain.sh: # failed 5 among 40 test(s), and I have > spent some time digging into those failures ,and it seems they're all false > negative since they're using hard-coded bisect path to validate specific > case Yes, there are some hard coded commits that depend on the algorithm. Anyway I did not look in depth at your patch yet, and as I said it would be better if you could split it into a patch series where a "firstparentsonly" algorithm is implemented first. This way it will be easier to review, and we can start to integrate some non controversial features, and then discuss the other ones on their own merit. Thanks in advance, Christian. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC] Add bad-branch-first option for git-bisect 2011-01-26 9:44 ` Christian Couder @ 2011-01-26 10:40 ` Shuang He 0 siblings, 0 replies; 14+ messages in thread From: Shuang He @ 2011-01-26 10:40 UTC (permalink / raw) To: Christian Couder Cc: git@vger.kernel.org, apenwarr@gmail.com, Junio C Hamano, Johannes Sixt On 2011/1/26 17:44, Christian Couder wrote: > On Wed, Jan 26, 2011 at 8:22 AM, Shuang He<shuang.he@intel.com> wrote: >> On 2011/1/25 17:20, Christian Couder wrote: >>>> Yeah, I agree that the issue I addressed above will not be a problem if >>>> all >>>> those branches are maintained very well. >>>> Actually we've implemented a automated bisect system for Intel Linux >>>> Graphics Driver Project, and so we'd like the system >>>> helps us to locate issue in an more automatic way when branches are not >>>> maintained as good as expected. >>> I think there is always a price to pay when you bisect if the branches >>> are not well maintained. >>> Maybe your algorithm could help in some cases, but my opinion is that >>> there will probably still be many problems and a human will often have >>> to take a look. >>> >> Yes, I agree. What we trying to do is just make the machine to do more help >> for human. > Yeah, this is the way to go. And by the way I am happy to know that > you have implemented an automated bisect system. That's great and I > hope it already helps. > >>>>> - the name "bisectbadbranchfirst" seems wrong to me, because git >>>>> branches are just some special tags; "firstparentsonly" would be a >>>>> better name, >>>> It's recursively applying bad branch first algorithm, not just constantly >>>> stick to first parent. >>>> Given this condition: >>>> A -> B -> C -> D -> E -> F -> G -> H (master) >>>> \ a -> b -> c -> d -> e / (feature 1) >>>> \ x -> y -> z/ (feature 2) >>>> start with H as bad commit, and A as good commit, if y is the target bad >>>> commit. bad-branch-first algorithm will do it like this: >>>> 1. In first round stick to master branch, so it will locate G as first >>>> bad commit >>>> 2. In second round stick to feature1 branch, then it will locate d as >>>> first bad commit >>>> 3. In third round stick to feature2 branch, then it will finally >>>> locate y >>>> as first bad commit >>>> So you could see, it's always sticking to branch where current bad commit >>>> sit >>> I see. It is interesting, but why not develop a "firstparentsonly" >>> algorithm first? >>> >>> As Avery explains in his email, it is already interesting to have a >>> "firstparentsonly" algorithm because some people are only interested >>> to know from which branch the bug comes from. >>> When they know that, they can just contact the relevant people and be >>> done with it. >>> >>> And when we have a "firstparentsonly" algorithm, then your algorithm >>> could be just a script that repeatedly uses git bisect with the >>> "firstparentsonly" algorithm. And this script might be integrated in >>> the "contrib" directory if it not considered important to be >>> integrated as an algorithm into git bisect. >> Sorry to reply so late, since I was on a long journey home for Chinese New >> Year vacation ;) > No problem. I am not in a hurry at all. In fact I don't have much time > these days so I reply very late too. > >> I agree that's also an good option. >> Is it acceptable to add option to git-bisect stuff, so user could choose >> which algorithm to use at every step at will. > Are you sure it is needed to be able to change the algorithm at every step? I don't think it's needed, it would just give user more control over the algorithm. > This means that you would like a new "git bisect strategy<strategy>" > subcommand ? > > First I thought that we could just add a "--strategy<strategy>" > option to "git bisect start". > But anyway, I think it should be easy to add afterward, and it can be > done in a separated patch that can be discussed on its own. Yeah, agree. We could discuss this later >> And we have tested previous attached patch with t6002-rev-list-bisect.sh and >> t6030-bisect-porcelain.sh, and we get: >> with bad-branch-first disabled (which is the default setting): >> t6002-rev-list-bisect.sh: # passed all 45 test(s) >> t6030-bisect-porcelain.sh: # passed all 40 test(s) >> and with bad-branch-first enabled: >> t6002-rev-list-bisect.sh: # passed all 45 test(s) >> t6030-bisect-porcelain.sh: # failed 5 among 40 test(s), and I have >> spent some time digging into those failures ,and it seems they're all false >> negative since they're using hard-coded bisect path to validate specific >> case > Yes, there are some hard coded commits that depend on the algorithm. > Anyway I did not look in depth at your patch yet, and as I said it > would be better if you could split it into a patch series where a > "firstparentsonly" algorithm is implemented first. > This way it will be easier to review, and we can start to integrate > some non controversial features, and then discuss the other ones on > their own merit. > > Thanks in advance, > Christian. Thanks for the good suggestion, I'll start the work soon. Thanks --Shuang ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC] Add bad-branch-first option for git-bisect 2011-01-24 9:53 ` [RFC] Add bad-branch-first option for git-bisect Christian Couder 2011-01-24 10:30 ` Shuang He @ 2011-01-24 20:28 ` Avery Pennarun 2011-01-26 7:11 ` Shuang He 1 sibling, 1 reply; 14+ messages in thread From: Avery Pennarun @ 2011-01-24 20:28 UTC (permalink / raw) To: Christian Couder; +Cc: Shuang He, git, Johannes Sixt, Junio C Hamano On Mon, Jan 24, 2011 at 1:53 AM, Christian Couder <christian.couder@gmail.com> wrote: > I am not opposed to an option to bisect on the first parents of the > bad commit only. And after a very fast look at your patch it seems to > be what it does. By the way Avery Pennarun's gitbuilder > (https://github.com/apenwarr/gitbuilder) does the same thing. So I > know some people are interested in such a feature. Just some notes on gitbuilder's algorithm, since I haven't spent the time to fully understand Shuang's proposal. I do understand at least one of his concerns, that is, that people like to do a lot of "messy" development on a branch, and when the branch is done, merge the whole messy branch into the "mainline". The messy branch would then have a lot of commits that break a lot of things before fixing them again later. In a corporate environment, this method allows people to work all day, make frequent commits, pull from other branches at will, and never risk their lives by doing poorly-educated rebases. It works pretty well *until* you try to bisect, at which time all these messy commits start to bite you. gitbuilder's bisection is a total hack around this situation, although it happens to work perfectly in the workflow it was designed for, thus making me feel clever. Basically, we push/fetch *all* the branches from *everybody* into a single repo, and build all of them as frequently as we can. If you think about it, if you have all the branches that someone might have pulled/merged from, then you don't have to think of the git history as a whole complicated DAG; you can just think of it as a whole bunch of separate chunks of linear history. Moreover, as long as people are careful to only pull from a branch when that branch is passing all tests - which you can easily see by looking at the gitbuilder console - then playing inside each of these chunks of linear history can help you figure out where particular bugs were introduced during "messy" branches. It also allows you a nice separation of concerns. The owner of the mainline branch (the "integration manager" person) only really cares about which branch they merged that caused a problem, because that person doesn't want to fix bugs, he/she simply wants to know who owns the failing branch, so that person can fix *their* bug and their branch will merge without breaking things. So this is why gitbuilder uses "git rev-list --first-parent" during its "fake bisection" operation: because a different person is responsible for each "linear chunk" of history. Note that you have to use --no-ff when merging if you want this to work reliably. But the build manager person can just remember to do that. Combining --no-ff and --ff-only (which sound mutually exclusive but aren't) is a way to be extra specially sure. Now, if you aren't using gitbuilder, what we want from "bisection" is not quite the same, but let's imagine that you at least have a similar setup, where people *only* ever merge into the mainline by using --no-ff. In that case, you'd like a bisect operation that *starts* by using --first-parent, which will tell you which merge caused the problem. After that, you might want to bisect into the branch. (I don't actually remember if 'git bisect' understands --first-parent correctly. gitbuilder doesn't exactly bisect either, but that's another story and not relevant right now.) I can actually imagine that there are many more projects that do what I'm talking about - "messy" branches that get broken and fixed over time, then merge into a "clean" mainline - than projects (like the kernel and git.git) that try to keep all branches clean at all times. Thus, I could see some argument that a "--first-parents first" bisection would actually help out a lot of people, and maybe even deserves to be the default. I don't really care though, I just use gitbuilder :) Have fun, Avery ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC] Add bad-branch-first option for git-bisect 2011-01-24 20:28 ` Avery Pennarun @ 2011-01-26 7:11 ` Shuang He 0 siblings, 0 replies; 14+ messages in thread From: Shuang He @ 2011-01-26 7:11 UTC (permalink / raw) To: Avery Pennarun Cc: Christian Couder, git@vger.kernel.org, Johannes Sixt, Junio C Hamano On 2011/1/25 4:28, Avery Pennarun wrote: > On Mon, Jan 24, 2011 at 1:53 AM, Christian Couder > <christian.couder@gmail.com> wrote: >> I am not opposed to an option to bisect on the first parents of the >> bad commit only. And after a very fast look at your patch it seems to >> be what it does. By the way Avery Pennarun's gitbuilder >> (https://github.com/apenwarr/gitbuilder) does the same thing. So I >> know some people are interested in such a feature. > Just some notes on gitbuilder's algorithm, since I haven't spent the > time to fully understand Shuang's proposal. > > I do understand at least one of his concerns, that is, that people > like to do a lot of "messy" development on a branch, and when the > branch is done, merge the whole messy branch into the "mainline". The > messy branch would then have a lot of commits that break a lot of > things before fixing them again later. > > In a corporate environment, this method allows people to work all day, > make frequent commits, pull from other branches at will, and never > risk their lives by doing poorly-educated rebases. It works pretty > well *until* you try to bisect, at which time all these messy commits > start to bite you. > > gitbuilder's bisection is a total hack around this situation, although > it happens to work perfectly in the workflow it was designed for, thus > making me feel clever. > > Basically, we push/fetch *all* the branches from *everybody* into a > single repo, and build all of them as frequently as we can. If you > think about it, if you have all the branches that someone might have > pulled/merged from, then you don't have to think of the git history as > a whole complicated DAG; you can just think of it as a whole bunch of > separate chunks of linear history. Moreover, as long as people are > careful to only pull from a branch when that branch is passing all > tests - which you can easily see by looking at the gitbuilder console > - then playing inside each of these chunks of linear history can help > you figure out where particular bugs were introduced during "messy" > branches. > > It also allows you a nice separation of concerns. The owner of the > mainline branch (the "integration manager" person) only really cares > about which branch they merged that caused a problem, because that > person doesn't want to fix bugs, he/she simply wants to know who owns > the failing branch, so that person can fix *their* bug and their > branch will merge without breaking things. > > So this is why gitbuilder uses "git rev-list --first-parent" during > its "fake bisection" operation: because a different person is > responsible for each "linear chunk" of history. > > Note that you have to use --no-ff when merging if you want this to > work reliably. But the build manager person can just remember to do > that. Combining --no-ff and --ff-only (which sound mutually exclusive > but aren't) is a way to be extra specially sure. > > Now, if you aren't using gitbuilder, what we want from "bisection" is > not quite the same, but let's imagine that you at least have a similar > setup, where people *only* ever merge into the mainline by using > --no-ff. In that case, you'd like a bisect operation that *starts* by > using --first-parent, which will tell you which merge caused the > problem. After that, you might want to bisect into the branch. > > (I don't actually remember if 'git bisect' understands --first-parent > correctly. gitbuilder doesn't exactly bisect either, but that's > another story and not relevant right now.) > > I can actually imagine that there are many more projects that do what > I'm talking about - "messy" branches that get broken and fixed over > time, then merge into a "clean" mainline - than projects (like the > kernel and git.git) that try to keep all branches clean at all times. > Thus, I could see some argument that a "--first-parents first" > bisection would actually help out a lot of people, and maybe even > deserves to be the default. > > I don't really care though, I just use gitbuilder :) > > Have fun, > > Avery Thanks for helping explaining those stuff, and also glad to learn more about gitbuilder :) Thanks --Shuang ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2011-01-26 10:40 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-01-24 2:03 [RFC] Add bad-branch-first option for git-bisect Shuang He 2011-01-24 2:05 ` [PATCH] add config option core.bisectbadbranchfirst Shuang He 2011-01-24 9:53 ` [RFC] Add bad-branch-first option for git-bisect Christian Couder 2011-01-24 10:30 ` Shuang He 2011-01-24 10:50 ` Johannes Sixt 2011-01-24 11:05 ` Shuang He 2011-01-24 20:04 ` Junio C Hamano 2011-01-25 3:27 ` Shuang He 2011-01-25 9:20 ` Christian Couder 2011-01-26 7:22 ` Shuang He 2011-01-26 9:44 ` Christian Couder 2011-01-26 10:40 ` Shuang He 2011-01-24 20:28 ` Avery Pennarun 2011-01-26 7:11 ` Shuang He
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).