* An idea for "git bisect" and a GSoC enquiry @ 2014-02-26 8:28 Jacopo Notarstefano 2014-02-26 19:58 ` Junio C Hamano ` (2 more replies) 0 siblings, 3 replies; 17+ messages in thread From: Jacopo Notarstefano @ 2014-02-26 8:28 UTC (permalink / raw) To: git Hey everyone, my name is Jacopo, a student developer from Italy, and I'm interested in applying to this years' Google Summer of Code. I set my eyes on the project called "git-bisect improvements", in particular the subtask about swapping the "good" and "bad" labels when looking for a bug-fixing release. I have a very simple proposal for that: add a new "mark" subcommand. Here is an example of how it should work: 1) A developer wants to find in which commit a past regression was fixed. She start bisecting as usual with "git bisect start". 2) The current HEAD has the bugfix, so she marks it as fixed with "git bisect mark fixed". 3) She knows that HEAD~100 had the regression, so she marks it as unfixed with "git bisect mark unfixed". 4) Now that git knows what the two labels are, it starts bisecting as usual. For compatibility with already written scripts, "git bisect good" and "git bisect bad" will alias to "git bisect mark good" and "git bisect mark bad" respectively. Does this make sense? Did I overlook some details? There were already several proposals on this topic, among which those listed at https://git.wiki.kernel.org/index.php/SmallProjectsIdeas#git_bisect_fix.2Funfixed. I'm interested in contacting the prospective mentor, Christian Couder, to go over these. What's the proper way to ask for an introduction? I tried asking on IRC, but had no success. Cheers, Jacopo Notarstefano ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: An idea for "git bisect" and a GSoC enquiry 2014-02-26 8:28 An idea for "git bisect" and a GSoC enquiry Jacopo Notarstefano @ 2014-02-26 19:58 ` Junio C Hamano 2014-02-28 9:00 ` Jacopo Notarstefano 2014-02-27 11:18 ` Michael Haggerty 2014-02-27 14:47 ` Christian Couder 2 siblings, 1 reply; 17+ messages in thread From: Junio C Hamano @ 2014-02-26 19:58 UTC (permalink / raw) To: Jacopo Notarstefano; +Cc: git Jacopo Notarstefano <jacopo.notarstefano@gmail.com> writes: > Does this make sense? Did I overlook some details? How does this solve the labels shown in "git bisect visualize"? ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: An idea for "git bisect" and a GSoC enquiry 2014-02-26 19:58 ` Junio C Hamano @ 2014-02-28 9:00 ` Jacopo Notarstefano 0 siblings, 0 replies; 17+ messages in thread From: Jacopo Notarstefano @ 2014-02-28 9:00 UTC (permalink / raw) To: Junio C Hamano; +Cc: git Mh. Haven't thought of that. I have no experience with TK, so I'm having trouble digging up where the "good" and "bad" labels in the GUI are generated. I guess that a solution might involve writing a temporary file in $GIT_DIR called something like BISECT_LABELS in which the chosen labels are listed and reused across all tools that require them. (Sorry for sending this email twice, I thought I had sent it to the list as well!) On Wed, Feb 26, 2014 at 8:58 PM, Junio C Hamano <gitster@pobox.com> wrote: > Jacopo Notarstefano <jacopo.notarstefano@gmail.com> writes: > >> Does this make sense? Did I overlook some details? > > How does this solve the labels shown in "git bisect visualize"? > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: An idea for "git bisect" and a GSoC enquiry 2014-02-26 8:28 An idea for "git bisect" and a GSoC enquiry Jacopo Notarstefano 2014-02-26 19:58 ` Junio C Hamano @ 2014-02-27 11:18 ` Michael Haggerty 2014-02-27 12:09 ` Matthieu Moy ` (2 more replies) 2014-02-27 14:47 ` Christian Couder 2 siblings, 3 replies; 17+ messages in thread From: Michael Haggerty @ 2014-02-27 11:18 UTC (permalink / raw) To: Jacopo Notarstefano; +Cc: git, Christian Couder, Junio C Hamano On 02/26/2014 09:28 AM, Jacopo Notarstefano wrote: > my name is Jacopo, a student developer from Italy, and I'm interested > in applying to this years' Google Summer of Code. I set my eyes on the > project called "git-bisect improvements", in particular the subtask > about swapping the "good" and "bad" labels when looking for a > bug-fixing release. Hello and welcome! > I have a very simple proposal for that: add a new "mark" subcommand. > Here is an example of how it should work: > > 1) A developer wants to find in which commit a past regression was > fixed. She start bisecting as usual with "git bisect start". > 2) The current HEAD has the bugfix, so she marks it as fixed with "git > bisect mark fixed". > 3) She knows that HEAD~100 had the regression, so she marks it as > unfixed with "git bisect mark unfixed". > 4) Now that git knows what the two labels are, it starts bisecting as usual. > > For compatibility with already written scripts, "git bisect good" and > "git bisect bad" will alias to "git bisect mark good" and "git bisect > mark bad" respectively. > > Does this make sense? Did I overlook some details? I don't understand the benefit of adding a new command "mark" rather than continuing to use "good", "bad", plus new commands "unfixed" and "fixed". Does this solve any problems? What happens if the user mixes, say, "good" and "fixed" in a single bisect session? I think it would be more convenient if "git bisect" would autodetect whether the history went from "good" to "bad" or vice versa. The algorithm could be: 1. Wait until the user has marked one commit "bad" and one commit "good". 2. If a "good" commit is an ancestor of a "bad" one, then "git bisect" should announce "I will now look for the first bad commit". If reversed, then announce "I will now look for the first good commit". If neither commit is an ancestor of the other, then explain the situation and ask the user to run "git bisect find-first-bad" or "git bisect find-first-good" or to mark another commit "bad" or "good". 3. If the user marks another commit, go back to step 2, also doing a consistency check to make sure that all of the ancestry relationships go in a consistent direction. 4. After the direction is clear, the old bisect algorithm can be used (though taking account of the direction). Obviously a lot of the output would have to be adjusted, as would the way that a bisect is visualized. I can't think of any fundamental problems with a scheme like this, and I think it would be easier to use than the unfixed/fixed scheme. But that is only my opinion; other opinions are undoubtedly available :-) > There were already several proposals on this topic, among which those > listed at https://git.wiki.kernel.org/index.php/SmallProjectsIdeas#git_bisect_fix.2Funfixed. > I'm interested in contacting the prospective mentor, Christian Couder, > to go over these. What's the proper way to ask for an introduction? I > tried asking on IRC, but had no success. Just CC Christian on your emails to the mailing list, like I've done with this email. As a rule of thumb all communications should go to the mailing list *plus* any people who are likely to be personally interested in the topic (e.g., because they have participated in the thread). By the way, although "git bisect fixed/unfixed" would be a very useful improvement, and has gone unimplemented for a lamentably long time, my personal feeling is that it has too meat in it to constitute a GSoC project by itself. Michael -- Michael Haggerty mhagger@alum.mit.edu http://softwareswirl.blogspot.com/ ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: An idea for "git bisect" and a GSoC enquiry 2014-02-27 11:18 ` Michael Haggerty @ 2014-02-27 12:09 ` Matthieu Moy 2014-02-28 9:03 ` Jacopo Notarstefano [not found] ` <CAL0uuq3TGb2wjaqNxwXYa++E5rjVoozox5mZbzTaE17OKtsVTg@mail.gmail.com> 2 siblings, 0 replies; 17+ messages in thread From: Matthieu Moy @ 2014-02-27 12:09 UTC (permalink / raw) To: Michael Haggerty Cc: Jacopo Notarstefano, git, Christian Couder, Junio C Hamano ----- Original Message ----- > I don't understand the benefit of adding a new command "mark" rather > than continuing to use "good", "bad", plus new commands "unfixed" and > "fixed". Does this solve any problems? I think it could be interesting to allow arbitrary words here. For example, I recently walked through history to find a performance regression, it would have been natural to use slow/fast instead of bad/good (bad/good would actually do the job, but slightly less naturally). One can look for a change which is neither a fix nor a bug (e.g. when did command foo start behaving like that? when did we start using such or such feature in the code). I wouldn't fight for it, but I think it makes sense. -- Matthieu Moy http://www-verimag.imag.fr/~moy/ ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: An idea for "git bisect" and a GSoC enquiry 2014-02-27 11:18 ` Michael Haggerty 2014-02-27 12:09 ` Matthieu Moy @ 2014-02-28 9:03 ` Jacopo Notarstefano 2014-02-28 18:31 ` Junio C Hamano [not found] ` <CAL0uuq3TGb2wjaqNxwXYa++E5rjVoozox5mZbzTaE17OKtsVTg@mail.gmail.com> 2 siblings, 1 reply; 17+ messages in thread From: Jacopo Notarstefano @ 2014-02-28 9:03 UTC (permalink / raw) To: Michael Haggerty; +Cc: git, Christian Couder, Junio C Hamano On Thu, Feb 27, 2014 at 12:18 PM, Michael Haggerty <mhagger@alum.mit.edu> wrote: > I don't understand the benefit of adding a new command "mark" rather > than continuing to use "good", "bad", plus new commands "unfixed" and > "fixed". Does this solve any problems? > As Matthieu Moy remarked in a previous email, the main reason is extensibility: I prefer having a single command to assign new descriptive labels instead of having to patch git-bisect.sh to create new labels like fixed, unfixed, fast, slow... > What happens if the user mixes, say, "good" and "fixed" in a single > bisect session? > I don't think that's an issue. If the user uses the label "fixed" instead of "bad" she will have a hard time remembering to use it every time she needs it, and maybe the output of "git bisect" will look very confusing, but what can git do? This is a semantic user input error, not a syntax one. > I think it would be more convenient if "git bisect" would autodetect > whether the history went from "good" to "bad" or vice versa. The > algorithm could be: > > 1. Wait until the user has marked one commit "bad" and one commit "good". > > 2. If a "good" commit is an ancestor of a "bad" one, then "git bisect" > should announce "I will now look for the first bad commit". If > reversed, then announce "I will now look for the first good commit". If > neither commit is an ancestor of the other, then explain the situation > and ask the user to run "git bisect find-first-bad" or "git bisect > find-first-good" or to mark another commit "bad" or "good". > > 3. If the user marks another commit, go back to step 2, also doing a > consistency check to make sure that all of the ancestry relationships go > in a consistent direction. > > 4. After the direction is clear, the old bisect algorithm can be used > (though taking account of the direction). Obviously a lot of the output > would have to be adjusted, as would the way that a bisect is visualized. > > I can't think of any fundamental problems with a scheme like this, and I > think it would be easier to use than the unfixed/fixed scheme. But that > is only my opinion; other opinions are undoubtedly available :-) > I like this idea! It also looks fun to implement. A minor difference is that I'd rather die with an error on point 2) if there's no ancestorship relation between the two commits; if the user is asking for such a thing then she has a fundamental misconception of the state of her repository. > By the way, although "git bisect fixed/unfixed" would be a very useful > improvement, and has gone unimplemented for a lamentably long time, my > personal feeling is that it has too meat in it to constitute a GSoC > project by itself. Oh! Then in fact, as Christian Couder said, this project shouldn't be marked as "easy". (Sorry for sending this email twice! I thought I had sent it to the list as well.) On Thu, Feb 27, 2014 at 12:18 PM, Michael Haggerty <mhagger@alum.mit.edu> wrote: > On 02/26/2014 09:28 AM, Jacopo Notarstefano wrote: >> my name is Jacopo, a student developer from Italy, and I'm interested >> in applying to this years' Google Summer of Code. I set my eyes on the >> project called "git-bisect improvements", in particular the subtask >> about swapping the "good" and "bad" labels when looking for a >> bug-fixing release. > > Hello and welcome! > >> I have a very simple proposal for that: add a new "mark" subcommand. >> Here is an example of how it should work: >> >> 1) A developer wants to find in which commit a past regression was >> fixed. She start bisecting as usual with "git bisect start". >> 2) The current HEAD has the bugfix, so she marks it as fixed with "git >> bisect mark fixed". >> 3) She knows that HEAD~100 had the regression, so she marks it as >> unfixed with "git bisect mark unfixed". >> 4) Now that git knows what the two labels are, it starts bisecting as usual. >> >> For compatibility with already written scripts, "git bisect good" and >> "git bisect bad" will alias to "git bisect mark good" and "git bisect >> mark bad" respectively. >> >> Does this make sense? Did I overlook some details? > > I don't understand the benefit of adding a new command "mark" rather > than continuing to use "good", "bad", plus new commands "unfixed" and > "fixed". Does this solve any problems? > > What happens if the user mixes, say, "good" and "fixed" in a single > bisect session? > > I think it would be more convenient if "git bisect" would autodetect > whether the history went from "good" to "bad" or vice versa. The > algorithm could be: > > 1. Wait until the user has marked one commit "bad" and one commit "good". > > 2. If a "good" commit is an ancestor of a "bad" one, then "git bisect" > should announce "I will now look for the first bad commit". If > reversed, then announce "I will now look for the first good commit". If > neither commit is an ancestor of the other, then explain the situation > and ask the user to run "git bisect find-first-bad" or "git bisect > find-first-good" or to mark another commit "bad" or "good". > > 3. If the user marks another commit, go back to step 2, also doing a > consistency check to make sure that all of the ancestry relationships go > in a consistent direction. > > 4. After the direction is clear, the old bisect algorithm can be used > (though taking account of the direction). Obviously a lot of the output > would have to be adjusted, as would the way that a bisect is visualized. > > I can't think of any fundamental problems with a scheme like this, and I > think it would be easier to use than the unfixed/fixed scheme. But that > is only my opinion; other opinions are undoubtedly available :-) > >> There were already several proposals on this topic, among which those >> listed at https://git.wiki.kernel.org/index.php/SmallProjectsIdeas#git_bisect_fix.2Funfixed. >> I'm interested in contacting the prospective mentor, Christian Couder, >> to go over these. What's the proper way to ask for an introduction? I >> tried asking on IRC, but had no success. > > Just CC Christian on your emails to the mailing list, like I've done > with this email. As a rule of thumb all communications should go to the > mailing list *plus* any people who are likely to be personally > interested in the topic (e.g., because they have participated in the > thread). > > By the way, although "git bisect fixed/unfixed" would be a very useful > improvement, and has gone unimplemented for a lamentably long time, my > personal feeling is that it has too meat in it to constitute a GSoC > project by itself. > > Michael > > -- > Michael Haggerty > mhagger@alum.mit.edu > http://softwareswirl.blogspot.com/ ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: An idea for "git bisect" and a GSoC enquiry 2014-02-28 9:03 ` Jacopo Notarstefano @ 2014-02-28 18:31 ` Junio C Hamano 2014-03-01 11:31 ` Jacopo Notarstefano 0 siblings, 1 reply; 17+ messages in thread From: Junio C Hamano @ 2014-02-28 18:31 UTC (permalink / raw) To: Jacopo Notarstefano; +Cc: Michael Haggerty, git, Christian Couder Jacopo Notarstefano <jacopo.notarstefano@gmail.com> writes: > On Thu, Feb 27, 2014 at 12:18 PM, Michael Haggerty <mhagger@alum.mit.edu> wrote: >> I don't understand the benefit of adding a new command "mark" rather >> than continuing to use "good", "bad", plus new commands "unfixed" and >> "fixed". Does this solve any problems? >> > > As Matthieu Moy remarked in a previous email, the main reason is > extensibility: I prefer having a single command to assign new > descriptive labels instead of having to patch git-bisect.sh to create > new labels like fixed, unfixed, fast, slow... > >> What happens if the user mixes, say, "good" and "fixed" in a single >> bisect session? > > I don't think that's an issue. If the user uses the label "fixed" > instead of "bad" she will have a hard time remembering to use it every > time she needs it,... I am not sure I understand what you are trying to say. Are you saying that we should stick to good/bad and allow the users use nothing else, because allowing "fixed" will be confusing? > and maybe the output of "git bisect" will look very > confusing, but what can git do? This is a semantic user input error, > not a syntax one. For a young tool or a feature, catering to perfect human perfectly is a good first goal---if it does not work well even for error-free human input, it would be worthless. However, its second goal after achieving that first goal ought to be to help imperfect humans. I can very well imagine somebody start hunting for an earlier bugfix (perhaps trying to find it to backport to an older maintenance track), start saying "fixed", "broken", "broken", ..., continue after leaving for lunch for a while, and then try to mark the next version he tests as "bad" because it has a bug. It technically may be an user error, in the sense that in such a "where is the fix?" session, you want to mark a "still-has-bug" one as "broken" and mark a "no-longer-has-bug" one as "fixed" (just like "still-broken" as "bad" and "no-longer-broken" as "good" in regular bisection). But at that point, the tool *knows* that the user earlier used "fixed" (or "broken") to mark some commits *already*. Why do you think there is nothing it can do to help the user? Upon seeing "bad", the tool should at least be able to say "Excuse me, but you earlier said 'fixed' for one of the commits, so your vocabulary now is limited to 'fixed' and 'broken'". I think it also should be able to add "Did you mean to say 'broken'?", or even "I'd assume that you meant 'broken' and will continue." I have always wondered if we can introduce a value neutral synonyms to good and bad. For a bisect session, we care only about two states: "still-X" and "no-longer-X" where X may be 'working' for the normal bug-hunting bisection and 'broken' for the fix-hunting one. $ git bisect still-working v1.6.0 $ git bisect no-longer-working v1.8.0 would be a way to find a bug that was introduced during v1.6.0..v1.8.0, while $ git bisect still-broken v1.6.0 $ git bisect no-longer-broken v1.8.0 would be a way to find a fix in the same range. The lowest-level bisection machinery could just _ignore_ anything after still/no-longer and do its thing, while the end-user facing layer could enforce, once one commit is marked as still-X (or no-longer-X), that nothing other than the same X is used, and issue an error message, perhaps like this: $ git bisect still-broken v1.6.0 $ git bisect still-working v1.8.0 error: You earlier marked v1.6.0 as "still-broken" and error: are hunting for the first commit that can be marked error: as "no-longer-broken". Say either "still-broken" or error: "no-longer-broken", not "still-working". and that can be done without having to understand that "broken" is the opposite of "working" (of course if we understood that, we could even offer to guess that the user meant "no-longer-broken" by "still-working", but we do not want to go there). ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: An idea for "git bisect" and a GSoC enquiry 2014-02-28 18:31 ` Junio C Hamano @ 2014-03-01 11:31 ` Jacopo Notarstefano 2014-03-03 18:34 ` Junio C Hamano 0 siblings, 1 reply; 17+ messages in thread From: Jacopo Notarstefano @ 2014-03-01 11:31 UTC (permalink / raw) To: Junio C Hamano; +Cc: Michael Haggerty, Git Mailing List, Christian Couder > I am not sure I understand what you are trying to say. Are you > saying that we should stick to good/bad and allow the users use > nothing else, because allowing "fixed" will be confusing? > No! Pretty much the opposite of that. My idea (the "mark" subcommand) is to let people define their own pairs of labels to represent two opposite states of a commit. My point was that, if people choose pairs of words that are not opposites (such as "good" and "fixed") then it's their error, not something that git should attempt to fix or detect. > For a young tool or a feature, catering to perfect human perfectly > is a good first goal---if it does not work well even for error-free > human input, it would be worthless. However, its second goal after > achieving that first goal ought to be to help imperfect humans. > Loved this. > Why do you think there is nothing it can do to help the user? Upon > seeing "bad", the tool should at least be able to say "Excuse me, > but you earlier said 'fixed' for one of the commits, so your > vocabulary now is limited to 'fixed' and 'broken'". I think it also > should be able to add "Did you mean to say 'broken'?", or even "I'd > assume that you meant 'broken' and will continue." > I haven't said this, but this is pretty much what I had in mind. Suppose a user wants to find a bugfix between HEAD and HEAD~10, this is what she would do: $ git bisect start $ git bisect mark working HEAD $ git bisect mark broken HEAD~10 [git will now start bisecting as usual. Suppose that she is now at HEAD~5] $ git bisect mark bad -> Error: unrecognized label 'bad'. You previously used 'working' and 'fixed' to describe commits in this git bisect session. Please mark commits with one of these labels. I suppose that we could cater a little better to imperfect humans if we had two predefined parallel list of antonyms in which to search for given labels and infer whether they are positive or negative labels, but this is beyond the scope of my proposal. > I have always wondered if we can introduce a value neutral synonyms > to good and bad. For a bisect session, we care only about two > states: "still-X" and "no-longer-X" where X may be 'working' for the > normal bug-hunting bisection and 'broken' for the fix-hunting one. > > $ git bisect still-working v1.6.0 > $ git bisect no-longer-working v1.8.0 > > would be a way to find a bug that was introduced during v1.6.0..v1.8.0, > while > > $ git bisect still-broken v1.6.0 > $ git bisect no-longer-broken v1.8.0 > > would be a way to find a fix in the same range. The lowest-level > bisection machinery could just _ignore_ anything after still/no-longer > and do its thing, [...] This is remarkably similar to my proposal. Using "mark", these would be: $ git bisect mark working v1.6.0 $ git bisect mark not-working v1.8.0 and $ git bisect mark broken v1.6.0 $ git bisect mark not-broken v1.8.0 > while the end-user facing layer could enforce, > once one commit is marked as still-X (or no-longer-X), that nothing > other than the same X is used, and issue an error message, perhaps > like this: > > $ git bisect still-broken v1.6.0 > $ git bisect still-working v1.8.0 > error: You earlier marked v1.6.0 as "still-broken" and > error: are hunting for the first commit that can be marked > error: as "no-longer-broken". Say either "still-broken" or > error: "no-longer-broken", not "still-working". > > and that can be done without having to understand that "broken" is > the opposite of "working" (of course if we understood that, we could > even offer to guess that the user meant "no-longer-broken" by > "still-working", but we do not want to go there). Here my proposal differs in that I have no way of knowing which label is good and which label is bad: I blindly accept two distinct labels and bisect with those. I gave an example of this behaviour above. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: An idea for "git bisect" and a GSoC enquiry 2014-03-01 11:31 ` Jacopo Notarstefano @ 2014-03-03 18:34 ` Junio C Hamano 2014-03-12 1:32 ` Jacopo Notarstefano 0 siblings, 1 reply; 17+ messages in thread From: Junio C Hamano @ 2014-03-03 18:34 UTC (permalink / raw) To: Jacopo Notarstefano; +Cc: Michael Haggerty, Git Mailing List, Christian Couder Jacopo Notarstefano <jacopo.notarstefano@gmail.com> writes: > Here my proposal differs in that I have no way of knowing which label > is good and which label is bad: I blindly accept two distinct labels > and bisect with those. I gave an example of this behaviour above. I think you fundamentally cannot use two labels that are merely "distinct" and bisect correctly. You need to know which ones (i.e. good) are to be excluded and the other (i.e. bad) are to be included when computing the "remaining to be tested" set of commits. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: An idea for "git bisect" and a GSoC enquiry 2014-03-03 18:34 ` Junio C Hamano @ 2014-03-12 1:32 ` Jacopo Notarstefano 2014-03-12 18:31 ` Junio C Hamano 0 siblings, 1 reply; 17+ messages in thread From: Jacopo Notarstefano @ 2014-03-12 1:32 UTC (permalink / raw) To: Junio C Hamano; +Cc: Michael Haggerty, Git Mailing List, Christian Couder On Mon, Mar 3, 2014 at 7:34 PM, Junio C Hamano <gitster@pobox.com> wrote: > I think you fundamentally cannot use two labels that are merely > "distinct" and bisect correctly. You need to know which ones > (i.e. good) are to be excluded and the other (i.e. bad) are to be > included when computing the "remaining to be tested" set of commits. Good point. Yes, this isn't viable. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: An idea for "git bisect" and a GSoC enquiry 2014-03-12 1:32 ` Jacopo Notarstefano @ 2014-03-12 18:31 ` Junio C Hamano 2014-03-13 17:19 ` Michael Haggerty 0 siblings, 1 reply; 17+ messages in thread From: Junio C Hamano @ 2014-03-12 18:31 UTC (permalink / raw) To: Jacopo Notarstefano; +Cc: Michael Haggerty, Git Mailing List, Christian Couder Jacopo Notarstefano <jacopo.notarstefano@gmail.com> writes: > On Mon, Mar 3, 2014 at 7:34 PM, Junio C Hamano <gitster@pobox.com> wrote: >> I think you fundamentally cannot use two labels that are merely >> "distinct" and bisect correctly. You need to know which ones >> (i.e. good) are to be excluded and the other (i.e. bad) are to be >> included when computing the "remaining to be tested" set of commits. > > Good point. Yes, this isn't viable. But if you make them into --no-longer-X vs --still-X, then it will be viable without us knowing what X means. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: An idea for "git bisect" and a GSoC enquiry 2014-03-12 18:31 ` Junio C Hamano @ 2014-03-13 17:19 ` Michael Haggerty 2014-03-13 18:47 ` Junio C Hamano 0 siblings, 1 reply; 17+ messages in thread From: Michael Haggerty @ 2014-03-13 17:19 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jacopo Notarstefano, Git Mailing List, Christian Couder On 03/12/2014 07:31 PM, Junio C Hamano wrote: > Jacopo Notarstefano <jacopo.notarstefano@gmail.com> writes: > >> On Mon, Mar 3, 2014 at 7:34 PM, Junio C Hamano <gitster@pobox.com> wrote: >>> I think you fundamentally cannot use two labels that are merely >>> "distinct" and bisect correctly. You need to know which ones >>> (i.e. good) are to be excluded and the other (i.e. bad) are to be >>> included when computing the "remaining to be tested" set of commits. >> >> Good point. Yes, this isn't viable. > > But if you make them into --no-longer-X vs --still-X, then it will > be viable without us knowing what X means. Yes, but who wants to type such long and inelegant option names? It seems to me that we can infer which mark is which from the normal bisect user interaction. At the startup phase of a bisect, there are only three cases: 1. There are fewer than two different types of marks on tested commits. For example, maybe one commit has been marked "bad". Or two commits have both been marked "slow". In this case we wait for the user to choose another commit manually, so we don't have to know the meaning of the mark. 2. There are two different types of marks, but no commits with differing marks are ancestors of each other. In this case, we pick the merge base of two commits with differing marks and present it to the user for testing. But we can do that without knowing which mark is "before the change" and which mark means "after the change". So just defer the inference. 3. There are two different types of marks, and a commit with one mark is an ancestor of a commit with the other mark. In this case, it is clear from the ancestry which mark means "before the change" and which mark means "after the change". So record the "orientation" of the marks and continue like in the old days. Of course, there are still details to be worked out, like how to tag the commits before we know which mark means what. But that is just a clerical problem, not a fundamental one. Michael -- Michael Haggerty mhagger@alum.mit.edu http://softwareswirl.blogspot.com/ ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: An idea for "git bisect" and a GSoC enquiry 2014-03-13 17:19 ` Michael Haggerty @ 2014-03-13 18:47 ` Junio C Hamano 0 siblings, 0 replies; 17+ messages in thread From: Junio C Hamano @ 2014-03-13 18:47 UTC (permalink / raw) To: Michael Haggerty; +Cc: Jacopo Notarstefano, Git Mailing List, Christian Couder Michael Haggerty <mhagger@alum.mit.edu> writes: > It seems to me that we can infer which mark is which from the normal > bisect user interaction. At the startup phase of a bisect, there are > only three cases: > > 1. There are fewer than two different types of marks on tested commits. > For example, maybe one commit has been marked "bad". Or two commits > have both been marked "slow". In this case we wait for the user to > choose another commit manually, so we don't have to know the meaning > of the mark. > > 2. There are two different types of marks, but no commits with > differing marks are ancestors of each other. In this case, we pick > the merge base of two commits with differing marks and present it > to the user for testing. But we can do that without knowing which > mark is "before the change" and which mark means "after the > change". So just defer the inference. > > 3. There are two different types of marks, and a commit with one mark > is an ancestor of a commit with the other mark. In this case, it is > clear from the ancestry which mark means "before the change" and > which mark means "after the change". So record the "orientation" of > the marks and continue like in the old days. > > Of course, there are still details to be worked out, like how to tag the > commits before we know which mark means what. But that is just a > clerical problem, not a fundamental one. Yup, with an extra "state" kept somewhere in $GIT_DIR, we should in principle be able to defer the "value judgement" (aka "which one should be treated as a bottom of the range"). The first change that is needed for this scheme to be workable is to decide how we mark such an unknown state at the beginning, though. We assume that we need to keep track of a single top one ("bad", aka "no-longer-good") while we have to keep track of multiple bottom ones ("good"). There also is a safety valve in the current logic for transitioning from case #2 to case #3; when a common ancestor is marked as "bad" (aka "no-longer-good"), we notice that the original bisection is screwy in the sense that the user is seeing not just a single state flip that made something that used to be good into bad. I am afraid that we may instead _silently_ decide that the user is trying to locate a state flip that made something that used to be bad (at the common ancestor) into good with the logic proposed above. From the point of view of the user who wanted to find a regression by marking one as "bad" and the other "good", running bisection whose semantics suddenly and silently changed into an opposite "where was it fixed" hunt would be an unpleasant and confusing experience. I do not know, without knowing the meaning of "slow" and "fast" (which implicitly tells us which way the user intends to bisect), how well we can keep that safety valve. Other than that, I like the idea. ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <CAL0uuq3TGb2wjaqNxwXYa++E5rjVoozox5mZbzTaE17OKtsVTg@mail.gmail.com>]
[parent not found: <a8cf74b4-bae1-4511-a45e-d4ca90e3c3e1@email.android.com>]
* Re: An idea for "git bisect" and a GSoC enquiry [not found] ` <a8cf74b4-bae1-4511-a45e-d4ca90e3c3e1@email.android.com> @ 2014-02-28 9:07 ` Jacopo Notarstefano 2014-02-28 9:13 ` Jacopo Notarstefano 1 sibling, 0 replies; 17+ messages in thread From: Jacopo Notarstefano @ 2014-02-28 9:07 UTC (permalink / raw) To: Michael Haggerty; +Cc: git, Junio C Hamano, Christian Couder This email was sent privately by Michael to me as a result of my previous error. I'm quoting it in its entirety so that he doesn't have to submit it twice. On Thu, Feb 27, 2014 at 8:32 PM, Michael Haggerty <mhagger@alum.mit.edu> wrote: > Please forgive my typos and brevity; this was typed on a phone. > > Michael > On February 27, 2014 5:16:40 PM CET, Jacopo Notarstefano <jacopo.notarstefano@gmail.com> wrote: >>On Thu, Feb 27, 2014 at 12:18 PM, Michael Haggerty >><mhagger@alum.mit.edu> wrote: >>> What happens if the user mixes, say, "good" and "fixed" in a single >>> bisect session? >>> >> >>I don't think that's an issue. If the user uses the label "fixed" >>instead of "bad" she will have a hard time remembering to use it every >>time she needs it, and maybe the output of "git bisect" will look very >>confusing, but what can git do? This is a semantic user input error, >>not a syntax one. > > - git could emit an error message and refuse to continue > - git could interpret the command one way or the other, with or without a warning > > By my count that gives at least five possibilities. The feature cannot be implemented without choosing one. > >>> I think it would be more convenient if "git bisect" would autodetect >>> whether the history went from "good" to "bad" or vice versa. The >>> algorithm could be: >>> >>> 1. Wait until the user has marked one commit "bad" and one commit >>"good". >>> >>> 2. If a "good" commit is an ancestor of a "bad" one, then "git >>bisect" >>> should announce "I will now look for the first bad commit". If >>> reversed, then announce "I will now look for the first good commit". >>If >>> neither commit is an ancestor of the other, then explain the >>situation >>> and ask the user to run "git bisect find-first-bad" or "git bisect >>> find-first-good" or to mark another commit "bad" or "good". >>> >>> 3. If the user marks another commit, go back to step 2, also doing a >>> consistency check to make sure that all of the ancestry relationships >>go >>> in a consistent direction. >>> >>> 4. After the direction is clear, the old bisect algorithm can be used >>> (though taking account of the direction). Obviously a lot of the >>output >>> would have to be adjusted, as would the way that a bisect is >>visualized. >>> >>> I can't think of any fundamental problems with a scheme like this, >>and I >>> think it would be easier to use than the unfixed/fixed scheme. But >>that >>> is only my opinion; other opinions are undoubtedly available :-) >>> >> >>I like this idea! It also looks fun to implement. A minor difference >>is that I'd rather die with an error on point 2) if there's no >>ancestorship relation between the two commits; if the user is asking >>for such a thing then she has a fundamental misconception of the state >>of her repository. > > That is not correct. If there is a bug on one branch but not another, it is legitimate to ask when the bug was introduced, and git bisect can indeed handle this case today (think about how this could work, and try it!) > >>> By the way, although "git bisect fixed/unfixed" would be a very >>useful >>> improvement, and has gone unimplemented for a lamentably long time, >>my >>> personal feeling is that it has too meat in it to constitute a GSoC >>> project by itself. >>> >> >>Oh! Then in fact, as Christian Couder said, this project shouldn't be >>marked as "easy". > > Sorry for the typo; I meant to say "too LITTLE meat". > > > -- > Michael Haggerty > mhagger@alum.mit.edu ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: An idea for "git bisect" and a GSoC enquiry [not found] ` <a8cf74b4-bae1-4511-a45e-d4ca90e3c3e1@email.android.com> 2014-02-28 9:07 ` Jacopo Notarstefano @ 2014-02-28 9:13 ` Jacopo Notarstefano 1 sibling, 0 replies; 17+ messages in thread From: Jacopo Notarstefano @ 2014-02-28 9:13 UTC (permalink / raw) To: Michael Haggerty; +Cc: git, Junio C Hamano, Christian Couder > - git could emit an error message and refuse to continue > - git could interpret the command one way or the other, with or without a warning > > By my count that gives at least five possibilities. The feature cannot be implemented without choosing one. > Let me explain what I meant with an example. 1) The user starts bisecting with bisect start. 2) The user marks HEAD as good with git bisect mark good. 3) The user then marks HEAD~10 as fixed with git bisect mark fixed. 4) Git will then continue bisecting as usual with the labels "good" and "fixed" instead of "bad" and "good" respectively. This is very confusing, but is a result of a user semantic error, so no warning is emitted. After all, this might have been what the user wanted. > That is not correct. If there is a bug on one branch but not another, it is legitimate to ask when the bug was introduced, and git bisect can indeed handle this case today (think about how this could work, and try it!) > Interesting. I did not know that. Yes, I see how that might pan out, and why my idea is worse. > Sorry for the typo; I meant to say "too LITTLE meat". > Ok. Not a big issue for me: I might squash another project together in my proposal. I've already seen one that piqued my interest: "Unifying git branch -l, git tag -l, and git for-each-ref". (Sorry for sending this email twice! I thought I had sent it to the list as well.) On Thu, Feb 27, 2014 at 8:32 PM, Michael Haggerty <mhagger@alum.mit.edu> wrote: > Please forgive my typos and brevity; this was typed on a phone. > > Michael > On February 27, 2014 5:16:40 PM CET, Jacopo Notarstefano <jacopo.notarstefano@gmail.com> wrote: >>On Thu, Feb 27, 2014 at 12:18 PM, Michael Haggerty >><mhagger@alum.mit.edu> wrote: >>> What happens if the user mixes, say, "good" and "fixed" in a single >>> bisect session? >>> >> >>I don't think that's an issue. If the user uses the label "fixed" >>instead of "bad" she will have a hard time remembering to use it every >>time she needs it, and maybe the output of "git bisect" will look very >>confusing, but what can git do? This is a semantic user input error, >>not a syntax one. > > - git could emit an error message and refuse to continue > - git could interpret the command one way or the other, with or without a warning > > By my count that gives at least five possibilities. The feature cannot be implemented without choosing one. > >>> I think it would be more convenient if "git bisect" would autodetect >>> whether the history went from "good" to "bad" or vice versa. The >>> algorithm could be: >>> >>> 1. Wait until the user has marked one commit "bad" and one commit >>"good". >>> >>> 2. If a "good" commit is an ancestor of a "bad" one, then "git >>bisect" >>> should announce "I will now look for the first bad commit". If >>> reversed, then announce "I will now look for the first good commit". >>If >>> neither commit is an ancestor of the other, then explain the >>situation >>> and ask the user to run "git bisect find-first-bad" or "git bisect >>> find-first-good" or to mark another commit "bad" or "good". >>> >>> 3. If the user marks another commit, go back to step 2, also doing a >>> consistency check to make sure that all of the ancestry relationships >>go >>> in a consistent direction. >>> >>> 4. After the direction is clear, the old bisect algorithm can be used >>> (though taking account of the direction). Obviously a lot of the >>output >>> would have to be adjusted, as would the way that a bisect is >>visualized. >>> >>> I can't think of any fundamental problems with a scheme like this, >>and I >>> think it would be easier to use than the unfixed/fixed scheme. But >>that >>> is only my opinion; other opinions are undoubtedly available :-) >>> >> >>I like this idea! It also looks fun to implement. A minor difference >>is that I'd rather die with an error on point 2) if there's no >>ancestorship relation between the two commits; if the user is asking >>for such a thing then she has a fundamental misconception of the state >>of her repository. > > That is not correct. If there is a bug on one branch but not another, it is legitimate to ask when the bug was introduced, and git bisect can indeed handle this case today (think about how this could work, and try it!) > >>> By the way, although "git bisect fixed/unfixed" would be a very >>useful >>> improvement, and has gone unimplemented for a lamentably long time, >>my >>> personal feeling is that it has too meat in it to constitute a GSoC >>> project by itself. >>> >> >>Oh! Then in fact, as Christian Couder said, this project shouldn't be >>marked as "easy". > > Sorry for the typo; I meant to say "too LITTLE meat". > > > -- > Michael Haggerty > mhagger@alum.mit.edu ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: An idea for "git bisect" and a GSoC enquiry 2014-02-26 8:28 An idea for "git bisect" and a GSoC enquiry Jacopo Notarstefano 2014-02-26 19:58 ` Junio C Hamano 2014-02-27 11:18 ` Michael Haggerty @ 2014-02-27 14:47 ` Christian Couder 2014-02-27 22:46 ` Andrew Ardill 2 siblings, 1 reply; 17+ messages in thread From: Christian Couder @ 2014-02-27 14:47 UTC (permalink / raw) To: Jacopo Notarstefano; +Cc: git Hi, On Wed, Feb 26, 2014 at 9:28 AM, Jacopo Notarstefano <jacopo.notarstefano@gmail.com> wrote: > Hey everyone, > > my name is Jacopo, a student developer from Italy, and I'm interested > in applying to this years' Google Summer of Code. I set my eyes on the > project called "git-bisect improvements", in particular the subtask > about swapping the "good" and "bad" labels when looking for a > bug-fixing release. > > I have a very simple proposal for that: add a new "mark" subcommand. > Here is an example of how it should work: > > 1) A developer wants to find in which commit a past regression was > fixed. She start bisecting as usual with "git bisect start". > 2) The current HEAD has the bugfix, so she marks it as fixed with "git > bisect mark fixed". > 3) She knows that HEAD~100 had the regression, so she marks it as > unfixed with "git bisect mark unfixed". > 4) Now that git knows what the two labels are, it starts bisecting as usual. > > For compatibility with already written scripts, "git bisect good" and > "git bisect bad" will alias to "git bisect mark good" and "git bisect > mark bad" respectively. > > Does this make sense? Did I overlook some details? As Junio said adding a command "mark" doesn't by itself solve the difficult problems related to this project. (By the way I think it is misleading to state that this GSoC is "easy".) > There were already several proposals on this topic, among which those > listed at https://git.wiki.kernel.org/index.php/SmallProjectsIdeas#git_bisect_fix.2Funfixed. > I'm interested in contacting the prospective mentor, Christian Couder, > to go over these. What's the proper way to ask for an introduction? As Michael said, you can just CC me or send me a private email. But I think the most important thing right now is first to gather as much information as you can from the previous discussions on this topic on this mainling list. Perhaps you should also gather information on how git bisect works. It will help you understand what are the difficult problems. One of the problems, for example, is that git bisect can work using a "good" commit that is not an ancestor of the "bad" commit. In this case it will checkout the merge bases between the good and the bad commit. (And by the way this is related to the bug that should also be fixed as part of this project.) Then you are welcome to come back and ask questions, or suggest solutions. > I tried asking on IRC, but had no success. Sorry but I don't use IRC. Thanks, Christian. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: An idea for "git bisect" and a GSoC enquiry 2014-02-27 14:47 ` Christian Couder @ 2014-02-27 22:46 ` Andrew Ardill 0 siblings, 0 replies; 17+ messages in thread From: Andrew Ardill @ 2014-02-27 22:46 UTC (permalink / raw) To: Christian Couder; +Cc: Jacopo Notarstefano, git On 27 February 2014 06:47, Christian Couder <christian.couder@gmail.com> wrote: > But I think the most important thing right now is first to gather as > much information as you can from the previous discussions on this > topic on this mainling list. > Perhaps you should also gather information on how git bisect works. I have also, at one time, started working on this problem, though I never submitted any of my patches :(. I went the way of renaming the internal logic to make it less tied to the good/bad distinction that is currently hard coded in. That may not be the best starting point, but let me summarise the thoughts I had at the time, particularly around the different adjective pairs that we might use. A general description of git bisect is that you start with a commit that exhibits a given property, find a commit that does not have that property, and then look for when the property was introduced. I think of this property as the 'bisect property' of the bisect search. The property is described with our adjective pair, currently 'bad' (with the property) and 'good' (without the property). We assume that commits with the property have an ancestor without the property, and as this assumption is so essential to how git bisect works I think of it as the 'bisect relationship' of the bisect search, and we care about the direction of this relationship between commits. The proposed adjectives tend to be along the lines of the following: - good->bad (current); good<->bad The bisect property is currently always described as 'bad', the introduction of a bug being the motivating use case. The problem with this is that we often want to find when a 'good' behaviour was introduced, or when a neutral change occurred. A solution is to allow reversing our bisect relationship, by either detecting the intended direction or allowing the user to choose. If we reverse the direction our adjectives also flip, and so the bisect property we are now looking for is 'good' instead of 'bad'. The terms good and bad don't work well with neutral searches. - unfixed->fixed For this pair, the bisect property would always be described by the 'fixed' adjective. It seems odd to ever reverse the bisect relationship, as we don't usually say something was 'fixed' and then became 'unfixed'. The behaviour of this pair would thus be near identical to current usage of 'good->bad', but with the bisect property conceptually reversed (when was a bug fixed vs when was a bug introduced). - old->new This pair avoids making any judgement on what type of bisect property we have. The adjectives are thus simply describing the bisect relationship, and the user is free to use any bisect property they wish. The main problem with this is that it is possible to have commits without the property (thus described as 'old') that were made chronologically after a commit with the property ('new'). This has the potential to cause confusion for users. - without->with This pair also avoids making a judgement on the bisect property, but avoids potential chronological confusion that 'old->new' has. You could potentially allow users to reverse the bisect relationship's direction, but these adjectives allow you to easily invert the bisect property without causing confusion. For example, 'without bug XYZ' can instead be written as 'with bug XYZ fixed'. ---- My preference is for the without->with adjective pair, as I believe it maps most closely to the concept of finding a commit that changed a given property, and it allows that property to be negated without introducing too much confusion. Reversing the relationship's direction would also make sense, however that is a significantly greater change to the commands logic. Thus, my initial work was to refactor the internal naming to use the terms with and without, as that would make a better place from which to add other features (such as reversing the relationship direction, or adding new adjective pairs). Sorry if that is all confusing to read, or if I'm repeating things that have been said before :) Regards, Andrew Ardill ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2014-03-13 18:47 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-02-26 8:28 An idea for "git bisect" and a GSoC enquiry Jacopo Notarstefano 2014-02-26 19:58 ` Junio C Hamano 2014-02-28 9:00 ` Jacopo Notarstefano 2014-02-27 11:18 ` Michael Haggerty 2014-02-27 12:09 ` Matthieu Moy 2014-02-28 9:03 ` Jacopo Notarstefano 2014-02-28 18:31 ` Junio C Hamano 2014-03-01 11:31 ` Jacopo Notarstefano 2014-03-03 18:34 ` Junio C Hamano 2014-03-12 1:32 ` Jacopo Notarstefano 2014-03-12 18:31 ` Junio C Hamano 2014-03-13 17:19 ` Michael Haggerty 2014-03-13 18:47 ` Junio C Hamano [not found] ` <CAL0uuq3TGb2wjaqNxwXYa++E5rjVoozox5mZbzTaE17OKtsVTg@mail.gmail.com> [not found] ` <a8cf74b4-bae1-4511-a45e-d4ca90e3c3e1@email.android.com> 2014-02-28 9:07 ` Jacopo Notarstefano 2014-02-28 9:13 ` Jacopo Notarstefano 2014-02-27 14:47 ` Christian Couder 2014-02-27 22:46 ` Andrew Ardill
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).