* GSoC Project | Improvise git bisect
@ 2016-03-19 9:33 Pranit Bauva
2016-03-19 12:48 ` Matthieu Moy
0 siblings, 1 reply; 16+ messages in thread
From: Pranit Bauva @ 2016-03-19 9:33 UTC (permalink / raw)
To: Git List
Cc: Johannes Schindelin, Stefan Beller, Christian Couder,
Matthieu Moy, Lars Schneider, Jeff King, troy.moure,
Junio C Hamano, Eric Sunshine, me, Philip Oakley
Hey everyone!
I am Pranit Bauva. I am studying Mining Engineering at IIT Kharagpur.
I am interested in participating in Google Summer of Code under Git
organization. I have attempted a micro-project to add configuration to
commonly used command line options `git commit -v` ($gmane/288820).
I am interested in 2 project related to git bisect confused between the two:
- Implement git bisect --first-parent
> When your project is strictly “new features are merged into trunk, never the other
> way around”, it is handy to be able to first find a merge on the trunk that merged a
> topic to point fingers at when a bug appears, instead of having to drill down to the
> individual commit on the faulty side branch.
> Cf. http://thread.gmane.org/gmane.comp.version-control.git/264661/focus=264720
What I understood is that let's say the repository is like :
C13
|
C12
|
C11 (merge commit)
/ |
| C10
| |
| C9
| |
| C6 (merge commit)
C8 | \
| C3 |
C7 | |
\ | C5
C2 |
| C4
| /
C1
(master branch)
The commits numbers ie. C1...C13 are according to the time stamp, C1
being the first. On starting to debug with git bisect, given that C12
is bad and C1 is good, it starts a binary search from C1...C13. ie. It
first goes to C7, if its all good, it goes to C10 and so on an so
forth. If C7 is not good, it goes to C4 and so on and so forth. This
just makes the job of debugging a bit difficult for a repo which has
only 1 mainstream repository and it just has some short-term branches
to instantly get stuff done. It can be simplified by using
--first-parent. Given C1 is good and C12 is bad, it will find the mean
between {C1, C2, C3, C6, C9, C10, C11, C12, C13} which is C9, see if
its good. If not then it will go to C3 and then C2, if good then it
will go to C6, if not good then it will go to C5 and then C4. This
will greatly simplify the job of debugging.
- Rewrite git-bisect.sh as bisect.c and bisect.h
For this I plan to go along the guidelines of Paul Tan's previous
year work. I have followed his work and his way seems nice to go about
with rewriting.
What are your suggestions about these project? Which one would you
recommend me to do? I have tried to cc everyone who participated in
the discussion for -first-parent
Regards,
Pranit Bauva
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: GSoC Project | Improvise git bisect 2016-03-19 9:33 GSoC Project | Improvise git bisect Pranit Bauva @ 2016-03-19 12:48 ` Matthieu Moy 2016-03-19 16:14 ` Christian Couder 0 siblings, 1 reply; 16+ messages in thread From: Matthieu Moy @ 2016-03-19 12:48 UTC (permalink / raw) To: Pranit Bauva Cc: Git List, Johannes Schindelin, Stefan Beller, Christian Couder, Lars Schneider, Jeff King, troy.moure, Junio C Hamano, Eric Sunshine, me, Philip Oakley > Subject: Re: GSoC Project | Improvise git bisect ^^^^ "Improve" I guess. Pranit Bauva <pranit.bauva@gmail.com> writes: > Hey everyone! Hi, > What I understood is that let's say the repository is like : > > C13 > | > C12 > | > C11 (merge commit) > / | > | C10 > | | > | C9 > | | > | C6 (merge commit) > C8 | \ > | C3 | > C7 | | > \ | C5 > C2 | > | C4 > | / > C1 > (master branch) When drawing ascii-art diagrams like this, try to use a fixed-width font. It looks ugly in my mailer. > The commits numbers ie. C1...C13 are according to the time stamp, C1 > being the first. One information is missing: which is the first parent. > On starting to debug with git bisect, given that C12 is bad and C1 is > good, it starts a binary search from C1...C13. ie. It first goes to > C7, I don't think so. It tries to find a commit which cuts the graph into 2 sub-graphs with roughly the same number of commits. If you pick C7, then C7 is bad, the regression may be anywhere except C1, C2, C7. This does not reduce the scope much. I guess you picked C7 because of the timestamps. "bisect" picks the commit according to the graph topology. > if its all good, it goes to C10 and so on an so forth. If C7 is not > good, it goes to C4 and so on and so forth. This just makes the job of > debugging a bit difficult for a repo which has only 1 mainstream > repository and it just has some short-term branches to instantly get > stuff done. Why? > It can be simplified by using --first-parent. Given C1 is good and C12 > is bad, it will find the mean between {C1, C2, C3, C6, C9, C10, C11, > C12, C13} which is C9, see if its good. Do you mean that C10 is the first parent of C11, and C3 the first parent of C6? That's an un-usual graphical convention: usually we represent first parent as leftmost parent. > If not then it will go to C3 > and then C2, if good then it will go to C6, if not good then it will > go to C5 and then C4. This will greatly simplify the job of debugging. Again, why? The missing part in your explanation is probably: Some projects do not enforce the policy "each commit must be compilable and correct", but instead consider that only commits on the mainline should have this property. This typically allows history like A Merge feature A |\ | B fix bug in feature A | | | C fix compilation error in previous commit | | | D implement feature A |/ E Merge feature B ... When bisecting through such history, testing commits B and C is meaningless, but it still makes sense to bisect through the mainling commits A and E. In this case, we can consider that if E is good and A is bad, then the regression was introduced in A. Once we know that, we can actually continue the bisection: "OK, the regression was introduced in mainline at merge commit A, let's see if the branch being merged is bisectable", which could be recursive if the topic branch contains merge commits. > - Rewrite git-bisect.sh as bisect.c and bisect.h > > For this I plan to go along the guidelines of Paul Tan's previous > year work. I have followed his work and his way seems nice to go about > with rewriting. Please elaborate. Your proposal needs to be convincing enough that mentors accept to commit to mentoring the project. "I'll do like Paul Tan" is by far not sufficient. I'm actually not sure the same plan applies here: there's already a C helper for bisect, so an incremental rewrite may be more appropriate: port functions one by one from shell to C untill the shell part is empty. I don't know the bisect code well enough to know which approach would work best. Cheers, -- Matthieu Moy http://www-verimag.imag.fr/~moy/ ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: GSoC Project | Improvise git bisect 2016-03-19 12:48 ` Matthieu Moy @ 2016-03-19 16:14 ` Christian Couder 2016-03-19 16:49 ` Pranit Bauva 0 siblings, 1 reply; 16+ messages in thread From: Christian Couder @ 2016-03-19 16:14 UTC (permalink / raw) To: Matthieu Moy Cc: Pranit Bauva, Git List, Johannes Schindelin, Stefan Beller, Lars Schneider, Jeff King, troy.moure, Junio C Hamano, Eric Sunshine, Kevin Daudt, Philip Oakley Hi, On Sat, Mar 19, 2016 at 1:48 PM, Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> wrote: >> Subject: Re: GSoC Project | Improvise git bisect > ^^^^ > > "Improve" I guess. > > Pranit Bauva <pranit.bauva@gmail.com> writes: > >> Hey everyone! > > Hi, > >> What I understood is that let's say the repository is like : >> >> C13 >> | >> C12 >> | >> C11 (merge commit) >> / | >> | C10 >> | | >> | C9 >> | | >> | C6 (merge commit) >> C8 | \ >> | C3 | >> C7 | | >> \ | C5 >> C2 | >> | C4 >> | / >> C1 >> (master branch) > > When drawing ascii-art diagrams like this, try to use a fixed-width > font. It looks ugly in my mailer. Ah, it looks ok in gmail. >> The commits numbers ie. C1...C13 are according to the time stamp, C1 >> being the first. > > One information is missing: which is the first parent. Yeah it is not clear but we can suppose that the first parents are among C1, C2, C3,C6, C9, C10, C11, C12 and C13. So the first parent of C11 would be C10 and the first parent of C6 would be C3. >> On starting to debug with git bisect, given that C12 is bad and C1 is >> good, it starts a binary search from C1...C13. ie. It first goes to >> C7, First if C1 is good and C12 is bad then the binary search is between C1 and C12. C13 is excluded. > I don't think so. It tries to find a commit which cuts the graph into 2 > sub-graphs with roughly the same number of commits. If you pick C7, then > C7 is bad, the regression may be anywhere except C1, C2, C7. This does > not reduce the scope much. If C7 is bad then, as C1 is good the "first bad commit" is C7 or C2. It's when C7 is good that C7 and C2 are excluded. > I guess you picked C7 because of the timestamps. "bisect" picks the > commit according to the graph topology. Yeah. Basically it will pick the commit that is the farther away from the "bad" and "good" commits. That means C6 or C9 will be picked, so it looks like the graph is not a good example of why --first-parent could be useful. >> if its all good, it goes to C10 and so on an so forth. If C7 is not >> good, it goes to C4 and so on and so forth. This just makes the job of >> debugging a bit difficult for a repo which has only 1 mainstream >> repository and it just has some short-term branches to instantly get >> stuff done. > > Why? > >> It can be simplified by using --first-parent. Given C1 is good and C12 >> is bad, it will find the mean between {C1, C2, C3, C6, C9, C10, C11, >> C12, C13} which is C9, see if its good. It would find C6 or C9 even without --first-parent. > Do you mean that C10 is the first parent of C11, and C3 the first parent > of C6? That's an un-usual graphical convention: usually we represent > first parent as leftmost parent. Yeah. >> If not then it will go to C3 >> and then C2, if good then it will go to C6, if not good then it will >> go to C5 and then C4. This will greatly simplify the job of debugging. > > Again, why? > > The missing part in your explanation is probably: > > Some projects do not enforce the policy "each commit must be compilable > and correct", but instead consider that only commits on the mainline > should have this property. Yeah. And there were previous discussions on the mailing list where --first-parent was discussed. It would be nice if they were refered to. They might talk about other interesting use cases. > This typically allows history like > > A Merge feature A > |\ > | B fix bug in feature A > | | > | C fix compilation error in previous commit > | | > | D implement feature A > |/ > E Merge feature B > ... > > When bisecting through such history, testing commits B and C is > meaningless, but it still makes sense to bisect through the mainling > commits A and E. In this case, we can consider that if E is good and A > is bad, then the regression was introduced in A. > > Once we know that, we can actually continue the bisection: "OK, the > regression was introduced in mainline at merge commit A, let's see if > the branch being merged is bisectable", which could be recursive if the > topic branch contains merge commits. > >> - Rewrite git-bisect.sh as bisect.c and bisect.h >> >> For this I plan to go along the guidelines of Paul Tan's previous >> year work. I have followed his work and his way seems nice to go about >> with rewriting. > > Please elaborate. Your proposal needs to be convincing enough that > mentors accept to commit to mentoring the project. "I'll do like Paul > Tan" is by far not sufficient. > > I'm actually not sure the same plan applies here: there's already a C > helper for bisect, so an incremental rewrite may be more appropriate: > port functions one by one from shell to C untill the shell part is > empty. Yeah, I think an incremental rewrite is more appropriate. > I don't know the bisect code well enough to know which approach would > work best. Best, Christian. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: GSoC Project | Improvise git bisect 2016-03-19 16:14 ` Christian Couder @ 2016-03-19 16:49 ` Pranit Bauva 2016-03-19 22:05 ` Stefan Beller 2016-03-20 11:35 ` Pranit Bauva 0 siblings, 2 replies; 16+ messages in thread From: Pranit Bauva @ 2016-03-19 16:49 UTC (permalink / raw) To: Christian Couder Cc: Matthieu Moy, Git List, Johannes Schindelin, Stefan Beller, Lars Schneider, Jeff King, Troy Moure, Junio C Hamano, Eric Sunshine, Kevin Daudt, Philip Oakley On Sat, Mar 19, 2016 at 9:44 PM, Christian Couder <christian.couder@gmail.com> wrote: > Hi, > > On Sat, Mar 19, 2016 at 1:48 PM, Matthieu Moy > <Matthieu.Moy@grenoble-inp.fr> wrote: >>> Subject: Re: GSoC Project | Improvise git bisect >> ^^^^ >> >> "Improve" I guess. >> >> Pranit Bauva <pranit.bauva@gmail.com> writes: >> >>> Hey everyone! >> >> Hi, >> >>> What I understood is that let's say the repository is like : >>> >>> C13 >>> | >>> C12 >>> | >>> C11 (merge commit) >>> / | >>> | C10 >>> | | >>> | C9 >>> | | >>> | C6 (merge commit) >>> C8 | \ >>> | C3 | >>> C7 | | >>> \ | C5 >>> C2 | >>> | C4 >>> | / >>> C1 >>> (master branch) >> >> When drawing ascii-art diagrams like this, try to use a fixed-width >> font. It looks ugly in my mailer. > > Ah, it looks ok in gmail. > >>> The commits numbers ie. C1...C13 are according to the time stamp, C1 >>> being the first. >> >> One information is missing: which is the first parent. > > Yeah it is not clear but we can suppose that the first parents are > among C1, C2, C3,C6, C9, C10, C11, C12 and C13. > So the first parent of C11 would be C10 and the first parent of C6 would be C3. > >>> On starting to debug with git bisect, given that C12 is bad and C1 is >>> good, it starts a binary search from C1...C13. ie. It first goes to >>> C7, > > First if C1 is good and C12 is bad then the binary search is between C1 and C12. > C13 is excluded. > >> I don't think so. It tries to find a commit which cuts the graph into 2 >> sub-graphs with roughly the same number of commits. If you pick C7, then >> C7 is bad, the regression may be anywhere except C1, C2, C7. This does >> not reduce the scope much. > > If C7 is bad then, as C1 is good the "first bad commit" is C7 or C2. > It's when C7 is good that C7 and C2 are excluded. > >> I guess you picked C7 because of the timestamps. "bisect" picks the >> commit according to the graph topology. > > Yeah. Basically it will pick the commit that is the farther away from > the "bad" and "good" commits. > That means C6 or C9 will be picked, so it looks like the graph is not > a good example of why --first-parent could be useful. > >>> if its all good, it goes to C10 and so on an so forth. If C7 is not >>> good, it goes to C4 and so on and so forth. This just makes the job of >>> debugging a bit difficult for a repo which has only 1 mainstream >>> repository and it just has some short-term branches to instantly get >>> stuff done. >> >> Why? >> >>> It can be simplified by using --first-parent. Given C1 is good and C12 >>> is bad, it will find the mean between {C1, C2, C3, C6, C9, C10, C11, >>> C12, C13} which is C9, see if its good. > > It would find C6 or C9 even without --first-parent. > >> Do you mean that C10 is the first parent of C11, and C3 the first parent >> of C6? That's an un-usual graphical convention: usually we represent >> first parent as leftmost parent. > > Yeah. > >>> If not then it will go to C3 >>> and then C2, if good then it will go to C6, if not good then it will >>> go to C5 and then C4. This will greatly simplify the job of debugging. >> >> Again, why? >> >> The missing part in your explanation is probably: >> >> Some projects do not enforce the policy "each commit must be compilable >> and correct", but instead consider that only commits on the mainline >> should have this property. > > Yeah. And there were previous discussions on the mailing list where > --first-parent was discussed. > It would be nice if they were refered to. They might talk about other > interesting use cases. > >> This typically allows history like >> >> A Merge feature A >> |\ >> | B fix bug in feature A >> | | >> | C fix compilation error in previous commit >> | | >> | D implement feature A >> |/ >> E Merge feature B >> ... >> >> When bisecting through such history, testing commits B and C is >> meaningless, but it still makes sense to bisect through the mainling >> commits A and E. In this case, we can consider that if E is good and A >> is bad, then the regression was introduced in A. >> >> Once we know that, we can actually continue the bisection: "OK, the >> regression was introduced in mainline at merge commit A, let's see if >> the branch being merged is bisectable", which could be recursive if the >> topic branch contains merge commits. I guess I had quite a lot of conceptual doubts regarding this. I will search more about this. >> >>> - Rewrite git-bisect.sh as bisect.c and bisect.h >>> >>> For this I plan to go along the guidelines of Paul Tan's previous >>> year work. I have followed his work and his way seems nice to go about >>> with rewriting. >> >> Please elaborate. Your proposal needs to be convincing enough that >> mentors accept to commit to mentoring the project. "I'll do like Paul >> Tan" is by far not sufficient. >> >> I'm actually not sure the same plan applies here: there's already a C >> helper for bisect, so an incremental rewrite may be more appropriate: >> port functions one by one from shell to C untill the shell part is >> empty. > > Yeah, I think an incremental rewrite is more appropriate. > >> I don't know the bisect code well enough to know which approach would >> work best. Sorry it was a mistake on my part. I should have explained it in very detail. I will do it within a day. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: GSoC Project | Improvise git bisect 2016-03-19 16:49 ` Pranit Bauva @ 2016-03-19 22:05 ` Stefan Beller 2016-03-20 8:10 ` Pranit Bauva 2016-03-20 11:35 ` Pranit Bauva 1 sibling, 1 reply; 16+ messages in thread From: Stefan Beller @ 2016-03-19 22:05 UTC (permalink / raw) To: Pranit Bauva, Christian Couder Cc: Matthieu Moy, Git List, Johannes Schindelin, Stefan Beller, Lars Schneider, Jeff King, Troy Moure, Junio C Hamano, Eric Sunshine, Kevin Daudt, Philip Oakley, s-beyer On 19.03.2016 09:49, Pranit Bauva wrote: > On Sat, Mar 19, 2016 at 9:44 PM, Christian Couder > <christian.couder@gmail.com> wrote: >> Hi, >> >> On Sat, Mar 19, 2016 at 1:48 PM, Matthieu Moy >> <Matthieu.Moy@grenoble-inp.fr> wrote: >>>> Subject: Re: GSoC Project | Improvise git bisect >>> ^^^^ >>> >>> "Improve" I guess. >>> >>> Pranit Bauva <pranit.bauva@gmail.com> writes: >>> >>>> Hey everyone! >>> >>> Hi, >>> >>>> What I understood is that let's say the repository is like : >>>> >>>> C13 >>>> | >>>> C12 >>>> | >>>> C11 (merge commit) >>>> / | >>>> | C10 >>>> | | >>>> | C9 >>>> | | >>>> | C6 (merge commit) >>>> C8 | \ >>>> | C3 | >>>> C7 | | >>>> \ | C5 >>>> C2 | >>>> | C4 >>>> | / >>>> C1 >>>> (master branch) >>> >>> When drawing ascii-art diagrams like this, try to use a fixed-width >>> font. It looks ugly in my mailer. >> >> Ah, it looks ok in gmail. >> >>>> The commits numbers ie. C1...C13 are according to the time stamp, C1 >>>> being the first. >>> >>> One information is missing: which is the first parent. >> >> Yeah it is not clear but we can suppose that the first parents are >> among C1, C2, C3,C6, C9, C10, C11, C12 and C13. >> So the first parent of C11 would be C10 and the first parent of C6 would be C3. >> >>>> On starting to debug with git bisect, given that C12 is bad and C1 is >>>> good, it starts a binary search from C1...C13. ie. It first goes to >>>> C7, >> >> First if C1 is good and C12 is bad then the binary search is between C1 and C12. >> C13 is excluded. >> >>> I don't think so. It tries to find a commit which cuts the graph into 2 >>> sub-graphs with roughly the same number of commits. If you pick C7, then >>> C7 is bad, the regression may be anywhere except C1, C2, C7. This does >>> not reduce the scope much. >> >> If C7 is bad then, as C1 is good the "first bad commit" is C7 or C2. >> It's when C7 is good that C7 and C2 are excluded. >> >>> I guess you picked C7 because of the timestamps. "bisect" picks the >>> commit according to the graph topology. >> >> Yeah. Basically it will pick the commit that is the farther away from >> the "bad" and "good" commits. >> That means C6 or C9 will be picked, so it looks like the graph is not >> a good example of why --first-parent could be useful. >> >>>> if its all good, it goes to C10 and so on an so forth. If C7 is not >>>> good, it goes to C4 and so on and so forth. This just makes the job of >>>> debugging a bit difficult for a repo which has only 1 mainstream >>>> repository and it just has some short-term branches to instantly get >>>> stuff done. >>> >>> Why? >>> >>>> It can be simplified by using --first-parent. Given C1 is good and C12 >>>> is bad, it will find the mean between {C1, C2, C3, C6, C9, C10, C11, >>>> C12, C13} which is C9, see if its good. >> >> It would find C6 or C9 even without --first-parent. >> >>> Do you mean that C10 is the first parent of C11, and C3 the first parent >>> of C6? That's an un-usual graphical convention: usually we represent >>> first parent as leftmost parent. >> >> Yeah. >> >>>> If not then it will go to C3 >>>> and then C2, if good then it will go to C6, if not good then it will >>>> go to C5 and then C4. This will greatly simplify the job of debugging. >>> >>> Again, why? >>> >>> The missing part in your explanation is probably: >>> >>> Some projects do not enforce the policy "each commit must be compilable >>> and correct", but instead consider that only commits on the mainline >>> should have this property. >> >> Yeah. And there were previous discussions on the mailing list where >> --first-parent was discussed. >> It would be nice if they were refered to. They might talk about other >> interesting use cases. >> >>> This typically allows history like >>> >>> A Merge feature A >>> |\ >>> | B fix bug in feature A >>> | | >>> | C fix compilation error in previous commit >>> | | >>> | D implement feature A >>> |/ >>> E Merge feature B >>> ... >>> >>> When bisecting through such history, testing commits B and C is >>> meaningless, but it still makes sense to bisect through the mainling >>> commits A and E. In this case, we can consider that if E is good and A >>> is bad, then the regression was introduced in A. >>> >>> Once we know that, we can actually continue the bisection: "OK, the >>> regression was introduced in mainline at merge commit A, let's see if >>> the branch being merged is bisectable", which could be recursive if the >>> topic branch contains merge commits. > > I guess I had quite a lot of conceptual doubts regarding this. I will > search more about this. Once upon a time, a discussion produced this proposal[1], which tries to split up the set as good as possible (50:50) instead of inspecting the branch/merging structure of the underlying graph. There was a recent series on bisect by Stephan Beyer[2], who is cc'd now, maybe he has some thoughts on improving bisect. Thanks, Stefan [1] https://docs.google.com/document/d/1hzF8fZbsQtKwUPH60dsEwVZM2wmESFq713SeAsg_hkc/edit?usp=sharing [2] http://comments.gmane.org/gmane.comp.version-control.git/287513 > >>> >>>> - Rewrite git-bisect.sh as bisect.c and bisect.h >>>> >>>> For this I plan to go along the guidelines of Paul Tan's previous >>>> year work. I have followed his work and his way seems nice to go about >>>> with rewriting. >>> >>> Please elaborate. Your proposal needs to be convincing enough that >>> mentors accept to commit to mentoring the project. "I'll do like Paul >>> Tan" is by far not sufficient. >>> >>> I'm actually not sure the same plan applies here: there's already a C >>> helper for bisect, so an incremental rewrite may be more appropriate: >>> port functions one by one from shell to C untill the shell part is >>> empty. >> >> Yeah, I think an incremental rewrite is more appropriate. >> >>> I don't know the bisect code well enough to know which approach would >>> work best. > > Sorry it was a mistake on my part. I should have explained it in very > detail. I will do it within a day. > -- > To unsubscribe from this list: send the line "unsubscribe git" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: GSoC Project | Improvise git bisect 2016-03-19 22:05 ` Stefan Beller @ 2016-03-20 8:10 ` Pranit Bauva 0 siblings, 0 replies; 16+ messages in thread From: Pranit Bauva @ 2016-03-20 8:10 UTC (permalink / raw) To: Stefan Beller Cc: Christian Couder, Matthieu Moy, Git List, Johannes Schindelin, Stefan Beller, Lars Schneider, Jeff King, Troy Moure, Junio C Hamano, Eric Sunshine, Kevin Daudt, Philip Oakley, s-beyer On Sun, Mar 20, 2016 at 3:35 AM, Stefan Beller <stefanbeller@gmail.com> wrote: > Once upon a time, a discussion produced this proposal[1], > which tries to split up the set as good as possible (50:50) instead > of inspecting the branch/merging structure of the underlying graph. > Thanks! This helped me in getting to know how bisect actually works. > There was a recent series on bisect by Stephan Beyer[2], who is cc'd > now, maybe he has some thoughts on improving bisect. > > Thanks, > Stefan > > > [1] > https://docs.google.com/document/d/1hzF8fZbsQtKwUPH60dsEwVZM2wmESFq713SeAsg_hkc/edit?usp=sharing > > [2] http://comments.gmane.org/gmane.comp.version-control.git/287513 ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: GSoC Project | Improvise git bisect 2016-03-19 16:49 ` Pranit Bauva 2016-03-19 22:05 ` Stefan Beller @ 2016-03-20 11:35 ` Pranit Bauva 2016-03-20 13:24 ` Matthieu Moy ` (2 more replies) 1 sibling, 3 replies; 16+ messages in thread From: Pranit Bauva @ 2016-03-20 11:35 UTC (permalink / raw) To: Christian Couder Cc: Matthieu Moy, Git List, Johannes Schindelin, Stefan Beller, Lars Schneider, Jeff King, Troy Moure, Junio C Hamano, Eric Sunshine, Kevin Daudt, Philip Oakley The project Idea: Incremental Rewrite from shell to C of git-bisect.sh The plan: - Place bisect.c in builtin/ - Implement a skeletal cmd_bisect() which will redirect to git-bisect.sh (1e1ea69f) - Introduce a structure for parsing the command line flags. - Start converting individual functions and place it in bisect.c The major objective would be to replicate function behavior rather than its actual implementation because what might seem to be a good choice in bash would not be a good choice in C and vice versa. For eg. bisect_skip() externally calls the command `git rev-list HEAD` to get a list of commits and then store the standard output in an array as bash can directly do the conversion, but it would not be a good thing to do in C. This might involve a very different approach to the existing code. To take care of this part, I will extensively debug git-bisect.sh and the other files it depends on ie. rev-list.c, rev-parse.c, for-each-ref.c, list_objects.c and bisect.c to study their behavior . I will spend some time in testing the functions for corner cases and running the ready-made tests to investigate its behavior. The goal would be to produce quality code that can be included in the next maintenance release 2.7.5. I will try to extensively review the patches myself first and then add it on the mailing list. I intend to discuss the whole structure of bisect.c based on my observations in the first week. Then accordingly send individual patches for each method. Also I am quite unsure how the patches would be maintained by Junio. Would he create a separate branch for me in his personal repo and then add the patches to it without merging it to pu and then when it is completely done, the merge will take place? Or he will individually first place every patch in pu and the normal process? Functions I intend to cover (This is from a first-read, I haven't actually debugged it. It might miss some technicalities or deviate from actual) : - bisect_head() : This tests whether there exists a regular file named "BISECT_HEAD". If it exists then it prints out the contents of it otherwise it prints the contents of current HEAD pointer. - bisect_autostart() : This sees if there exists a non-empty file "BISECT_HEAD". If yes, then it checks if standard input is available. If yes, then it asks whether the user wants to automatically start bisect. - bisect_start() : It calls `git rev-parse` with the arguments that this function was called to get the SHA-1 of the refs in the git repo. Then, if this repo is a bare one then it uses the flag `--no-checkout`. It then parses the options and accordingly updates the state variables. It then checks whether the user ran "git bisect start <sha1> <sha2>". If yes, then it assigns bad commit to sha1 and good commit to sha2. It then gets the path of HEAD pointer and stores its SHA-1 in head. The it checks whether the user has already started bisecting. If it has started, then it will reset to where it previously started. It will then get rid of any old bisect state. Then it will write new start state by putting the ref in BISECT_START, add them in BISECT_NAMES, and then store the log of the good and bad commit in BISECT_LOG. Then it will check whether the user can proceed to the next bisect state by using system signals. - bisect_write() : It understands whether the commit is good or bad and initializes variables accordingly. This will get the current ref to the commit it is on. It will then print its state in BISECT_LOG. - is_expected_rev() : It checks whether the argument matches the contents of BISECT_EXPECTED_REV - check_expected_rev() : It removes the revs BISECT_ANCESTORS_OK and BISECT_EXPECTED_REV is they fail the is_expect_rev() test - bisect_skip() : It specifies a range/single commit which needs to be skipped and then call bisect_state() with the required arguments to skip. - bisect_state() : It will first start auto bisect. It then verifies the ref and calls it bisect_write with the state and ref. Then it calls bisect_auto_next() - bisect_check_next() : It will check whether there exists a ref for a good commit and a bad commit. If not then it will just report it. It can bisect with only 1 bad commit without a good commit. - bisect_auto_next() : It checks whether bisection for next is possible. If yes then it calls bisect_next(). - bisect_next() : It calls first bisect_autostart() and it calls bisect_check_next() with the good commit. Then it will perform the bisection computation, display the results and then checkout. Then it will check if it should exit because bisection is finished. And then check for an error in the bisection process. - bisect_visualize() : It will call gitk and git log (if gitk is not available) and pass on the arguments also. - bisect_reset() : It checks for the file BISECT_START, if present then it will first checkout to the branch/commit/ref specified in the argument and then call bisect_clean_state(). - bisect_clean_state() : It will update all the refs which bisect changed safely with the help of `git update-ref`. It will then remove all the extra refs which bisect introduced during bisection. - bisect_replay() : It first calls bisect_reset(). Then it reads from log file and then runs git bisect and marks the commits good or bad all by itself using the information obtained from log file. - bisect_run() : This basically checks whether the script you provided with the arguments that you provided with and performs a fully automated bisection. - bisect_log() : Prints the contents of BISECT_LOG - get_terms() : It reads the contents of BISECT_TERMS and stores the SHA-1 of good commits and bad commits. - write_terms() : This will write good commit and bad commit in proper format to BISECT_TERMS - check_term_format() : Checks whether BISECT_TERMS has a proper SHA-1 and nothing else. - check_and_set_terms() : It verify and set the good/bad or new/old commit SHA-1 in BISECT_TERMS - bisect_voc() : simple switch case to print "bad|new" if argument is "bad" and "good|old" if argument is "good". - bisect_terms() : It will first receive terms using get_terms() or by using the arguments passed to it. Should I also include "How git bisect works internally?" in the proposal along with this? This is not my "first look" of the proposal. I am still working on the proposal. Half way there! And most importantly, Would anyone like to mentor me for this project? ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: GSoC Project | Improvise git bisect 2016-03-20 11:35 ` Pranit Bauva @ 2016-03-20 13:24 ` Matthieu Moy 2016-03-20 13:25 ` Christian Couder 2016-03-20 15:31 ` Johannes Schindelin 2 siblings, 0 replies; 16+ messages in thread From: Matthieu Moy @ 2016-03-20 13:24 UTC (permalink / raw) To: Pranit Bauva Cc: Christian Couder, Git List, Johannes Schindelin, Stefan Beller, Lars Schneider, Jeff King, Troy Moure, Junio C Hamano, Eric Sunshine, Kevin Daudt, Philip Oakley Pranit Bauva <pranit.bauva@gmail.com> writes: > The project Idea: Incremental Rewrite from shell to C of git-bisect.sh > > The plan: > > - Place bisect.c in builtin/ > - Implement a skeletal cmd_bisect() which will redirect to > git-bisect.sh (1e1ea69f) > - Introduce a structure for parsing the command line flags. > - Start converting individual functions and place it in bisect.c The most important question is missing from this: how do you ensure that you periodically fall back to a consistant state (i.e. a state where your code is used and the whole testsuite passes)? Without this, the timeline may look like: - day 1: the 3 first steps above - 2 months: convert individual functions - last day: submit the whole thing And you absolutely want to avoid this. You need to get reviewable and mergeable code ASAP. As I wrote earlier, I'm not convinced that your plan is the best approach. Converting shell functions to the bisect helper is IMHO a better first step, and having a builtin cmd_bisect can come later. > a list of commits and then store the standard output in an array as > bash can directly do the conversion, but it would not be a good thing > to do in C. Small confusion here: there's no bash involved, just a POSIX /bin/sh (which may, or may not point to bash), and POSIX /bin/sh has no notion of array. > The goal would be to produce quality code that can be included in the > next maintenance release 2.7.5. I don't think this is a candidate for a maintenance release. Maintenance release include fixes for which we're confident enough that no regression or unexpected behavior change will happen. Complete rewrite is the opposite of that. > Also I am quite unsure how > the patches would be maintained by Junio. Would he create a separate > branch for me in his personal repo and then add the patches to it > without merging it to pu and then when it is completely done, the > merge will take place? Or he will individually first place every patch > in pu and the normal process? Junio usually does not maintain branches that are not merged in pu. "merged in pu" does not mean "accepted", just "worth paying attention". But this implies "does not introduce obvious regression" (cf. above). Each series is applied in its own branch, and then this branch is merged into pu. We consider GSoC students as normal contributors when it comes to submitting code (that's one of the reason why Junio is not a mentor). > Should I also include "How git bisect works internally?" in the > proposal along with this? You should include any convincing argument to support the claim that you are able to comptlete the project. I think this includes re-explaining in your own words how you understand bisect in its current form. > And most importantly, Would anyone like to mentor me for this project? Christian is a potential mentor and he knows bisect very well. I can't speak for him, but he'd probably be a good mentor for such project. -- Matthieu Moy http://www-verimag.imag.fr/~moy/ ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: GSoC Project | Improvise git bisect 2016-03-20 11:35 ` Pranit Bauva 2016-03-20 13:24 ` Matthieu Moy @ 2016-03-20 13:25 ` Christian Couder 2016-03-20 15:31 ` Johannes Schindelin 2 siblings, 0 replies; 16+ messages in thread From: Christian Couder @ 2016-03-20 13:25 UTC (permalink / raw) To: Pranit Bauva Cc: Matthieu Moy, Git List, Johannes Schindelin, Stefan Beller, Lars Schneider, Jeff King, Troy Moure, Junio C Hamano, Eric Sunshine, Kevin Daudt, Philip Oakley On Sun, Mar 20, 2016 at 12:35 PM, Pranit Bauva <pranit.bauva@gmail.com> wrote: > The project Idea: Incremental Rewrite from shell to C of git-bisect.sh > > The plan: > > - Place bisect.c in builtin/ > - Implement a skeletal cmd_bisect() which will redirect to > git-bisect.sh (1e1ea69f) > - Introduce a structure for parsing the command line flags. > - Start converting individual functions and place it in bisect.c I sent the following PR 8 days ago to add something about this idea on the SoC Idea page, but unfortunately it has been merged only yesterday: https://github.com/git/git.github.io/pull/135 Also a plan like the one I outline in this PR has been discussed on the list at least a few times. As for working on --first-parent and generally speaking any project, it is a very good idea to first search the mailing list (using for example http://dir.gmane.org/gmane.comp.version-control.git/) to see if the idea has already been discussed and to link to the previous discussions in the proposal. For example you could have found this (from a few weeks ago): http://article.gmane.org/gmane.comp.version-control.git/286496/match=move+bisect+c > The major objective would be to replicate function behavior rather > than its actual implementation because what might seem to be a good > choice in bash would not be a good choice in C and vice versa. For eg. > bisect_skip() externally calls the command `git rev-list HEAD` to get > a list of commits and then store the standard output in an array as > bash can directly do the conversion, but it would not be a good thing > to do in C. This might involve a very different approach to the > existing code. To take care of this part, I will extensively debug > git-bisect.sh and the other files it depends on ie. rev-list.c, > rev-parse.c, for-each-ref.c, list_objects.c and bisect.c to study > their behavior. "builtin/bisect--helper.c" is missing from the list. > I will spend some time in testing the functions for > corner cases and running the ready-made tests to investigate its > behavior. The goal would be to produce quality code that can be > included in the next maintenance release 2.7.5. Developments like porting commands from shell to C are very unlikely to be merged in maintenance releases. > I will try to > extensively review the patches myself first and then add it on the > mailing list. I intend to discuss the whole structure of bisect.c > based on my observations in the first week. Then accordingly send > individual patches for each method. Also I am quite unsure how the > patches would be maintained by Junio. Would he create a separate > branch for me in his personal repo and then add the patches to it > without merging it to pu and then when it is completely done, the > merge will take place? Or he will individually first place every patch > in pu and the normal process? I don't think it's worth worrying about that at this point in a proposal. (And you could just look at what happened during the previous GSoC to get a good idea about what happens.) > Functions I intend to cover (This is from a first-read, I haven't > actually debugged it. It might miss some technicalities or deviate > from actual) : ... I don't think such a list is very useful. At this point please focus on the big picture, that is what you call "The plan" above. > Should I also include "How git bisect works internally?" in the > proposal along with this? Yeah, I think it would help. But just show the big picture, not too many details. For example you could just tell what are the bisect related features that are implemented in the different bisect related files that you list above. > This is not my "first look" of the proposal. I am still working on the > proposal. Half way there! > > And most importantly, Would anyone like to mentor me for this project? Matthieu and myself are listed as possible mentors for the bisect related projects on the Ideas page (http://git.github.io/SoC-2016-Ideas/). ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: GSoC Project | Improvise git bisect 2016-03-20 11:35 ` Pranit Bauva 2016-03-20 13:24 ` Matthieu Moy 2016-03-20 13:25 ` Christian Couder @ 2016-03-20 15:31 ` Johannes Schindelin 2016-03-20 16:26 ` Pranit Bauva 2 siblings, 1 reply; 16+ messages in thread From: Johannes Schindelin @ 2016-03-20 15:31 UTC (permalink / raw) To: Pranit Bauva Cc: Christian Couder, Matthieu Moy, Git List, Stefan Beller, Lars Schneider, Jeff King, Troy Moure, Junio C Hamano, Eric Sunshine, Kevin Daudt, Philip Oakley Hi, On Sun, 20 Mar 2016, Pranit Bauva wrote: > The plan: > > - Place bisect.c in builtin/ > - Implement a skeletal cmd_bisect() which will redirect to > git-bisect.sh (1e1ea69f) I would highly advise a different course of action: - move functionality one by one from bisect.sh to builtin/bisect--helper.c That way, you do not have to deal with a lot of infrastructure at the beginning of your project because bisect--helper is already a builtin; Instead, you can dive right in and implement satisfyingly noticable improvements. Ciao, Johannes ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: GSoC Project | Improvise git bisect 2016-03-20 15:31 ` Johannes Schindelin @ 2016-03-20 16:26 ` Pranit Bauva 2016-03-20 18:08 ` Matthieu Moy 2016-03-21 7:18 ` Johannes Schindelin 0 siblings, 2 replies; 16+ messages in thread From: Pranit Bauva @ 2016-03-20 16:26 UTC (permalink / raw) To: Johannes Schindelin Cc: Christian Couder, Matthieu Moy, Git List, Stefan Beller, Lars Schneider, Jeff King, Troy Moure, Junio C Hamano, Eric Sunshine, Kevin Daudt, Philip Oakley I didn't mention it before, I was thinking to move directly everything from git-bisect.sh to bisect.c . But then I would have to redirect to a shell script. And ultimately remove bisect--helper.c Your suggestions is a lot better. I could first move individual functions to bisect--helper.c. This would not affect the current status of the project. When I complete this, I can shift all the methods to bisect.c and place it in the directory builtin/. Then I would remove linking of the git-bisect.sh and introduce builtin/bisect.c thus completing the process. When I am writing functions, I am thinking to make some tests which will be copied contents of t/t6030-bisect-porcelain.sh with s/bisect/bisect--helper/g uncommenting tests as I implement more functions. Should I keep these tests restricted to myself or I send it as a patch along with the corresponding function? ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: GSoC Project | Improvise git bisect 2016-03-20 16:26 ` Pranit Bauva @ 2016-03-20 18:08 ` Matthieu Moy 2016-03-21 7:18 ` Johannes Schindelin 1 sibling, 0 replies; 16+ messages in thread From: Matthieu Moy @ 2016-03-20 18:08 UTC (permalink / raw) To: Pranit Bauva Cc: Johannes Schindelin, Christian Couder, Git List, Stefan Beller, Lars Schneider, Jeff King, Troy Moure, Junio C Hamano, Eric Sunshine, Kevin Daudt, Philip Oakley Pranit Bauva <pranit.bauva@gmail.com> writes: > When I am writing > functions, I am thinking to make some tests which will be copied > contents of t/t6030-bisect-porcelain.sh with s/bisect/bisect--helper/g > uncommenting tests as I implement more functions. I don't think you need that. When you write a new function, you are removing pieces of shell and adding pieces of C as replacement. If the testsuite is complete, it covers the shell implementation today and will cover the C implementation when it replaces the shell. The existing tests should be usable without modification. -- Matthieu Moy http://www-verimag.imag.fr/~moy/ ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: GSoC Project | Improvise git bisect 2016-03-20 16:26 ` Pranit Bauva 2016-03-20 18:08 ` Matthieu Moy @ 2016-03-21 7:18 ` Johannes Schindelin 2016-03-21 7:29 ` Pranit Bauva 1 sibling, 1 reply; 16+ messages in thread From: Johannes Schindelin @ 2016-03-21 7:18 UTC (permalink / raw) To: Pranit Bauva Cc: Christian Couder, Matthieu Moy, Git List, Stefan Beller, Lars Schneider, Jeff King, Troy Moure, Junio C Hamano, Eric Sunshine, Kevin Daudt, Philip Oakley Hi Pranit, On Sun, 20 Mar 2016, Pranit Bauva wrote: > I could first move individual functions to bisect--helper.c. My suggestion would be to give it a try already with some functionality you deem small enough to move to the bisect--helper within a day or so. It is always good to test the waters like that, and to include this early work in the proposal, also to assess (and to let the reviewers assess) how feasible the project is. Ciao, Johannes ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: GSoC Project | Improvise git bisect 2016-03-21 7:18 ` Johannes Schindelin @ 2016-03-21 7:29 ` Pranit Bauva 2016-03-21 17:53 ` Christian Couder 0 siblings, 1 reply; 16+ messages in thread From: Pranit Bauva @ 2016-03-21 7:29 UTC (permalink / raw) To: Johannes Schindelin Cc: Christian Couder, Matthieu Moy, Git List, Stefan Beller, Lars Schneider, Jeff King, Troy Moure, Junio C Hamano, Eric Sunshine, Kevin Daudt, Philip Oakley On Mon, Mar 21, 2016 at 12:48 PM, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > Hi Pranit, > > On Sun, 20 Mar 2016, Pranit Bauva wrote: > >> I could first move individual functions to bisect--helper.c. > > My suggestion would be to give it a try already with some functionality > you deem small enough to move to the bisect--helper within a day or so. It > is always good to test the waters like that, and to include this early > work in the proposal, also to assess (and to let the reviewers assess) how > feasible the project is. Sure! I will start with check_term_format(). Also, I think implementing a new algorithm for bisect along with this would be too big for a GSoC project and I am no algorithm expert. I will stick to --first-parent along with incremental rewrite. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: GSoC Project | Improvise git bisect 2016-03-21 7:29 ` Pranit Bauva @ 2016-03-21 17:53 ` Christian Couder 2016-03-21 18:13 ` Pranit Bauva 0 siblings, 1 reply; 16+ messages in thread From: Christian Couder @ 2016-03-21 17:53 UTC (permalink / raw) To: Pranit Bauva Cc: Johannes Schindelin, Matthieu Moy, Git List, Stefan Beller, Lars Schneider, Jeff King, Troy Moure, Junio C Hamano, Eric Sunshine, Kevin Daudt, Philip Oakley Hi Pranit, On Mon, Mar 21, 2016 at 8:29 AM, Pranit Bauva <pranit.bauva@gmail.com> wrote: > On Mon, Mar 21, 2016 at 12:48 PM, Johannes Schindelin > <Johannes.Schindelin@gmx.de> wrote: >> Hi Pranit, >> >> On Sun, 20 Mar 2016, Pranit Bauva wrote: >> >>> I could first move individual functions to bisect--helper.c. >> >> My suggestion would be to give it a try already with some functionality >> you deem small enough to move to the bisect--helper within a day or so. It >> is always good to test the waters like that, and to include this early >> work in the proposal, also to assess (and to let the reviewers assess) how >> feasible the project is. > > Sure! I will start with check_term_format(). > > Also, I think implementing a new algorithm for bisect along with this > would be too big for a GSoC project Yeah, I also think so. > and I am no algorithm expert. I > will stick to --first-parent along with incremental rewrite. My opinion is that an incremental rewrite by itself is big enough for a GSoC. And it might be difficult to do something else that is bisect related at the same time. So if you want to do the rewrite, just focus on it. If you ever have some time left we will easily find other interesting bisect related improvements. If you really want to do "--first-parent" then you should probably add "Improve git bisect terms" to it in your proposal. Best, Christian. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: GSoC Project | Improvise git bisect 2016-03-21 17:53 ` Christian Couder @ 2016-03-21 18:13 ` Pranit Bauva 0 siblings, 0 replies; 16+ messages in thread From: Pranit Bauva @ 2016-03-21 18:13 UTC (permalink / raw) To: Christian Couder Cc: Johannes Schindelin, Matthieu Moy, Git List, Stefan Beller, Lars Schneider, Jeff King, Troy Moure, Junio C Hamano, Eric Sunshine, Kevin Daudt, Philip Oakley On Mon, Mar 21, 2016 at 11:23 PM, Christian Couder <christian.couder@gmail.com> wrote: > Hi Pranit, > > On Mon, Mar 21, 2016 at 8:29 AM, Pranit Bauva <pranit.bauva@gmail.com> wrote: >> On Mon, Mar 21, 2016 at 12:48 PM, Johannes Schindelin >> <Johannes.Schindelin@gmx.de> wrote: >>> Hi Pranit, >>> >>> On Sun, 20 Mar 2016, Pranit Bauva wrote: >>> >>>> I could first move individual functions to bisect--helper.c. >>> >>> My suggestion would be to give it a try already with some functionality >>> you deem small enough to move to the bisect--helper within a day or so. It >>> is always good to test the waters like that, and to include this early >>> work in the proposal, also to assess (and to let the reviewers assess) how >>> feasible the project is. >> >> Sure! I will start with check_term_format(). >> >> Also, I think implementing a new algorithm for bisect along with this >> would be too big for a GSoC project > > Yeah, I also think so. > >> and I am no algorithm expert. I >> will stick to --first-parent along with incremental rewrite. > > My opinion is that an incremental rewrite by itself is big enough for > a GSoC. And it might be difficult to do something else that is bisect > related at the same time. So if you want to do the rewrite, just focus > on it. If you ever have some time left we will easily find other > interesting bisect related improvements. I just realised it after I finished rewriting a simple function. I will send that patch. It would be much better on doing one thing properly. > If you really want to do "--first-parent" then you should probably add > "Improve git bisect terms" to it in your proposal. ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2016-03-21 18:14 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-03-19 9:33 GSoC Project | Improvise git bisect Pranit Bauva 2016-03-19 12:48 ` Matthieu Moy 2016-03-19 16:14 ` Christian Couder 2016-03-19 16:49 ` Pranit Bauva 2016-03-19 22:05 ` Stefan Beller 2016-03-20 8:10 ` Pranit Bauva 2016-03-20 11:35 ` Pranit Bauva 2016-03-20 13:24 ` Matthieu Moy 2016-03-20 13:25 ` Christian Couder 2016-03-20 15:31 ` Johannes Schindelin 2016-03-20 16:26 ` Pranit Bauva 2016-03-20 18:08 ` Matthieu Moy 2016-03-21 7:18 ` Johannes Schindelin 2016-03-21 7:29 ` Pranit Bauva 2016-03-21 17:53 ` Christian Couder 2016-03-21 18:13 ` Pranit Bauva
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).