* Could this be done simpler?
@ 2009-06-24 21:35 Linus Torvalds
2009-06-25 1:04 ` Junio C Hamano
0 siblings, 1 reply; 15+ messages in thread
From: Linus Torvalds @ 2009-06-24 21:35 UTC (permalink / raw)
To: Junio C Hamano, Git Mailing List
Ok, so I have a practice of occasionally doing octopus merges when I have
two branches with trivial fixes from the same person.
That all works fine when they use the "multiple branches in the same
repository" approach (eg x86 "tip" tree), but other people tend to prefer
to use multiple repositories for different features, rather than branches.
And git generally lets you do things either way with no real difference.
But for the octopus case, it does make a difference. You can easily make
octopus merges only from one repository.
Which is kind of sad.
So I did kernel commit c6223048259006759237d826219f0fa4f312fb47 by
basically doing the 'git pull" logic by hand, and while this was just a
trial and maybe I'll never feel the urge to do it again, I'm wondering it
maybe we should make it easier to do.
Right now the "git pull" syntax is
git pull <repo> <branch>*
and you cannot specify multiple repositories, only multiple branches.
But at the same time, it should be pretty unambiguous whether an argument
is a repository or a branch (':' in a remote repository, or "/" or ".." at
the beginning of a local one - all invalid in branch names).
So it _should_ be syntactically unambiguous to allow
git pull (<repo> <branch>*)+
for the octopus case. Hmm?
Linus
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: Could this be done simpler? 2009-06-24 21:35 Could this be done simpler? Linus Torvalds @ 2009-06-25 1:04 ` Junio C Hamano 2009-06-25 14:33 ` Randal L. Schwartz 2009-06-25 22:02 ` Christian Couder 0 siblings, 2 replies; 15+ messages in thread From: Junio C Hamano @ 2009-06-25 1:04 UTC (permalink / raw) To: Linus Torvalds; +Cc: Git Mailing List Linus Torvalds <torvalds@linux-foundation.org> writes: > Ok, so I have a practice of occasionally doing octopus merges when I have > two branches with trivial fixes from the same person. > > That all works fine when they use the "multiple branches in the same > repository" approach (eg x86 "tip" tree), but other people tend to prefer > to use multiple repositories for different features, rather than branches. > And git generally lets you do things either way with no real difference. > > But for the octopus case, it does make a difference. You can easily make > octopus merges only from one repository. > > Which is kind of sad. > > So I did kernel commit c6223048259006759237d826219f0fa4f312fb47 by > basically doing the 'git pull" logic by hand, and while this was just a > trial and maybe I'll never feel the urge to do it again, I'm wondering it > maybe we should make it easier to do. Every once in a while I have this urge to see how it feels to be Linus by pretending to be him, trying what he did. (1) So where is he? $ git pull ... From git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 f234012..28d0325 master -> linus * [new tag] v2.6.31-rc1 -> v2.6.31-rc1 Updating f234012..28d0325 Fast forward ... (2) Let's pretend to be Linus, just before he made this merge. $ git checkout c62230^ (3) Let's see what he did with that thing. $ git show c62230 commit c6223048259006759237d826219f0fa4f312fb47 Merge: bd453cd d5bb68a 3a6a6c1 Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Wed Jun 24 14:17:14 2009 -0700 Merge branches 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/{vfs-2.6,audit-current} * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: another race fix in jfs_check_acl() Get "no acls for this inode" right, fix shmem breakage inline functions left without protection of ifdef (acl) * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current: audit: inode watches depend on CONFIG_AUDIT not CONFIG_AUDIT_SYSCALL Ah, so we know the two repositories and branches involved. (4) Let's pretend to be Linus. Fetch the first branch and drop the necessary information in FETCH_HEAD. $ git fetch \ git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 \ for-linus (5) Continue pretending to be Linus, complete the octopus. The key is to let the "fetch" phase of this to append to the FETCH_HEAD, not replacing it. $ git pull --append \ git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current \ for-linus (6) Did I succeed? Let's see. $ git diff c62230 Yay, identical tree. (7) How does the log message look? $ git show commit cb1e4198421091ea5844d93624d5d5499537dbe0 Merge: bd453cd d5bb68a 3a6a6c1 Author: Junio C Hamano <gitster@pobox.com> Date: Wed Jun 24 17:45:09 2009 -0700 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6; branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current into HEAD * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: another race fix in jfs_check_acl() Get "no acls for this inode" right, fix shmem breakage inline functions left without protection of ifdef (acl) * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current: audit: inode watches depend on CONFIG_AUDIT not CONFIG_AUDIT_SYSCALL Hmm, Linus's combined notation on the summary line that uses {} is much nicer. > Right now the "git pull" syntax is > > git pull <repo> <branch>* > > and you cannot specify multiple repositories, only multiple branches. > > But at the same time, it should be pretty unambiguous whether an argument > is a repository or a branch (':' in a remote repository, or "/" or ".." at > the beginning of a local one - all invalid in branch names). > > So it _should_ be syntactically unambiguous to allow > > git pull (<repo> <branch>*)+ > > for the octopus case. Hmm? Strictly speaking, you are not quite correct. Arguments after <repo> can be storing refspecs and they do come with colon. Conclusion. git-fmt-merge-msg may need to learn the trick of using {}. No other changes needed. Side note. People sometimes say, and I am certain I agreed to them on more than one occasions, that Octopus hurt bisectability and does not have much value in real life. I've always thought this bisectability issue was a downside of Octopus merges, but now I think about it, perhaps "git bisect" can be taught to dynamically decompose an Octopus merges into a sequence of two-head virtual merges while bisecting. We strongly discourage and do not allow conflicting Octopus merges, so when you need to bisect a history with an Octopus that looks like this: ---o---A \ ---o---B---M---o / ---o---C it should be able to mechanically decompose it, without conflicts, into ---o---A \ ---o---B---M1--M2--o / ---o---C where the tree of M and the tree of M2 are identical. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Could this be done simpler? 2009-06-25 1:04 ` Junio C Hamano @ 2009-06-25 14:33 ` Randal L. Schwartz 2009-06-25 16:32 ` Matthias Andree 2009-06-25 17:19 ` Michael J Gruber 2009-06-25 22:02 ` Christian Couder 1 sibling, 2 replies; 15+ messages in thread From: Randal L. Schwartz @ 2009-06-25 14:33 UTC (permalink / raw) To: Junio C Hamano; +Cc: Linus Torvalds, Git Mailing List >>>>> "Junio" == Junio C Hamano <gitster@pobox.com> writes: Junio> (5) Continue pretending to be Linus, complete the octopus. The key is to Junio> let the "fetch" phase of this to append to the FETCH_HEAD, not Junio> replacing it. Junio> $ git pull --append \ Junio> git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current \ Junio> for-linus The relatively current doc of "--append" looks like this: -a, --append Append ref names and object names of fetched refs to the existing contents of will be overwritten. I read this three times, and still don't know what it means (and it doesn't even scan well as English), so I would have never known to use this strategy. Can you explain this more in detail, or point at something in the mailing list that does? -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 <merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/> Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc. See http://methodsandmessages.vox.com/ for Smalltalk and Seaside discussion ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Could this be done simpler? 2009-06-25 14:33 ` Randal L. Schwartz @ 2009-06-25 16:32 ` Matthias Andree 2009-06-25 17:25 ` Junio C Hamano 2009-06-25 18:32 ` Junio C Hamano 2009-06-25 17:19 ` Michael J Gruber 1 sibling, 2 replies; 15+ messages in thread From: Matthias Andree @ 2009-06-25 16:32 UTC (permalink / raw) To: Randal L. Schwartz; +Cc: Junio C Hamano, Linus Torvalds, Git Mailing List Randal L. Schwartz schrieb: >>>>>> "Junio" == Junio C Hamano <gitster@pobox.com> writes: > > Junio> (5) Continue pretending to be Linus, complete the octopus. The key is to > Junio> let the "fetch" phase of this to append to the FETCH_HEAD, not > Junio> replacing it. > > Junio> $ git pull --append \ > Junio> git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current \ > Junio> for-linus > > The relatively current doc of "--append" looks like this: > > -a, --append > Append ref names and object names of fetched refs to the existing > contents of will be overwritten. > > I read this three times, and still don't know what it means (and it doesn't > even scan well as English), so I would have never known to use this strategy. > Can you explain this more in detail, or point at something in the mailing list > that does? Greetings, If I may: So the existing description is incomprehensible. I sort of believed I understood it, but apparently I didn't understand enough of it. Could we ditch the current git-pull --append description? Can then please somebody rewrite this paragraph? This somebody must have completely understood (1) what this feature is good for (practically speaking) (2) how it works (technically speaking, to provide reference information) That would be much more useful, and the use would last longer :-) I don't dare ask Junio directly. However, it appears to me that git-pull already does most of what Linus needs, could take some final cosmetic touch-ups WRT logs. So could somebody please rewrite this? And if I may be so bold: Please rewrite before somebody starts polishing the bisect facilities WRT octopus merges. These seem unrelated, as in: you don't need to make bisect more convenient to be able to fix the description of git-pull --append... Thanks for not slashing me to pieces. 8-) Best regards MA ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Could this be done simpler? 2009-06-25 16:32 ` Matthias Andree @ 2009-06-25 17:25 ` Junio C Hamano 2009-06-25 21:54 ` Matthias Andree 2009-06-25 18:32 ` Junio C Hamano 1 sibling, 1 reply; 15+ messages in thread From: Junio C Hamano @ 2009-06-25 17:25 UTC (permalink / raw) To: Matthias Andree Cc: Randal L. Schwartz, Junio C Hamano, Linus Torvalds, Git Mailing List Matthias Andree <matthias.andree@gmx.de> writes: > Could we ditch the current git-pull --append description? Can then please > somebody rewrite this paragraph? This somebody must have completely understood > (1) what this feature is good for (practically speaking) > > (2) how it works (technically speaking, to provide reference information) > > That would be much more useful, and the use would last longer :-) > > I don't dare ask Junio directly. But if you run blame and mailing list archive search, you would discover that "fetch --append" was my invention. After all, the entire Octopus idea originates from me at 211232b (Octopus merge of the following five patches., 2005-05-05). It is interesting to realize that it was actually a Pentapus made on the day of 5/5/5 ;-) I thought I was going to take blame on the incomprehensive documentation and pass it on to me being non-native speaker/writer of English, but the situation is bit funny. Documentation/fetch-options.txt says this: -a:: --append:: Append ref names and object names of fetched refs to the existing contents of `.git/FETCH_HEAD`. Without this option old data in `.git/FETCH_HEAD` will be overwritten. Perhaps there has a cut&paste error? I haven't looked. Now answers to (1) and (2). (1) The feature was designed exactly for the use case Linus described. (2) "git fetch" leaves list of <commit object, repo, branch, flag> for each ref fetched from repository in .git/FETCH_HEAD, where flag tells if it is meant for merging. "git pull" runs "git fetch", reads from this file to learn which ones to pass to "git merge". The information also is given to "git fmt-merge-msg" to come up with the message. Usually "git fetch" first empties the existing contents of the file and stores the list of refs it fetched. With --append, it doesn't empty the file; refs fetched by the previous invocation of "git fetch" will be kept and the refs it fetched are appenede. So: $ git fetch one a $ git fetch --append two b $ git pull --apend three c will end up having all the three refs from different repositories in .git/FETCH_HEAD. I.e. branch a, from repo one, to be merged branch b, from repo two, to be merged branch c, from repo three, to be merged when "git fetch" run by the the last "git pull" returns. "git pull" reads the file and learn what to give to "git fmt-merge-msg" (to come up with the message for the merge commit) and "git merge" (to create the merge commit). ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Could this be done simpler? 2009-06-25 17:25 ` Junio C Hamano @ 2009-06-25 21:54 ` Matthias Andree 2009-06-27 0:26 ` Junio C Hamano 0 siblings, 1 reply; 15+ messages in thread From: Matthias Andree @ 2009-06-25 21:54 UTC (permalink / raw) To: Junio C Hamano; +Cc: Randal L. Schwartz, Linus Torvalds, Git Mailing List Am 25.06.2009, 19:25 Uhr, schrieb Junio C Hamano <gitster@pobox.com>: > Matthias Andree <matthias.andree@gmx.de> writes: > >> Could we ditch the current git-pull --append description? Can then >> please somebody rewrite this paragraph? This somebody must have >> completely understood > >> (1) what this feature is good for (practically speaking) >> >> (2) how it works (technically speaking, to provide reference >> information) >> >> That would be much more useful, and the use would last longer :-) >> >> I don't dare ask Junio directly. > > But if you run blame and mailing list archive search, you would discover > that "fetch --append" was my invention. After all, the entire Octopus > idea originates from me at 211232b (Octopus merge of the following five > patches., 2005-05-05). It is interesting to realize that it was actually > a Pentapus made on the day of 5/5/5 ;-) Fair enough, but I hadn't looked at who wrote it because more people than just the original author can be able to write it. In fact, I've drifted into doing that. Suggestions at the very end if you're not interested in the rationale, but if you are only interested the solution. :-) I've seen your later message on the octopus merges, and therefore suggest that we get this --append stuff documented first. > I thought I was going to take blame on the incomprehensive documentation > and pass it on to me being non-native speaker/writer of English, but the > situation is bit funny. Documentation/fetch-options.txt says this: Neither am I a native writer, why bother… it's more important to write things clearly than to polish things, and writing something correctly from the beginning is a very hard problem. Writing something that works, and later polish, are two simpler problems. Language isn't the concern here, but understanding the feature is. > -a:: > --append:: > Append ref names and object names of fetched refs to the > existing contents of `.git/FETCH_HEAD`. Without this > option old data in `.git/FETCH_HEAD` will be overwritten. > > Perhaps there has a cut&paste error? I haven't looked. Nevermind, that's irrelevant. The key problem is that Linus could not tell from this description that it was the feature he was looking for. So let's fix that and make documentation clearer. Particularly: let's fix the git-fetch manpage, let's untangle it from git-pull, and let git-pull reference the git-fetch description rather than copy it. This quoted section "Append ref... overwritten." explains how the beast works technically. So what? What is it good for? What can I do with it? You made FETCH_HEAD the focus point of the description, but that's not the point. (It may be the point of the implementation, but I don't care). In order for a reader to understand this feature from the docs, he must know what FETCH_HEAD is good for in the whole git context (as a requisite), but that is just a diversion, not the key point. The point is that you can mark several branches for merge, or in other words, accumulate other tips/heads that you want to merge, before doing the merge. It is useful for merges of more than one branch at a time, "octopus merges" or similar. Preface: the next comments don't mean to criticize what you are presenting, but just to select which should and which shouldn't go in the reference manual, and if yes, where. I think we've got the whole description organized backwards, let's fix that, too. > (2) "git fetch" leaves list of <commit object, repo, branch, flag> for > each ref fetched from repository in .git/FETCH_HEAD, where flag > tells > if it is meant for merging. "git pull" runs "git fetch", reads from > this file to learn which ones to pass to "git merge". The > information also is given to "git fmt-merge-msg" to come up with the > message. This is a technical detail - this belongs into a separate FETCH_HEAD document in section 5 (file formats). > Usually "git fetch" first empties the existing contents of the file > and stores the list of refs it fetched. With --append, it doesn't > empty the file; refs fetched by the previous invocation of "git > fetch" will be kept and the refs it fetched are appenede. OK. Also for later, so I know how it differs from regular behaviour. So, this is technically more comprehensive, but that leaves the old question unanswered - what is FETCH_HEAD good for? Let's change roles (or perspective) for a moment, for the sake of clarity and usability: I am just a Git user. I don't want to hack Git. I couldn't care less about implementation details such as FETCH_HEAD, I only need to know how I can tell Git to merge branches foo, bar, baz into master in one single merge. > So: > > $ git fetch one a > $ git fetch --append two b > $ git pull --apend three c > > will end up having all the three refs from different repositories in > .git/FETCH_HEAD. I.e. > > branch a, from repo one, to be merged > branch b, from repo two, to be merged > branch c, from repo three, to be merged > > when "git fetch" run by the the last "git pull" returns. "git pull" > reads the file and learn what to give to "git fmt-merge-msg" (to > come > up with the message for the merge commit) and "git merge" (to create > the merge commit). Let's leave git pull out of this picture. If you mention it, you must explain the interaction between pull and fetch, but you don't want this here. You only want to explain the interaction between fetching more than one branch and merging all of them. "git pull" (at bird eye's view) is just a short-cut for "git fetch something" and "git merge with somehow configured branch" (somehow = implicitly through setting up tracking branches, or clone), or explicitly through git branch, or git remote -- let's leave this aside. So, here's my first stab at it (just content, not ASCIIDOC markup, as I'm not fluent in ASCIIDOC and you can easily do that when merging later) - feel free to correct edit, rewrite, amend to it... I'm not sure FETCH_HEAD(5) ------------------------------------------------- This file in the git directory records which heads have been downloaded, from where, and for what purpose. Each line in this file is one TAB-delimited record with three fields. From left to right, these fields contain: 1 - the commit of the remote head 2 - "not-for-merge" if the branch is not meant to be merged, otherwise, this field remains empty 3 - branch 'xxx' of UUU, where xxx and UUU are the remote repository's refname and base URL, respectively. This file is written by git-fetch and used by git-merge. ------------------------------------------------- git-fetch(1) ------------------------------------------------- ... -a:: --append:: This option allows you to fetch and accumulate multiple remote refs for future merging. Normally, git-fetch records the latest fetch for a later merge, by writing them to .git/FETCH_HEAD (there can be multiple recorded heads in FETCH_HEAD although the name suggests there were just one). The --append option lets git-fetch keep, rather than delete, prior contents of the file. This can be ueful when consolidating multiple topic branches in one single merge (a so-called octopus merge, see git-merge(1)). Example: > $ git fetch one a > $ git fetch --append two b > $ git pull --append three c (git pull first runs git fetch --append three c, and then git merge with all remotes that have been recorded for merging in .git/FETCH_HEAD). ... You can use git-pull as short-cut for the all too common "git-fetch"-"git-merge" sequence. ------------------------------------------------- NOTE: git-fetch accepts command lines without refspec. These mark fetched heads as "not-for-merge". IOW, a refspec is needed that heads are marked as for-merge. I haven't found this documented in git-fetch. -- Matthias Andree ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Could this be done simpler? 2009-06-25 21:54 ` Matthias Andree @ 2009-06-27 0:26 ` Junio C Hamano 0 siblings, 0 replies; 15+ messages in thread From: Junio C Hamano @ 2009-06-27 0:26 UTC (permalink / raw) To: Matthias Andree; +Cc: Randal L. Schwartz, Linus Torvalds, Git Mailing List "Matthias Andree" <matthias.andree@gmx.de> writes: > Neither am I a native writer, why bother… it's more important to write > things clearly than to polish things,... Hey, calm down. The current documentation was written back when everybody knew what git fetch internally did (e.g. left state in .git/FETCH_HEAD) and describing things from the perspective of what is done internally was "accepted" back when the alternative was not describing anything in any form ;-) I took your two questions literally as they were. That is, * You, like other people, realize that times have changed since then, and noticed that even with the correct rendition (it appears the problem Merlyn saw was primarily caused by Asciidoc toolchain), the bottom-up description based on what is done internally is not sufficient. * You are volunteering to make things better, but you first need input to make sure the result is not just readable but technically correct. * And I was among the few people who were around when .git/FETCH_HEAD and "git fetch --append" were invented to give precise answers to these questions. No way I meant that these two answers should replace the current documentation. > Let's change roles (or perspective) for a moment, for the sake of > clarity and usability: I am just a Git user. I don't want to hack > Git. I couldn't care less about implementation details such as > FETCH_HEAD, I only need to know how I can tell Git to merge branches > foo, bar, baz into master in one single merge. Yes, that is the good starting point. > "git pull" (at bird eye's view) is just a short-cut for "git fetch > something" and "git merge with somehow configured branch" (somehow = > implicitly through setting up tracking branches, or clone) Actually the latter is "with information somehow left by git-fetch". > FETCH_HEAD(5) > ------------------------------------------------- > This file in the git directory records which heads have been > downloaded, from where, and for what purpose. Each line in this file > is one TAB-delimited record with three fields. From left to right, > these fields contain: > > 1 - the commit of the remote head > 2 - "not-for-merge" if the branch is not meant to be merged, > otherwise, this field remains empty > 3 - branch 'xxx' of UUU, where xxx and UUU are the remote repository's > refname and base URL, respectively. > > This file is written by git-fetch and used by git-merge. > ------------------------------------------------- It is true that git-merge does use it, but not under its normal mode of operation. Unless the reader of this paragraph is hacking git, I do not think s/he needs to (nor wants to) know about it. IIRC, it only triggers if you do $ git merge FETCH_HEAD The more prominent user is git-pull. git-fetch leaves the instructions to git-pull so that the latter knows what to use when it drives git-merge in this file. > git-fetch(1) > ------------------------------------------------- > ... > -a:: > --append:: > This option allows you to fetch and accumulate multiple remote > refs for future merging. Normally, git-fetch records the latest > fetch for a later merge, by writing them to .git/FETCH_HEAD (there > can be multiple recorded heads in FETCH_HEAD although the name > suggests there were just one). I personally find the parenthesized comment at the end just distracting and confusing. You are explicitly saying "by writing THEM" so it is clear that the file can and does record more than one when the user instructs the command to. > .... The --append option lets git-fetch > keep, rather than delete, prior contents of the file. This can be > ueful when consolidating multiple topic branches in one single merge > (a so-called octopus merge, see git-merge(1)). Example: The description lacks one important point. It can be useful only when consolidating multiple topic branches _that come from more than one remote repositories_ Other than that, the above paragraph is perfect. > NOTE: git-fetch accepts command lines without refspec. These mark > fetched heads as "not-for-merge". IOW, a refspec is needed that heads > are marked as for-merge. I haven't found this documented in > git-fetch. Sorry, I have no idea what you are talking about in these four lines. Perhaps "DEFAULT BEHAVIOUR" section in Documentation/git-pull.txt, the paragraph that begins with "The rule to determine which remote branch to merge ..." may be what you are looking for? ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Could this be done simpler? 2009-06-25 16:32 ` Matthias Andree 2009-06-25 17:25 ` Junio C Hamano @ 2009-06-25 18:32 ` Junio C Hamano 1 sibling, 0 replies; 15+ messages in thread From: Junio C Hamano @ 2009-06-25 18:32 UTC (permalink / raw) To: Matthias Andree Cc: Randal L. Schwartz, Christian Couder, Linus Torvalds, Git Mailing List Matthias Andree <matthias.andree@gmx.de> writes: > And if I may be so bold: Please rewrite before somebody starts polishing the > bisect facilities WRT octopus merges. These seem unrelated, as in: you don't > need to make bisect more convenient to be able to fix the description of > git-pull --append... Let's have a refresher course of how bisection works with a history with merges. Assume that you have this history (time flows from left to right, recent commits are known to be bad, old commits are known to be good). o---o---o---o---A / \ ---o---o---o---o---F---o---o---o---B---M In real life, you would start from a history with more commits on top of M and only know that the tip of that sequence is bad, but for brevity, let's assume we bisected and already know M is bad. If B is good, the breakage was either introduced at M, or was on the side branch leading to A, but not older than F where A and B forked from. Side note. As in all other discussion in this message, remember that bisect is for finding a _single_ breakage that was left unfixed til the tip of the history being bisected. "B is good" means "the _single_ breakage is not in the commit that would affect B, i.e. in B's ancestors", If B is bad, on the other hand, the branch leading to A since the fork point F is exonerated and we do not have to look at the side branch that leads to A. Which means that by seeing one the tip of a merged branch is good, you can see that everything before the merge base is good and you need to only look at _the other_ branch. What happens if M is an Octopus? o---o---o---o---A / \ ---o---o---o---o---F---o---o---o---B---M \ \ /| \ o---o---o---C | \ | o---o---o---o---o---D If B is good, you still need to look at histories leading to A, C, and D individually. Of course if B is bad, then you do not have to look at the histrories leading to A, C and D from their respective fork points, but you still do have to look at the shared past. But we could optimize further. After knowing M, an Octopus merge, is bad, when we are tempted to test one of the tips of the branches that was merged (say B), we can instead give a tree that is a result of merging only A and B (i.e. excluding C and D) for testing. If it is good, then the histories leading to both A and B are good, and we only need to check side branches leading C and D since they forked from the shared common history. If combination of A and B is bad, on the other hand, then we do not have to check branch histories leading to C nor D. Doing so essentially shifts the balance between what happens if a single test turns out to be good or bad. If we test the tip of the branch, and if it is bad, we will eliminate other forks (but still need to test the shared history). If it is good, we only eliminate that particular branch and shared history, but all the other forks remain suspect. So it is a tradeoff between: - the size of all the other side branches since they forked == number of commits we do not have to test if this round says "bad"; - the size of this side branch and the shared history == number of commits we do not have to test if this round says "good"; The current bisect algorithm makes this tradeoff, by computing the above two numbers and finding the point that makes them closest to each other. It however does not let you test two commits at the same time (i.e. testing the merge of A and B in the above example) which could make the tradeoff even more efficient. I see there is another window for optimization we could make from the above observation. Making the number of commits eliminated when the test is "good" and "bad" as close to equal as possible is the best strategy when the tested commit has a 50-50 chance of being "good" or "bad". If we somehow know that the tested commit is likely to be "bad", we would want to maximize the number of commits eliminated when the commit is indeed "bad" (and vice versa). I do not see an easy way to exploit this window offhand, though... ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Could this be done simpler? 2009-06-25 14:33 ` Randal L. Schwartz 2009-06-25 16:32 ` Matthias Andree @ 2009-06-25 17:19 ` Michael J Gruber 1 sibling, 0 replies; 15+ messages in thread From: Michael J Gruber @ 2009-06-25 17:19 UTC (permalink / raw) To: Randal L. Schwartz; +Cc: Junio C Hamano, Linus Torvalds, Git Mailing List Randal L. Schwartz venit, vidit, dixit 25.06.2009 16:33: >>>>>> "Junio" == Junio C Hamano <gitster@pobox.com> writes: > > Junio> (5) Continue pretending to be Linus, complete the octopus. The key is to > Junio> let the "fetch" phase of this to append to the FETCH_HEAD, not > Junio> replacing it. > > Junio> $ git pull --append \ > Junio> git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current \ > Junio> for-linus > > The relatively current doc of "--append" looks like this: > > -a, --append > Append ref names and object names of fetched refs to the existing > contents of will be overwritten. > > I read this three times, and still don't know what it means (and it doesn't > even scan well as English), so I would have never known to use this strategy. > Can you explain this more in detail, or point at something in the mailing list > that does? Uhm, my version of git-fetch.1 has -a, --append Append ref names and object names of fetched refs to the existing contents of .git/FETCH_HEAD. Without this option old data in .git/FETCH_HEAD will be overwritten. That at least scans better in English. It does not make it very clear what the consequences are, though. Michael ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Could this be done simpler? 2009-06-25 1:04 ` Junio C Hamano 2009-06-25 14:33 ` Randal L. Schwartz @ 2009-06-25 22:02 ` Christian Couder 2009-06-25 22:23 ` Christian Couder 1 sibling, 1 reply; 15+ messages in thread From: Christian Couder @ 2009-06-25 22:02 UTC (permalink / raw) To: Junio C Hamano; +Cc: Linus Torvalds, Git Mailing List On Thursday 25 June 2009, Junio C Hamano wrote: > Side note. > > People sometimes say, and I am certain I agreed to them on more than one > occasions, that Octopus hurt bisectability and does not have much value > in real life. I've always thought this bisectability issue was a > downside of Octopus merges, but now I think about it, perhaps "git > bisect" can be taught to dynamically decompose an Octopus merges into a > sequence of two-head virtual merges while bisecting. We strongly > discourage and do not allow conflicting Octopus merges, so when you need > to bisect a history with an Octopus that looks like this: > > ---o---A > \ > ---o---B---M---o > / > ---o---C > > it should be able to mechanically decompose it, without conflicts, into > > > ---o---A > \ > ---o---B---M1--M2--o > / > ---o---C > > where the tree of M and the tree of M2 are identical. If someone creates a "git decompose-octopus <commit>" command then you only need to do "git replace M M2" after that and you can bisect as usual. (Of course after that you can remove the replacement with "git replace -d M".) Best regards, Christian. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Could this be done simpler? 2009-06-25 22:02 ` Christian Couder @ 2009-06-25 22:23 ` Christian Couder 2009-06-25 22:29 ` Junio C Hamano 0 siblings, 1 reply; 15+ messages in thread From: Christian Couder @ 2009-06-25 22:23 UTC (permalink / raw) To: Junio C Hamano; +Cc: Linus Torvalds, Git Mailing List On Friday 26 June 2009, Christian Couder wrote: > On Thursday 25 June 2009, Junio C Hamano wrote: > > Side note. > > > > People sometimes say, and I am certain I agreed to them on more than > > one occasions, that Octopus hurt bisectability and does not have much > > value in real life. I've always thought this bisectability issue was a > > downside of Octopus merges, but now I think about it, perhaps "git > > bisect" can be taught to dynamically decompose an Octopus merges into a > > sequence of two-head virtual merges while bisecting. We strongly > > discourage and do not allow conflicting Octopus merges, so when you > > need to bisect a history with an Octopus that looks like this: > > > > ---o---A > > \ > > ---o---B---M---o > > / > > ---o---C > > > > it should be able to mechanically decompose it, without conflicts, into > > > > > > ---o---A > > \ > > ---o---B---M1--M2--o > > / > > ---o---C > > > > where the tree of M and the tree of M2 are identical. > > If someone creates a "git decompose-octopus <commit>" command then you > only need to do "git replace M M2" after that and you can bisect as > usual. (Of course after that you can remove the replacement with "git > replace -d M".) (Or if we make the "refs/replace/bisect/" directory special so that it is only used when bisecting, and if the replace ref is created in this directory, then no need to remove the replacement ref. On the contrary it's better to leave it there so that people who fetch it benefit from it too.) Best regards, Christian. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Could this be done simpler? 2009-06-25 22:23 ` Christian Couder @ 2009-06-25 22:29 ` Junio C Hamano 2009-06-25 22:50 ` Linus Torvalds 2009-06-25 22:55 ` Christian Couder 0 siblings, 2 replies; 15+ messages in thread From: Junio C Hamano @ 2009-06-25 22:29 UTC (permalink / raw) To: Christian Couder; +Cc: Junio C Hamano, Linus Torvalds, Git Mailing List Christian Couder <chriscool@tuxfamily.org> writes: >> If someone creates a "git decompose-octopus <commit>" command then ... I am afraid that misses the entire point of my discussion. Such a decomposed octopus would _only_ be necessary during bisection, only when the user chooses to test two tips at once (instead of testing one by one), _and_ only its tree is needed for that purpose. In other words, we should be able to do this _without_ creating an extra commit, let alone replace mechanism. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Could this be done simpler? 2009-06-25 22:29 ` Junio C Hamano @ 2009-06-25 22:50 ` Linus Torvalds 2009-06-25 23:17 ` Junio C Hamano 2009-06-25 22:55 ` Christian Couder 1 sibling, 1 reply; 15+ messages in thread From: Linus Torvalds @ 2009-06-25 22:50 UTC (permalink / raw) To: Junio C Hamano; +Cc: Christian Couder, Git Mailing List On Thu, 25 Jun 2009, Junio C Hamano wrote: > > Such a decomposed octopus would _only_ be necessary during bisection, only > when the user chooses to test two tips at once (instead of testing one by > one), _and_ only its tree is needed for that purpose. In other words, we > should be able to do this _without_ creating an extra commit, let alone > replace mechanism. Keep in mind, though, that realistically, I don't think we've ever seen any bisection attempts that end at an octopus. Sure, I suspect that being really clever about decomposing an octopus merge might allow us to bisect things _faster_ to one of the branches involved in the merge, but the amount of smarts to do that just for that reason seems pretty outlandish. And if we ever do end up with an actual bug being bisected to the octopus merge itself, at that point I don't think it's unreasonable to take the same approach we do with any normal merge: just try to figure out what the conflict is all about (clearly it's not a data conflict, since the octopus wouldn't have succeeded in that case, but subtle merge errors can be due to two branches each introducing their own assumptions without actually ever clashing on a source file level). With regular merges, if you really don't see what the conceptual conflict is, you could try to do a temporary rebase to try to figure it out, and I suspect that that is what you'd want to do with an octopus merge too - rather than try to decompose the octopus merge into multiple simpler merges, you'd like to try to linearize history and then re-do the bisection attempt on that totally modified/simplified history. Linus ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Could this be done simpler? 2009-06-25 22:50 ` Linus Torvalds @ 2009-06-25 23:17 ` Junio C Hamano 0 siblings, 0 replies; 15+ messages in thread From: Junio C Hamano @ 2009-06-25 23:17 UTC (permalink / raw) To: Linus Torvalds; +Cc: Christian Couder, Git Mailing List Linus Torvalds <torvalds@linux-foundation.org> writes: > Sure, I suspect that being really clever about decomposing an octopus > merge might allow us to bisect things _faster_ to one of the branches > involved in the merge, but the amount of smarts to do that just for that > reason seems pretty outlandish. > > And if we ever do end up with an actual bug being bisected to the octopus > merge itself, at that point I don't think it's unreasonable to take the > same approach we do with any normal merge: just try to figure out what the > conflict is all about (clearly it's not a data conflict, since the > octopus wouldn't have succeeded in that case, but subtle merge errors can > be due to two branches each introducing their own assumptions without > actually ever clashing on a source file level). > > With regular merges, if you really don't see what the conceptual conflict > is, you could try to do a temporary rebase to try to figure it out, and I > suspect that that is what you'd want to do with an octopus merge too - > rather than try to decompose the octopus merge into multiple simpler > merges, you'd like to try to linearize history and then re-do the > bisection attempt on that totally modified/simplified history. All true. Thanks for thoughts. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Could this be done simpler? 2009-06-25 22:29 ` Junio C Hamano 2009-06-25 22:50 ` Linus Torvalds @ 2009-06-25 22:55 ` Christian Couder 1 sibling, 0 replies; 15+ messages in thread From: Christian Couder @ 2009-06-25 22:55 UTC (permalink / raw) To: Junio C Hamano; +Cc: Linus Torvalds, Git Mailing List On Friday 26 June 2009, Junio C Hamano wrote: > Christian Couder <chriscool@tuxfamily.org> writes: > >> If someone creates a "git decompose-octopus <commit>" command then ... > > I am afraid that misses the entire point of my discussion. > > Such a decomposed octopus would _only_ be necessary during bisection, > only when the user chooses to test two tips at once (instead of testing > one by one), _and_ only its tree is needed for that purpose. In other > words, we should be able to do this _without_ creating an extra commit, > let alone replace mechanism. But suppose the result from the bisection tells that M1 is the first bad commit, then the user will need to look at M1, and perhaps check it out or use it in other ways after the bisection is finished. So why shouldn't it be a real commit? It's not like a few more commits are a big problem as they will be reclaimed by garbage collection anyway if the replace ref is deleted. Best regards, Christian. ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2009-06-27 0:26 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-06-24 21:35 Could this be done simpler? Linus Torvalds 2009-06-25 1:04 ` Junio C Hamano 2009-06-25 14:33 ` Randal L. Schwartz 2009-06-25 16:32 ` Matthias Andree 2009-06-25 17:25 ` Junio C Hamano 2009-06-25 21:54 ` Matthias Andree 2009-06-27 0:26 ` Junio C Hamano 2009-06-25 18:32 ` Junio C Hamano 2009-06-25 17:19 ` Michael J Gruber 2009-06-25 22:02 ` Christian Couder 2009-06-25 22:23 ` Christian Couder 2009-06-25 22:29 ` Junio C Hamano 2009-06-25 22:50 ` Linus Torvalds 2009-06-25 23:17 ` Junio C Hamano 2009-06-25 22:55 ` Christian Couder
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).