* git-name-rev off-by-one bug @ 2005-11-28 23:42 linux 2005-11-29 5:54 ` Junio C Hamano 2005-12-01 10:14 ` Junio C Hamano 0 siblings, 2 replies; 64+ messages in thread From: linux @ 2005-11-28 23:42 UTC (permalink / raw) To: git; +Cc: linux I've been trying to wrap my head around git for a while now, and finding things a bit confusing. Basically, the reason that I'm scared to trust it with my code is that all sharing is done via push and pull, and they are done by merging, and merging isn't described very well anywhere. There's lots of intimate *detail* of merge algorithms (hiding in, of all places, the git-read-tree documentation, which is not the obvious place for a beginner to look), but the important high-level questions like "what happens to all my hard work if there's a merge conflict?" or "what if I forget to git-update-index before doing the merge?" are not really clear. I don't like to go ahead if I'm not confident I can get back. (Being able to back up the object database is obviously simple, but what happens if the index holds HEAD+1, the working directory holds HEAD+2, and I try to mere the latest changes from origin? Are either HEAD+1 or HEAD+2 in danger of being lost, or will checking them in later overwrite the merge, or what?) Anyway, I'm doing some experiments and trying to understand it, and writing what I learn as I go, which will hopefully be useful to someone. Another very confusing thing is the ref syntax with all those ~12^3^22^2 suffixes. The git tutorial uses "master^" and "master^2" syntax, but doesn't actually explain it. The meaning can be found on the second page of the git-rev-parse manual. If, that is, you think to read that man page, and if you don't stop reading after the first page tells you that it's a helper for scripts not meant to be invoked directly by the end-user. Trying to see if I understood what was going on, I picked a random rev out of git-show-branch output and tried git-name-rev: > $ git-name-rev 365a00a3f280f8697e4735e1ac5b42a1c50f7887 > 365a00a3f280f8697e4735e1ac5b42a1c50f7887 maint~404^1~7 (If you care, maint=93dcab2937624ebb97f91807576cddb242a55a46) And was very confused when git-rev-parse didn't invert the operation: > $ git-rev-parse maint~404^1~7 > f69714c38c6f3296a4bfba0d057e0f1605373f49 I spent a while verifying that I understood that ^1 == ^ == ~1, so ~404^1~7 = ~412, and that gave the same unwanted result: > $ git-rev-parse maint~412 > f69714c38c6f3296a4bfba0d057e0f1605373f49 After confusing myself for a while, I looked to see why git-name-rev would output such a redundant name and found that it was simply wrong. Fixing the symbolic name worked: > $ git-rev-parse maint~404^2~7 > 365a00a3f280f8697e4735e1ac5b42a1c50f7887 You can either go with a minimal fix: diff --git a/name-rev.c b/name-rev.c index 7d89401..f7fa18c 100644 --- a/name-rev.c +++ b/name-rev.c @@ -61,9 +61,10 @@ copy_data: if (generation > 0) sprintf(new_name, "%s~%d^%d", tip_name, - generation, parent_number); + generation, parent_number+1); else - sprintf(new_name, "%s^%d", tip_name, parent_number); + sprintf(new_name, "%s^%d", tip_name, + parent_number+1); name_rev(parents->item, new_name, merge_traversals + 1 , 0, 0); Or you can get a bit more ambitious and write ~1 as ^: diff --git a/name-rev.c b/name-rev.c index 7d89401..82053c8 100644 --- a/name-rev.c +++ b/name-rev.c @@ -57,13 +57,17 @@ copy_data: parents; parents = parents->next, parent_number++) { if (parent_number > 0) { - char *new_name = xmalloc(strlen(tip_name)+8); + unsigned const len = strlen(tip_name); + char *new_name = xmalloc(len+8); - if (generation > 0) - sprintf(new_name, "%s~%d^%d", tip_name, - generation, parent_number); - else - sprintf(new_name, "%s^%d", tip_name, parent_number); + memcpy(new_name, tip_name, len); + + if (generation == 1) + new_name[len++] = '^'; + else if (generation > 1) + len += sprintf(new_name+len, "~%d", generation); + + sprintf(new_name+len, "^%d", parent_number+1); name_rev(parents->item, new_name, merge_traversals + 1 , 0, 0); While I'm at it, I notice some unnecessary invocations of expr in some of the shell scripts. You can do it far more simply using the ${var#pat} and ${var%pat} expansions to strip off leading and trailing patterns. For example: diff --git a/git-cherry.sh b/git-cherry.sh index 867522b..c653a6a 100755 --- a/git-cherry.sh +++ b/git-cherry.sh @@ -23,8 +23,7 @@ case "$1" in -v) verbose=t; shift ;; esa case "$#,$1" in 1,*..*) - upstream=$(expr "$1" : '\(.*\)\.\.') ours=$(expr "$1" : '.*\.\.\(.*\)$') - set x "$upstream" "$ours" + set x "${1%..*}" "${1#*..}" shift ;; esac This works in dash and is in the POSIX spec. It doesn't work in some very old /bin/sh implementations (such as Solaris still ships), but I'm pretty sure it was introduced at the same time as $(), and the scripts use *that* all over the place. % sh $ uname -s -r SunOS 5.9 $ foo=bar $ echo ${foo#b} bad substitution $ echo `echo $foo` bar $ echo $(echo $foo) syntax error: `(' unexpected Anyway, if it's portable enough, it's faster. Ah... I just found discussion of this in late September, but it's not clear what the resolution was. http://marc.theaimsgroup.com/?t=112746188000003 (Oh, yes: all of the above patches are released into the public domain. Copyright abandoned. Have fun.) ^ permalink raw reply related [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-11-28 23:42 git-name-rev off-by-one bug linux @ 2005-11-29 5:54 ` Junio C Hamano 2005-11-29 8:05 ` linux 2005-11-30 17:46 ` Daniel Barkalow 2005-12-01 10:14 ` Junio C Hamano 1 sibling, 2 replies; 64+ messages in thread From: Junio C Hamano @ 2005-11-29 5:54 UTC (permalink / raw) To: linux; +Cc: git linux@horizon.com writes: > (Being able to back up the object database is obviously simple, but what > happens if the index holds HEAD+1, the working directory holds HEAD+2, > and I try to mere the latest changes from origin? Are either HEAD+1 or > HEAD+2 in danger of being lost, or will checking them in later overwrite > the merge, or what?) Thanks for the complaints. No sarcasm intended. Yours is exactly the kind of message we (people who've been around here for too long) need to hear. Although the technical details are hidden in the documentation which needs reorganization to make them easier to find [*1*] as you point out, the guiding principle for merge is quite simple. To the git barebone Porcelain layer (things that start with git-*, not with cg-*) [*2*], a merge is always between the current HEAD and one or more remote branch heads, and the index file must exactly match the tree of HEAD commit (i.e. the contents of the last commit) when it happens. In other words, "git-diff --cached HEAD" must report no changes [*3*]. So HEAD+1 must be HEAD in your above notation, or merge will refuse to do any harm to your repository (that is, it may fetch the objects from remote, and it may even update the local branch used to keep track of the remote branch with "git pull remote rbranch:lbranch", but your working tree, .git/HEAD pointer and index file are left intact). You may have local modifications in the working tree files. In other words, "git-diff" is allowed to report changes (the difference between HEAD+2 and HEAD+1 in your notation). However, the merge uses your working tree as the working area, and in order to prevent the merge operation from losing such changes, it makes sure that they do not interfere with the merge. Those complex tables in read-tree documentation define what it means for a path to "interfere with the merge". And if your local modifications interfere with the merge, again, it stops before touching anything. So in the above two "failed merge" case, you do not have to worry about lossage of data --- you simply were not ready to do a merge, so no merge happened at all. You may want to finish whatever you were in the middle of doing, and retry the same pull after you are done and ready. When things cleanly merge, these things happen: (1) the results are updated both in the index file and in your working tree, (2) index file is written out as a tree, (3) the tree gets committed, and (4) the HEAD pointer gets advanced. Because of (2), we require that the original state of the index file to match exactly the current HEAD commit; otherwise we will write out your local changes already registered in your index file (the difference between HEAD+1 and HEAD in your notation) along with the merge result, which is not good. Because (1) involves only the paths different between your branch and the remote branch you are pulling from during the merge (which is typically a fraction of the whole tree), you can have local modifications in your working tree as long as they do not overlap with what the merge updates. When there are conflicts, these things happen: (0) HEAD stays the same. (1) Cleanly merged paths are updated both in the index file and in your working tree. (2) For conflicting paths, the index file records the version from HEAD. The working tree files have the result of "merge" program; i.e. 3-way merge result with familiar conflict markers <<< === >>>. (3) No other changes are done. In particular, the local modifications you had before you started merge will stay the same and the index entries for them stay as they were, i.e. matching HEAD. After seeing a conflict, you can do two things: * Decide not to merge. The only clean-up you need are to reset the index file to the HEAD commit to reverse (1) and to clean up working tree changes made by (1) and (2); "git-reset" can be used for this. * Resolve the conflicts. "git-diff" would report only the conflicting paths because of the above (1) and (2). Edit the working tree files into a desirable shape, git-update-index them, to make the index file contain what the merge result should be, and run "git-commit" to commit the result. [Footnotes] *1* It is a shame that the most comprehensive definition of 3-way read-tree semantics is in t/t1000-read-tree-m-3way.sh test script. *2* Cogito (things that start with cg-*) seems to try to be cleverer. Pasky might want to brag about the rules in Cogito land. *3* This is a bit of lie. In certain special cases, your index are allowed to be different from the tree of HEAD commit; basically your index entries are allowed to match the result of trivial merge already (e.g. you received the same patch from external source to produce the same result as what you are merging). For example, if a path did not exist in the common ancestor and your head commit but exists in the tree you are merging into your repository, and if you already happen to have that path exactly in your index, the merge does not have to fail. This is case #2 in the 3-way read-tree table in t/t1000. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-11-29 5:54 ` Junio C Hamano @ 2005-11-29 8:05 ` linux 2005-11-29 9:29 ` Junio C Hamano 2005-11-29 10:31 ` Petr Baudis 2005-11-30 17:46 ` Daniel Barkalow 1 sibling, 2 replies; 64+ messages in thread From: linux @ 2005-11-29 8:05 UTC (permalink / raw) To: junkio; +Cc: git, linux > Thanks for the complaints. No sarcasm intended. Yours is > exactly the kind of message we (people who've been around here > for too long) need to hear. Thanks for taking it so well! I'm trying to really *understand* git, so I can predict its behaviour, but I've been coming to the conclusion that the only way to do that is by re-reading the mailing list from day 1. And to understand git at all, you have to understand merging, since doing merging fast and well is the central reason for git's entire existence. Single-developer porcelain like "cvs annotate" is noticeably lacking, but branching and merging is great. (In particular, and unlike other SCMs, "push" and "pull" are based on merging! So I can't even understand what pulling from Linus's tree does until I understand merging.) I'm working on some notes to explain git to myself and people I work with, which I'll post when they're vaguely complete. > To the git barebone Porcelain layer (things that start with > git-*, not with cg-*) [*2*], a merge is always between the > current HEAD and one or more remote branch heads, and the index > file must exactly match the tree of HEAD commit (i.e. the > contents of the last commit) when it happens. In other words, > "git-diff --cached HEAD" must report no changes [*3*]. So > HEAD+1 must be HEAD in your above notation, or merge will refuse > to do any harm to your repository (that is, it may fetch the > objects from remote, and it may even update the local branch > used to keep track of the remote branch with "git pull remote > rbranch:lbranch", but your working tree, .git/HEAD pointer and > index file are left intact). Right! Since the object database is strictly append-only, it's easy to see how such changes are quite harmless, and updating a tracking branch is hardly a big nasty surprise. It's the index and working directory that are volatile, and that I was worried about. BUT... what's the second argument to git-read-tree for, if it always has to be HEAD? BTW, I'd change the description of git-read-tree from > Reads the tree information given by <tree-ish> into the directory > cache, but does not actually update any of the files it "caches". (see: > git-checkout-index) to Reads the tree information given by <tree-ish> into the index. The working directory is not modified in any way (unless -u is used). Use git-checkout-index to do that. In addition to the simple one-tree case, this can (with the -m flag) merge 2 or 3 trees into the index. When used with -m, the -u flag causes it to also update the files in the working directory. Trivial merges are done by "git-read-tree" itself. Conflicts are left in an unmerged state for git-merge-index to resolve. > You may have local modifications in the working tree files. In > other words, "git-diff" is allowed to report changes (the > difference between HEAD+2 and HEAD+1 in your notation). > However, the merge uses your working tree as the working area, > and in order to prevent the merge operation from losing such > changes, it makes sure that they do not interfere with the > merge. Those complex tables in read-tree documentation define > what it means for a path to "interfere with the merge". And if > your local modifications interfere with the merge, again, it > stops before touching anything. THANK YOU for finally making this clear! I was wondering why the hell a 2-way merge looked more complex than a 3-way. (Although, admittedly, I'm *still* not clear on what the difference is. It seems like a 2-way just picks the origin commit automatically.) So it goes like this: - git-merge will refuse to do anything if there are any uncommitted changes in the index. [footnote about harmless exceptions] - git-merge will refuse to do anything if there are changes in the working directory to a file that would be affected by the merge. You CAN, however, have unindexed changes to files that are unchanged by the merge. The description of a 1-way merge in git-read tree is quite confusing. Here's a rephrasing; do I understand it? READING Without -m, git-read-tree pulls a specified tree into the index. Any file whose underlying blob is changed by the update will have its cached stat data invalidated, so it will appear in the output of git-diff-files, and git-checkout-index -f will overwrite it. A single-tree merge is a slightly nicer variant on this. MERGING ... Single Tree Merge If only one tree is specfied, any file whose blob changes as a result of the update is compared with the working directory file, and the cached stat data updated if the working directory file matches the new blob. Thus, working directory files which happen to match the contents of the new tree will be excluded from the outout of git-diff-files, and git-checkout-index -f will not update their file modification times. This is basically the effect of git-update-index --refresh. This is actually usually preferable to the non-merge case, but does do extra work. I haven't found the code yet, but obviously if the working directory file is clean relative to the previous blob, it's dirty relative to the changed blob, so there's no need to actually read the file. Um... I just tried it, and it appears that things do NOT work as I just described. Time to read the code... BTW, an even cuter way of writing same() in read_tree.c would be: static int same(struct cache_entry *a, struct cache_entry *b) { if (!a || !b) return a == b; return a->ce_mode == b->ce_mode && !memcmp(a->sha1, b->sha1, 20); } (I presume that return a&&b ? a->ce_mode == b->ce_mode && !memcmp(a->sha1, b->sha1, 20) : a==b; would be taking it too far...) Oh, and the git-read-tree man page fails to mention --reset and --trivial. And the usage message should have <sha> changed to <tree>. > So in the above two "failed merge" case, you do not have to > worry about lossage of data --- you simply were not ready to do > a merge, so no merge happened at all. You may want to finish > whatever you were in the middle of doing, and retry the same > pull after you are done and ready. I feel about a thousand percent better about git already. > When things cleanly merge, these things happen: > > (1) the results are updated both in the index file and in your > working tree, > (2) index file is written out as a tree, > (3) the tree gets committed, and > (4) the HEAD pointer gets advanced. This is git-merge, as opposed to the more primitive git-read-tree -m plus git-merge-index, right? (Aside, you should document that got-merge --no-commit saves the <msg> in .git/MERGE_MSG, and git-commit uses it as the default message. Otherwise, users wonder why the hell it asks for a commit message it knows won't be used.) > Because of (2), we require that the original state of the index > file to match exactly the current HEAD commit; otherwise we will > write out your local changes already registered in your index > file (the difference between HEAD+1 and HEAD in your notation) > along with the merge result, which is not good. > *3* This is a bit of lie. In certain special cases, your index > are allowed to be different from the tree of HEAD commit; > basically your index entries are allowed to match the result of > trivial merge already (e.g. you received the same patch from > external source to produce the same result as what you are > merging). For example, if a path did not exist in the common > ancestor and your head commit but exists in the tree you are > merging into your repository, and if you already happen to have > that path exactly in your index, the merge does not have to > fail. This is case #2 in the 3-way read-tree table in t/t1000. Ah, yes, the light dawns! So it *is* okay if you have an uncommitted change in the index which exactly matches what git-read-tree -m would have done anyway. It doesn't make a difference to the final state of the index, and forcing the poor user to undo it just so the merge can redo it is simply annoying. But it's NOT okay if you have any other change in the index, even in a file unaffected by the merge, because then the final state would not be the result of the merge, and that could be confusing. > Because (1) > involves only the paths different between your branch and the > remote branch you are pulling from during the merge (which is > typically a fraction of the whole tree), you can have local > modifications in your working tree as long as they do not > overlap with what the merge updates. Wonderfully clear. Such changes must not exist at the git-read-tree level, since there's not even a way to represent them to git-merge-index. Still, this can save considerable bother when trying to track someone else's tree. > When there are conflicts, these things happen: > > (0) HEAD stays the same. > > (1) Cleanly merged paths are updated both in the index file and > in your working tree. > > (2) For conflicting paths, the index file records the version > from HEAD. The working tree files have the result of > "merge" program; i.e. 3-way merge result with familiar > conflict markers <<< === >>>. > > (3) No other changes are done. In particular, the local > modifications you had before you started merge will stay the > same and the index entries for them stay as they were, > i.e. matching HEAD. So this is a lot like what CVS does, but neater because all the merge sources are available unmodified in the index. Excellent. > After seeing a conflict, you can do two things: > > * Decide not to merge. The only clean-up you need are to reset > the index file to the HEAD commit to reverse (1) and to clean > up working tree changes made by (1) and (2); "git-reset" can > be used for this. Cool. Two minor questions: - Doesn't any non-trivial merge or invocation of git-update-index produce blob objects in the database that become garbage if you do this? Or are they somehow kept separate until a tree object is created to point to them? (You could have an "unreferenced" bit in the index, indicating that the blobs in question could be found in .git/pending rather than .git/objects, until git-write-tree moved them into the database. But I don't see mention of any such scheme.) - Is there any difference between "git-reset --hard" and "git-checkout -f"? > * Resolve the conflicts. "git-diff" would report only the > conflicting paths because of the above (1) and (2). Edit the > working tree files into a desirable shape, git-update-index > them, to make the index file contain what the merge result > should be, and run "git-commit" to commit the result. Okay, so git-update-index will overwrite a staged file with a fresh stage-0 copy. And git-commit will refuse to commit (to be precise, it'll stop at the git-write-tree stage) if there are unresolved conflicts. If you want to see the unmodified input files, you can find their IDs with "git-ls-files -u" and then get a copy with "git-cat-file blob" or "git-unpack-file". git-merge-index is basically a different way to process the output of git-ls-files -u. > *1* It is a shame that the most comprehensive definition of > 3-way read-tree semantics is in t/t1000-read-tree-m-3way.sh test > script. Thanks for the pointer; I'll go and read it! > *2* Cogito (things that start with cg-*) seems to try to be > cleverer. Pasky might want to brag about the rules in Cogito > land. In fact, he might want to explain what the difference is between cogito and git. Most particularly, are there any restrictions on mixing cg-* and git-* operations from within the same directory? I've been assuming that cogito is just series of friendlier utilities, in much the same way that a text editor is friendlier than "cat > kernel/sched.c", so I'll study it after I understand core git. One more question to be sure I understand merging... AFAICT, it would be theoretically possible to stop distinguishing "stage 2" and "stage 0". If there is only a stage 2 file, then in-index will immediately collapse it to stage 0, and if there are any other stage files, you know there's an incomplete merge. (Alternatively, you could collapse "stage 3" and "stage 0", since stages 2 and 3 are treated identically, but traditionally, stage 2 is the "trunk" and state 3 is the "branch" being merged in.) ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-11-29 8:05 ` linux @ 2005-11-29 9:29 ` Junio C Hamano 2005-11-30 8:37 ` Junio C Hamano 2005-11-29 10:31 ` Petr Baudis 1 sibling, 1 reply; 64+ messages in thread From: Junio C Hamano @ 2005-11-29 9:29 UTC (permalink / raw) To: linux; +Cc: junkio, git linux@horizon.com writes: > (In particular, and unlike other SCMs, "push" and "pull" are > based on merging! So I can't even understand what pulling from > Linus's tree does until I understand merging.) If you do "git pull git://.../linux-2.6/", it fetches the objects and stores Linus "master" into .git/FETCH_HEAD, and then merges that into whatever branch you happen to be on (but you already know that). > BUT... what's the second argument to git-read-tree for, if it > always has to be HEAD? git-read-tree is purely "a building block to be scripted" in your second mail (private?). In my message you are replying to, I outlined what the end-user level tool, git-pull (and git-merge it uses) does, which is built using git-read-tree, and we happen to always pass HEAD as its second argument. But it does not have to be that way. One thing planned in the future is to do a merge in a temporary working directory, not in your primary working tree, and when we implement that, the second argument will be whatever branch head you are pulling into, not necessarily your current HEAD. > BTW, I'd change the description of git-read-tree from Thanks. I am a bit too tired tonight so I may send you a correction later if what I am going to comment here turns out to be incorrect, but a quick glance tells me this is a good clarification. > ... I was wondering why the > hell a 2-way merge looked more complex than a 3-way. 2-way is called merge but it is not about the merge at all (git-merge is mostly about 3-way merge). It is more about checkout. Suppose you have checked out branch A, and have local modificiations (both in index and in working tree). You would want to switch to branch B by "git checkout B" (without -f). 2-way "read-tree -m -u" is used to ensure that you take your local modifications with you while checking out branch B into your working tree, meaning: (1) local changes already registered in the index stays in the index file; obviously this can be done only if A and B are the same at such paths. (2) local changes not registered in the index stays in the working tree; similar restriction applies but the rules are more involved. Similar to 3-way case, 2-way will refuse to lose your local modifications, and that is the 2-way case table in Documentation/git-read-tree.txt is about. > This is git-merge, as opposed to the more primitive git-read-tree -m > plus git-merge-index, right? Everything I wrote in the previous message was about the end-user tools git-pull/git-merge. > - Doesn't any non-trivial merge or invocation of git-update-index > produce blob objects in the database that become garbage if you > do this? We produce garbage blobs all the time, and we do not care. Even the following sequence that does not involve any merge produces a garbage blob for the first version of A that was faulty: $ git checkout $ edit A $ git update-index A $ make ;# oops, there is a mistake. $ edit A $ make ;# this time it is good. $ git commit -a -m 'Finally compiles.' Occasional fsck-objects, prune and repack are your friends. > - Is there any difference between "git-reset --hard" and "git-checkout -f"? "reset --hard" does more thorough job removing unwanted files from your working tree. It looks at your current HEAD, the commit you are resetting to (when you say "reset --hard <commit>"), and your index file, and paths mentioned by any of these three that should not remain (that is, not in the commit you are resetting to) are removed from your working tree. In addition, "reset --hard <commit>" updates the branch head and can be used to rewind it. On the other hand, "checkout -f" tells git to *ignore* what is in the index, so any file in the working tree that used to be in the index (or old branch you were working on) that does not exist in the branch you are checking out is not removed. On a related topic of removing unwanted paths, earlier I said 2-way is used to make sure "git checkout" takes your changes with you when you switch branches. As a natural consequence of this, if you do not have any local changes, "git checkout" without "-f" does the right thing -- it removes unwanted paths that existed in the original branch but not in the branch you are switching to. > Okay, so git-update-index will overwrite a staged file with a > fresh stage-0 copy. And git-commit will refuse to commit > (to be precise, it'll stop at the git-write-tree stage) if there > are unresolved conflicts. Sorry, I was unclear that I was talking about end-user level tool. The update-index here is not about the conflict resolution in the index file read-tree documentation talks about. That has already been done when "merge" ran in the conflicting case. In the conflicting case, the working tree holds 3-way merge conflicting result, and the index holds HEAD version at stage0 for such a path. Hand resolving after update-index is to record what you eventually want to commit (i.e. you are not replacing higher stage entry in the index with stage0 entry -- you are replacing stage0 entry with another). > If you want to see the unmodified input files, you can find their > IDs with "git-ls-files -u" and then get a copy with "git-cat-file blob" > or "git-unpack-file". git-merge-index is basically a different way to > process the output of git-ls-files -u. Yes, in principle. But in practice you usually do not use these low level tools yourself. When git-merge returns with conflicting paths, most of them have already been collapsed into stage0 and git-ls-files --unmerged would not show. The only case I know of that you may still see higher stage entries in the index these days is merging paths with different mode bits. We used to leave higher stage entries when both sides added new file at the same path, but even that we show as merge from common these days. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-11-29 9:29 ` Junio C Hamano @ 2005-11-30 8:37 ` Junio C Hamano 0 siblings, 0 replies; 64+ messages in thread From: Junio C Hamano @ 2005-11-30 8:37 UTC (permalink / raw) To: linux; +Cc: git Junio C Hamano <junkio@cox.net> writes: > On a related topic of removing unwanted paths, earlier I said > 2-way is used to make sure "git checkout" takes your changes > with you when you switch branches. As a natural consequence of > this, if you do not have any local changes, "git checkout" > without "-f" does the right thing -- it removes unwanted paths > that existed in the original branch but not in the branch you > are switching to. Here is an unsolicited advice ("tip of the day"). I was on a branch which had some local "throwaway" changes, and I wanted to switch back to the master branch. To be honest, I even forgot I had local changes there. So I ran "git checkout", and here is what happened. junio@siamese:~/git$ git checkout master fatal: Entry 'Documentat...' not uptodate. Cannot merge. The easiest is "git checkout -f master" at this point, but I usually do not do that. If that entry "git checkout" complains about is something that is not in the master branch and I have throwaway changes, "git checkout -f master" would leave that file with throwaway changes behind. So I did this first: junio@siamese:~/git$ git reset --hard This would sync my working tree to the current branch. Then junio@siamese:~/git$ git checkout master would switch branches properly, removing that new file that should not exist in the working tree. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-11-29 8:05 ` linux 2005-11-29 9:29 ` Junio C Hamano @ 2005-11-29 10:31 ` Petr Baudis 2005-11-29 18:46 ` Junio C Hamano 2005-11-29 21:40 ` git-name-rev off-by-one bug linux 1 sibling, 2 replies; 64+ messages in thread From: Petr Baudis @ 2005-11-29 10:31 UTC (permalink / raw) To: linux; +Cc: junkio, git Dear diary, on Tue, Nov 29, 2005 at 09:05:29AM CET, I got a letter where linux@horizon.com said that... > > *2* Cogito (things that start with cg-*) seems to try to be > > cleverer. Pasky might want to brag about the rules in Cogito > > land. > In fact, he might want to explain what the difference is between cogito > and git. Most particularly, are there any restrictions on mixing cg-* > and git-* operations from within the same directory? Nope, except: Cogito vs. other GIT tools ~~~~~~~~~~~~~~~~~~~~~~~~~~ You can *MOSTLY* use Cogito in parallel with other GIT frontends (e.g. StGIT), as well as the GIT plumbing and core GIT tools - the tools only need to keep HEAD in place and follow the standardized `refs/` hierarchy. The only notable exception is that you should stick with a single toolkit during a merge. (-- Cogito README) So exactly during a merge, things might not blend well, since Cogito does things a bit differently. It knows of no MERGE_HEAD, MERGE_MSG and such, and instead passes stuff over different channels or computes/asks it at different times. Historically, cg-merge and git-merge evolution has been almost entirely separate. From the user POV, the main difference between Cogito and GIT merging is that: (i) Cogito tries to never leave the index "dirty" (i.e. containing unmerged entries), and instead all conflicts should propagate to the working tree, so that the user can resolve them without any further special tools. (What is lacking here is that Cogito won't proofcheck that you really resolved them all during a commit. That's a big TODO. But core GIT won't you warn about committing the classical << >> == conflicts either.) (ii) Cogito will handle trees with some local modifications better - basically any local modifications git-read-tree -m won't care about. I didn't read the whole conversation, so to reiterate: git-read-tree will complain when the index does not match the HEAD, but won't complain about modified files in the working tree if the merge is not going to touch them. Now, let's say you do this (output is visually only roughly or not at all resembling what would real tools tell you): $ ls a b c $ echo 'somelocalhack' >>a $ git merge "blah" HEAD remotehead File-level merge of 'b' and 'c'... Oops, 'b' contained local conflicts. Automatic merge aborted, fix up by hand. $ fixup b $ git commit Committed files 'a', 'b', 'c'. Oops. It grabbed your local hack and committed it along the merge. Cogito won't do this, it will hold 'a' back when doing the merge commit (if it works right; in the past, there were several bugs related to this, but hopefully they are all fixed by now): $ ls a b c $ echo 'somelocalhack' >>a $ cg-merge remotehead ... Merging c ... Merging b Conflicts during merge of 'b'. Fix up the conflicts, then kindly do cg-commit. $ fixup b $ cg-commit -m"blah" Committed files 'b', 'c'. Also note that the cg-merge usage is simpler and you give the "blah" message only to cg-commit, when it's for sure you are going to use it. (iii) Cogito does not support the smart recursive merging strategy. That means it won't follow renames, and in case of multiple merge bases, it will not merge them recursively, but it will just ask you to choose one manually, or suggest you the most conservative merge base (where you should get no false clean merges, but you will probably have to deal with a lot of conflicts). -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ VI has two modes: the one in which it beeps and the one in which it doesn't. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-11-29 10:31 ` Petr Baudis @ 2005-11-29 18:46 ` Junio C Hamano 2005-12-04 21:34 ` Petr Baudis 2005-11-29 21:40 ` git-name-rev off-by-one bug linux 1 sibling, 1 reply; 64+ messages in thread From: Junio C Hamano @ 2005-11-29 18:46 UTC (permalink / raw) To: Petr Baudis; +Cc: linux, junkio, git Petr Baudis <pasky@suse.cz> writes: > (ii) Cogito will handle trees with some local modifications better - > basically any local modifications git-read-tree -m won't care about. > I didn't read the whole conversation, so to reiterate: git-read-tree > will complain when the index does not match the HEAD, but won't > complain about modified files in the working tree if the merge is not > going to touch them. Now, let's say you do this (output is visually > only roughly or not at all resembling what would real tools tell you): > > $ ls > a b c > $ echo 'somelocalhack' >>a > $ git merge "blah" HEAD remotehead > File-level merge of 'b' and 'c'... > Oops, 'b' contained local conflicts. > Automatic merge aborted, fix up by hand. > $ fixup b > $ git commit > Committed files 'a', 'b', 'c'. > > Oops. It grabbed your local hack and committed it along the merge. Are you sure about this? In the above sequence, after you touch a with 'somelocalhack', there is no 'git update-index a', until you say 'git commit' there, so I do not think that mixup is possible. The "fixup b" step is actually two commands, so after merge command, you would do: $ edit b $ git update-index b ;# mark that you are dealt with it $ git commit ;# commits what is in index After the above steps, "git diff" (that is working tree against index) still reports your local change to "a", which were _not_ committed. Maybe you were mistaken because Cogito tries to be nice to its users and always does a moral equivalent of "git commit -a" (unless the user tells you to commit only specific paths), but you needed to special case merge resolution commit to make sure that you exclude "a" in the above example? "git commit" does not do "-a" by default, and it will stay that way, so I do not think we do not have the "Oops" you described above. "Oops" would happen only if you did "git commit -a" instead at the last step. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-11-29 18:46 ` Junio C Hamano @ 2005-12-04 21:34 ` Petr Baudis 2005-12-08 6:34 ` as promised, docs: git for the confused linux 0 siblings, 1 reply; 64+ messages in thread From: Petr Baudis @ 2005-12-04 21:34 UTC (permalink / raw) To: Junio C Hamano; +Cc: linux, git Dear diary, on Tue, Nov 29, 2005 at 07:46:20PM CET, I got a letter where Junio C Hamano <junkio@cox.net> said that... > Petr Baudis <pasky@suse.cz> writes: > > > (ii) Cogito will handle trees with some local modifications better - > > basically any local modifications git-read-tree -m won't care about. > > I didn't read the whole conversation, so to reiterate: git-read-tree > > will complain when the index does not match the HEAD, but won't > > complain about modified files in the working tree if the merge is not > > going to touch them. Now, let's say you do this (output is visually > > only roughly or not at all resembling what would real tools tell you): > > > > $ ls > > a b c > > $ echo 'somelocalhack' >>a > > $ git merge "blah" HEAD remotehead > > File-level merge of 'b' and 'c'... > > Oops, 'b' contained local conflicts. > > Automatic merge aborted, fix up by hand. > > $ fixup b > > $ git commit > > Committed files 'a', 'b', 'c'. > > > > Oops. It grabbed your local hack and committed it along the merge. > > Are you sure about this? > > In the above sequence, after you touch a with 'somelocalhack', > there is no 'git update-index a', until you say 'git commit' > there, so I do not think that mixup is possible. > > The "fixup b" step is actually two commands, so after merge > command, you would do: > > $ edit b > $ git update-index b ;# mark that you are dealt with it > $ git commit ;# commits what is in index > > After the above steps, "git diff" (that is working tree against > index) still reports your local change to "a", which were _not_ > committed. Yes. I actually tried it out, but I was confused by the file list in the commit message (I'm used to seeing just committed files there) and I didn't check the status of the 'a' file after the commit. Sorry about the confusion. -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ VI has two modes: the one in which it beeps and the one in which it doesn't. ^ permalink raw reply [flat|nested] 64+ messages in thread
* as promised, docs: git for the confused 2005-12-04 21:34 ` Petr Baudis @ 2005-12-08 6:34 ` linux 2005-12-08 21:53 ` Junio C Hamano ` (2 more replies) 0 siblings, 3 replies; 64+ messages in thread From: linux @ 2005-12-08 6:34 UTC (permalink / raw) To: git; +Cc: linux As I mentioned with all my questions, I was writing up the answers I got. Here's the current status. If anyone would like to comment on its accuracy or usefulness, feedback is appreciated. I've tried to omit or skim very lightly over subjects I think are adequately explained in existing docs, unless that would leave an uncomfortable hole in the explanation. TODO: Describe the config file. It's a recent invention, and I haven't found a good description of its contents. "I Don't Git It" Git for the confused Git is hardly lacking in documentation, but coming at it fresh, I found it somewhat confusing. Git is a toolkit in the Unix tradition. There are a number of primitives written in C, which are made friendly by a layer of shell scripts. These are known in git-speak, as the "plumbing" and the "porcelain", respectively. The porcelain should work and look nice. The plumbing should just deal with lots of crap efficiently. Much of git's documentation was first written to explain the plumbing to the people writing the porcelain. Since then, although the essentials haven't changed, porcelain has been added and conventions have been established that make it a lot more pleasant to deal with. Some commands have been changed or replaced, and it's not quite the same. Using the original low-level commands is now most likely more difficult than necessary, unless you want to do something not supported by the existing porcelain. This document retraces (with fewer false turns) how I learned my way around git. There are some concepts I didn't understand so well the first time through, and an overview of all the git commands, grouped by application. A good rule of thumb is that the commands with one-word names (git-diff, git-commit, git-merge, git-push, git-pull, git-status, git-tag, etc.) are designed for end-user use. Multi-word names (git-count-objects, git-write-tree, git-cat-file) are generally designed for use from a script. This isn't ironclad. The first command to start using git is git-init-db, and git-show-branch is pure porcelain, while git-mktag is a primitive. And you don't often run git-daemon by hand. But still, it's a useful guideline. * Background material. To start with, read "man git". Or Documentation/git.txt in the git source tree, which is the same thing. Particularly note the description of the index, which is where all the action in git happens. One thing that's confusing is why git allows you to have one version of a file in the current HEAD, a second version in the index, and possibly a third in the working directory. Why doesn't the index just contain a copy of the current HEAD until you commit a new one? The answer is merging, which does all its work in the index. Neither the object database nor the working directory let you have multiple files with the same name. The index is really very simple. It's a series of structures, each describing one file. There's an object ID (SHA1) of the contents, some file metadata to detect changes (time-stamps, inode number, size, permissions, owner, etc.), and the path name relative to the root of the working directory. It's always stored sorted by path name, for efficient merging. At (almost) any time, you can take a snapshot of the index and write it as a tree object. The only interesting feature is that each entry has a 2-bit stage number. Normally, this is always zero, but each path name is allowed up to three different versions (object IDs) in the index at once. This is used to represent an incomplete merge, and an unmerged index entry (with more than one version) prevents committing the index to the object database. * Terminology - heads, branches, refs, and revisions (This is a supplement to what's already in "man git".) The most common object needed by git primitives is a tree. Since a commit points to a tree and a tag points to a commit, both of these are acceptable "tree-ish" objects and can be used interchangeably. Likewise, a tag is "commit-ish" and can be used where a commit is required. As soon as you get to the porcelain, the most commonly used object is a commit. Also known as a revision, this is a tree plus a history. While you can always use the full object ID, you can also use a reference. A reference is a file that contains a 40-character hex SHA1 object ID (and a trailing newline). When you specify the name of a reference, it is searched for in one of the directories: .git/ (or $GIT_DIR) .git/refs/ (or $GIT_DIR/refs/) .git/refs/heads/ (or $GIT_DIR/refs/heads/) .git/refs/tags/ (or $GIT_DIR/refs/tags/) You may use subdirectories by including slashes in the reference name. There is no search order; if searching the above four path prefixes produces more than one match for the reference name (it's ambiguous), then the name is not valid. There is additional syntax (which looks like "commit~3^2~17") for specifying an ancestor of a given commit (or tag). This is documented in detail in the documentation for git-rev-parse. Briefly, commit^ is the parent of the given commit. commit^^ is the grandparent, etc.. If there is more than one ancestor (a merge), then they can be referenced as commit^1 (a synonym for commit^), commit^2, commit^3, etc. (commit^0 gives the commit object itself. A no-op if you're starting from a commit, but it lets you get the commit object from a tag object.) As long strings of ^ can be annoying, they can be abbreviated using ~ syntax. commit^^^ is the same as commit~3, ^^^^ is the same as ~4, etc. You can see lots of examples in the output of "git-show-branch". Now, the although the most primitive git tools don't care, a convention among all the porcelain is that the current head of development is .git/HEAD, a symbolic link to a reference under refs/heads/. git-init-db creates HEAD pointing to refs/heads/master, and that is traditionally the name used for the "main trunk" of development. Note that initially refs/heads/master doesn't exist - HEAD is a dangling symlink! This is okay, and will cause the initial commit to have zero parents. A "head" is mostly synonymous with a "branch", but the terms have different emphasis. The "head" is particularly the tip of the branch, where future development will be appended. A "branch" is the entire development history leading to the head. However, as far as git is concerned, they're both references to commit objects, referred to from refs/heads/. When you actually do more (with git-commit, or git-merge), then the current HEAD reference is overwritten with the new commit's id, and the old HEAD becomes HEAD^. Since HEAD is a symlink, it's the file in refs/heads/ that's actually overwritten. (See the git-update-ref documentation for further details.) The git-checkout command actually changes the HEAD symlink. git-checkout enforces the rule that it will only check out a branch under refs/heads. You can use refs/tags as a source for git-diff or any other command that only examines the revision, but if you want to check it out, you have to copy it to refs/heads. * Resetting The "undo" command for commits to the object database is git-reset. Like all deletion-type commands, be careful or you'll hurt yourself. Given a commit (using any of the syntaxes mentioned above), this sets the current HEAD to refer to the given commit. This does NOT alter the HEAD symlink (as git-checkout <branch> will do), but actually changes the reference pointed to by HEAD (e.g. refs/heads/master) to contain a new object ID. The classic example is to undo an erroneous commit, use "git-reset HEAD^". There are actually three kinds of git-reset: git-reset --soft: Only overwrite the reference. If you can find the old object ID, you can put everything back with a second git-reset --soft OLD_HEAD. git-reset --mixed: This is the default, which I always think of as "--medium". Overwrite the reference, and (using git-read-tree) read the commit into the index. The working directory is unchanged. git-reset --hard: Do everything --mixed does, and also check out the index into the working directory. This really erases all traces of the previous version. (One caveat: this will not delete any files in the working directory that were added as part of the changes being undone.) The space taken up by the abandoned commit won't actually be reclaimed until you collect garbage with git-prune. git-reset with no commit specified is "git-reset HEAD", which is much safer because the object reference is not actually changed. This can be used to undo changes in the index or working directory that you did not intend. Note, however, that it is not selective. "git-commit" has options for doing this selectively. Like being sure what directory you're in when typing "rm -r", think carefully about what branch you're on when typing "git-reset <commit>". There is an undelete: git-reset stores the previous HEAD commit in OLD_HEAD. And git-lost-found can find leftover commits until you do a git-prune. * Merging Merging is central to git operations. Indeed, a big difference between git and other version control systems is that git assumes that a change will be merged more often than it's written, as it's passed around different developers' repositories. Even "git checkout" is a merge. The heart of merging is git-read-tree, but if you can understand it from the man page, you're doing better than me. As mentioned, the index and the working directory versions of a file could both be different from the HEAD. Git lets you merge "under" your current working directory edits, as long as the merge doesn't change the files you're editing. There are some special cases of merging, but let me start with the procedure for the general 3-way merge case: merging branch B into branch A (the current HEAD). 1) Given two commits, find a common ancestor O to server as the origin of the merge. The basic "resolve" algorithm uses git-merge-base for the task, but the "recursive" merge strategy gets more clever in the case where there are multiple candidates. I won't got into what it does, but it does a pretty good job. 2) Add all three input trees (the Origin, A, and B) to the index by "git-read-tree -m O A B". The index now contains up to three copies of every file. (Four, including the original, but that is discarded before git-read-tree returns.) Then, for each file in the index, git-read-tree does the following: 2a) For each file, git-merge-tree tries to collapse the various versions into one using the "trivial in-index merge". This just uses the file blob object names to see if the file contents are identical, and if two or more of the three trees contain an identical copy of this file, it is merged. A missing (deleted) file matches another missing file. Note that this is NOT a majority vote. If A and B agree on the contents of the file, that's what is used. (Whether O agrees is irrelevant in this case.) But if O and A agree, then the change made in B is taken as the final value. Likewise, if O and B agree, then A is used. 2b) If this is possible, then a check is made to see if the merge would conflict with any uncommitted work in the index or change the index out from under a modified working directory file. If either of those cases happen, the entire merge is backed out and fails. (In the git source, the test "t/t1000-read-tree-m-3way.sh" has a particularly detailed description of the various cases.) If the merge is possible and safe, the versions are collapsed into one final result version. 2c) If all three versions differ, the trivial in-index merge is not possible, and the three source versions are left in the index unmerged. Again, if there was uncommitted work in the index or the working directory, the entire merge fails. 3) Use git-merge-index to iterate over the remaining unmerged files, and apply an intra-file merge. The intra-file merge is usually done with git-merge-one-file, which does a standard RCS-style three-way merge (see "man merge"). 4) Check out all the successfully merged files into the working directory. 5) If automatic merging was successful on every file, commit the merged version immediately and stop. 6) If automatic merging was not complete, then replace the working directory copies of any remaining unmerged files with a merged copy with conflict markers (again, just like RCS or CVS) in the working directory. All three source versions are available in the index for diffing against. (We have not destroyed anything, because in step 2c), we checked to make sure the working directory file didn't have anything not in the repository.) 7) Manually edit the conflicts and resolve the merge. As long as an unmerged, multi-version file exists in the index, committing the index is forbidden. (You can use the options to git-diff to see the changes.) 8) Commit the final merged version of the conflicting file(s), replacing the unmerged versions with the single finished version. Note that if the merge is simple, with no one file edited on both branches, git never has to open a single file. It reads three tree objects (recursively) and stat(2)s some working directory files to verify that they haven't changed. Also note that this aborts and backs out rather than overwrite anything not committed. You can merge "under" uncommitted edits only if those edits are to files not affected by the merge. * 2-way merging A "2-way merge" is basically a 3-way merge with the contents of the index as the "current HEAD", and the original HEAD as the Origin. However, this merge is designed only for simple cases and only supports the "trivial merge" cases. It does not fall back to an intra-file merge. [[ I'm not sure why it couldn't, I confess. For reversibility? Or just because it's likely to be too confusing. ]] This merge is used by git-checkout to switch between two branches, while preserving any changes in the working directory and index. Like the 3-way case, if a particular file hasn't changed between the two heads, then git will preserve any uncommitted edits. If the file has changed in any way, git doesn't try to perform any sort of intra-file merge, it just fails. * 1-way merging This is not actually used by the git-core porcelain, and so is only useful to someone writing more porcelain, but I'll describe it for completeness. Plain (non-merging) git-read-tree will overwrite the index entries with those from the tree. This invalidates the cached stat data, causing git to think all the working directory files are "potentially changed" until you do a git-update-index --refresh. By specifying a 1-way merge, any index entry whose contents (object ID) matches the incoming tree will have its cached stat data preserved. Thus, git will know if the working directory file is not changed, and will not overwrite if you execute git-checkout-index. This is purely an efficiency hack. * Special merges - already up to date, and fast-forward There are two cases of 3-way or 2-way merging that are special. Recall that the basic merge pattern is B--------> A+B / / / / O -----> A The two special cases arise if one of A or B is a direct ancestor of the other. In that case, the common ancestor of both A and B is the older of the two commits. And the merged result is simply the newer of the two, unchanged. Recalling that we are merging B into A, if B is a direct ancestor of A, then A already includes all of B. A is "already up to date" and not changed at all by the merge. The other case you'll hear mentioned, because it happens a lot when pulling, is when A is a direct ancestor of B. In this case, the result of the merge is a "fast-forward" to B. Both of these cases are handled very efficiently by the in-index merge done by git-read-tree. * Deleted files during merges There is one small wrinkle in git's merge algorithm that will probably never bite you, but I ought to explain anyway, just because it's so rare that it's difficult to discover it by experiment. The index contains a list of all files that git is tracking. If the index file is empty or missing and you do a commit, you write an empty tree with no files. When merging, if git finds no pre-existing index entry for a path it is trying to merge, it considers that to mean "status unknown" rather than "modified by being deleted". Thus, this is not uncommitted work in the index file, and does not block the merge. Instead, the file will reappear in the merge. This is because it is possible to blow away the index file (rm .git/index will do it quite nicely), and if this was considered a modification to be preserved, it would cause all sorts of conflicts. So the one change to the index that will NOT be preserved by a merge is the removal of a file. A missing index entry is treated the same as an unmodified index entry. The index will be updated, and when you check out the revision, the working directory file will be (re-)created. Note that none of this affects you in the usual case where you make changes in the working directory only, and leave the index equal to HEAD until you're ready to commit. * Packs Originally, git stored every object in its own file, and used rsync to share repositories. It was quickly discovered that this brought mighty servers to their knees. It's great for retrieving a small subset of the database the way git usually does, but rsync scans the whole .git/objects tree every time. So packs were developed. A pack is a file, built all at once, which contains many delta-compressed objects. With each .pack file, there's an accompanying .idx file that indexes the pack so that individual objects can be retrieved quickly. You can reduce the disk space used by your repositories by periodically repacking them with git-repack. Normally, this makes a new incremental pack of everything not already packed. With the -a flag, this repacks everything for even greater compression (but takes longer). The git wire protocol basically consists of negotiation over what objects needs to be transferred followed by sending a custom-built pack. The .idx file can be reconstructed from the .pack file, so it's never transferred. [[ Is once every few hundred commits a good rule of thumb for repacking? When .git/objects/?? reaches X megabytes? I think too many packs is itself a bad thing, since they all have to be searched. ]] * Raw diffs A major source of git's speed is that it tries to avoid accessing files unnecessarily. In particular, files can be compared based on their object IDs without needing to open and read them. As part of this, the responsibility for finding file differences (printing diffs) is divided into finding what files have changed, and finding the changes within those files. This is all explained in the Documentation/diffcore.txt in the git distribution, but the basics is that many of the primitives spit out a line like this: :100755 100755 68838f3fad1d22ab4f14977434e9ce73365fb304 0000000000000000000000000000000000000000 M git-bisect.sh when asked for a diff. This is known as a "raw diff". They can be told to generate a human-readable diff with the "-p" (patch) flag. The git-diff command includes this by default. * Advice on using git If you're used to CVS, where branches and merges are "advanced" features that you can go a long time, you need to learn to use branches in git a lot more. Branch early and often. Every time you think about developing a feature or fixing a bug, create a branch to do it on. In fact, avoid doing any development on the master branch. Merges only. A branch is the git equivalent of a patch, and merging a branch is the equivalent of applying that patch. A branch gives it a name that you can use to refer to it. This is particularly useful if you're sharing your changes. Once you're done with a branch, you can delete it. This is basically just removing the refs/heads/<branch> file, but "git-branch -d" adds a few extra safety checks. Assuming you merged the branch in, you can still find all the commits in the history, it's just the name that's been deleted. You can also rename a branch by renaming the refs/heads/branch file. There's no git command to do this, but as long as you update the HEAD symlink if necessary, you don't need one. Periodically merge all of the branches you're working on into a testing branch to see if everything works. Blow away and re-create the testing branch whenever you do this. When you like the result, merge them into the master. * The .git directory There are a number of files in the .git directory used by the porcelain. In case you're curious (I was), this is what they are: index - The actual index file. objects/ - The object database. Can be overridden by $GIT_OBJECT_DIRECTORY hooks/ - where the hook scripts are kept. The standard git template includes examples, but disabled by being marked non-executable. info/exclude - Default project-wide list of file patterns to exclude from notice. To this is added the per-directory list in .gitignore. See the git-ls-files docs for full details. refs/ - References to development heads (branches) and tags. remotes/ - Short names of remote repositories we pull from or push to. Details are in the "git-fetch" man page. HEAD - The current default development head. - Created by git-init-db and never deleted - Changed by git-checkout - Used by git-commit and any other command that commits changes. - May be a dangling pointer, in which case git-commit does an "initial checkin" with no parent. COMMIT_EDITMSG - Temp used by git-commit to edit a commit message. COMMIT_MSG - Temp used by git-commit to form a commit message, post-processed from COMMIT_EDITMSG. FETCH_HEAD - Just-fetched commits, to be merged into the local trunk. - Created by git-fetch. - Used by git-pull as the source of data to merge. MERGE_HEAD - Keeps track of what heads are currently being merged into HEAD. - Created by git-merge --no-commit with the heads used - Deleted by git-checkout and git-reset (since you're abandoning the merge) - Used by git-commit to supply additional parents to the current commit. (And deleted when done.) MERGE_MSG - Generated by git-merge --no-commit. - Used by git-commit as the commit message for a merge (If present, git-commit doesn't prompt.) MERGE_SAVE - cpio archive of all locally modified files created by "git-merge" before starting to do anything, if multiple merge strategies are being attempted. Used to rewind the tree in case a merge fails. ORIG_HEAD - Previous HEAD commit prior to a merge or reset operation. LAST_MERGE - Set by the "resolve" strategy to the most recently merged-in branch. Basically, a copy of MERGE_HEAD. Not used by the other merge strategies, and resolve is no longer the default, so its utility is very limited. BISECT_LOG - History of a git-bisect operation. - Can be replayed (or, more usefully, a prefix can) by "git-bisect replay" BISECT_NAMES - The list of files to be modified by git-bisect. - Set by "git-bisect start" TMP_HEAD (used by git-fetch) TMP_ALT (used by git-fetch) * Git command summary There are slightly over a hundred git commands. This section tries to classify them by purpose, so you can know which commands are intended to be used for what. You can always use the low-level plumbing directly, but that's inconvenient and error-prone. Helper programs (not for direct use) for a specific utility are shown indented under the program they help. Note that all of these can be invoked using the "git" wrapper by replacing the leading "git-" with "git ". The results are exactly the same. There is a suggestion to reduce the clutter in /usr/bin and move all the git binaries to their own directory, leaving just the git wrapper in /usr/bin. so you'll have to use it or adjust your $PATH. But that hasn't happened yet. In the meantime, including the hyphen makes tab-completion work. I include ".sh", ".perl", etc. suffixes to show what the programs are written in, so you can read those scripts written in languages you're familiar with. These are the names in the git source tree, but the suffix is not included in the /usr/bin copies. + Administrative commands git-init-db + Object database maintenance: git-convert-objects git-fsck-objects git-lost-found.sh git-prune.sh git-relink.perl + Pack maintenance git-count-objects.sh git-index-pack git-pack-objects git-pack-redundant git-prune-packed git-repack.sh git-show-index git-unpack-objects git-verify-pack + Important primitives git-commit-tree git-rev-list git-rev-parse + Useful primitives git-ls-files git-update-index + General script helpers (used only by scripts) git-cat-file git-check-ref-format git-checkout-index git-fmt-merge-msg.perl git-hash-object git-ls-tree git-repo-config git-unpack-file git-update-ref git-sh-setup.sh git-stripspace git-symbolic-ref git-var git-write-tree + Oddballs git-mv.perl + Code browsing git-diff.sh git-diff-files git-diff-index git-diff-tree git-diff-stages git-grep.sh git-log.sh git-name-rev git-shortlog.perl git-show-branch git-whatchanged.sh + Making local changes git-add.sh git-bisect.sh git-branch.sh git-checkout.sh git-commit.sh git-reset.sh git-status.sh + Cherry-picking git-cherry.sh git-patch-id git-cherry-pick.sh git-rebase.sh git-revert.sh + Accepting changes by e-mail git-apply git-am.sh git-mailinfo git-mailsplit git-applypatch.sh git-applymbox.sh + Publishing changes by e-mail git-format-patch.sh git-send-email.perl + Merging git-merge.sh git-merge-base git-merge-index git-merge-one-file.sh git-merge-octopus.sh git-merge-ours.sh git-merge-recursive.py git-merge-resolve.sh git-merge-stupid.sh git-read-tree git-resolve.sh git-octopus.sh + Making releases git-get-tar-commit-id git-tag.sh git-mktag git-tar-tree git-verify-tag.sh + Accepting changes by network git-clone.sh git-clone-pack git-fetch.sh git-fetch-pack git-local-fetch git-http-fetch git-ssh-fetch git-ls-remote.sh git-peek-remote git-parse-remote.sh git-pull.sh git-ssh-pull git-shell git-receive-pack + Publishing changes by network git-daemon git-push.sh git-http-push git-ssh-push git-ssh-upload git-request-pull.sh git-send-pack git-update-server-info git-upload-pack All of the basic git commands are designed to be scripted. When scripting, use the "--" option to ensure that files beginning with "-" won't be interpreted as options, and the "-z" option to output NUL-terminated file names so embedded newlines won't break things. (A person who'd do either of these on purpose is probably crazy, but it's not actually illegal.) Looking at existing shell scripts can be very informative. * Detailed list Here's a repeat, including descriptions. I don't try to include every detail you can find on the man page, but to explain when you'd want to use a command. + Administrative commands git-init-db This creates an empty git repository in ./.git (or $GIT_DIR if that is non-null) using a system-wide template. It won't hurt an existing repository. + Object database maintenance: git-convert-objects You will *never* need to use this command. The git repository format has undergone some revision since its first release. If you have an ancient and crufty git repository from the very very early days, this will convert it for you. But as you're new to git, it doesn't apply. git-fsck-objects Validate the object database. Checks that all references point somewhere, all the SHA1 hashes are correct, and that sort of thing. This walks the entire repository, uncompressing and hashing every object, so it takes a while. Note that by default, it skips over packs, which can make it seem misleadingly fast. git-lost-found.sh Find (using git-fsck-objects) any unreferenced commits and tags in the object database, and place them in a .git/lost-found directory. This can be used to recover from accidentally deleting a tag or branch reference that you wanted to keep. This is the opposite of git-prune. git-prune.sh Delete all unreachable objects from the object database. It deletes useless packs, but does not remove useless objects from the middle of partially useful packs. Git leaks objects in a number of cases, such as unsuccessful merges. The leak rate is generally a small fraction of the rate at which the desired history grows, so it's not very alarming, but occasionally running git-prune will eliminate the If you deliberately throw away a development branch, you will need to run this command to fully reclaim the disk space. On something like the full Linux repository, this takes a while. git-relink.perl Merge the objects stores of multiple git repositories by making hard links between them. Useful to save space if duplicate copies are accidentally created on one machine. + Pack maintenance The classic git format is to compress and store each object separately. This is still used for all newly created changes. However, objects can also be stored en masse in "packs" which contain many objects and tan take advantage of delta-compressing. Repacking your repositories periodically can save space. (Repacking is pretty quick but not quick enough to be comfortable doing every commit.) git-count-objects.sh Print the number and total size of unpacked objects in the repository, to help you decide when is a good time to repack. git-index-pack A pack file has an accompanying .idx file to allow rapid lookup. This regenerates the .idx file from the .pack. This is almost never needed directly, but can be used after transferring a .pack file between machines. git-pack-objects Given a list of objects on stdin, build a pack file. This is a helper used by the various network communication scripts. git-pack-redundant Produce a list of redundant packs, for feeding to "xargs rm". A helper for git-prune. git-prune-packed Delete unpacked object files that are duplicated in packs. (With -n, only lists them.) A helper for git-prune. git-repack.sh Make a new pack with all the unpacked objects. With -a, include already-packed objects in the new pack. With -d as well, deletes all the old packs thereby made redundant. git-show-index Dump the contents of a pack's .idx file. Mostly for debugging git itself. git-unpack-objects Unpack a .pack file, the opposite of git-pack-objects. With -n, doesn't actually create the files. With -q, suppresses the progress indicator. git-verify-pack Validate a pack file. Useful when debugging git, and when downloading from a remote source. A helper for git-clone. + Important primitives Although these primitives are not used directly very frequently, understanding them will help you understand other git commands that wrap them. git-commit-tree Create a new commit object from a tree and a list of parent commits. This is the primitive that's the heart of git-commit. (It's also used by git-am, git-applypatch, git-merge, etc.) git-rev-list Print a list of commits (revisions), in reverse chronological order. This is the heart of git-log and other history examination commands, and the options for specifying parts of history are shared by all of them. In particular, it takes an arbitrary number of revisions as arguments, some of which may be prefixed with ^ to negate them. These make up "include" and "exclude" sets. git-rev-list lists all revisions that are ancestors of the "include" set but not ancestors of the "exclude" set. For this purpose, a revision is considered an ancestor of itself. Thus, "git-rev-list ^v1.1 v1.2" will list all revisions from the v1.2 release back to (but not including) the v1.1 release. Because this is so convenient, a special syntax, "v1.1..v1.2" is allowed as an equivalent. However, occasionally the general form is useful. For example, adding ^branch will show the trunk (including merges from the branch), but exclude the branch itself. Similarly, "branch ^trunk", a.k.a. trunk..branch, will show all work on the branch that hasn't been merged to the trunk. This works even though trunk is not a direct ancestor of branch. Git-rev-list has a variety of flags to control it output format. The default is to just print the raw SHA1 object IDs of the commits, but --pretty produces a human-readable log. You can also specify a set of files names (or directories), in which case output will be limited to commits that modified those files. This command is used extensively by the git:// protocol to compute a set of objects to send to update a repository. git-rev-parse This is a very widely used command line canonicalizer for git scripts. It converts relative commit references (e.g. master~3) to absolute SHA1 hashes, and can also pass through arguments not recognizable as references, so the script can interpret them. It is important because it defines the <rev> syntax. This takes a variety of options to specify how to prepare the command line for the script's use. --verify is a particularly important one. + Useful primitives These primitives are potentially useful directly. git-ls-files List files in the index and/or working directory. A variety of options control which files to list, based on whether they are the same in both places or have been modified. This command is the start of most check-in scripts. git-update-index Copy the given files from the working directory into the index. This create the blob objects, but no trees yet. (Note that editing a file executing this multiple times without creating a commit will generate orphaned objects. Harmless.) One common safe option is "git-update-index --refresh". This looks for files whose metadata (modification time etc.) has changed, but not their contents, and updates the metadata in the index so the file contents won't have to be examined again. + General script helpers (used only by scripts) These are almost exclusively helpers for use in porcelain scripts and have little use by themselves from the command line. git-cat-file Extract a file from the object database. You can ask for an object's type or size given only an object ID, but to get its contents, you have to specify the type. This is a deliberate safety measure. git-check-ref-format Verify that the reference specified on the command line is syntactically valid for a new reference name under $GIT_DIR/refs. A number of characters (^, ~, :, and ..) are reserved; see the man page for the full list of rules. git-checkout-index Copy files from the index to the working directory, or to a specified directory. Most important as a helper for git-checkout, this is also used by git-merge and git-reset. git-fmt-merge-msg.perl Generate a reasonable default commit message for a merge. Used by git-pull and git-octopus. git-hash-object Very primitive helper to turn an arbitrary file into an object, returning just the ID or actually adding it to the database. Used by the cvs-to-git and svn-to-git import filters. git-ls-tree List the contents of a tree object. Will tell you all the files in a commit. Used by the checkout scripts git-checkout and git-reset. git-repo-config Get and set options in .git/config. The .git/config format is designed to be human-readable. This gives programmatic access to the settings. This currently has a lot of overlap with the function of git-var. git-unpack-file Write the contents of the given block to a temporary file, and return the name of that temp file. Used most often by merging scripts. git-update-ref Rewrite a reference (in .git/refs/) to point to a new object. "echo $sha1 > $file" is mostly equivalent, but this adds locking so two people don't update the same reference at once. git-sh-setup.sh This is a general prefix script that sets up $GIT_DIR and $GIT_OBJECT_DIRECTORY for a script, or errors out if the git control files can't be found. git-stripspace Remove unnecessary whitespace. Used mostly on commit messages received by e-mail. git-symbolic-ref This queries or creates symlinks to references such as HEAD. Basically equivalent to readlink(1) or ln -s, this also works on platforms that don't have symlinks. See the man page. git-var Provide access to the GIT_AUTHOR_IDENT and/or GIT_COMMITTER_IDENT values, used in various commit scripts. This currently has a lot of overlap with the function of git-repo-config. git-write-tree Generate a tree object reflecting the current index. The output is the tree object; if you don't remember it somewhere (usually, pass it to git-commit-tree), it'll be lost. This requires that the index be fully merged. If any incomplete merges are present in the index (files in stages 1, 2 or 3), git-write-tree will fail. + Oddballs git-mv.perl I have to admit, I'm not quite sure what advantages this is supposed to have over plain "mv" followed by "git-update-index", or why it's complex enough to need perl. Basically, this renames a file, deleting its old name and adding its new name to the index. Otherwise, it's a two-step process to rename a file: - Rename the file - git-add the new name Followed by which you must commit both the old and new names + Code browsing git-diff.sh Show changes between various trees. Takes up to two tree specifications, and shows the difference between the versions. Zero arguments: index vs. working directory (git-diff-files) One: tree vs. working directory (git-diff-index) One, --cached: tree vs. index (git-diff-index) Two: tree vs. tree (git-diff-tree) This wrapper always produces human-readable patch output. The helpers all produce "diff-raw" format unless you supply the -p option. There are some interesting options. Unfortunately, the git-diff man page is annoyingly sparse, and refers to the helper scripts' documentation rather than describing the many useful options they all have in common. Please do read the man pages of the helpers to see what's available. In particular, although git does not explicitly record file renames, it has some pretty good heuristics to notice things. -M tries to detect renamed files by matching up deleted files with similar newly created files. -C tries to detect copies as well. By default, -C only looks among the modified files for the copy source. For common cases like splitting a file in two, this works well. The --find-copies-harder searches ALL files in the tree for the copy source. This can be slow on large trees! See Documentation/diffcore.txt for an explanation of how all this works. git-diff-files Compare the index and the working directory. git-diff-index Compare the working directory and a given tree. This is the git equivalent of the single-operand form of "cvs diff". If "--cached" is specified, uses the index rather than the working directory. git-diff-tree Compare two trees. This is the git equivalent of the two-operand form of "cvs diff". This command is sometimes useful by itself to see the changes made by a single commit. If you give it only one commit on the command line, it shows the diff between that commit and its first parent. If the commit specification is long and awkward to type, using "git-diff-tree -p <commit>" can be easier than "git-diff <commit>^ <commit>". git-diff-stages Although not called by git-diff, there is a fourth diff helper routine, used to compare the various versions of an unmerged file in the index. It is intended for use by merging porcelain. git-grep.sh A very simple wrapper that runs git-ls-files and greps the output looking for a file name. Does nothing fancy except saves typing. git-log.sh Wrapper around git-rev-list --pretty. Shows a history of changes made to the repository. Takes all of git-rev-list's options for specifying which revisions to list. git-name-rev Find a symbolic name for the commit specified on the command line, and returns a symbolic name of the form "maint~404^2~7". Basically, this does a breadth-first search from all the heads in .git/refs looking for the given commit. git-shortlog.perl This is a filter for the output of "git-log --pretty=short" to generate a one-line-per-change "shortlog" as Linus likes. git-show-branch Visually show the merge history of the references given as arguments. Prints one column per reference and one line per commit showing whether that commit is an ancestor of each reference. git-whatchanged.sh A simple wrapper around git-rev-list and git-diff-tree, this shows the change history of a repository. Specify a directory or file on the command line to limit the output to changes affecting those files. This isn't the same as "cvs annotate", but it serves a similar purpose among git folks. You can add the -p option to include patches as well as log comments. You can also add the -M or -C option to follow history back through file renames. -S is interesting: it's the "pickaxe" option. Given a string, this limits the output to changes that make that string appear or disappear. This is for "digging through history" to see when a piece of code was introduced. The string may (and often does) contain embedded newlines. See Documentation/cvs-migration.txt. + Making local changes All of these are examples of "porcelain" scripts. Reading the scripts themselves can be informative; they're generally not too confusing. git-add.sh A simple wrapper around "git-ls-files | git-update-index --add" to add new files to the index. You may specify directories. You need to invoke this for every new file you want git to track. git-bisect.sh Utility to do a binary search to find the change that broke something. The heart of this is in "git-rev-list --bisect" A very handy little utility! Kernel developers love it when you tell them exactly which patch broke something. NOTE: this uses the head named "bisect", and will blow away any existing branch by that name. Try not to create a branch with that name. There are three steps: git-bisect start [<files>] - Reset to start bisecting. If any files are specified, only they will be checked out as bisection proceeds. git-bisect good [<revision>] - Record the revision as "good". The change being sought must be after this revision. git-bisect bad [<revision>] - Record the revision as "bad". The change being sought must be before or equal to this revision. As soon as you have specified one good version and one bad version, git-bisect will find a halfway point and check out that revision. Build and test it, then report it as good or bad, and git-bisect will narrow the search. Finally, git-bisect will tell you exactly which change caused the problem. git-bisect log - Show a history of revisions. git-bisect replay - Replay (part of) a git-bisect log. Generally used to recover from a mistake, you can truncate the log before the mistake and replay it to continue. If git-bisect chooses a version that cannot build, or you are otherwise unable to determine whether it is good or bad, you can change revisions with "git-reset --hard <revision>" to another checkout between the current good and bad limits, and continue from there. "git-reset --hard <revision>" is generally dangerous, but you are on a scratch branch. This can, of course, be used to look for any change, even one for the better, if you can avoid being confused by the terms "good" and "bad". git-branch.sh Most commonly used bare, to show the available branches. Show, create, or delete a branch. The current branches are simply the contents of .git/refs/heads/. Note that this does NOT switch to the created branch! For the common case of creating a branch and immediately switching to it, "git-checkout -b <branch>" is simpler. git-checkout.sh This does two superficially similar but very different things depending on whether any files or paths are specified on the command line. git-checkout [-f] [-b <new-branch>] <branch> This switches (changes the HEAD symlink to) the specified branch, updating the index and working directory to reflect the change. This preserves changes in the working directory unless -f is specified. If -b is specified, a new branch is started from the specified point and switched to. If <branch> is omitted, it defaults to HEAD. This is the usual way to start a new branch. git-checkout [<branch>] [--] <paths>... This replaces the files specified by the given paths with the versions from the index or the specified branch. It does NOT affect the HEAD symlink, just replaces the specified paths. This form is like a selective form of "git-reset". Normally, this can guess whether the first argument is a branch name or a path, but you can use "--" to force the latter interpretation. With no branch, this is used to revert a botched edit of a particular file. Both forms use git-read-tree internally, but the net effect is quite different. git-commit.sh Commit changes to the revision history. In terms of primitives, this does three things: 1) Updates the index file with the working directory files specified on the command line, or -a for all (using git-diff-files --name-only | git-update_index), 2) Prompts for or generates a commit message, and then 3) Creates a commit object with the current index contents. This also executes the pre-commit, commit-msg, and post-commit hooks if present. This will remove deleted files from the index, but will not add new files to the index, even if explicitly specified on the command line; you must use git-add for that. git-reset.sh Explained in detail in "resetting", above. This modifies the current branch head reference (as pointed to by .git/HEAD) to refer to the given commit. It does not modify .git/HEAD Reset the current HEAD to the specified commit, so that future checkins will be relative to it. There are three variations: --soft: Just move the HEAD link. The index is unchanged. --mixed (default): Move the HEAD link and update the index file. Any local changes will appear not checked in. --hard: Move the HEAD links, update the index file, and check out the index, overwriting the working directory. Like "cvs update -C". In case of accidents, this copies the previous head object ID to ORIG_HEAD (which is NOT a symlink). git-status.sh Show all files in the directory not current with respect to the git HEAD. The basic categories are: 1) Changed in the index, will be included in the next commit. 2) Changed in the working directory but NOT in the index; will be committed only if added via git-update-index or the git-commit command line. 3) Not tracked by git. + Cherry-picking Cherry-picking is the process of taking part of the changes introduced on one tree and applying those changes to another. This doesn't produce a parent/descendant relationship in the commit history. To produce that relationship, there's a special type of merge you can do if you've taken everything you want off a branch and want to show it in the merge history without actually importing any changes from it: ours. "git-merge -s ours" will generate a commit that shows some branches were merged in, but not actually alter the current HEAD source code in any way. One thing cherry-picking is sometimes used for is taking a development branch and re-organizing the changes into a patch series for submission to the Linux kernel. git-cherry.sh This searches a branch for patches which have not been applied to another. Basically, it finds the unpicked cherries. It searches back to the common ancestor of the named branch and the current head using git-patch-id to identify similarity in patches. git-patch-id Generate a hash of a patch, ignoring whitespace and line numbers so that "the same" patch, even relative to a different code base, probably has the same hash, and different patches probably have different ones. git-cherry looks for patch hashes which are present on the branch (source branch) that are not present on the trunk (destination branch). git-cherry-pick.sh Given a commit (on a different branch), compute a diff between it and its immediate parent, and apply it to the current HEAD. This is actually the same script as "git revert", but works forward. git-cherry finds the patches, this merges them. Handles failures gracefully. git-rebase.sh Move a branch to a more recent "base" release. This just extracts all the patches applied on the local head since the last merge with upstream (using git-format-patch) and re-applies them relative to the current upstream with git-am (explained under "accepting changes by e-mail"). Finally, it deletes the old branch and gives its name to the new one, so your branch now contains all the same changes, but relative to a different base. Basically the same as cherry-picking an entire branch. git-revert.sh Undo a commit. Basically "patch -R" followed by a commit. This is actually the same script as "git-cherry-pick", just applies the patch in reverse, undoing a change that you don't wish to back up to using git-reset. Handles failures gracefully by telling the user what to do. + Accepting changes by e-mail git-apply Apply a (git-style extended) patch to the current index and working directory. git-am.sh The new and improved "apply an mbox" script. Takes an mbox-style concatenation of e-mails as input and batch-applies them, generating one commit per message. Can resume after stopping on a patch problem. (Invoke it as "git-am --skip" or "git-am --resolved" to deal with the problematic patch and continue.) git-mailinfo Given a single mail message on stdin (in the Linux standard SubmittingPatches format), extract a commit message and the patch proper. git-mailsplit Split an mbox into separate files. git-applypatch.sh Tries simple git-apply, then tries a few other clever merge strategies to get a patch to apply. Used in the main loop of git-am and git-applymbox. git-applymbox.sh This is Linus's original apply-mbox script. Mostly superseded by git-am (which is friendlier and has more features), but he still uses it, so it's maintained. This is so old it was originally a test of the git core called "dotest", and that name is still lurking in the temp file names. + Publishing changes by e-mail git-format-patch.sh Generate a series of patches, in the preferred Linux kernel (Documentation/SubmittingPatches) format, for posting to lkml or the like. This formats every commit on a branch as a separate patch. git-send-email.perl Actually e-mail the output of git-format-patch. (This uses direct SMTP, a matter of some controversy. Others feel that /bin/mail is the correct local mail-sending interface.) + Merging git-merge.sh Merge one or more "remote" heads into the current head. Some changes, when there has been change only on one branch or the same change has been made to all branches, can be resolved by the "trivial in-index" merge done by git-read-tree. For more complex cases, git provides a number of different merge strategies (with reasonable defaults). Note that merges are done on a filename basis. While git tries to detect renames when generating diffs, most merge strategies don't track them by renaming. (The "recursive" strategy, which recently became the default, is a notable exception.) git-merge-base Finds a common ancestor to use when comparing the changes made on two branches. The simple case is straightforward, but if there have been cross-merges between the branches, it gets somewhat hairy. The algorithm is not 100% final yet. (There's also --all, which lists all candidates.) git-merge-index This is the outer merging loop. It takes the name of a one-file merge executable as an argument, and runs it for every incomplete merge. git-merge-one-file.sh This is the standard git-merge-index helper, that tries to resolve a 3-way merge. A helper used by all the merge strategies. (Except "recursive" which has its own equivalent.) git-merge-octopus.sh Many-way merge. Overlaps should be negligible. git-merge-ours.sh A "dummy" merge strategy helper. Claims that we did the merge, but actually takes the current tree unmodified. This is used to cleanly terminate side branches that heve been cherry-picked in. git-merge-recursive.py A somewhat fancier 3-way merge. This handles multiple cross-merges better by using multiple common ancestors. git-merge-resolve.sh git-merge-stupid.sh Not actually used by git-merge, this is a simple example merge strategy. git-read-tree Read the given tree into the index. This is the difference between the "--soft" and "--mixed" modes of git-reset, but the important thing this command does is simple merging. If -m is specified, this can take up to three trees as arguments. git-resolve.sh OBSOLETE. Perform a merge using the "resolve" strategy. Has been superseded by the "-s resolve" option to git-merge and git-pull. git-octopus.sh OBSOLETE. Perform a merge using the "octopus" strategy. Has been superseded by the "-s octopus" option to git-merge and git-pull. + Making releases git-get-tar-commit-id Reads out the commit ID that git-tar-tree puts in its output. (Or fails if this isn't a git-generated tar file.) git-tag.sh Create a tag in the refs/tags directory. There are two kinds: "lightweight tags" are just references to commits. More serious tags are GPG-signed tag objects, and people receiving the git tree can verify that it is the version that you released. git-mktag Creates a tag object. Verifies syntactic correctness of its input. (If you want to cheat, use git-hash-object.) git-tar-tree Generate a tar archive of the named tree. Because git does NOT track file timestamps, this uses the timestamp of the commit, or the current time if you specify a tree. Also stores the commit ID in an extended tar header. git-verify-tag.sh Given a tag object, GPG-verify the embedded signature. + Accepting changes by network Pulling consists of two steps: retrieving the remote commit objects and everything they point to (including ancestors), then merging that into the desired tree. There are still separate fetch and merge commands, but it's more commonly done with a single "git-pull" command. git-fetch leaves the commit objects, one per line, in .git/FETCH_HEAD. git-merge will merge those in if that file exists when it is run. References to remote repositories can be made with long URLs, or with files in the .git/remotes/ directory. The latter also specifies the local branches to merge the fetched data into, making it very easy to track a remote repository. git-clone.sh Create a new local clone of a remote repository. (Can do a couple of space-sharing hacks when "remote" is on a local machine.) You only do this once. git-clone-pack Runs git-upload-pack remotely and places the resultant pack into the local repository. Supports a variety of network protocols, but "remote" can also be a different directory on the current machine. git-fetch.sh Fetch the named refs and all linked objects from a remote repository. The resultant refs (tags and commits) are stored in .git/FETCH_HEAD, which is used by a later git-resolve or git octopus. This is the first half of a "git pull" operation. git-fetch-pack Retrieve missing objects from a remote repository. git-local-fetch Duplicates a git repository from the local system. (Er... is this used anywhere???) git-http-fetch Do a fetch via http. Http requires some kludgery on the server (see git-update-server-info), but it works. git-ssh-fetch Do a fetch via ssh. git-ls-remote.sh Show the contents of the refs/heads/ and/or refs/tags/ directories of a remote repository. Useful to see what's available. git-peek-remote Helper C program for the git-ls-remote script. Implements the git protocol form of it. git-parse-remote.sh Helper script to parse a .git/remotes/ file. Used by a number of these programs. git-pull.sh Fetches specific commits from the given remote repository, and merges everything into the current branch. If a remote commit is named as src:dst, this merges the remote head "src" into the branch "dst" as well as the trunk. Typically, the "dst" branch is not modified locally, but is kept as a pristine copy of the remote branch. One very standard example of this contention is that a repository that is tracking another specifies "master:origin" to provide a pristine local copy of the remote "master" branch in the local branch named "origin". git-ssh-pull A helper program that pulls over ssh. git-shell A shell that can be used for git-only users. Allows git push (git-receive-pack) and pull (git-upload-pack) only. git-receive-pack Receive a pack from git-send-pack, validate it, and add it to the repository. Adding just the bare objects has no security implications, but this can also update branches and tags, which does have an effect. Runs pre-update and post-update hooks; the former may do permissions checking and disallow the upload. This is the command run remotely via ssh by git-push. + Publishing changes by network git-daemon A daemon that serves up the git native protocol so anonymous clients can fetch data. For it to allow export of a directory, the magic file name "git-daemon-export-ok" must exist in it. This does not accept (receive) data under any circumstances. git-push.sh Git-pull, only backwards. Send local changes to a remote repository. The same .git/remotes/ short-cuts can be used, and the same src:dst syntax. (But this time, the src is local and the dst is remote.) git-http-push A helper to git-push to implement the http: protocol. git-ssh-push A helper to git-push to push over ssh. git-ssh-upload Another helper. This just does the equivalent of "fetch" ("throw"?) and doesn't actually merge the result. Obsolete? git-request-pull.sh Generate an e-mail summarizing the changes between two commits, and request that the recipient pull them from your repository. Just a little helper to generate a consistent and informative format. git-send-pack Pack together a pile of objects missing at the destination and send them. This is the sending half that talks to a remote git-receive-pack. git-update-server-info To run git over http, auxiliary info files are required that describes what objects are in the repository (since git-upload-pack can't generate this on the fly). If you want to publish a repository via http, run this after every commit. (Typically via the hooks/post-update script.) git-upload-pack Like git-send-pack, but this is invoked by a remote git-fetch-pack. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: as promised, docs: git for the confused 2005-12-08 6:34 ` as promised, docs: git for the confused linux @ 2005-12-08 21:53 ` Junio C Hamano 2005-12-08 22:02 ` H. Peter Anvin 2005-12-09 0:47 ` Alan Chandler 2005-12-09 1:19 ` Josef Weidendorfer 2 siblings, 1 reply; 64+ messages in thread From: Junio C Hamano @ 2005-12-08 21:53 UTC (permalink / raw) To: linux; +Cc: git linux@horizon.com writes: > * Terminology - heads, branches, refs, and revisions > > (This is a supplement to what's already in "man git".) > > The most common object needed by git primitives is a tree. Since a > commit points to a tree and a tag points to a commit, both of these are > acceptable "tree-ish" objects and can be used interchangeably. Likewise, > a tag is "commit-ish" and can be used where a commit is required. I am unsure if we want to further confuse readers by saying this, but technically, "Likewise, a tag which is commit-ish can be used in place of commit". Not all tags are necessarily commit-ish. v2.6.11 tag is tree-ish but not commit-ish for example. Typically, however, a tag is commit-ish. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: as promised, docs: git for the confused 2005-12-08 21:53 ` Junio C Hamano @ 2005-12-08 22:02 ` H. Peter Anvin 0 siblings, 0 replies; 64+ messages in thread From: H. Peter Anvin @ 2005-12-08 22:02 UTC (permalink / raw) To: Junio C Hamano; +Cc: linux, git Junio C Hamano wrote: > linux@horizon.com writes: > > >>* Terminology - heads, branches, refs, and revisions >> >>(This is a supplement to what's already in "man git".) >> >>The most common object needed by git primitives is a tree. Since a >>commit points to a tree and a tag points to a commit, both of these are >>acceptable "tree-ish" objects and can be used interchangeably. Likewise, >>a tag is "commit-ish" and can be used where a commit is required. > > > I am unsure if we want to further confuse readers by saying > this, but technically, "Likewise, a tag which is commit-ish can > be used in place of commit". Not all tags are necessarily > commit-ish. v2.6.11 tag is tree-ish but not commit-ish for > example. Typically, however, a tag is commit-ish. > Saying they can be used interchangably is just plain wrong, however. It's not a bijective relation. Something like: >> The most common object needed by git primitives is a tree. Since a >> commit points and tags uniquely identify a tree, a commit or tag can >> be used anywhere a tree is expected. >> Likewise, most tags point to commits and can be used anywhere a >> commit is expected. ... might be better, and avoids the colloquialisms. -hpa ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: as promised, docs: git for the confused 2005-12-08 6:34 ` as promised, docs: git for the confused linux 2005-12-08 21:53 ` Junio C Hamano @ 2005-12-09 0:47 ` Alan Chandler 2005-12-09 1:45 ` Petr Baudis 2005-12-09 1:19 ` Josef Weidendorfer 2 siblings, 1 reply; 64+ messages in thread From: Alan Chandler @ 2005-12-09 0:47 UTC (permalink / raw) To: git On Thursday 08 Dec 2005 06:34, linux@horizon.com wrote: > As I mentioned with all my questions, I was writing up the answers > I got. Here's the current status. If anyone would like to comment on > its accuracy or usefulness, feedback is appreciated. ... > * Background material. > > To start with, read "man git". Or Documentation/git.txt in the git > source tree, which is the same thing. Particularly note the description > of the index, which is where all the action in git happens. > > One thing that's confusing is why git allows you to have one version of > a file in the current HEAD, a second version in the index, and possibly a > third in the working directory. Why doesn't the index just contain a copy > of the current HEAD until you commit a new one? The answer is merging, > which does all its work in the index. Neither the object database nor > the working directory let you have multiple files with the same name. If I was a complete newbie, I would be lost right here. You start refering to the term HEAD without any introduction to what it means and (as far as I could see on a quick glance - which is what a newbie would do - man git doesn't start out here either). If your audience really is a complete new commer, then as a minimum I think you need to describe to concept of a "branch of development" with a series of snapshots of the state, the current of which is called HEAD. You might even at this stage hint about there being several such branches. The next bit, which goes on about the index is great - just put it into context with a simple explanation first. -- Alan Chandler http://www.chandlerfamily.org.uk Open Source. It's the difference between trust and antitrust. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: as promised, docs: git for the confused 2005-12-09 0:47 ` Alan Chandler @ 2005-12-09 1:45 ` Petr Baudis 0 siblings, 0 replies; 64+ messages in thread From: Petr Baudis @ 2005-12-09 1:45 UTC (permalink / raw) To: Alan Chandler; +Cc: git Dear diary, on Fri, Dec 09, 2005 at 01:47:56AM CET, I got a letter where Alan Chandler <alan@chandlerfamily.org.uk> said that... > On Thursday 08 Dec 2005 06:34, linux@horizon.com wrote: > > As I mentioned with all my questions, I was writing up the answers > > I got. Here's the current status. If anyone would like to comment on > > its accuracy or usefulness, feedback is appreciated. > ... > > * Background material. > > > > To start with, read "man git". Or Documentation/git.txt in the git > > source tree, which is the same thing. Particularly note the description > > of the index, which is where all the action in git happens. > > > > One thing that's confusing is why git allows you to have one version of > > a file in the current HEAD, a second version in the index, and possibly a > > third in the working directory. Why doesn't the index just contain a copy > > of the current HEAD until you commit a new one? The answer is merging, > > which does all its work in the index. Neither the object database nor > > the working directory let you have multiple files with the same name. > > > If I was a complete newbie, I would be lost right here. You start refering to > the term HEAD without any introduction to what it means and (as far as I > could see on a quick glance - which is what a newbie would do - man git > doesn't start out here either). I think that the first paragraph of the background material means "insert Documentation/git.txt here", the second one is then "now what might've been unclear there". That said, the "git for the confused" contains a lot of nice points, but I don't think it's a good approach to just have extra document for clarifying this stuff. It would be much better if the stock documentation itself would not be confusing in the first place. Same goes for the "commands overview" (BOUND to get out-of-date over time since it's detached from the normal per-command documentation; we have troubles huge enough to keep usage strings in sync, let alone the manpages). -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ VI has two modes: the one in which it beeps and the one in which it doesn't. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: as promised, docs: git for the confused 2005-12-08 6:34 ` as promised, docs: git for the confused linux 2005-12-08 21:53 ` Junio C Hamano 2005-12-09 0:47 ` Alan Chandler @ 2005-12-09 1:19 ` Josef Weidendorfer 2 siblings, 0 replies; 64+ messages in thread From: Josef Weidendorfer @ 2005-12-09 1:19 UTC (permalink / raw) To: linux; +Cc: git On Thursday 08 December 2005 07:34, you wrote: > As I mentioned with all my questions, I was writing up the answers > I got. Here's the current status. If anyone would like to comment on > its accuracy or usefulness, feedback is appreciated. > ... > + Oddballs > git-mv.perl > I have to admit, I'm not quite sure what advantages this is > supposed to have over plain "mv" followed by "git-update-index", > or why it's complex enough to need perl. > > Basically, this renames a file, deleting its old name and adding > its new name to the index. Otherwise, it's a two-step process > to rename a file: > - Rename the file > - git-add the new name > Followed by which you must commit both the old and new names The nice thing about it is that you can move huge directories around, or multiple files/dirs at once, and it will do the right thing. E.g. git-mv -k foo* bar/ will only move files which are version controlled. It is actually a 3-step process: rename, delete old, add new. Perhaps it should be noted that this has nothing to do with any explicit renaming feature like in other SCMs. Josef ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-11-29 10:31 ` Petr Baudis 2005-11-29 18:46 ` Junio C Hamano @ 2005-11-29 21:40 ` linux 2005-11-29 23:14 ` Junio C Hamano 1 sibling, 1 reply; 64+ messages in thread From: linux @ 2005-11-29 21:40 UTC (permalink / raw) To: junkio, pasky; +Cc: git, linux I'm feeling slightly guilty about eliciting such a flood of help, but I'm certainly leraning a lot. But there's one statement that, while I'm not doubting it's accuracy, seems at odds with the mental model I'm building. I must be misunderstanding something. junkio wrote: >> Okay, so git-update-index will overwrite a staged file with a >> fresh stage-0 copy. And git-commit will refuse to commit >> (to be precise, it'll stop at the git-write-tree stage) if there >> are unresolved conflicts. > > Sorry, I was unclear that I was talking about end-user level > tool. The update-index here is not about the conflict > resolution in the index file read-tree documentation talks > about. That has already been done when "merge" ran in the > conflicting case. In the conflicting case, the working tree > holds 3-way merge conflicting result, and the index holds HEAD > version at stage0 for such a path. Hand resolving after > update-index is to record what you eventually want to commit > (i.e. you are not replacing higher stage entry in the index with > stage0 entry -- you are replacing stage0 entry with another). > >> If you want to see the unmodified input files, you can find their >> IDs with "git-ls-files -u" and then get a copy with "git-cat-file blob" >> or "git-unpack-file". git-merge-index is basically a different way to >> process the output of git-ls-files -u. > > Yes, in principle. But in practice you usually do not use these > low level tools yourself. When git-merge returns with > conflicting paths, most of them have already been collapsed into > stage0 and git-ls-files --unmerged would not show. The only > case I know of that you may still see higher stage entries in > the index these days is merging paths with different mode bits. > We used to leave higher stage entries when both sides added new > file at the same path, but even that we show as merge from > common these days. And pasky reiterated: > From the user POV, the main difference between Cogito and GIT merging > is that: > > (i) Cogito tries to never leave the index "dirty" (i.e. containing > unmerged entries), and instead all conflicts should propagate to the > working tree, so that the user can resolve them without any further > special tools. (What is lacking here is that Cogito won't proofcheck > that you really resolved them all during a commit. That's a big TODO. > But core GIT won't you warn about committing the classical << >> == > conflicts either.) This seems odd to me. There's an alternate implementation that I described that makes a lot more sense to me, based on my current state of knowledge. Can someone explain why my idea is silly? I'd imagine you'd consider user editing to be a last-resort merge algorithm, but treat it like the other merges, and leave the file staged while it's in progress. Either git-checkout-index or something similar would "check out" the staged file with CVS-style merge markers. And an eventual git-update-index would replace the staged file with a stage-0, just like git-merge-one-file does automatically. "git-diff" could default to diffing against the stage-2 file to produce the same reults as now, but you could also have an option to diff against a different stage, which might be useful. (This is another reason for my earlier comment that I don't think the distinction between stage-0 and stage-2 is actually necessary.) And git-write-tree would naturally stop you from committing with unresolved conflicts. You could still commit the conflict markers, but it would be a two-step process. You'd have the simple principle that all merges start with git-read-tree producing a staged file, and end with git-update-index collapsing them when it's been resolved. (Or something like git-reset throwing everything away.) Having said all this, there's presumably a good reason why this is a bad idea. Could someone enlighten me? Thanks! ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-11-29 21:40 ` git-name-rev off-by-one bug linux @ 2005-11-29 23:14 ` Junio C Hamano 2005-11-30 0:15 ` linux 0 siblings, 1 reply; 64+ messages in thread From: Junio C Hamano @ 2005-11-29 23:14 UTC (permalink / raw) To: linux; +Cc: junkio, pasky, git linux@horizon.com writes: > This seems odd to me. There's an alternate implementation that > I described that makes a lot more sense to me, based on my current > state of knowledge. Can someone explain why my idea is silly? It is not silly. Actually we have "been there, done that". We used to leave the higher stages around in the index after automerge failure. Note that you would not just have stage2 in such a case. stage1 keeps the common ancestor, stage2 has what you started with, and stage3 holds the version from other branch. diff-stages can be used to diff between these stages. We _could_ have added feature to either diff-stages or diff-files to compare between stageN and working tree. However, this turned out to be not so convenient as we wished initially. What you would do after inspecting diffs between stage1 and stage3, between stage2 and stage3 and between stage1 and stage2 typically ends up doing what "merge" have tried (and failed) manually anyway, and being able to find the conflict markers by simply running "git diff" was just as good, except that we risk getting still-unresolved files checked in if the user is not careful. If you want to be clever about an automated merge, you could write a new merge strategy to take the three trees and produce a better automerge result. That is what Fredrik has done in his git-merge-recursive (now default). Or you could "improve" git-merge-one-file to take three blob object names and leave file~1 file~2 file~3 in the working tree, instead of (or in addition to) leaving a "merge" result with conflict markers, to give the user ready access to the version from each stage. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-11-29 23:14 ` Junio C Hamano @ 2005-11-30 0:15 ` linux 2005-11-30 0:53 ` Junio C Hamano 2005-11-30 1:51 ` Linus Torvalds 0 siblings, 2 replies; 64+ messages in thread From: linux @ 2005-11-30 0:15 UTC (permalink / raw) To: junkio; +Cc: git, linux, pasky > It is not silly. Actually we have "been there, done that". Um, okay, but I don't see why you changed... > We used to leave the higher stages around in the index after > automerge failure. Note that you would not just have stage2 in > such a case. stage1 keeps the common ancestor, stage2 has what > you started with, and stage3 holds the version from other > branch. diff-stages can be used to diff between these stages. > We _could_ have added feature to either diff-stages or > diff-files to compare between stageN and working tree. Yes, exactly. This is what I expected. > However, this turned out to be not so convenient as we wished > initially. What you would do after inspecting diffs between > stage1 and stage3, between stage2 and stage3 and between stage1 > and stage2 typically ends up doing what "merge" have tried (and > failed) manually anyway, and being able to find the conflict > markers by simply running "git diff" was just as good, except > that we risk getting still-unresolved files checked in if the > user is not careful. You seem to be saying that producing a merge with conflict markers is what you (almost) always want, so it's the default. No objections. But why collapse the index and only keep stage2? Why not leave all stages in the index *and* the merge-with-conflict-markers in the working directory? They you could, for example, try alternate single-file merge algorithms on the conflict, or regenerate the conflict markers if you wanted. By keeping all of the source material around until the user has decided on a resolution, you achieve maximal flexibility. This is no more effort for the user to use in the common case (edit the conflicts and git-update-index), but lets you try various things in the working directory and eaily back out of them. ("git-merge-index -s manual -a" would regenerate all of the conflict markers.) And it prevents a checkin until the matter has been resolved. I'm wondering if this isn't a philosophical issue. One side says that, since all automated merging is complete, the stages should be collapsed. To me, it makes more sense to leave out the adjective "automated" and consider the merge to be incomplete; we're just putting the user in the loop when software fails. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-11-30 0:15 ` linux @ 2005-11-30 0:53 ` Junio C Hamano 2005-11-30 1:27 ` Junio C Hamano 2005-11-30 1:51 ` Linus Torvalds 1 sibling, 1 reply; 64+ messages in thread From: Junio C Hamano @ 2005-11-30 0:53 UTC (permalink / raw) To: linux; +Cc: junkio, git, pasky linux@horizon.com writes: > I'm wondering if this isn't a philosophical issue. I do not think so. I have to admit I did not exactly agree with the current behaviour when it was changed from the previous one, but at the same time I did not have anything concrete against it, and I did not care too much about the details back then. I suspect it was primarily be done to make things easier for the end user without changing already existing tools (i.e., git-diff-files did not have to start taking --stage=2 flag to tell it to compare stage2 and working tree). This is the message from Linus that announced the current behaviour: http://marc.theaimsgroup.com/?l=git&m=111826424425624 ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-11-30 0:53 ` Junio C Hamano @ 2005-11-30 1:27 ` Junio C Hamano 0 siblings, 0 replies; 64+ messages in thread From: Junio C Hamano @ 2005-11-30 1:27 UTC (permalink / raw) To: linux; +Cc: git Junio C Hamano <junkio@cox.net> writes: > linux@horizon.com writes: > >> I'm wondering if this isn't a philosophical issue. > > I do not think so.... > ... > This is the message from Linus that announced the current > behaviour: > > http://marc.theaimsgroup.com/?l=git&m=111826424425624 Replying to myself. In the message, Linus talks about being able to do (diff-cache is an old name for diff-index): git-diff-files -p xyzzy ;# to compare with our version git-diff-cache -p MERGE_HEAD xyzzy ;# to compare with his But because of the "index before merge has to match HEAD" rule, the first one could have been written as: git-diff-index -p HEAD xyzzy ;# to compare with ours So in that sense, I suspect it may not be too bad if we just changed merge-one-file with the patch at the end. However, git-diff-index HEAD without paths restriction would show everything the merge brought in, not just the conflicting path, so in that sense it may make things slightly harder for the end user to use. --- diff --git a/git-merge-one-file.sh b/git-merge-one-file.sh index c3eca8b..df6dd67 100755 --- a/git-merge-one-file.sh +++ b/git-merge-one-file.sh @@ -79,11 +79,12 @@ case "${1:-.}${2:-.}${3:-.}" in ;; esac - # We reset the index to the first branch, making - # git-diff-file useful - git-update-index --add --cacheinfo "$6" "$2" "$4" - git-checkout-index -u -f -- "$4" && - merge "$4" "$orig" "$src2" + # Leave the conflicts in stages; failed merge result can be + # seen by "git-diff-index HEAD" or "git-diff-index MERGE_HEAD" + rm -fr "$4" && + git-cat-file blob "$2" >"$4" && + case "$6" in *?7??) chmod +x "$4" ;; esac && + merge "$4" "$orig" "$src2" ret=$? rm -f -- "$orig" "$src2" ^ permalink raw reply related [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-11-30 0:15 ` linux 2005-11-30 0:53 ` Junio C Hamano @ 2005-11-30 1:51 ` Linus Torvalds 2005-11-30 2:06 ` Junio C Hamano ` (2 more replies) 1 sibling, 3 replies; 64+ messages in thread From: Linus Torvalds @ 2005-11-30 1:51 UTC (permalink / raw) To: linux; +Cc: junkio, git, pasky On Tue, 29 Nov 2005, linux@horizon.com wrote: > > You seem to be saying that producing a merge with conflict markers is > what you (almost) always want, so it's the default. No objections. > > But why collapse the index and only keep stage2? Why not leave all > stages in the index *and* the merge-with-conflict-markers in the working > directory? That may actually work really well. It would also avoid one bug that we have right now: if you fix things up by hand, but forget to explicitly do a "git-update-index filename" or "git commit filename", a plain regular "git commit" will happily commit all the changes _except_ for the ones you have merged manually. It's happened once to me. If we left things in the index in an unmerged state, we'd be guaranteed to either _fail_ that git commit unless somebody has done the git-update-index (or names the files specifically on the commit command line, which will do it for you). So I think I agree. Junio? The problem (I think) was that "git-diff-file" did bad things with unmerged entries. That's what the comment in git-merge-one-file implies. But otherwise this should just make it so.. Do you want to test this out? Linus --- diff --git a/git-merge-one-file.sh b/git-merge-one-file.sh index c3eca8b..739a072 100755 --- a/git-merge-one-file.sh +++ b/git-merge-one-file.sh @@ -79,11 +79,7 @@ case "${1:-.}${2:-.}${3:-.}" in ;; esac - # We reset the index to the first branch, making - # git-diff-file useful - git-update-index --add --cacheinfo "$6" "$2" "$4" - git-checkout-index -u -f -- "$4" && - merge "$4" "$orig" "$src2" + merge "$4" "$orig" "$src2" ret=$? rm -f -- "$orig" "$src2" ^ permalink raw reply related [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-11-30 1:51 ` Linus Torvalds @ 2005-11-30 2:06 ` Junio C Hamano 2005-11-30 2:33 ` Junio C Hamano 2005-11-30 18:11 ` Daniel Barkalow 2 siblings, 0 replies; 64+ messages in thread From: Junio C Hamano @ 2005-11-30 2:06 UTC (permalink / raw) To: Linus Torvalds; +Cc: git Linus Torvalds <torvalds@osdl.org> writes: > If we left things in the index in an unmerged state, we'd be guaranteed to > either _fail_ that git commit unless somebody has done the > git-update-index (or names the files specifically on the commit command > line, which will do it for you). > > So I think I agree. I suspect we are saying the same thing. Funny. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-11-30 1:51 ` Linus Torvalds 2005-11-30 2:06 ` Junio C Hamano @ 2005-11-30 2:33 ` Junio C Hamano 2005-11-30 3:12 ` Linus Torvalds 2005-11-30 3:15 ` linux 2005-11-30 18:11 ` Daniel Barkalow 2 siblings, 2 replies; 64+ messages in thread From: Junio C Hamano @ 2005-11-30 2:33 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux, junkio, git, pasky Linus Torvalds <torvalds@osdl.org> writes: > Junio? > > The problem (I think) was that "git-diff-file" did bad things with > unmerged entries. That's what the comment in git-merge-one-file implies. > But otherwise this should just make it so.. > > Do you want to test this out? I have actually resolved one conflicting merge with this and it was OK, except that it was a bit unpleasant when I first did "git-diff-index HEAD" without giving any path ;-), but the users will get used to it. Pushed out as a part of the proposed updates collection. Here is what I wrote as the commit log message for the hand-resolved merge using this updated merge-one-file. commit 6b48f6ff7ffff6ca0f9da53d9423a0474dd008fd Merge: b4f40b90ed1d9e1f3c0557e1ba064d169ba03a1c 99e01692063cc48adee19e1f738472a579c14ca2 Author: Junio C Hamano <junkio@cox.net> Date: Tue Nov 29 18:25:29 2005 -0800 Merge branch 'jc/subdir' This one is done with the updated merge-one-file, which leaves unmerged entries in the index file to prevent unresolved merge from getting committed by mistake. After "git pull ..." fails, earlier the user said: $ git-diff to see half-merged state. Now git-diff just says: $ git-diff * Unmerged path ls-tree.c In order to get the earlier "show me the failed merge relative to my HEAD", you can say: $ git-diff HEAD ls-tree.c Signed-off-by: Junio C Hamano <junkio@cox.net> ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-11-30 2:33 ` Junio C Hamano @ 2005-11-30 3:12 ` Linus Torvalds 2005-11-30 5:06 ` Linus Torvalds 2005-11-30 7:18 ` Junio C Hamano 2005-11-30 3:15 ` linux 1 sibling, 2 replies; 64+ messages in thread From: Linus Torvalds @ 2005-11-30 3:12 UTC (permalink / raw) To: Junio C Hamano; +Cc: linux, git, pasky On Tue, 29 Nov 2005, Junio C Hamano wrote: > > I have actually resolved one conflicting merge with this and it > was OK, except that it was a bit unpleasant when I first did > "git-diff-index HEAD" without giving any path ;-), What does "git-diff-files" do? Just output a lot of nasty "unmerged" messages? The _nice_ thing to do would be to output one "unmerged" message, but then diff against stage2 if it exists (and it basically always should, since otherwise we wouldn't have gotten a merge error). If it did that, then you'd have the best of both world: the old nice "git diff" behaviour _and_ being safe (and saying that it's unmerged). Something like this (untested, of course). It _should_ write out * Unmerged path <filename> followed by a regular diff, exactly like you'd want. [ This all assumes that merge-one-file leaves the stages right. I think my patch to do that was just broken. Yours was probably not. ] Linus --- diff --git a/diff-files.c b/diff-files.c index 38599b5..8a78326 100644 --- a/diff-files.c +++ b/diff-files.c @@ -95,11 +95,23 @@ int main(int argc, const char **argv) if (ce_stage(ce)) { show_unmerge(ce->name); - while (i < entries && - !strcmp(ce->name, active_cache[i]->name)) + while (i < entries) { + struct cache_entry *nce = active_cache[i]; + + if (strcmp(ce->name, nce->name)) + break; + /* Prefer to diff against stage 2 (original branch) */ + if (ce_stage(nce) == 2) + ce = nce; i++; - i--; /* compensate for loop control increments */ - continue; + } + /* + * Compensate for loop update + */ + i--; + /* + * Show the diff for the 'ce' we chose + */ } if (lstat(ce->name, &st) < 0) { ^ permalink raw reply related [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-11-30 3:12 ` Linus Torvalds @ 2005-11-30 5:06 ` Linus Torvalds 2005-11-30 5:51 ` Junio C Hamano 2005-11-30 6:09 ` git-name-rev off-by-one bug linux 2005-11-30 7:18 ` Junio C Hamano 1 sibling, 2 replies; 64+ messages in thread From: Linus Torvalds @ 2005-11-30 5:06 UTC (permalink / raw) To: Junio C Hamano; +Cc: linux, git, pasky On Tue, 29 Nov 2005, Linus Torvalds wrote: > > Something like this (untested, of course). > > It _should_ write out > > * Unmerged path <filename> > > followed by a regular diff, exactly like you'd want. The more I thinking about this, the more I think this is a wonderful approach, but it would be even better to add a flag to let it choose between diffing against stage2 by default or diffing against stage3 (and hey, maybe even diffing against the original). In fact, here's a patch that does that, and also makes the "resolve" merge create these kinds of merges. As usual, my python knowledge is useless, since the only thing I know about python is that thou shalt not count to four. As a result, the standard recursive merge doesn't do this yet ;( The magic incantation is to just do git diff and you'll get a diff against the first branch. If you want a diff against the second branch, just use the "-2" option, and if you want a diff against the common base (which is actually surprisingly useful, I noticed, when I tried this with a conflict), use "-0". I've also changed "git diff" to _not_ drop the "-M" and "-p" options just because you give some other diff option. That was always a mistake. If you really want the raw git diff format, use the raw "git-diff-xyz" programs directly. Whaddaya think? I really like it. Here's an example, where I merged two branches that had the file "hello" in it, and the first branch had: Hi there This is the master branch and the second one had Hi there This is the 'other' branch and the base version had just the "Hi there", of course. The default (or "-1" arg) behaviour is: [torvalds@g5 test-merge]$ git diff * Unmerged path hello diff --git a/hello b/hello index 7cebcf8..3fa4697 100644 --- a/hello +++ b/hello @@ -1,2 +1,6 @@ Hi there +<<<<<<< hello This is the master branch +======= +This is the 'other' branch +>>>>>>> .merge_file_fJWiNf which is obvious enough. You see exactly the conflict, and you see the part of the first branch that is unchanged. Diffing against the original gives you [torvalds@g5 test-merge]$ git diff -0 * Unmerged path hello diff --git a/hello b/hello index 6530b63..3fa4697 100644 --- a/hello +++ b/hello @@ -1 +1,6 @@ Hi there +<<<<<<< hello +This is the master branch +======= +This is the 'other' branch +>>>>>>> .merge_file_fJWiNf which I actually found really readable. I realize that this is a really stupid example, but for a lot of trivial merges, this won't be _that_ far off, and it basically shows what happened in both branches, and ignores what neither side changed. The "-2" in this case is just the same as "-1" except obviously the "+" characters are situated differently. Still useful (especially if the changes were more complex). Me likee. Hope you guys do too. (And this is quite independently of the advantage that you can't commit an unmerged state by mistake, which is perhaps an even bigger one). Linus --- diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt index 6b496ed..afc7334 100644 --- a/Documentation/diff-options.txt +++ b/Documentation/diff-options.txt @@ -18,6 +18,12 @@ object name of pre- and post-image blob on the "index" line when generating a patch format output. +-0 -1 -2:: + When an unmerged entry is seen, diff against the base version, + the "first branch" or the "second branch" respectively. + + The default is to diff against the first branch. + -B:: Break complete rewrite changes into pairs of delete and create. diff --git a/diff-files.c b/diff-files.c index 38599b5..d744636 100644 --- a/diff-files.c +++ b/diff-files.c @@ -13,6 +13,7 @@ COMMON_DIFF_OPTIONS_HELP; static struct diff_options diff_options; static int silent = 0; +static int diff_unmerged_stage = 2; static void show_unmerge(const char *path) { @@ -46,7 +47,13 @@ int main(int argc, const char **argv) argc--; break; } - if (!strcmp(argv[1], "-q")) + if (!strcmp(argv[1], "-0")) + diff_unmerged_stage = 1; + else if (!strcmp(argv[1], "-1")) + diff_unmerged_stage = 2; + else if (!strcmp(argv[1], "-2")) + diff_unmerged_stage = 3; + else if (!strcmp(argv[1], "-q")) silent = 1; else if (!strcmp(argv[1], "-r")) ; /* no-op */ @@ -95,11 +102,23 @@ int main(int argc, const char **argv) if (ce_stage(ce)) { show_unmerge(ce->name); - while (i < entries && - !strcmp(ce->name, active_cache[i]->name)) + while (i < entries) { + struct cache_entry *nce = active_cache[i]; + + if (strcmp(ce->name, nce->name)) + break; + /* Prefer to diff against the proper unmerged stage */ + if (ce_stage(nce) == diff_unmerged_stage) + ce = nce; i++; - i--; /* compensate for loop control increments */ - continue; + } + /* + * Compensate for loop update + */ + i--; + /* + * Show the diff for the 'ce' we chose + */ } if (lstat(ce->name, &st) < 0) { diff --git a/git-diff.sh b/git-diff.sh index b3ec84b..efe8f75 100755 --- a/git-diff.sh +++ b/git-diff.sh @@ -3,12 +3,13 @@ # Copyright (c) 2005 Linus Torvalds # Copyright (c) 2005 Junio C Hamano +# Some way to turn these off? +default_flags="-M -p" + rev=$(git-rev-parse --revs-only --no-flags --sq "$@") || exit -flags=$(git-rev-parse --no-revs --flags --sq "$@") +flags=$(git-rev-parse --no-revs --flags --sq $default_flags "$@") files=$(git-rev-parse --no-revs --no-flags --sq "$@") -: ${flags:="'-M' '-p'"} - # I often say 'git diff --cached -p' and get scolded by git-diff-files, but # obviously I mean 'git diff --cached -p HEAD' in that case. case "$rev" in diff --git a/git-merge-one-file.sh b/git-merge-one-file.sh index c3eca8b..739a072 100755 --- a/git-merge-one-file.sh +++ b/git-merge-one-file.sh @@ -79,11 +79,7 @@ case "${1:-.}${2:-.}${3:-.}" in ;; esac - # We reset the index to the first branch, making - # git-diff-file useful - git-update-index --add --cacheinfo "$6" "$2" "$4" - git-checkout-index -u -f -- "$4" && - merge "$4" "$orig" "$src2" + merge "$4" "$orig" "$src2" ret=$? rm -f -- "$orig" "$src2" ^ permalink raw reply related [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-11-30 5:06 ` Linus Torvalds @ 2005-11-30 5:51 ` Junio C Hamano 2005-11-30 6:11 ` Junio C Hamano ` (2 more replies) 2005-11-30 6:09 ` git-name-rev off-by-one bug linux 1 sibling, 3 replies; 64+ messages in thread From: Junio C Hamano @ 2005-11-30 5:51 UTC (permalink / raw) To: Linus Torvalds; +Cc: git Linus Torvalds <torvalds@osdl.org> writes: > Whaddaya think? I really like it. Yes. Maybe split this into 3 pieces. I do not want to waste your time with that, so will take the liberty to do so myself, with appropriate commit log messages, if you do not mind. 1. give diff-files -[012] flags. 2. merge-one-file leaves unmerged index entries. 3. always use -M -p in git-diff. I do not have any issue against #1. Regarding #2, in an earlier message you said something about "patch to do that was just broken" which I did not understand; I think your patch I am replying to is doing the right thing. That case arm is dealing with a path that exists in "our" branch and the working tree blob should be the same as recorded in the HEAD, so I did not have to do the unpack-cat-chmod like I did in mine. Am I simply confused? About #3, I am not quite sure. I often use --name-status and I do _not_ want -p to kick in when I do so. How about something like this? --- diff --git a/git-diff.sh b/git-diff.sh index b3ec84b..8e0fe34 100755 --- a/git-diff.sh +++ b/git-diff.sh @@ -7,8 +7,6 @@ rev=$(git-rev-parse --revs-only --no-fla flags=$(git-rev-parse --no-revs --flags --sq "$@") files=$(git-rev-parse --no-revs --no-flags --sq "$@") -: ${flags:="'-M' '-p'"} - # I often say 'git diff --cached -p' and get scolded by git-diff-files, but # obviously I mean 'git diff --cached -p HEAD' in that case. case "$rev" in @@ -20,6 +18,21 @@ case "$rev" in esac esac +# If we do not have --name-status, --name-only nor -r, default to -p. +# If we do not have -B nor -C, default to -M. +case " $flags " in +*" '--name-status' "* | *" '--name-only' "* | *" '-r' "* ) + ;; +*) + flags="$flags '-p'" ;; +esac +case " $flags " in +*" '-"[BCM]* | *" '--find-copies-harder' "*) + ;; # something like -M50. +*) + flags="$flags '-M'" ;; +esac + case "$rev" in ?*' '?*' '?*) echo >&2 "I don't understand" ^ permalink raw reply related [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-11-30 5:51 ` Junio C Hamano @ 2005-11-30 6:11 ` Junio C Hamano 2005-11-30 16:13 ` Linus Torvalds 2005-11-30 16:08 ` Linus Torvalds 2005-12-02 8:25 ` Junio C Hamano 2 siblings, 1 reply; 64+ messages in thread From: Junio C Hamano @ 2005-11-30 6:11 UTC (permalink / raw) To: Linus Torvalds; +Cc: git Junio C Hamano <junkio@cox.net> writes: > Linus Torvalds <torvalds@osdl.org> writes: > >> Whaddaya think? I really like it. > > Yes. Maybe split this into 3 pieces. I do not want to waste > your time with that, so will take the liberty to do so myself, > with appropriate commit log messages, if you do not mind. > > 1. give diff-files -[012] flags. > 2. merge-one-file leaves unmerged index entries. > 3. always use -M -p in git-diff. > > I do not have any issue against #1. Actually there is one. If we are asked to do diff -1 and an unmerged path does not have stage #2 but stage #1 entry exists, we would end up showing that stage #1, without telling the user that we are showing something different from what was asked. How about doing something like this, on top of yours? --- diff-files.c +++ diff-files.c @@ -117,8 +117,11 @@ */ i--; /* - * Show the diff for the 'ce' we chose + * Show the diff for the 'ce' if we found the one + * from the desired stage. */ + if (ce_stage(ce) != diff_unmerged_stage) + continue; } if (lstat(ce->name, &st) < 0) { ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-11-30 6:11 ` Junio C Hamano @ 2005-11-30 16:13 ` Linus Torvalds 0 siblings, 0 replies; 64+ messages in thread From: Linus Torvalds @ 2005-11-30 16:13 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Tue, 29 Nov 2005, Junio C Hamano wrote: > > Actually there is one. If we are asked to do diff -1 and an > unmerged path does not have stage #2 but stage #1 entry exists, > we would end up showing that stage #1, without telling the user > that we are showing something different from what was asked. > How about doing something like this, on top of yours? Yes. Linus ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-11-30 5:51 ` Junio C Hamano 2005-11-30 6:11 ` Junio C Hamano @ 2005-11-30 16:08 ` Linus Torvalds 2005-12-02 8:25 ` Junio C Hamano 2 siblings, 0 replies; 64+ messages in thread From: Linus Torvalds @ 2005-11-30 16:08 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Tue, 29 Nov 2005, Junio C Hamano wrote: > > About #3, I am not quite sure. I often use --name-status and I > do _not_ want -p to kick in when I do so. How about something > like this? Yes. I was thinking about something like that, but I decided it was too much work ;) Linus ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-11-30 5:51 ` Junio C Hamano 2005-11-30 6:11 ` Junio C Hamano 2005-11-30 16:08 ` Linus Torvalds @ 2005-12-02 8:25 ` Junio C Hamano 2005-12-02 9:14 ` [PATCH] merge-one-file: make sure we create the merged file Junio C Hamano ` (2 more replies) 2 siblings, 3 replies; 64+ messages in thread From: Junio C Hamano @ 2005-12-02 8:25 UTC (permalink / raw) To: Linus Torvalds; +Cc: git Junio C Hamano <junkio@cox.net> writes: > 2. merge-one-file leaves unmerged index entries. > > Regarding #2, in an earlier message you said something about > "patch to do that was just broken" which I did not understand; I > think your patch I am replying to is doing the right thing. That > case arm is dealing with a path that exists in "our" branch and > the working tree blob should be the same as recorded in the > HEAD, so I did not have to do the unpack-cat-chmod like I did in > mine. Am I simply confused? The only difference is that, from the old tradition, we are supposed to allow the merge to happen in an unchecked-out working tree [*1*]. The version you did and I merged in the master branch breaks that, while the patch I posted keeps that premise. I can throw in my change on top of what is already commited for now to "fix" this, but do we still care about the "merge should succeed in an unchecked-out working tree" rule, or does it not matter anymore these days? One thing is that the check with "git diff" to show diff between half-merged and stage2 after a failed merge does not work very well in a sparsely checked out working tree, because the real change is buried among tons of deletes ("diff --diff-filter=UM" helps, though [*2*]). [Footnote] *1* ... and that is why we special case a non-existent working tree file as if it is clean with the index. After a merge, you would end up with a sparsely checked-out working tree that contains only the files that were involved in the merge. *2* Maybe --diff-filter should always include U in the output, because it is rare and when an unmerged entry exists the user would always want to see it. ^ permalink raw reply [flat|nested] 64+ messages in thread
* [PATCH] merge-one-file: make sure we create the merged file. 2005-12-02 8:25 ` Junio C Hamano @ 2005-12-02 9:14 ` Junio C Hamano 2005-12-02 9:15 ` [PATCH] merge-one-file: make sure we do not mismerge symbolic links Junio C Hamano 2005-12-02 9:16 ` [PATCH] git-merge documentation: conflicting merge leaves higher stages in index Junio C Hamano 2 siblings, 0 replies; 64+ messages in thread From: Junio C Hamano @ 2005-12-02 9:14 UTC (permalink / raw) To: git The "update-index followed by checkout-index" chain served two purposes -- to collapse the index to "our" version, and make sure that file exists in the working tree. In the recent update to leave the index unmerged on conflicting path, we wanted to stop doing the former, but we still need to do the latter (we allow merging to work in an un-checked-out working tree). Signed-off-by: Junio C Hamano <junkio@cox.net> --- git-merge-one-file.sh | 8 +++++++- 1 files changed, 7 insertions(+), 1 deletions(-) 7afd8d297cd0c24e51188181769b56e0fb0f4171 diff --git a/git-merge-one-file.sh b/git-merge-one-file.sh index 9a049f4..906098d 100755 --- a/git-merge-one-file.sh +++ b/git-merge-one-file.sh @@ -79,7 +79,13 @@ case "${1:-.}${2:-.}${3:-.}" in ;; esac - merge "$4" "$orig" "$src2" + # Create the working tree file, with the correct permission bits. + # we can not rely on the fact that our tree has the path, because + # we allow the merge to be done in an unchecked-out working tree. + rm -f "$4" && + git-cat-file blob "$2" >"$4" && + case "$6" in *7??) chmod +x "$4" ;; esac && + merge "$4" "$orig" "$src2" ret=$? rm -f -- "$orig" "$src2" -- 0.99.9.GIT ^ permalink raw reply related [flat|nested] 64+ messages in thread
* [PATCH] merge-one-file: make sure we do not mismerge symbolic links. 2005-12-02 8:25 ` Junio C Hamano 2005-12-02 9:14 ` [PATCH] merge-one-file: make sure we create the merged file Junio C Hamano @ 2005-12-02 9:15 ` Junio C Hamano 2005-12-02 9:16 ` [PATCH] git-merge documentation: conflicting merge leaves higher stages in index Junio C Hamano 2 siblings, 0 replies; 64+ messages in thread From: Junio C Hamano @ 2005-12-02 9:15 UTC (permalink / raw) To: git We ran "merge" command on O->A, O->B, A!=B case without verifying the path involved is not a symlink. Signed-off-by: Junio C Hamano <junkio@cox.net> --- git-merge-one-file.sh | 8 ++++++++ 1 files changed, 8 insertions(+), 0 deletions(-) 01655e7c9a0b05d930aa7e27e74f75e086005bfc diff --git a/git-merge-one-file.sh b/git-merge-one-file.sh index 906098d..c262dc6 100755 --- a/git-merge-one-file.sh +++ b/git-merge-one-file.sh @@ -58,6 +58,14 @@ case "${1:-.}${2:-.}${3:-.}" in # Modified in both, but differently. # "$1$2$3" | ".$2$3") + + case ",$6,$7," in + *,120000,*) + echo "ERROR: $4: Not merging symbolic link changes." + exit 1 + ;; + esac + src2=`git-unpack-file $3` case "$1" in '') -- 0.99.9.GIT ^ permalink raw reply related [flat|nested] 64+ messages in thread
* [PATCH] git-merge documentation: conflicting merge leaves higher stages in index 2005-12-02 8:25 ` Junio C Hamano 2005-12-02 9:14 ` [PATCH] merge-one-file: make sure we create the merged file Junio C Hamano 2005-12-02 9:15 ` [PATCH] merge-one-file: make sure we do not mismerge symbolic links Junio C Hamano @ 2005-12-02 9:16 ` Junio C Hamano 2 siblings, 0 replies; 64+ messages in thread From: Junio C Hamano @ 2005-12-02 9:16 UTC (permalink / raw) To: git This hopefully concludes the latest updates that changes the behaviour of the merge on an unsuccessful automerge. Instead of collapsing the conflicted path in the index to show HEAD, we leave it unmerged, now that diff-files can compare working tree files with higher stages. Signed-off-by: Junio C Hamano <junkio@cox.net> --- Documentation/git-merge.txt | 10 ++++++---- 1 files changed, 6 insertions(+), 4 deletions(-) e8be26e9282e346b01aa41fd7ab0b5f7bf7dcfc3 diff --git a/Documentation/git-merge.txt b/Documentation/git-merge.txt index c117404..0cac563 100644 --- a/Documentation/git-merge.txt +++ b/Documentation/git-merge.txt @@ -108,10 +108,12 @@ When there are conflicts, these things h 2. Cleanly merged paths are updated both in the index file and in your working tree. -3. For conflicting paths, the index file records the version - from `HEAD`. The working tree files have the result of - "merge" program; i.e. 3-way merge result with familiar - conflict markers `<<< === >>>`. +3. For conflicting paths, the index file records up to three + versions; stage1 stores the version from the common ancestor, + stage2 from `HEAD`, and stage3 from the remote branch (you + can inspect the stages with `git-ls-files -u`). The working + tree files have the result of "merge" program; i.e. 3-way + merge result with familiar conflict markers `<<< === >>>`. 4. No other changes are done. In particular, the local modifications you had before you started merge will stay the -- 0.99.9.GIT ^ permalink raw reply related [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-11-30 5:06 ` Linus Torvalds 2005-11-30 5:51 ` Junio C Hamano @ 2005-11-30 6:09 ` linux 2005-11-30 6:39 ` Junio C Hamano 2005-11-30 16:12 ` git-name-rev off-by-one bug Linus Torvalds 1 sibling, 2 replies; 64+ messages in thread From: linux @ 2005-11-30 6:09 UTC (permalink / raw) To: junkio, torvalds; +Cc: git, linux, pasky > +-0 -1 -2:: > + When an unmerged entry is seen, diff against the base version, > + the "first branch" or the "second branch" respectively. > + > + The default is to diff against the first branch. > + Er... why are these flags zero-based? git-ls-files -s displays them as "1", "2" and "3". All the docs talk about "stage1", "stage2" and "stage3". Change the nomenclature if you want, but this mixed messages business is kind of weird... (Heartened by the response to my previous question of "why do you do this thing that makes no sense to me", I'm going to be bold and not ask why this is a good idea.) ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-11-30 6:09 ` git-name-rev off-by-one bug linux @ 2005-11-30 6:39 ` Junio C Hamano 2005-11-30 13:10 ` More merge questions linux 2005-11-30 16:12 ` git-name-rev off-by-one bug Linus Torvalds 1 sibling, 1 reply; 64+ messages in thread From: Junio C Hamano @ 2005-11-30 6:39 UTC (permalink / raw) To: linux; +Cc: junkio, torvalds, git, pasky linux@horizon.com writes: >> +-0 -1 -2:: >> + When an unmerged entry is seen, diff against the base version, >> + the "first branch" or the "second branch" respectively. >> + >> + The default is to diff against the first branch. >> + > > Er... why are these flags zero-based? Because -1 means "first branch" (usually "ours", aka HEAD), and -2 means "second branch" ("theirs", aka MERGE_HEAD), and -0 is for the base (aka merge base)? But I think you are right. The numeric parameters should match stage number for consistency. How about if I redo the patch to make diff-files accept -1/-2/-3 instead, and in addition accept "--base", "--ours", and "--theirs" as synonyms? Side note. diff3 says MINE OLDER YOURS and the way to remember the order is they are alphabetical. We can say the same for base, ours and theirs. ^ permalink raw reply [flat|nested] 64+ messages in thread
* More merge questions 2005-11-30 6:39 ` Junio C Hamano @ 2005-11-30 13:10 ` linux 2005-11-30 18:37 ` Daniel Barkalow 2005-11-30 20:23 ` Junio C Hamano 0 siblings, 2 replies; 64+ messages in thread From: linux @ 2005-11-30 13:10 UTC (permalink / raw) To: git; +Cc: junkio, linux I'm working my way through a thorough understanding of merging. First I got git-read-tree's 3-way merge down to 6 conditionals, where a missing entry is considered equal to a missing entry, and a missing index entry is considered clean. a) If stage2 == stage3, use stage2 b) If stage1 == stage3, use stage2 c) If the index entry exists and is dirty (working dir changes), FAIL d) If stage1 == stage2, use stage3 e) If trivial-only, FAIL f) Return unmerged result for 3-way resolution by git-merge-index. Case c is needed so you don't change the world out from under your working directory changes. You could move it earlier and make things strictire, but that's the minimal restriction. Then I started thinking about 2-way merge, and how that differed from a 3-way merge where stage2 was the previous index contents. If you apply the same rules (with trivial-only true), the only differences to the big 22-case table in the git-read-tree docs are: 3) This says that if stage1 and state3 exist, use stage3. 3-way says if they're equal, delete the file, while if they're unequal, it's fail. If 3-way git-merge-index were allowed, then the conditions that would change to do it are cases 8 and 12. The full list of cases and the conditional that applies, is: 0) a 1) d 2) a 3) see above. It's b or e by my logic, but d by the table. 4) b 5) b 6) a 7) a 8) e 9) c 10) d 11) c 12) e 13) c 14) a or b 15) a or b 16) e 17) c 18) a 19) a 20) d 21) c Given that it all matches up so nicely, I'd like to honestly ask if case 3 of the conditions is correct. I'd think that if I deleted a file form te index, and the file wasn't changed on the head I'm tracking, the right resolution is to keep it deleted. Why override my deletion? Sorry if this is a dumb question, but it's not obvious to me. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: More merge questions 2005-11-30 13:10 ` More merge questions linux @ 2005-11-30 18:37 ` Daniel Barkalow 2005-11-30 20:23 ` Junio C Hamano 1 sibling, 0 replies; 64+ messages in thread From: Daniel Barkalow @ 2005-11-30 18:37 UTC (permalink / raw) To: linux; +Cc: git, junkio On Wed, 30 Nov 2005, linux@horizon.com wrote: > Given that it all matches up so nicely, I'd like to honestly ask if > case 3 of the conditions is correct. I'd think that if I deleted > a file form te index, and the file wasn't changed on the head I'm > tracking, the right resolution is to keep it deleted. Why override > my deletion? You're allowed to do the two-way merge with your index empty, and this means that you just hadn't read the ancestor, not that you want to remove everything. I'm not sure what this is useful for. You're definitely allowed to do a three-way merge with your index empty, meaning that you don't have any local changes at all, which lets you do a merge in a temporary index that didn't exist before. (The two-way case is less interesting, because it's the same as just reading the new tree.) -Daniel *This .sig left intentionally blank* ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: More merge questions 2005-11-30 13:10 ` More merge questions linux 2005-11-30 18:37 ` Daniel Barkalow @ 2005-11-30 20:23 ` Junio C Hamano 2005-12-02 9:19 ` More merge questions (why doesn't this work?) linux 1 sibling, 1 reply; 64+ messages in thread From: Junio C Hamano @ 2005-11-30 20:23 UTC (permalink / raw) To: linux; +Cc: git linux@horizon.com writes: > 3) This says that if stage1 and state3 exist, use stage3. > 3-way says if they're equal, delete the file, while if they're > unequal, it's fail. > > Given that it all matches up so nicely, I'd like to honestly ask if > case 3 of the conditions is correct. I'd think that if I deleted > a file form te index, and the file wasn't changed on the head I'm > tracking, the right resolution is to keep it deleted. Why override > my deletion? > > Sorry if this is a dumb question, but it's not obvious to me. Funny that I asked exactly the same question when it was done first: http://marc.theaimsgroup.com/?l=git&m=111804744926989 It was a question about then-current code, so other cases might have been changed/corrected/enhanced since then, but I believe the behaviour for the case in question here stays the same til this day, and the response from Linus to that article still applies. http://marc.theaimsgroup.com/?l=git&m=111807024201485 I'll quote only the punch line here, but the whole thing is worth a read if you want to understand how this evolved and what the design choices and decisions were: Right. We didn't lose anything hugely important. In theory this could be a delete that we've missed, and we could add a flag to actually reject this case. However, it's always easy to "recover" deletes (just delete it again ;), so the loss of information is absolutely minimal, and it allows starting from an empty index file. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: More merge questions (why doesn't this work?) 2005-11-30 20:23 ` Junio C Hamano @ 2005-12-02 9:19 ` linux 2005-12-02 10:12 ` Junio C Hamano 2005-12-02 11:37 ` linux 0 siblings, 2 replies; 64+ messages in thread From: linux @ 2005-12-02 9:19 UTC (permalink / raw) To: git; +Cc: junkio, linux I was playing with the implications of the "deleted file in the index is not a conflict" merge rule, and came up with the following octopus test which fails to work. Note line 2 when choosing a directory to run it in! #!/bin/bash -xe rm -rf .git git-init-db echo "File A" > a echo "File B" > b echo "File C" > c git-add a b c git-commit -a -m "Octopus test repository" git-checkout -b a echo "Modifications to a" >> a git-commit -a -m "Modified file a" git-checkout -b b master echo "Modifications to b" >> b git-commit -a -m "Modified file b" git-checkout -b c master rm c git-commit -a -m "Deleted file c" git-checkout master #git merge --no-commit "" master c b a #git merge --no-commit "" master a b c git-rev-parse a b c > .git/FETCH_HEAD git-octopus (Commented out are the first few things I tried.) Can someone tell me why this doesn't work? It should be a simple in-index merge. Right after the incomplete merge (I hacked this into the git-octopus script), git-ls-files -s produces 100644 8fb437b77759c7709c122fbc8ba43f720e1fbc0a 0 a 100644 b3418f25da4393974aa205e2863f012e5b503369 0 b 100644 df78d3d51c369e1d2f1eadb73464aadd931d56b4 1 c 100644 df78d3d51c369e1d2f1eadb73464aadd931d56b4 2 c Which should be case 10 of the t/t1000-read-tree-m-3way.sh table and succeed. Other things I've discovered... 1) The MAJOR difference between "git checkout" and "git reset --hard" are that git-checkout takes a *head* as an argument and changes the .git/HEAD *symlink* to point to that head (ln -sf refs/heads/<head> .git/HEAD). "git reset" takes a *commit* (<rev>) as an argument and changes the head that .git/HEAD points to to have that commit as its hew tip (git-rev-parse <rev> > .git/HEAD) All the other behavioural differences are relatively minor, and appropriate for this big difference. 2) Don't use "git branch" to create branches, unless you really *don't* want to switch to them. Use "git checkout -b". 3) Dumb question: why does "git-commit-tree" need "-p" before the parent commit arguments? Isn't just argv[2]..argv[argc-1] good enough? 4) If the "git-read-tree" docs for "--reset", does "ignored" mean "not overwritten" or "overwritten"? 5) The final "error" message on "git-merge --no-commit" is a bit alarming for a newbie who uses it because they don't quite trust git's enough to enable auto-commit. And it should be changed from ""Automatic merge failed/prevented; fix up by hand" to "fix up and commit by hand". Or how about: "Automatic commit prevented; edit and commit by hand." which actually tells the truth. 6) The "pickaxe" options are being a bit confusing, and the fact they're only documented in cvs-migration.txt doesn't help. 7) The git-tag man page could use a little better description of -a. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: More merge questions (why doesn't this work?) 2005-12-02 9:19 ` More merge questions (why doesn't this work?) linux @ 2005-12-02 10:12 ` Junio C Hamano 2005-12-02 13:09 ` Sven Verdoolaege 2005-12-02 11:37 ` linux 1 sibling, 1 reply; 64+ messages in thread From: Junio C Hamano @ 2005-12-02 10:12 UTC (permalink / raw) To: linux; +Cc: git linux@horizon.com writes: > Which should be case 10 of the t/t1000-read-tree-m-3way.sh > table and succeed. Yes. The reason is git-read-tree's behaviour was changed underneath while octopus was looking elsewhere ;-). See Documentation/technical/trivial-merge.txt, last couple of lines. There are two schools of thoughts about "both sides remove" (case #10) case. Some people argued that "the branches might have renamed that path to different paths and might indicate a rename/rename conflict" (meaning read-tree should not consider it trivial, and leave that to upper level "policy layer" to decide). merge-one-file policy simply says "no, they both wanted to remove them". If I recall correctly, read-tree itself merged this case before multi-base rewrite happened (if you are curious, run 'git whatchanged -p read-tree.c' and look for "Rewrite read-tree"). > 1) The MAJOR difference between "git checkout" and "git reset --hard" True. "git reset --hard" should be used without <rev> by novices and with <rev> after they understand what they are doing (it is used for rewinding/warping heads). > 2) Don't use "git branch" to create branches, unless you really > *don't* want to switch to them. Use "git checkout -b". Because...? "git branch foo && git checkout foo" may be suboptimal to type, but it is not _wrong_; it does not do anything bad or incorrect. > 3) Dumb question: why does "git-commit-tree" need "-p" before the > parent commit arguments? Isn't just argv[2]..argv[argc-1] > good enough? 1. Why not? 2. I myself wondered about it long time ago. 3. It does not matter; nobody types that command by hand. 4. It allows us to later add some other flags to commit-tree (none planned currently). > 4) If the "git-read-tree" docs for "--reset", does "ignored" mean > "not overwritten" or "overwritten"? That sentence is very poorly written; a better paraphrasing is appreciated. $ git whatchanged -S--reset \ read-tree.c Documentation/git-read-tree.txt shows logs for 438195ccedce7270cf5ba167a940c90467cb72d7 commit (run "git-cat-file commit 438195cc" to read it). It ignores existing unmerged entries when reconstructing the index from the given tree ("git-read-tree -m", given an unmerged index, refuses to operate, but "--reset" *ignores* the unmerged ones hence it does not refuse to operate). > 5) The final "error" message on "git-merge --no-commit" is a bit > alarming for a newbie who uses it... First of all, --no-commit is not meant to be used by newbies, but you are right. Patches to make the failure message conditional are welcome. It should switch on these three cases: - "--no-commit" option is given, but a merge conflict would have prevented autocommit anyway; - "--no-commit" option is given, but automerge succeeded; - conflict prevented autocommit. > 6) The "pickaxe" options are being a bit confusing, and the fact they're > only documented in cvs-migration.txt doesn't help. Docs of git-diff-* family have OPTIONS section, at the end of which refers you to the diffcore documentation. Suggestions to a better organization and a patch is appropriate here. > 7) The git-tag man page could use a little better description of -a. Please. It should have the same "OPTIONS" section as others do. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: More merge questions (why doesn't this work?) 2005-12-02 10:12 ` Junio C Hamano @ 2005-12-02 13:09 ` Sven Verdoolaege 2005-12-02 20:32 ` Junio C Hamano 0 siblings, 1 reply; 64+ messages in thread From: Sven Verdoolaege @ 2005-12-02 13:09 UTC (permalink / raw) To: Junio C Hamano; +Cc: linux, git On Fri, Dec 02, 2005 at 02:12:42AM -0800, Junio C Hamano wrote: > linux@horizon.com writes: > > 3) Dumb question: why does "git-commit-tree" need "-p" before the > > parent commit arguments? Isn't just argv[2]..argv[argc-1] > > good enough? > > 3. It does not matter; nobody types that command by hand. > I do. git commit won't let me commit an empty tree, or at least I haven't figured out how to make it do that. I also used it when, after resolving a merge initiated by cg-merge, cogito (or at least the version I had installed at the time) wouldn't let me commit it because a new file I had pulled in contained non-ascii characters in its name. skimo ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: More merge questions (why doesn't this work?) 2005-12-02 13:09 ` Sven Verdoolaege @ 2005-12-02 20:32 ` Junio C Hamano 2005-12-05 15:01 ` Sven Verdoolaege 0 siblings, 1 reply; 64+ messages in thread From: Junio C Hamano @ 2005-12-02 20:32 UTC (permalink / raw) To: Sven Verdoolaege; +Cc: git Sven Verdoolaege <skimo@kotnet.org> writes: >> 3. It does not matter; nobody types that command by hand. I should have said "nobody should need to type that, otherwise fix your Porcelain". > I do. git commit won't let me commit an empty tree, or at > least I haven't figured out how to make it do that. You are right, at least for the initial commit (for subsequent commits it happily commits an empty tree). Now why anybody would want to it is a different matter. Is it because you would want to record that your project started from scratch, as opposed to some import from an existing non versioned (or versioned by another SCM) working tree? > I also used it when, after resolving a merge initiated by > cg-merge, cogito (or at least the version I had installed > at the time) wouldn't let me commit it because a new file > I had pulled in contained non-ascii characters in its name. That sounds like a simple Porcelain bug and I hope neither Cogito or git wouldn't have that problem now. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: More merge questions (why doesn't this work?) 2005-12-02 20:32 ` Junio C Hamano @ 2005-12-05 15:01 ` Sven Verdoolaege 0 siblings, 0 replies; 64+ messages in thread From: Sven Verdoolaege @ 2005-12-05 15:01 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Fri, Dec 02, 2005 at 12:32:17PM -0800, Junio C Hamano wrote: > Sven Verdoolaege <skimo@kotnet.org> writes: > > I do. git commit won't let me commit an empty tree, or at > > least I haven't figured out how to make it do that. > > You are right, at least for the initial commit (for subsequent > commits it happily commits an empty tree). > > Now why anybody would want to it is a different matter. Is it > because you would want to record that your project started from > scratch, as opposed to some import from an existing non > versioned (or versioned by another SCM) working tree? Something like that, yes. In the beginning there as nothing and git commited the nothingness. skimo ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: More merge questions (why doesn't this work?) 2005-12-02 9:19 ` More merge questions (why doesn't this work?) linux 2005-12-02 10:12 ` Junio C Hamano @ 2005-12-02 11:37 ` linux 2005-12-02 20:31 ` Junio C Hamano 1 sibling, 1 reply; 64+ messages in thread From: linux @ 2005-12-02 11:37 UTC (permalink / raw) To: junkio; +Cc: git, linux > Yes. The reason is git-read-tree's behaviour was changed > underneath while octopus was looking elsewhere ;-). See > Documentation/technical/trivial-merge.txt, last couple of > lines. > There are two schools of thoughts about "both sides remove" > (case #10) case. Um, I'm looking at the one-side remove case, which t/t1000 calls O A B result index requirements ------------------------------------------------------------------- 10 exists O==A missing remove ditto ------------------------------------------------------------------ while trivial-merge.txt says is: case ancest head remote result ---------------------------------------- 10 ancest^ ancest (empty) no merge I assumed the test case was probably more accurate, given that it's coupled to code which actually verifies the behaviour. > Some people argued that "the branches might > have renamed that path to different paths and might indicate a > rename/rename conflict" (meaning read-tree should not consider > it trivial, and leave that to upper level "policy layer" to > decide). merge-one-file policy simply says "no, they both > wanted to remove them". If I recall correctly, read-tree itself > merged this case before multi-base rewrite happened (if you are > curious, run 'git whatchanged -p read-tree.c' and look for > "Rewrite read-tree"). Aren't you talking about case #6? O A B result index requirements ------------------------------------------------------------------- 6 exists missing missing remove must not exist. ------------------------------------------------------------------ case ancest head remote result ---------------------------------------- 6 ancest+ (empty) (empty) no merge >> 1) The MAJOR difference between "git checkout" and "git reset --hard" > True. "git reset --hard" should be used without <rev> by > novices and with <rev> after they understand what they are > doing (it is used for rewinding/warping heads). For the longest time I had been under the delusion that "git-checkout <branch> *" and "git-reset --hard <branch>" were very similar operations (modulo your comments about deleting files): overwrite the index and working directory files with the versions from that branch. It's hard to say how much I managed to confuse myself by damaging test repositories while I didn't understand what was going on. >> 2) Don't use "git branch" to create branches, unless you really >> *don't* want to switch to them. Use "git checkout -b". > Because...? "git branch foo && git checkout foo" may be > suboptimal to type, but it is not _wrong_; it does not do > anything bad or incorrect. Yes, I know it works. I suggest avoiding it because there's a much more convenient alternative and I kept forgetting the second half and checking my changes in to the wrong branch. >> 3) Dumb question: why does "git-commit-tree" need "-p" before the >> parent commit arguments? Isn't just argv[2]..argv[argc-1] >> good enough? > 1. Why not? > 3. It does not matter; nobody types that command by hand. Because it's a real pain to get it properly quoted and set up in a shell script. "$@" is a lot simpler and easier, and old /bin/sh only has the one array which provides that magic quoting behaviour. (Admittedly, you usually pass the arguments through git-rev-parse first, and are then guaranteed no embedded whitespace.) > 4. It allows us to later add some other flags to commit-tree > (none planned currently). Making it disappear wouldn't preclude having more options, either, any more than the variable number of arguments to cp(1) or mv(1)... >> 4) If the "git-read-tree" docs for "--reset", does "ignored" mean >> "not overwritten" or "overwritten"? > That sentence is very poorly written; a better paraphrasing is > appreciated. diff --git a/Documentation/git-read-tree.txt b/Documentation/git-read-tree.txt index 8b91847..47e2f93 100644 --- a/Documentation/git-read-tree.txt +++ b/Documentation/git-read-tree.txt @@ -31,8 +31,8 @@ OPTIONS Perform a merge, not just a read. --reset:: - - Same as -m except that unmerged entries will be silently ignored. + Same as -m except that unmerged entries will be silently overwritten + (instead of failing). -u:: After a successful merge, update the files in the work @@ -47,7 +47,6 @@ OPTIONS trees that are not directly related to the current working tree status into a temporary index file. - <tree-ish#>:: The id of the tree object(s) to be read/merged. >> 5) The final "error" message on "git-merge --no-commit" is a bit >> alarming for a newbie who uses it... > First of all, --no-commit is not meant to be used by newbies, > but you are right. Well, I can tell you that it's very very attractive to newbies. The first 5 or 10 times I tried git-merge, I used --no-commit. (My surprise was mostly that there wasn't a one-letter -x form.) "Do something really complicated and then commit it to the repository" is a frightening concept. "Do something really complicated and then stop and wait for you to see if it was what you expected" is a lot more comforting. >> 6) The "pickaxe" options are being a bit confusing, and the fact they're >> only documented in cvs-migration.txt doesn't help. > Docs of git-diff-* family have OPTIONS section, at the end of > which refers you to the diffcore documentation. Suggestions to > a better organization and a patch is appropriate here. That's a bigger job; I'll work on it when I've finished the docs I'm writing right. :-) >> 7) The git-tag man page could use a little better description of -a. > Please. It should have the same "OPTIONS" section as others do. I know NOTHING about asciidoc, and really wish I could fix its lack-of-line-break problem: GIT-BISECT(1) GIT-BISECT(1) NAME git-bisect - Find the change that introduced a bug SYNOPSIS git bisect start git bisect bad <rev> git bisect good <rev> git bisect reset [<branch>] git bisect visualize git bisect replay <logfile> git bisect log but emulating what I saw elsewhere... diff --git a/Documentation/git-tag.txt b/Documentation/git-tag.txt index 95de436..7635b1e 100644 --- a/Documentation/git-tag.txt +++ b/Documentation/git-tag.txt @@ -10,6 +10,26 @@ SYNOPSIS -------- 'git-tag' [-a | -s | -u <key-id>] [-f | -d] [-m <msg>] <name> [<head>] +OPTIONS +------- +-a:: + Make an unsigned (anotation) tag object + +-s:: + Make a GPG-signed tag, using the default e-mail address's key + +-u <key-id>:: + Make a GPG-signed tag, using the given key + +-f:: + Replace an existing tag with the given name (instead of failing) + +-d:: + Delete an existing tag with the given name + +-m <msg>:: + Use the given tag message (instead of prompting) + DESCRIPTION ----------- Adds a 'tag' reference in .git/refs/tags/ @@ -23,7 +43,7 @@ creates a 'tag' object, and requires the in the tag message. Otherwise just the SHA1 object name of the commit object is -written (i.e. an lightweight tag). +written (i.e. a lightweight tag). A GnuPG signed tag object will be created when `-s` or `-u <key-id>` is used. When `-u <key-id>` is not used, the ^ permalink raw reply related [flat|nested] 64+ messages in thread
* Re: More merge questions (why doesn't this work?) 2005-12-02 11:37 ` linux @ 2005-12-02 20:31 ` Junio C Hamano 2005-12-02 21:32 ` linux 2005-12-02 21:56 ` More merge questions linux 0 siblings, 2 replies; 64+ messages in thread From: Junio C Hamano @ 2005-12-02 20:31 UTC (permalink / raw) To: linux; +Cc: git linux@horizon.com writes: > Um, I'm looking at the one-side remove case, which t/t1000 calls > > O A B result index requirements > ------------------------------------------------------------------- > 10 exists O==A missing remove ditto > ------------------------------------------------------------------ > > while trivial-merge.txt says is: > > case ancest head remote result > ---------------------------------------- > 10 ancest^ ancest (empty) no merge > > I assumed the test case was probably more accurate, given that it's coupled > to code which actually verifies the behaviour. You are right. And the test expects something different from that table in t/t1000 test. Relevant are the lines for ND (one side No action the other Delete) in the "expected" file. The test expects the result to be unmerged. Interesting is that it did so from the day one [*1*]. The very original read-tree 3-way was quite conservative and left more things unmerged for the policy script to handle, and it is not surprising it started like this, but during the course of the project I thought read-tree was made to collapse more cases in index. I am a bit surprised we did not loosen it ever since [*2*]. Thanks for pointing out the discrepancy. We earlier agreed that the table in t/t1000 test should go and superseded by trivial-merge.txt, so what the table says right now is a non-issue, but we _might_ want to revisit the issue of what should happen in case #8 and #10 sometime in the future, as the last three lines of trivial-merge.txt mentions. I'd say we should leave things as they are for now, though. > --reset:: > - > - Same as -m except that unmerged entries will be silently ignored. > + Same as -m except that unmerged entries will be silently overwritten > + (instead of failing). Thanks. > "Do something really complicated and then commit it to the repository" > is a frightening concept. "Do something really complicated and > then stop and wait for you to see if it was what you expected" is > a lot more comforting. Fair enough. >>> 7) The git-tag man page could use a little better description of -a. > >> Please. It should have the same "OPTIONS" section as others do. > > I know NOTHING about asciidoc, and really wish I could fix its > lack-of-line-break problem: Thanks for pointing that one ont. I think Josef recently did similar linebreak on git-mv page. I'll try and see if I can mimic what he did [*3*]. > diff --git a/Documentation/git-tag.txt b/Documentation/git-tag.txt > index 95de436..7635b1e 100644 > --- a/Documentation/git-tag.txt > +++ b/Documentation/git-tag.txt Thanks; applied. [Footnotes] *1* A pickaxe example: $ git whatchanged -p -S'100644 1 ND 100644 2 ND' shows only two commits. One is the first version of the test, and the other is to adjust for the output format from *2* Further archaeology revealed that I did loosening primarily for the 2-way side, and did not touch much about 3-way merge other than what used to be marked with ALT. There was no 10ALT ever so it shows that my memory is simply faulty ;-). *3* I did that, and it renders HTML side nicer, but it breaks manpages X-<. Inputs from asciidoc gurus are appreciated. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: More merge questions (why doesn't this work?) 2005-12-02 20:31 ` Junio C Hamano @ 2005-12-02 21:32 ` linux 2005-12-02 22:00 ` Junio C Hamano 2005-12-02 22:12 ` Linus Torvalds 2005-12-02 21:56 ` More merge questions linux 1 sibling, 2 replies; 64+ messages in thread From: linux @ 2005-12-02 21:32 UTC (permalink / raw) To: junkio, linux; +Cc: git > We earlier agreed that the table in t/t1000 test should go and > superseded by trivial-merge.txt, so what the table says right > now is a non-issue, but we _might_ want to revisit the issue of > what should happen in case #8 and #10 sometime in the future, as > the last three lines of trivial-merge.txt mentions. I'd say we > should leave things as they are for now, though. But back to my original problem... I don't much care whether it's done as a trivial merge or a non-trivial merge, but why the #%@#$ can't it be done as an automatic merge? As I said, I'm trying to build (and write down) a mental model, so the behaviour of git can be predicted. My mental model says this should work. It doesn't. Therefore my mental model is incorrect, and I don't actually understand what it's doing. #!/bin/bash -xe rm -rf .git git-init-db echo "File A" > a echo "File B" > b echo "File C" > c git-add a b c git-commit -a -m "Octopus test repository" git-checkout -b a echo "Modifications to a" >> a git-commit -a -m "Modified file a" git-checkout -b b master echo "Modifications to b" >> b git-commit -a -m "Modified file b" git-checkout -b c master rm c git-commit -a -m "Deleted file c" git-checkout master git merge "Merge a, b, c" master a b c produces... + git merge 'Merge a, b, c' master a b c Trying simple merge with a Trying simple merge with b Trying simple merge with c Simple merge did not work, trying automatic merge. Removing c fatal: merge program failed No merge strategy handled the merge. > *3* I did that, and it renders HTML side nicer, but it breaks > manpages X-<. Inputs from asciidoc gurus are appreciated. I tried adding " +" at end-of-line, which is supposed to force a line break, but that didn't have any effect. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: More merge questions (why doesn't this work?) 2005-12-02 21:32 ` linux @ 2005-12-02 22:00 ` Junio C Hamano 2005-12-02 22:12 ` Linus Torvalds 1 sibling, 0 replies; 64+ messages in thread From: Junio C Hamano @ 2005-12-02 22:00 UTC (permalink / raw) To: git <linux <at> horizon.com> writes: > + git merge 'Merge a, b, c' master a b c > Trying simple merge with a > Trying simple merge with b > Trying simple merge with c > Simple merge did not work, trying automatic merge. > Removing c > fatal: merge program failed > No merge strategy handled the merge. I think this is the same problem I fixed yesterday after the breakage report from Luben Tuikov. You need the ce3ca275452cf069eb6451d6f5b0f424a6f046aa commit. Sorry about that. Could you try the latest and see if it still breaks? ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: More merge questions (why doesn't this work?) 2005-12-02 21:32 ` linux 2005-12-02 22:00 ` Junio C Hamano @ 2005-12-02 22:12 ` Linus Torvalds 2005-12-02 23:14 ` linux 1 sibling, 1 reply; 64+ messages in thread From: Linus Torvalds @ 2005-12-02 22:12 UTC (permalink / raw) To: linux; +Cc: junkio, git On Fri, 2 Dec 2005, linux@horizon.com wrote: > > produces... > > + git merge 'Merge a, b, c' master a b c > Trying simple merge with a > Trying simple merge with b > Trying simple merge with c > Simple merge did not work, trying automatic merge. > Removing c > fatal: merge program failed > No merge strategy handled the merge. I'm getting ... + git merge 'Merge a, b, c' master a b c Trying simple merge with a Trying simple merge with b Trying simple merge with c Simple merge did not work, trying automatic merge. Removing c Merge 9ca217790c7e6581fe0b8b3b4baf026d03584c66, made by octopus. a | 1 + b | 1 + c | 1 - 3 files changed, 2 insertions(+), 1 deletions(-) delete mode 100644 c and I don't see why you wouldn't get that too. Do you have that broken version of git that had problems with "rmdir" and thought the unlink failed? Linus ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: More merge questions (why doesn't this work?) 2005-12-02 22:12 ` Linus Torvalds @ 2005-12-02 23:14 ` linux 0 siblings, 0 replies; 64+ messages in thread From: linux @ 2005-12-02 23:14 UTC (permalink / raw) To: linux, torvalds; +Cc: git, junkio > and I don't see why you wouldn't get that too. > > Do you have that broken version of git that had problems with "rmdir" and > thought the unlink failed? Quite possibly; my previous version of git was 27 November. (I've been using the Debian package builder, which insists on a full rebuild each time and is thus annoyingly slow... especially the "xmlto man" part. I think I'll switch to "make; make install") Anyway, updated and it works as expected. Sorry for the spurious complaint. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: More merge questions 2005-12-02 20:31 ` Junio C Hamano 2005-12-02 21:32 ` linux @ 2005-12-02 21:56 ` linux 1 sibling, 0 replies; 64+ messages in thread From: linux @ 2005-12-02 21:56 UTC (permalink / raw) To: git; +Cc: junkio, linux Just thinking about the difference between 2-way and 3-way merge... *Mostly* a 2-way merge is just a 3-way merge where one of the ways is taken from the index rather than from a tree. But there are some subtle differences. This diffierence is what forces octopus merge to form intermediate tree objects when doing its merges. If there was a way to merge directly into the index, octopus merge wouldn't have to make intermediate tree objects that would have to be garbage-collected later. (Indeed, I originally assumed that Octopus did all its merges in the index; it's only when I traced the code that I saw it calls git-write-tree multiple times.) Is the time saved, and space not wasted, worth implementing a 2-way merge that more exactly matches 3-way? It should be fairly straightforward to share the actual merging code. Opinions solicited. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-11-30 6:09 ` git-name-rev off-by-one bug linux 2005-11-30 6:39 ` Junio C Hamano @ 2005-11-30 16:12 ` Linus Torvalds 1 sibling, 0 replies; 64+ messages in thread From: Linus Torvalds @ 2005-11-30 16:12 UTC (permalink / raw) To: linux; +Cc: junkio, git, pasky On Tue, 30 Nov 2005, linux@horizon.com wrote: > > > +-0 -1 -2:: > > + When an unmerged entry is seen, diff against the base version, > > + the "first branch" or the "second branch" respectively. > > + > > + The default is to diff against the first branch. > > Er... why are these flags zero-based? Because it makes more sense from a "git diff" standpoint to do that. The fact that _internally_, git puts the first branch into "stage 2", and the second one into "stage 3", that's very much a internal git implementation issue that makes no sense to expose to a regular user. > git-ls-files -s displays them as "1", "2" and "3". All the docs talk > about "stage1", "stage2" and "stage3". Yes, but those are _technical_ docs, not docs aimed toward a user. Nobody sane uses "git-ls-files --stage" outside of a script, or unless they really know git and are trying to debug something. >From a user standpoint, it makes a lot more sense to say "primary branch" and "other branch" , and then "-1" and "-2" make sense (and then the "base of the merge" makes sense as "-0"). Linus ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-11-30 3:12 ` Linus Torvalds 2005-11-30 5:06 ` Linus Torvalds @ 2005-11-30 7:18 ` Junio C Hamano 2005-11-30 9:05 ` Junio C Hamano 2005-11-30 9:42 ` Junio C Hamano 1 sibling, 2 replies; 64+ messages in thread From: Junio C Hamano @ 2005-11-30 7:18 UTC (permalink / raw) To: Linus Torvalds; +Cc: git Linus Torvalds <torvalds@osdl.org> writes: > On Tue, 29 Nov 2005, Junio C Hamano wrote: >> >> I have actually resolved one conflicting merge with this and it >> was OK, except that it was a bit unpleasant when I first did >> "git-diff-index HEAD" without giving any path ;-), > > What does "git-diff-files" do? Just output a lot of nasty "unmerged" > messages? That was not what was unpleasant. What was unpleasant was those "unmerged" messages were buried under heap of normal diffs, showing the successfully merged entries as the result of merge. I am inclined to munge your patch to do this: * Change -0/-1/-2 to -1/-2/-3 to be consistent with stage numbers, for technically minded. * Give --base, --ours, and --theirs as synonyms for -1/-2/-3, for end users. * Change it not to pick other unmerged stage (I sent a separate message about this already). * Change the diff_unmerged_stage default to 0. While you are inspecting a conflicted merge, you need to give --ours (or -2) explicitly. Alternatively we could first check if the whole index is unmerged and make it default to 2 without flags, but that would mean inspecting 19K entries first before starting the main loop for the kernel for normal case. With hot cache it is fine, so I'll try it first. This "with unmerged defaults to 2 otherwise defaults to 0" behaviour needs to be made overridable with an option, say -0 (or --merged, but that is overkill); otherwise you cannot get diffs for merged paths until the index file is unmerged. * When diff_unmerged_stage is zero, keep the current behaviour. Show diff for only specified stage when diff_unmerged_stage is not zero. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-11-30 7:18 ` Junio C Hamano @ 2005-11-30 9:05 ` Junio C Hamano 2005-11-30 9:42 ` Junio C Hamano 1 sibling, 0 replies; 64+ messages in thread From: Junio C Hamano @ 2005-11-30 9:05 UTC (permalink / raw) To: Linus Torvalds; +Cc: git Junio C Hamano <junkio@cox.net> writes: > Linus Torvalds <torvalds@osdl.org> writes: > >> On Tue, 29 Nov 2005, Junio C Hamano wrote: >>> >>> I have actually resolved one conflicting merge with this and it >>> was OK, except that it was a bit unpleasant when I first did >>> "git-diff-index HEAD" without giving any path ;-), >> >> What does "git-diff-files" do? Just output a lot of nasty "unmerged" >> messages? > > That was not what was unpleasant. What was unpleasant was those > "unmerged" messages were buried under heap of normal diffs, > showing the successfully merged entries as the result of merge. > > I am inclined to munge your patch to do this: This I have done, and pushed out to "pu" for tonight. After doing some more test I'll have this graduate to "master" sometime tomorrow along with other accumulated changes. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-11-30 7:18 ` Junio C Hamano 2005-11-30 9:05 ` Junio C Hamano @ 2005-11-30 9:42 ` Junio C Hamano 1 sibling, 0 replies; 64+ messages in thread From: Junio C Hamano @ 2005-11-30 9:42 UTC (permalink / raw) To: Linus Torvalds; +Cc: git Junio C Hamano <junkio@cox.net> writes: > Linus Torvalds <torvalds@osdl.org> writes: > >> What does "git-diff-files" do? Just output a lot of nasty "unmerged" >> messages? > > That was not what was unpleasant. What was unpleasant was those > "unmerged" messages were buried under heap of normal diffs, > showing the successfully merged entries as the result of merge. Correction. The above is a faulty memory, and does not happen, with or without your "stage0 or stage2" patch. Cleanly merged paths are written out to working tree and collapsed to stage0 in the index, so diff-files wouldn't have shown them at all. Sorry about the confused statement. What I saw was my local modifications on the paths unrelated to the merge. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-11-30 2:33 ` Junio C Hamano 2005-11-30 3:12 ` Linus Torvalds @ 2005-11-30 3:15 ` linux 1 sibling, 0 replies; 64+ messages in thread From: linux @ 2005-11-30 3:15 UTC (permalink / raw) To: junkio, torvalds; +Cc: git, linux, pasky > This one is done with the updated merge-one-file, which leaves > unmerged entries in the index file to prevent unresolved merge > from getting committed by mistake. > > After "git pull ..." fails, earlier the user said: > > $ git-diff > > to see half-merged state. Now git-diff just says: > > $ git-diff > * Unmerged path ls-tree.c > > In order to get the earlier "show me the failed merge relative > to my HEAD", you can say: > > $ git-diff HEAD ls-tree.c Cool! You all know I like this change, mostly because it makes git's merging conceptually cleaner and easier to explain. Looking at git, the difference between it and other SCMs is the emphasis on merging over editing. The tools for local development are a bit primitive in core git, but that's a well-understood problem and the tools can be implemented as needed. What makes git special is the assumption that a patch is going to pass through several people on its way from the text editor to the release, so merging is actually more important than initial writing. Rather than saying "Linus doesn't scale" and giving up, it's seen as an Amdahl's law problem - the goal is to remove as much work from Linus as possible and thus make him scale. (The shorter and earther way to say this is that Linus is a lazy bastard, which is no surprise to anyone who's seen him try to hide behind a podium. ;-) ) And an essential part of making that work is a good toolkit for dealing with merging, and particularly in-progress merges. By having the concept of an unmerged index, git lets you develop merging algorithms in a modular way, as opposed to "one big hairy pile of magic DWIMmery" that people are afraid to touch. For example, one thing I'm sure will arrive fairly soon is file-type specific merge algorithms. For something like a .po file where the order of sections doesn't matter, merging ad->abd and ad->acd can be fully automated. There are a number of good idea in git, but from what I've seen so far, "git-read-tree -m" is the most important one. Making git-diff Do The Right Thing is a relatively minor matter. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-11-30 1:51 ` Linus Torvalds 2005-11-30 2:06 ` Junio C Hamano 2005-11-30 2:33 ` Junio C Hamano @ 2005-11-30 18:11 ` Daniel Barkalow 2 siblings, 0 replies; 64+ messages in thread From: Daniel Barkalow @ 2005-11-30 18:11 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux, junkio, git, pasky On Tue, 29 Nov 2005, Linus Torvalds wrote: > If we left things in the index in an unmerged state, we'd be guaranteed to > either _fail_ that git commit unless somebody has done the > git-update-index (or names the files specifically on the commit command > line, which will do it for you). At this point, we could have a "git-merged-by-hand" script that would take filenames, check that they're unmerged now, and, if so, call git-update-index for them. And it could have a -a to do all of the unmerged entries (i.e., "I'm done merging by hand"), and maybe also have a flag to git-commit that does this, so you can say, "Commit the merge I did by hand, whatever filenames it used, but not any other changes I may have had beforehand." The "merged-by-hand" script would probably be a sensible place to complain about leftover conflict markers (unless you force it). -Daniel *This .sig left intentionally blank* ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-11-29 5:54 ` Junio C Hamano 2005-11-29 8:05 ` linux @ 2005-11-30 17:46 ` Daniel Barkalow 2005-11-30 20:05 ` Junio C Hamano 1 sibling, 1 reply; 64+ messages in thread From: Daniel Barkalow @ 2005-11-30 17:46 UTC (permalink / raw) To: Junio C Hamano; +Cc: linux, git On Mon, 28 Nov 2005, Junio C Hamano wrote: > *1* It is a shame that the most comprehensive definition of > 3-way read-tree semantics is in t/t1000-read-tree-m-3way.sh test > script. Isn't Documentation/technical/trivial-merge.txt more comprehensive? Probably the tables in various other places should be replaced with references to this document. -Daniel ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-11-30 17:46 ` Daniel Barkalow @ 2005-11-30 20:05 ` Junio C Hamano 2005-11-30 21:06 ` Daniel Barkalow 0 siblings, 1 reply; 64+ messages in thread From: Junio C Hamano @ 2005-11-30 20:05 UTC (permalink / raw) To: Daniel Barkalow; +Cc: git Daniel Barkalow <barkalow@iabervon.org> writes: > On Mon, 28 Nov 2005, Junio C Hamano wrote: > >> *1* It is a shame that the most comprehensive definition of >> 3-way read-tree semantics is in t/t1000-read-tree-m-3way.sh test >> script. > > Isn't Documentation/technical/trivial-merge.txt more comprehensive? It describes the multi-base extention while the old one was done before the multi-base, so content-wise it may be more up to date. One thing I have most trouble with is that it is not obvious if the table is covering all the cases. You have to read from top to bottom and consider the first match as its fate [*1*]. I was about to write "with no match resulting in no merge", but it is not even obvious if there are cases that would fall off at the end from the table by just looking at it. Even worse, if we add "no match results in no merge" at the end, by definition it covers all the cases, but it is not obvious what those fall-off cases are (IOW, what kinds of conflict they are and why they are not handled). Another thing, perhaps more important, is taht it does not seem to talk about index and up-to-dateness requirements much; it says something about what happens when "no merge" result is taken, but it is not clear about other cases. The table in t1000 test marks the case with "must match X" when index and tree X must agree at the path, and with "must match X and be up-to-date" when in addition the file in the working tree must match what is recorded in the index at the path (i.e. the former can have local modification in the working tree as long as index entry and tree match). This is vital in making sure that read-tree 3-way merge does not lose information from the working tree. I am sure your updated *code* is doing the right thing, but the documentation is not clear about it. E.g. case 3ALT in the table says "take our branch if the path does not exist in one or more of common ancestors and the other branch does not have it" without saying anything about index nor up-to-dateness requirements. In this case, the index must match HEAD but the working tree file is allowed to have local modification (t1000 table says "must match A"). If somebody wants to audit if the current read-tree.c does the right thing for this case, he needs the documentation to tell him what should happen. There may be thinko in the design (IOW, the index requirements the documentation places may not make sense) that can be found during such an audit. There may be implementation error that the code does not match what the documentation says should happen. Not having that information in the case table makes these verification difficult. > Probably the tables in various other places should be replaced with > references to this document. I agree 100% that having them scattered all over is bad and the trivial-merge.txt is the logical place to consolidate them, but I do not think simply removing others and pointing at trivial-merge.txt without updating it is good enough. [Footnote] *1* That is OK from an implementation point of view (i.e. we can look at the table, and then go to C implementation and follow its if-elif chain to see if the same checks are done in the same order as specified in the document), but for somebody who wants to understand the semantics, i.e. what the thing it does means, by looking at the documentation it is harder to read. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-11-30 20:05 ` Junio C Hamano @ 2005-11-30 21:06 ` Daniel Barkalow 2005-11-30 22:00 ` Junio C Hamano 0 siblings, 1 reply; 64+ messages in thread From: Daniel Barkalow @ 2005-11-30 21:06 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Wed, 30 Nov 2005, Junio C Hamano wrote: > Daniel Barkalow <barkalow@iabervon.org> writes: > > > On Mon, 28 Nov 2005, Junio C Hamano wrote: > > > >> *1* It is a shame that the most comprehensive definition of > >> 3-way read-tree semantics is in t/t1000-read-tree-m-3way.sh test > >> script. > > > > Isn't Documentation/technical/trivial-merge.txt more comprehensive? > > It describes the multi-base extention while the old one was done > before the multi-base, so content-wise it may be more up to date. > > One thing I have most trouble with is that it is not obvious if > the table is covering all the cases. You have to read from top > to bottom and consider the first match as its fate [*1*]. I actually had that problem with the original tables; there isn't a canonical order in which to list a table of all of the possible matches and non-matches between items so as to be complete. Perhaps it ought to list, on each line, which previous cases would match, so that you could see that case 2 is really the conditions of 2 minus the conditions for 2ALT, which is "all of the ancestors are empty, the head has a directory/file conflict, and remote exists." It can't fall off the table, because 1, 2, 3, 4, 6, 7, 9, and 11 cover all of the possibilities with respect to inputs being empty, and do not care about matching between the inputs. > I was about to write "with no match resulting in no merge", but it is > not even obvious if there are cases that would fall off at the > end from the table by just looking at it. Even worse, if we add > "no match results in no merge" at the end, by definition it > covers all the cases, but it is not obvious what those fall-off > cases are (IOW, what kinds of conflict they are and why they are > not handled). > > Another thing, perhaps more important, is taht it does not seem > to talk about index and up-to-dateness requirements much; it > says something about what happens when "no merge" result is > taken, but it is not clear about other cases. The table in > t1000 test marks the case with "must match X" when index and > tree X must agree at the path, and with "must match X and be > up-to-date" when in addition the file in the working tree must > match what is recorded in the index at the path (i.e. the former > can have local modification in the working tree as long as index > entry and tree match). But that is redundant information. I was actually confused by that part of the table for a long time, because it was not clear that it followed a couple of simple rules (which I give above the table), and weren't actually chosen on a case-by-case basis. The implementation I did is actually much easier to verify, because it doesn't go into each case for the index requirements, but checks the actual rules: the index must match either the head or (if there is one) the merge result, and the index must not be dirty if there is a "no merge" result. Therefore, you can't lose any work in the index (either you didn't have any, or you did the same thing), and you can't lose any work in the working tree (either you didn't have any, or we're not going to use the working tree). Last time we discussed it ("Multi-ancestor read-tree notes"), you said: I like the second sentence in three-way merge description. That is a very easy-to-understand description of what the index requirements are. > This is vital in making sure that read-tree 3-way merge does not > lose information from the working tree. I am sure your updated > *code* is doing the right thing, but the documentation is not > clear about it. E.g. case 3ALT in the table says "take our > branch if the path does not exist in one or more of common > ancestors and the other branch does not have it" without saying > anything about index nor up-to-dateness requirements. "If the index exists, it is an error for it not to match either the head or (if the merge is trivial) the result." "A result of "no merge" is an error if the index is not empty and not up-to-date." So the index is permitted to not exist (you missed this case), but if it exists, it must match HEAD (or, well, HEAD, which is the result). The index need not be up-to-date (since the result is not "no merge"), so the working tree doesn't matter. > > Probably the tables in various other places should be replaced with > > references to this document. > > I agree 100% that having them scattered all over is bad and the > trivial-merge.txt is the logical place to consolidate them, but > I do not think simply removing others and pointing at > trivial-merge.txt without updating it is good enough. Certainly, your complaints about the table should be addressed first. I think I'd addressed all your complaints from last time, but at that point, we got sidetracked into a discussion of the details of case #16. -Daniel *This .sig left intentionally blank* ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-11-30 21:06 ` Daniel Barkalow @ 2005-11-30 22:00 ` Junio C Hamano 2005-11-30 23:12 ` Daniel Barkalow 0 siblings, 1 reply; 64+ messages in thread From: Junio C Hamano @ 2005-11-30 22:00 UTC (permalink / raw) To: Daniel Barkalow; +Cc: git Daniel Barkalow <barkalow@iabervon.org> writes: > I actually had that problem with the original tables; there isn't a > canonical order in which to list a table of all of the possible matches > and non-matches between items so as to be complete. > > Perhaps it ought to list, on each line, which previous cases would match, > so that you could see that case 2 is really the conditions of 2 minus the > conditions for 2ALT, which is "all of the ancestors are empty, the head > has a directory/file conflict, and remote exists." Sorry, I was not clear about it when I did that table the first time. 2ALT was "alternatives suggested to replace 2" and listed in the same table for comparison purpose. The original table was designed in a way that if you have a match on case N, there would not be any other case M that matches the case, either N<M or M<N. IOW, the order to read the table did not matter. At least that was the intention. If you read "missing" = 0, "exists" = 1, and take OAB as bit2, bit1, and bit0, you can easily see the pattern in the table. It counts in binary, although bit1 has various subcases so the table has more than 8 rows, and it is easy to see it covers all. > "If the index exists, it is an error for it not to match either the > head or (if the merge is trivial) the result." > > "A result of "no merge" is an error if the index is not empty and not > up-to-date." That is good. > Certainly, your complaints about the table should be addressed first. I > think I'd addressed all your complaints from last time, but at that point, > we got sidetracked into a discussion of the details of case #16. ... which was a good thing to think about in itself. I feel I understand the new table a bit better. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-11-30 22:00 ` Junio C Hamano @ 2005-11-30 23:12 ` Daniel Barkalow 2005-12-01 7:46 ` Junio C Hamano 0 siblings, 1 reply; 64+ messages in thread From: Daniel Barkalow @ 2005-11-30 23:12 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Wed, 30 Nov 2005, Junio C Hamano wrote: > Daniel Barkalow <barkalow@iabervon.org> writes: > > > I actually had that problem with the original tables; there isn't a > > canonical order in which to list a table of all of the possible matches > > and non-matches between items so as to be complete. > > > > Perhaps it ought to list, on each line, which previous cases would match, > > so that you could see that case 2 is really the conditions of 2 minus the > > conditions for 2ALT, which is "all of the ancestors are empty, the head > > has a directory/file conflict, and remote exists." > > Sorry, I was not clear about it when I did that table the first > time. 2ALT was "alternatives suggested to replace 2" and listed > in the same table for comparison purpose. I understood that; actually, when I found it, a number of the ALT cases had been implemented, and some of them supplimented rather than replaced the originals. > The original table was designed in a way that if you have a > match on case N, there would not be any other case M that matches > the case, either N<M or M<N. IOW, the order to read the table > did not matter. At least that was the intention. > > If you read "missing" = 0, "exists" = 1, and take OAB as bit2, > bit1, and bit0, you can easily see the pattern in the table. It > counts in binary, although bit1 has various subcases so the > table has more than 8 rows, and it is easy to see it covers all. The hard thing is to verify that all the subcases are listed. I switched the orders of matching and non-matching, so that I could make it matching and need-not-match. Your table is actually missing a few cases: what happens if O is missing, 2 has a directory, and 3 has a file? You note that we have to be careful, but don't list the result (which is "no merge"). Perhaps the table would be clearer if the lines were grouped in exists/missing? (With 5ALT repeated in the 011 and 111 groups, since it applies to both) Then you would only need to look at 5 lines with cascading (in the most complex case), rather than having to read the whole top of the table. (It is actually written like that, with the exception of 5ALT, 2ALT, and 3ALT, but it's not visually obvious.) So case 11 is really: All three exist. Head and remote don't match (-5ALT), no ancestor matches remote (-13), and no ancestor matches head (-14). Case 13 is really: All three exist. Head and remote don't match (-5ALT), there aren't different ancestors which match head and remote (-16), and an ancestor matches remote. The tricky bit is really cases 2ALT and 3ALT, which can be used in cases where some but not all of the ancestors are empty, and can't be used if there's a directory/file conflict; neither of these conditions matters for anything else in the table, so it's hard to fit this in. My strategy is to have those as special cases, and have the rest of the table cover everything (rather than having case 2 require a directory/file conflict and case 7 require that no ancestor be empty, which would be accurate, but would make it harder to check for missing cases). -Daniel *This .sig left intentionally blank* ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-11-30 23:12 ` Daniel Barkalow @ 2005-12-01 7:46 ` Junio C Hamano 0 siblings, 0 replies; 64+ messages in thread From: Junio C Hamano @ 2005-12-01 7:46 UTC (permalink / raw) To: Daniel Barkalow; +Cc: git Delight in working on free software project is you have so many good people with you that you can have many "Aha, lightbulb!" moments. This is one of them for me. I realized where the trouble I felt when reading your table came from; it was that I was focused on the old way the table was organized too much. When one constructs a case table to make sure one covered everything, one first lists the variables and the possible values they can take, and make NxMxOxPx... grid. In the original table I did, I chose what is in O and A and B as my variables (and that's where my comment about O being bit2 etc comes from). I did not realize the semantics and algorithm you used can be better described by different set of variables (namely, how ancestors match the HEAD, and how the remote matches the HEAD, if I understand correctly). I had trouble understanding your version only because I kept thinking in terms of (O,A,B). So after thinking about that... Daniel Barkalow <barkalow@iabervon.org> writes: > On Wed, 30 Nov 2005, Junio C Hamano wrote: > > Perhaps the table would be clearer if the lines were grouped > in exists/missing? (With 5ALT repeated in the 011 and 111 > groups, since it applies to both) Then you would only need > to look at 5 lines with cascading (in the most complex > case), rather than having to read the whole top of the > table. I think the current ordering of cases makes more sense. If we forget about the case labels from the original table (and the way the original table classified cases), I suspect we could reorganize the cases to describe the semantics even better and clearer. That is, not grouping by exists/missing, but grouping by matching/unmatching. > (It is actually written like that, with the exception of 5ALT, 2ALT, and > 3ALT, but it's not visually obvious.) Yeah, I now realize that. > The tricky bit is really cases 2ALT and 3ALT, which can be used in cases > where some but not all of the ancestors are empty, and can't be used if > there's a directory/file conflict; neither of these conditions matters for > anything else in the table, so it's hard to fit this in. My strategy is to > have those as special cases, and have the rest of the table cover > everything (rather than having case 2 require a directory/file conflict > and case 7 require that no ancestor be empty, which would be accurate, but > would make it harder to check for missing cases). Makes sense. Thanks for the clarification and lightbulb moment. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-11-28 23:42 git-name-rev off-by-one bug linux 2005-11-29 5:54 ` Junio C Hamano @ 2005-12-01 10:14 ` Junio C Hamano 2005-12-01 21:50 ` Petr Baudis 1 sibling, 1 reply; 64+ messages in thread From: Junio C Hamano @ 2005-12-01 10:14 UTC (permalink / raw) To: linux; +Cc: git, Petr Baudis linux@horizon.com writes: > Anyway, if it's portable enough, it's faster. Ah... I just found discussion > of this in late September, but it's not clear what the resolution was. > http://marc.theaimsgroup.com/?t=112746188000003 Although updating our shell scripts to this century is lower on my priority scale, ideally I'd want to see things work with dash, not because I do not like bash/ksh, but because it seems the smallest minimally POSIXy shell. Speaking of shell gotchas, I do not know what the resolution was on the problem Merlyn was having the other day in "lost again on syntax change - local repository?" thread, which seemed that the failure described in <868xv86dam.fsf@blue.stonehenge.com> was his bash mishandling an if..then..elif..else..fi chain, which was sort of unexpected for me. I was curious but do not remember seeing the conclusion. Pasky, what happened to that thread? ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-12-01 10:14 ` Junio C Hamano @ 2005-12-01 21:50 ` Petr Baudis 2005-12-01 21:53 ` Randal L. Schwartz 0 siblings, 1 reply; 64+ messages in thread From: Petr Baudis @ 2005-12-01 21:50 UTC (permalink / raw) To: Junio C Hamano; +Cc: linux, git, Randal L. Schwartz Dear diary, on Thu, Dec 01, 2005 at 11:14:19AM CET, I got a letter where Junio C Hamano <junkio@cox.net> said that... > Speaking of shell gotchas, I do not know what the resolution was > on the problem Merlyn was having the other day in "lost again on > syntax change - local repository?" thread, which seemed that the > failure described in <868xv86dam.fsf@blue.stonehenge.com> was > his bash mishandling an if..then..elif..else..fi chain, which > was sort of unexpected for me. I was curious but do not > remember seeing the conclusion. Pasky, what happened to that > thread? I'm still perplexed and curious about what _did_ git-send-pack actually receive as URL, since it apparnetly decided it's ssh as well. -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ VI has two modes: the one in which it beeps and the one in which it doesn't. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: git-name-rev off-by-one bug 2005-12-01 21:50 ` Petr Baudis @ 2005-12-01 21:53 ` Randal L. Schwartz 0 siblings, 0 replies; 64+ messages in thread From: Randal L. Schwartz @ 2005-12-01 21:53 UTC (permalink / raw) To: Petr Baudis; +Cc: Junio C Hamano, linux, git >>>>> "Petr" == Petr Baudis <pasky@suse.cz> writes: Petr> I'm still perplexed and curious about what _did_ git-send-pack actually Petr> receive as URL, since it apparnetly decided it's ssh as well. Sorry... $work is swallowing my time right now. It's on my list of "very important things to get back to sometime real soon". -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 <merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/> Perl/Unix/security consulting, Technical writing, Comedy, etc. etc. See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training! ^ permalink raw reply [flat|nested] 64+ messages in thread
end of thread, other threads:[~2005-12-09 1:45 UTC | newest] Thread overview: 64+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-11-28 23:42 git-name-rev off-by-one bug linux 2005-11-29 5:54 ` Junio C Hamano 2005-11-29 8:05 ` linux 2005-11-29 9:29 ` Junio C Hamano 2005-11-30 8:37 ` Junio C Hamano 2005-11-29 10:31 ` Petr Baudis 2005-11-29 18:46 ` Junio C Hamano 2005-12-04 21:34 ` Petr Baudis 2005-12-08 6:34 ` as promised, docs: git for the confused linux 2005-12-08 21:53 ` Junio C Hamano 2005-12-08 22:02 ` H. Peter Anvin 2005-12-09 0:47 ` Alan Chandler 2005-12-09 1:45 ` Petr Baudis 2005-12-09 1:19 ` Josef Weidendorfer 2005-11-29 21:40 ` git-name-rev off-by-one bug linux 2005-11-29 23:14 ` Junio C Hamano 2005-11-30 0:15 ` linux 2005-11-30 0:53 ` Junio C Hamano 2005-11-30 1:27 ` Junio C Hamano 2005-11-30 1:51 ` Linus Torvalds 2005-11-30 2:06 ` Junio C Hamano 2005-11-30 2:33 ` Junio C Hamano 2005-11-30 3:12 ` Linus Torvalds 2005-11-30 5:06 ` Linus Torvalds 2005-11-30 5:51 ` Junio C Hamano 2005-11-30 6:11 ` Junio C Hamano 2005-11-30 16:13 ` Linus Torvalds 2005-11-30 16:08 ` Linus Torvalds 2005-12-02 8:25 ` Junio C Hamano 2005-12-02 9:14 ` [PATCH] merge-one-file: make sure we create the merged file Junio C Hamano 2005-12-02 9:15 ` [PATCH] merge-one-file: make sure we do not mismerge symbolic links Junio C Hamano 2005-12-02 9:16 ` [PATCH] git-merge documentation: conflicting merge leaves higher stages in index Junio C Hamano 2005-11-30 6:09 ` git-name-rev off-by-one bug linux 2005-11-30 6:39 ` Junio C Hamano 2005-11-30 13:10 ` More merge questions linux 2005-11-30 18:37 ` Daniel Barkalow 2005-11-30 20:23 ` Junio C Hamano 2005-12-02 9:19 ` More merge questions (why doesn't this work?) linux 2005-12-02 10:12 ` Junio C Hamano 2005-12-02 13:09 ` Sven Verdoolaege 2005-12-02 20:32 ` Junio C Hamano 2005-12-05 15:01 ` Sven Verdoolaege 2005-12-02 11:37 ` linux 2005-12-02 20:31 ` Junio C Hamano 2005-12-02 21:32 ` linux 2005-12-02 22:00 ` Junio C Hamano 2005-12-02 22:12 ` Linus Torvalds 2005-12-02 23:14 ` linux 2005-12-02 21:56 ` More merge questions linux 2005-11-30 16:12 ` git-name-rev off-by-one bug Linus Torvalds 2005-11-30 7:18 ` Junio C Hamano 2005-11-30 9:05 ` Junio C Hamano 2005-11-30 9:42 ` Junio C Hamano 2005-11-30 3:15 ` linux 2005-11-30 18:11 ` Daniel Barkalow 2005-11-30 17:46 ` Daniel Barkalow 2005-11-30 20:05 ` Junio C Hamano 2005-11-30 21:06 ` Daniel Barkalow 2005-11-30 22:00 ` Junio C Hamano 2005-11-30 23:12 ` Daniel Barkalow 2005-12-01 7:46 ` Junio C Hamano 2005-12-01 10:14 ` Junio C Hamano 2005-12-01 21:50 ` Petr Baudis 2005-12-01 21:53 ` Randal L. Schwartz
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).