* [DRAFT] Branching and merging with git
@ 2006-11-16 22:17 linux
2006-11-16 23:47 ` Junio C Hamano
` (7 more replies)
0 siblings, 8 replies; 66+ messages in thread
From: linux @ 2006-11-16 22:17 UTC (permalink / raw)
To: git; +Cc: linux
I know it took me a while to get used to playing with branches, and I
still get nervous when doing something creative. So I've been trying
to get more comfortable, and wrote the following to document what I've
learned.
It's a first draft - I just finished writing it, so there are probably
some glaring errors - but I thought it might be of interest anyway.
* Branching and merging in git
In CVS, branches are difficult and awkward to use, and generally
considered an advanced technique. Many people use CVS for a long time
without departing from the trunk.
Git is very different. Branching and merging are central to effective use
of git, and if you aren't comfortable with them, you won't be comfortable
with git. In particular, they are required to share work with other
people.
The only things that are a bit confusing are some of the names.
In particular, at least when beginning:
- You create new branches with "git checkout -b".
"git branch" should only be used to list and delete branches.
- You share work with "git fetch" and "git push". These are opposites.
- You merge with "git pull", not "git merge". "git pull" can
also do a "git fetch", but that's optional. What's not optional
is the merge.
* A brief digression on command names.
Originally, all git commands were named "git-foo". When there got to
be over a hundred, people started complaining about the clutter in
/usr/bin. After some discussion, the following solution was reached:
- It's now possible to place all of the git-foo commands into a separate
directory. (Despite the complaints, not too many people are doing it
yet.)
- One option for git users is to add that directory to their $PATH.
- Another is provided by a wrapper called just "git". It's intended to
live in a public directory like /usr/bin, and knows the location of
the separate directory. When you type "git foo", it finds and executes
"git-foo".
- Some simple commands are built into the git wrapper. When you type
"git add", it just does it internally. (On the git mailing list,
you will see patches like "make git diff a builtin"; this is what
they're talking about.)
- For compatibility, for each builtin, there is a "git-add" file,
which is just a link to the "git" wrapper. It looks at the name it
was invoked as to figure out what it should do.
The one confusing thing is that, although people usually type "git foo"
in examples, they're interchangeable in practice. I go back and forth
for no good reason. The main caveat is that to get the man page, you
still need to type "man git-foo". Fortunately, there are two other ways
to get the man page:
1) "git help foo"
2) "git foo --help"
Git doesn't have a specialized built-in help system; it just shows you
the man pages.
One outstanding problem with git's man pages is that often the most detail
is in the command page that was written first, not the user-friendly
one that you should use. For example, there are a number of special
cases of the "git diff" command that were written first, and the man
pages for these commands (git-diff-index, git-diff-files, git-diff-tree,
and git-diff-stages) are considerably more informative than the page for
plain git-diff, even though that's the command that you should use 99%
of the time.
* Git's representation of history
As you recall from Git 101, there are exactly four kinds of objects in
Git's object database. All of them have globally unique 40-character hex
names made by hashing their type and contents. Blob objects record file
contents; they contain bytes. Tree objects record directory contents;
they contain file names, permissions, and the associated tree or blob
object names. Tag objects are shareable pointers to other objects;
they're generally used to store a digital signature.
And then, we come to commit objects. Every commit points to (contains
the name of) an associated tree object which records the state of the
source code at the time of the commit, and some descriptive data (time,
author, committer, commit comment) about the commit.
And most importantly, it contains a list of "parent commits", older
commits from which this one is derived. These pointers are what produce
the history graph.
Typically only one commit (the initial commit) has zero parents. It's
possible to have more than one such commit (if you merge two projects
with different history), but that's unusual.
Many commits have exactly one parent. These are made by a normal commit
after editing. From a branching and merging point of view, they're not
too exciting.
And then there are commits which have multiple parents. Two is most
common, but git allows many more. (There's a limit of sixteen in the
source code, and the most anyone's ever used in real life is 12, and
that was generally regarded as overdoing it. Google on "doedecapus"
for discussion of it.)
Finally, there are references, stored in the .git/refs directory.
These are the human-readable names associated with commits, and the
"root set" from which all other commits should be reachable.
These references are generally divided into two types, although
there is no fundamental difference:
- Tags are references that are intended to be immutable.
The "v1.2" tag is a historical record. A tag may point to
a tag object (which will hold a signature), or just to a commit
directly. The latter isn't cryptographically authenticated, but
works just fine for everyday use.
- Heads are references that are intended to be updated. "Head"
is actually synonymous with "branch", although one emphasizes the
tip more, while the other directs your attention to the entire
path that got there.
Either way, they're just a 41-byte file that contains a 40-byte hex
object ID, plus a newline. Tags are stored in .git/refs/tags, and heads
are stored in .git/refs/heads. Creating a new branch is literally just
picking a file name and writing the ID of an existing commit into it.
The git programs enforce the immutability of tags, but that's a safety
feature, not something fundamental. You can rename a tag to the heads
directory and go wild.
The only limit on branches is clutter. A number of git commands have
ways to operate on "all heads", and if you have too many, it can get
annoying. If you're not using a branch, either delete it, or move it
somewhere (like the tags directory) where it won't clutter up the list of
"currently active heads".
(Note that CVS doesn't have this all-heads default, so people tend to
use longer branch names and keep them around after they've been merged
into the trunk. Old CVS repositories converted to git generally need
an old-branch cleanup.)
Another thing that's worth mentioning is that head and tag names can
contain slashes; i.e. you're allowed to make subdirectories in the
.git/refs/heads and .git/refs/tags directories. See the name page
for "git-check-ref-format" for full details of legal names.
* Naming revisions
CVS encourages you to tag like crazy, because the only other way to
find a given revision is by date. Git makes it a lot easier, so most
revisions don't need names.
You can find a full description in the git-rev-parse man page, but here's
a summary.
First of all, every commit has a globally unique name, its 40-digit hex
object ID. It's a bit long and awkward, but always works. This is useful
for talking about a specific commit on a mailing list. You can abbreviate
it to a unique prefix; most people find about 8 digits sufficient.
(Subversion is easier yet, because it assigns a sequential number to each
commit. However, that isn't possible in a distributed system like git.)
Second, you can refer to a head or tag name. Git looks in the
following places, in order, for a head:
1) .git
2) .git/refs
3) .git/refs/heads
4) .git/refs/tags
You should avoid having e.g. a head and a tag with the same name, but
if you do, you can specify one or the other with heads/foo and tags/foo.
Third, you can specify a commit relative to another. The simplest
one is "the parent", specified by appending ^ to a name. E.g. HEAD^
or deadbeef^. If there are multiple parents, then ^ is the same as ^1,
and the others are ^2, ^3, etc.
So the last few commits you've made are HEAD, HEAD^, HEAD^^, HEAD^^^, etc.
After a while, counting carets becomes annoying, so you can abbreviate
^^^^ as ~4. Note that this only lets you specify the first parent.
If you want to follow a side branch, you have to specify something like
"master~305^2~22".
* Converting between names
Git has two helpers (programs designed mainly for use in shell scripts)
to convert between global object IDs and human-readable names.
The first is git-rev-parse. This is a general git shell script helper,
which validates the command line and converts object names to absolute
object IDs. Its man page has a detailed description of the object
name syntax.
The second is git-name-rev, which converts the other way around. It's
particularly useful for seeing which tags a given commit falls between.
* Working with branches, the trivial cases.
By convention, the local "trunk" of git development is called "master".
This is just the name of the branch it creates when you start an empty
repository. You can delete it if you don't like the name.
If you create your repository by cloning someone else's repository, the
remote "master" branch is copied to a local branch named "origin". You
get your own "master" branch which is not tied to the remote repository.
There is always a current head, known as HEAD. (This is actually a
symbolic link, .git/HEAD, to a file like refs/heads/master.) Git requires
that this always point to the refs/heads directory.
Minor technical details:
1) HEAD used to be a Unix symlink, and can still be though of that
way, but for Microsoft support, this is now what's called a
"symbolic reference" or symref, and is a plain file containing
"ref: refs/heads/master". Git treats it just like a symlink.
There's a git-update-ref helper which writes these.
2) While HEAD must point to refs/heads, it's legal for it to
point to a file that doesn't exist. This is what happens
before the first commit in a brand new repository.
When you do "git commit", a new commit object is created with the old
HEAD as a parent, and the new commit is written to the current head
(pointed to by HEAD).
* The three uses of "git checkout"
Git checkout can do three separate things:
1) Change to a new head
git checkout [-f|-m] <branch>
This makes <branch> the new HEAD, and copies its state to the index
and the working directory.
If a file has unsaved changes in the working directory, this tries
to preserve them. This is a simple attempt, and requires that the
modified files(s) are not altered between the old and new HEADs.
In that case, the version in the working directory is left untouched.
A more aggressive option is -m, which will try to do a three-way
(intra-file) merge. This can fail, leaving unmerged files in the
index.
An alternative is to use -f, which will overwrite any unsaved changes
in the working directory. This option can be used with no <branch>
specified (defaults to HEAD) to undo local edits.
2) Revert changes to a small number of files.
git checkout [<revision>] [--] <paths>
will copy the version of the <paths> from the index to the working
directory. If a <revision> is given, the index for those paths will
be updated from the given revision before copying from the index to
the working tree.
Unlike the version with no <paths> specified, this does NOT update
HEAD, even if <paths> is ".".
3) Create a branch.
git checkout [-f|-m] -b <branch> [revision]
will create, and switch to, a new branch with the given name.
This is equivalent to
git branch <branch> [<revision>]
git checkout [-f|-m] <branch>
If <revision> is omitted, it defaults to the current HEAD, in which
case no working directory files are altered.
This is the usual way that one checks out a revision that does not
have an existing head pointing to it.
* Deleting branches
"git branch -d <head>" is safe. It deletes the given <head>, but first
it checks that the commit is reachable some other way. That is, you
merged the branch in somewhere, or you never did any edits on that branch.
It's a good idea to create a "topic branch" when you're working on
anything bigger than a one-liner, but it's also a good idea to delete
them when you're done. It's still there in the history.
* Doing rude things to heads: git reset
If you need to overwrite the current HEAD for some reason, the tool to
do it with is "git reset". There are three levels of reset:
git reset --soft <head>
This overwrites the current HEAD with the contents of <head>.
If you omit <head>, it defaults to HEAD, so this does nothing.
git reset [<head>]
git reset --mixed [<head>]
These overwrite the current HEAD, and copy it to the index,
undoing any git-update-index commands you may have executed.
If you omit <head>, it default to HEAD, so there is no change
to the current branch, but all index changes are undone.
git reset --hard [<head>]
This does everything mentioned above, and updates the
working directory. This throws away all of your in-progress
edits and gets you a clean copy. This is also commonly
used without an explicit <head>, in which case the current
HEAD is used.
* Using git-reset to fix mistakes
"Oh, no! I didn't mean to commit *that*! How do I undo it?"
If you just want to undo a commit, then you can use "git reset HEAD^"
to return the current HEAD to the previous version. If you want to leave
the commit in the index (this only applies to you if you are familiar with
using the index; see below), then you can use "git reset --soft HEAD^".
And if you want to blow away every record of the changes you made,
you can use "git reset --hard HEAD^"
If you just want a stupid trivial mistake and want to replace the most
recent commit with a corrected one, "git commit --amend" is your friend.
It makes a new commit with HEAD^ rather than HEAD as its ancestor.
* Fixing mistakes without git-reset
git-reset has the problem that it doesn't preserve hacking in progress
in the working directory. It can leave the working directory alone
(making everything a "hack in progress"), but it can't merge changes
like git checkout.
So, suppose you've been trying something that should have been simple, and
made three commits before realizing that the problem is harder than you
thought and you want your work so far to be on a new branch of its own;
committing them on the current HEAD (I'll call it "old") was a mistake.
You don't want to erase anything, just rename it. Make "new" a copy of
the current "old" and move old back to HEAD^^^ (three commits ago).
While there are ways to do that using git-reset, but far better is
to use "git branch -f":
git checkout -b new
Create (and switch to) the "new" branch.
git branch -f old HEAD^^^
Forcibly move "old" back three versions.
(You could also use old~3 or new^^^ or any synonymous name.)
You can use a similar trick to rename a branch. If it's the current
HEAD, then:
git checkout -b newname
git branch -d oldname
and if it's not, then
git branch newname oldname
git branch -d oldname
An alternative in the latter case is to just use mv on the raw
.git/refs/heads/oldname file.
* How do I check out an old version?
A very common beginning question is how to check out an old version.
Say you need to compile an old release for test purposes. "git checkout
v1.2" gives a funny error message. What's going on?
Well, "git checkout" makes the current HEAD point to the head that
you specify. And, as previously mentioned, git requires that it point
to something in the .git/refs/heads directory. So you can't do that.
If you're busy doing things in your working directory, and don't want to
overwrite your work with an old version, then you can get a snapshot with
the (old) git-tar-tree or (new) git-archive commands. These produce a
tar file (git-archive can also produce a zip file) which is a snapshot
of any version you like. You can then unpack this file in a different
directory and build it.
However, if you haven't got any edits in progress, and want to check out
the old version into your working directory, just create a temp branch!
git checkout -b temp v1.2
Will do what you want. This will also do what you want if you have a
local edit (like the "#define DEBUG 1" mentioned above) that you want
to preserve while working on the old version.
You'll see this in use if you ever use the (highly recommended) git-bisect
tool. It creates a branch called "bisect" for the duration of the bisect.
(Yes, I have to confess, I sometimes wish that git would enforce the
"HEAD must point to .git/refs/heads" rule when committing (checking in)
rather than when checking out, but that's the way git has grown up.)
Note that if you want *exactly* an old version, with no local hacks,
make sure there are none (with "git status") when doing this. It's more
convenient if you do it before the checkout, but you'll get the same
answer if you ask afterwards.
Now, what about the complex case: you have local hacks that you
want to keep, but not have polluting the old version?
Well, one way of the other, you'll have to commit it. If you don't mind
committing your changes to the current branch ("git commit -a"), do that.
If they're not ready to commit, you can commit them anyway, and back
them out when you're done:
git commit -a -m "Temp commit"
git checkout -b temp v1.2
make ; make test ; whatever
git checkout master
git branch -d temp
git reset HEAD^
This leaves both the working directory and the master head in the states
they were in at the beginning.
If you don't like committing to the master branch, you can make a new one.
In this example, it's "work in progress", a.k.a. "wip":
git checkout -b wip
git commit -a -m "Temp commit"
git checkout -b temp v1.2
make ; make test ; whatever
git checkout wip
git branch -d temp
git reset master
git checkout master # Won't change working directory
git branch -d wip
* Examining history: git-log and git-rev-list
In another example of docs being better on the first command written,
the all-purpose utility for examining history is "git log", but all of
the examples of clever ways to use it are in the git-rev-list man page.
And git-log also has most of git-diff's options.
Other utilities, notably the gitk and qgit GUIs, also use the git-rev-list
command-line options, so it's well worth learning them.
git-rev-list gives you a filtered subset of the repository history.
There are two basic ways that you can do the filtering:
1) By ancestry. You specify a set of commits to include all the
ancestors of, and another set to exclude all the ancestors of.
(For this purpose, a commit is considered an ancestor of itself.)
So if you want to see all commits between v1.1 and v1.2, you
can specify
git log ^v1.1 v1.2
or, with a more convenient syntax
git log v1.1..v1.2
However, there are times when you want to specify something more
complex. For example, if a big branch that had been in progress since
v1.0.7 was merged between v1.1 and v1.2, but you don't want to see it,
you could specify any of:
git log v1.2 ^v1.1 ^bigbranch
git log ^bigbranch v1.1..v1.2
git log ^v1.1 bigbranch..v1.2
They're all equivalent. Another special syntax that's sometimes
handy is
git log branch1...branch2
Note the three dots. This generates the symmetric difference between
the two; basically it's a diff between the commits that went into
each of them.
"git log" by default pipes its output through less(1), and generates
its output from newest to oldest on the fly, so there's no great
speed penalty to not specifying a starting place. It'll generate a
few screen fulls more than you look at, but not waste any more effort
than that.
2) By path name. This is a feature which appears to be unique to git.
If you give git-rev-list (or git-log, or gitk, or qgit) a list of
pathname prefixes, it will list only commits which touch those
paths. So "git log drivers/scsi include/scsi" will list only
commits which alters a file whose name begins with drivers/scsi
or include/scsi.
(If there's any possible ambiguity between a path name and a commit
name, git-rev-list will refuse to proceed. You can resolve it by
including "--" on the command line. Everything before that is a
commit name; everything after is a path.)
This filter is in addition to the ancestry filter. It's also rather
clever about omitting unnecessary detail. In particular, if there's
a side branch which does touch drivers/scsi, then the entire branch,
and the merge at the end, will be removed from the log.
You can additionally limit the commits to a certain number, or by date,
author, committer, and so on.
By default, "git log" only shows the commit messages, so it's important to
write good ones. Other tools compress commit messages down to
the first line, so try to make that as informative as possible.
* History diagrams
When talking about various situations involving multiple branches,
people often find it handy to draw pictures. Gitk draws nice pictures
vertically, but for e-mail, ASCII art drawn horizontally is often easier.
Commits are shown as "o", and the links between them with lines drawn with
- / and \. Time goes left to right, and heads may be labelled with names.
For example:
o--o--o <-- Branch A
/
o--o--o <-- master
\
o--o--o <-- Branch B
If someone needs to talk about a particular commit, the character "o"
may be replaced with another letter or number.
* Trivial merges: fast-forward and already up-to-date.
There are two kinds of merge that are particularly simple, and you will
encounter them in git a great deal. They are mirror images.
Suppose that you are working on branch A and merge in branch B, but no
work has been done to branch B since the last time you merged, or since
you spawned branch A from it. That is, the history looks like
o--o--o--o <-- B
\
o--o--o <-- A
or
o--o--o--o--o--o <-- B
\ \
o--o--o--o--o <-- A
If you then merge B into A, A is described as "already up to date".
It is already a strict superset of B, and the merge does nothing.
In particular, git will not create a dummy commit to record the fact that
a merge was done. It turns out that are a number of bad things that would
happen if you did this, but for now, I'll just say that git doesn't do it.
Now, the opposite scenario is the "fast-forward" merge. Suppose you
merge A into B. Again, A is a strict superset of B.
In this case, git will simply change the head B to point to the same
commit as A and say that it did a "fast-forward" merge. Again, no commit
object is created to reflect this fact.
The effect is to unclutter the git history. If I create a topic branch to
work on a feature, do some hacking, and then merge the result back into
the (untouched!) master, the history will look just like I did all the
work on the master directly. If I then delete the topic branch (because
I'm done using it), the repository state is truly indistinguishable.
While the topic branch existed, you could have done something to the
master branch, in which case the final merge would have been non-trivial,
but if that didn't happen, git produces a simple, easy-to-follow linear
history.
Some people used to heavyweight branches find this confusing; they
think a merge is a big deal and it should be memorialized, but there
are actually excellent reasons for doing this.
The most important one is that a fit of merging back and forth will
eventually end. Suppose that branches A and B are maintained by separate
developers who like to track each other's work closely.
If the fast-forward case did create a commit, then merging A into B
would produce
o--o--o--o---------o <-- B
\ /
o--o--o <-- A
then merging B into A would produce:
o--o--o--o---------o <-- B
\ / \
o--o--o---o <-- A
and further merges would produce more and more dummy commits, all without
ever reaching a steady state, and without making it obvious that the
two heads are actually identical.
Since history lasts forever, cluttering it up with unimportant stuff is a
burden to all future users, and not a good idea. Allowing the merge of a
branch to be seamless in the simple case encourages lightweight branches.
If you _might_ need a separate branch, create it. If it turned out that
you didn't, it won't make a difference.
* Exchanging work with other repositories
The basic tools for exchanging work with other repositories are "git
fetch" and "git push". The fact that "git pull" is not the opposite of
"git push" is often confusing to beginners (it's a superset of git fetch),
but that's the terminology that has grown up.
The unit of sharing in git is the branch. If you've used branches in
CVS, you'll be familiar with using "CVS update" to pull changes from your
"current branch" in the repository into your working directory.
In Git, you don't pull into the working directory, but rather into a
tracking branch. You set up a branch in your repository which will be
a copy of the branch in the remote repository. For example, if you use
"git clone", then the remote "master" branch is tracked by the local
"origin" branch.
Then, when you do a "git fetch", git fetches all of the new commits
and sets the origin head to point to the newly fetched head of the
remote branch.
By default, git checks that this is a trivial fast-forward merge, that
is not throwing away history. If it finds something like:
o--o--o--o--o--o <-- remote master
\
o <-- Local origin
It will complain and abort the fetch. This is usually a warning that
something has gone wrong - in particular, you forgot that this was
supposed to be a tracking branch and committed some work to it - and it
aborts before throwing your work away.
However, sometimes the remote git user will have a branch name that they
delete and re-create frequently. There are plenty of reasons to do this.
The most common is doing a "test merge" between various branches in
progress. They're all unfinished, so the developer of branch A doesn't
want to merge in all the new bugs in branch B, but a tester might want
to create a merged version with both sets of bugs for testing.
The merged version is not intended to be a permanent part of history -
it'll get deleted after the test - but it can still be useful to have
a draft copy.
In this case, you can mark the source branch with a leading "+", to
disable this sanity check. (See the git-fetch man page for details.)
Note that in this case, you should specifically avoid merging from such
a branch into any non-test branches of your own. It is, as mentioned,
not intended to be a permanent part of history, so don't make it part
of your permanent history. (You still might want to test-merge it with
your work in progress, of course.)
The fact that you should know to treat such branches specially is why
git doesn't try to automatically cope with them.
* Alternate branch naming
The original git scheme mixes tracking branches with all the other heads.
This requires that you remember which branches are tracking branches and
which aren't. Hopefully, you remember what all your branches are for,
but if you track a lot of remote repositories, you might not remember
what every remote branch is for and what you called it locally.
* Remotes files
You can specify what to fetch on the git-fetch command line. However,
if you intend to monitor another repository on an ongoing basis,
it's generally easier to set up a short-cut by placing the options in
.git/remotes/<name>.
The syntax is explained in the git-fetch man page. When this is st
up, "git fetch <name>" will retrieve all the branches listed in the
.git/remotes/<name> file. The ability to fetch multiple branches at
once (such as release, beta, and development) is an advantage of using
a remotes file.
You can also create the remotes file "origin" (not necessarily any
relation to the branch named "origin"), which is the default for
git-fetch. If you have a single primary "upstream" repository that
you sync to, place it in the origin remotes file, and you can just type
"git fetch" to get all the latest changes.
Note that branches to fetch are identified by "Pull: " lines in the
remotes file. This is another example of the fetch/pull confusion.
git-pull will be explained eventually.
* Remote tags
TODO: Figure out how remote tags work, under what circumstances
they are fetched, and what git does if there are conflicts.
* Exchanging work with other repositories, part II: git-push
It's simpler to set up git sharing on a pull basis. If your source
code isn't secret, you can set up a public read-only server very easily
(see the git-daemon man page for details), and have other fetch from that.
However, N developers all pulling from each other is an N^2 mess.
Some centralization helps.
One way is to have a central coordinator (like Linus) who pulls from
all of the developers, and who they in turn pull from.
The other is to have a central repository that people can push to.
This generally requires an ssh login on the server. You can use git-shell
as the login shell if all you want to allow the account to do is git
fetch and push. (You can use the hook scripts to enforce rules about
who's allowed to do what to which branch.)
Git-push to the remote machine works exactly like git-fetch from the
remote machine. The objects are moved over, and the branches pushed to
are fast-forwarded. If fast-forward is impossible, you get an error.
So if you have multiple people committing to a branch on the server,
you will not be allowed to push if someone has pushed more to that branch
since last time you fetched it.
You have to merge the changes locally, and re-try the push when you've
got a new head that includes the most recently pushed work as an ancestor.
This is exactly like "cvs commit" not working if your recent checkout
wasn't the (current) tip of the branch, but git can upload more than
one commit.
The simplest way to resolve the conflict is to merge the remote head with
your local head. This is easiest if you have different local branches
for fetching the remote repository and for pushing to it.
That is, you have one head that just tracks the master repository's
main branch, and another that you add your work to, and push from.
This makes merging simpler when there are conflicts.
Another use for git-push, even for a solo developer, is sharing your work
with the world. You can set up a public git server on a high-bandwidth
machine (possibly rented from a hosting service) and then push to it to
publish something.
* Merging (finally!)
I went through everything else first because the most common merge case
is local changes with remote changes. Not that you can't merge two
branches of your own, but you don't need to do that nearly as often.
The primitive that does the merging is called (guess what?) git-merge.
And you can use that if you want. If you want to create a so-called
octopus merge, with more than two parents, you have to.
However, it's usually easier to use the git-pull wrapper. This merges
the changes from some other branch into the current HEAD and generates
a commit message automatically.
git-merge lets you specify the commit message (rather than generating it
automatically) and use a non-HEAD destination branch, but those options
are usually more annoying than useful.
The basic git-pull syntax is
git-pull <repository> <branch>
The repository can be any URL that git supports. Including, particularly,
a local file. So to do a simple local merge, you just type
git-pull . <branch>
So after doing some hacking on branch "foo", you would
git checkout master
git pull . foo
and ba-boom, all is done.
Now, you can also specify a remote repository to merge from, using a
git://, http:// or git+ssh:// URL. This is what Linus does all day long,
and why the git-pull tool is optimized to allow that. It uses git-fetch
to fetch the remote branch without assigning it a branch name (it gets
the special name FETCH_HEAD temporarily), and them merges it into the
current HEAD directly.
There is absolutely nothing wrong with doing that, but beginners often
find it confusing to have a single short command do quite so much.
And if you are working closely with someone, it's often more convenient
and less confusing to keep local tracking branches. Then you can
git fetch upstream # Fetches 'origin'
git pull . origin
It's also possible to give just a single remotes file name to git-pull:
git pull upstream
That does a git fetch, updating all of the listed branches as usual,
then merges the _first_ listed branch into HEAD.
By the way: don't blink, you might miss it! As I mentioned, pulling is
a very big part of Linus's daily routine, and he's made sure it's fast.
(Actually, it produces a fair bit of output, so you'll see.)
Just to clarify, because people often get confused:
git-pull is a MERGING tool. It always does a merge, as well as an optional
fetch. If you just want to LOOK at a remote branch, use git-fetch.
* Undoing a merge
If you discover that a merge was a mistake, it can be undone just like
any other commit. The HEAD you merged to is the first parent, so just do
git reset --hard HEAD^
This is why Linus likes a git-pull command that does so much in one shot -
if he doesn't like what he pulls, it's easy to undo.
* How merging operates
Git uses the basic three-way merge. First, it applies it to whole files,
and then to lines within files.
To do a three-way merge, you need three versions of a file. The versions
A and B you want to merge, and a common ancestor, commonly called O.
That is, history proceeds something like:
o--o--A
/
o--o--O
\
o--B
The basic idea is "I want the file O, plus all the changes made from O
to A, plus all the changes made from O to B." Since the cases where one
of A or B is a direct ancestor of the other have already been disposed
of, the three commits must be different.
For each file, there are a few cases that are trivial, and git gets
these out of the way immediately:
- If A and B are identical, the merged result is obvious.
- If O and A are the same, then the result should be B.
- If O and B are the same, then the result should be A.
In the completely trivial case when O, A and B are the same, then
all three rules apply, they all produce the same obvious result.
The "merge base" version O is generally the most recent common ancestor
of A and B. The only problem is, that's not necessarily unique!
The classic confusing case is called a "criss-cross merge", and looks
like this:
o--b-o-o--B
/ \ /
o--o--o X
\ / \
o--a-o-o--A
There are two common ancestors of A and B, marked a and b in the graph
above. And they're not the same. You could use either one and get
reasonable results, but how to choose?
The details are too advanced for this discussion, but the default
"recursive" merge strategy that git uses solves the answer by merging
a and b into a temporary commit and using *that* as the merge base.
Of course, a and b could have the same problem, so merging them could
require another merge of still-older commits. This is why the algorithm
is called "recursive." It's been tested with pathological conditions,
but multiply nested criss-cross merges are very rare, so the recursion
isn't a performance limit in practice.
If all three of a given file in O, A, B are different, then the three
versions are pulled into the index file, called "stage 1", "stage 2",
and "stage 3", and a merge strategy driver is called to resolve the mess.
Git then uses the classic line-based three-way merge, looking for isolated
changes and applying the same rules as for files when two of the source
files are the same in some range.
* Alternate merge strategies
In every version control system prior to git, the merging algorithm was
buried deep in the bowels of the software, and very difficult to change.
One of particularly nice things that git did was allow for easily
replaceable "merge strategies". Indeed, you can try multiple merge
strategies, and the fallback - print an error message and let the user
sort it out - can be thought of as just another merge strategy.
Enabling this is why the index is so important to git. It provides a
place to store an unfinished merge, so you can try various strategies
(including hand-editing) to finish it.
Generally, git's default merge strategies are just fine. There is,
however, one special case that is occasionally useful, specified with the
"-s ours" strategy.
That strategy instructs git that the merged result should be the same
as the current HEAD. Any other branches are recorded as parents, but
their contents are ignored.
What the heck is the use of that? Well, it lets you record the fact
that some work has been done in the history, and that it shouldn't be
merged again. For example, say you write and share a popular patch set.
People are always merging it in to their local source trees. But then
you discover a much better way to achieve the goal of that patch set, and
you want to publish the fact that the new patch supersedes the old one.
If you developed the new set starting from the old one, that would happen
automatically. But another way to achieve the same goal is to merge the
old branch it in using the "ours" strategy. Everyone else's git will
notice that the patch is already included, and stop trying to merge it in.
* When merging goes wrong
This is the fun part. Git's default recursive-merge strategy is pretty
clever, but sometimes changes truly do conflict and need manual fix-up.
When git is unable to complete a merge, it leaves the three different
versions in the index and places a file with CVS-style conflict markers
in the working directory.
As long as there is a "staged" file in the index, you will not be able
to commit. You must resolve the conflict, and update the index with the
resolved versions. You can do this one at a time with git-update-index,
or at the end by giving the files as arguments to git-commit.
Doing them one at a time is probably safest; checking in a file which still
has conflict markers makes a bit of a mess. Note that git will still
use the automatically generated commit message when you finally commit.
(It's in .git/MERGE_MSG, if you care.)
Note that "git diff" knows how to be useful with a staged file.
By default, it displays a multi-way diff. For example, suppose I take a
(slightly buggy) hello.c:
--- hello.c ---
#include <stdio.h>
int main(void)
{
printf("Hello, world!");
}
--- end ---
Now, suppose that in branch A, I fix some bugs - add the missing newline
and "return 0;". In branch B, I display my angst and change it to
"Goodbye, cruel world!". When I try to merge A into B, obviously I'll
get a conflict. The resultant file, with conflict markers, looks like:
--- hello.c ---
#include <stdio.h>
int
main(void)
{
<<<<<<< HEAD/hello.c
printf("Goodbye, cruel world!");
=======
printf("Hello, world!\n");
return 0;
>>>>>>> edadc53fc7a8aef2a672a4fa9d09aa16f4e14706/hello.c
}
--- end ---
and the result of "git diff" is
diff --cc hello.c
index 4b7f550,948a5f8..0000000
--- a/hello.c
+++ b/hello.c
@@@ -3,5 -3,6 +3,10 @@@
int
main(void)
{
++<<<<<<< HEAD/hello.c
+ printf("Goodbye, cruel world!");
++=======
+ printf("Hello, world!\n");
+ return 0;
++>>>>>>> edadc53fc7a8aef2a672a4fa9d09aa16f4e14706/hello.c
}
Notice how this is not a standard diff! It has two columns of diff
symbols, and shows the difference from each of the ancestors to the
current hello.c contents. I can also use "git diff -1" to compare
against the common ancestor, or "-2" or "-3" to compare against each of
the merged copies individually.
* Alternatives to merging
The bigger and more active your source tree, the more important it is to
keep the history reasonably clean. Just because git can do a merge in
under a second doesn't mean that you should do one daily. When you look
back at a feature's development history, you'd like to see meaningful
changes recorded and not a lot of meaningless ones.
Now, once you have shared a commit with others, and they have incorporated
it into their development, it becomes impossible to undo. But git
provides tools that are useful for "rewriting history" before public
release. These can be used to edit a commit for publication.
* Test merging
One way to keep the history clean is to simply not merge other branches
into your development branch. If you want to use your new features and
other people's code changes, make a test merge and use that, but don't
make that merge part of your branch.
This is slightly more work (you have to change to a test branch and do
your merging there), but not very much.
Sometimes, when doing this, a conflict appears between your changes and
someone else's development. If you get tired of fixing the same conflict
every time you do a test merge, have a look at the git-rerere tool.
This remembers resolved conflicts and tries to apply the same resolution
patch the next time.
It's written specifically to help you not do an extra merge unnecessarily.
Although its man page is well worth reading, you never invoke git-rerere
explicitly; it's invoked automatically by the merge and patch tools if
you create a .git/rr-cache directory.
* Cherry picking
If you have a series of patches on a branch, but you want a subset
of them, or in a different order, there's a handy utility called
"git-cherry-pick" which will find the diff and apply it as a patch to
the current HEAD. It automatically recycles the commit message from
the original commit.
If the patch can't be applied, it leaves the versions in the index and
conflict markers in the working directory just like a failed merge.
And just like a merge, it remembers the commit message and provides it
as a default when I finally commit.
Note that this can only work on a chain of single-parent commits.
If a commit has multiple parents, there's no single patch to apply.
You van get the list of commits on a branch with git-log or git-rev-list,
but for more complex cases, the git-cherry tool is designed to generate
the list of commits to merge. It has a rather neat approximate-match
function built in which identifies patches that appear to already be
present in the target branch.
* Rebasing
A special case of cherry-picking is if you want to move a whole branch
to a newer "base" commit. This is done by git-rebase. You specify
the branch to move (default HEAD) and where to move it to (no default),
and git cherry-picks every patch out of that branch, applies it on top
of the target, and moves the refs/heads/<branch> pointer to the newly
created commits.
By default, "the branch" is every commit back to the last common
ancestor of the branch head and the target, but you can override that
with command-line arguments.
If you want to avoid merge conflicts due to the master code changing out
from under your edits, but not have "cleanup" merges in your history,
git-rebase is the tool to use.
Git-rebase will also use git-rerere if enabled ("mkdir .git/rr-cache").
If rebasing encounters a conflict it can't resolve, it will stop halfway
and ask you to resolve the problem by hand. However, it still knows it
has a job to finish! The unapplied patches are remembered until you do
one of
git-rebase --continue
This will check in the current index. You should
do git-update-index <files> in the conflicts that
you resolve, but NOT do an actual git-commit.
git-rebase --continue will do the commit.
git-rebase --skip
This will skip the conflicting patch. You
don't have to resolve the conflicts; git will
just back up and try the next patch in the series.
git-rebase --abort
This will abandon the whole rebase operation (including
any half-done work) and return you to where you began.
Git-rebase can also help you divide up work. Suppose you've mixed up
development of two features in the current HEAD, a branch called "dev".
You want to divide them up into "dev1" and "dev2". Assuming that HEAD
is a branch off master, then you can either look through
git log master..HEAD
or just get a raw list of the commits with
git rev-list master..HEAD
Either way, suppose you figure out a list of commits that you want in
dev1 and create that branch:
git checkout -b dev1 master
for i in `cat commit_list`; do
git-cherry-pick $i
done
You can use the other half of the list you edited to generate the dev2
branch, but if you're not sure if you forgot something, or just don't
feel like doing that manual work, then you can use git-rebase to do it
for you...
git checkout -b dev2 dev # Create dev2 branch
git-rebase --onto master dev1 # Subreact dev1 and rebase
This will find all patches that are in dev and not in dev1,
apply them on top of master, and call the result dev2.
* Experimenting with merging
To play with non-trivial merging, get an existing git repository of
a non-trivial project (git itself and the Linux kernel are readily
available. Fire up gitk to look at history, find some interesting-looking
merges, and redo them yourself on a test branch.
As long as you do everything on test branches, you aren't going to screw
anything up. So play!
You can use gitk to search for "Conflicts:" in the commit comments to
find merges that didn't go smoothly and see what happens. (Or you can
search in "git log" output. gitk just draws prettier pictures.)
You can also set up two repositories on the same machine and try pulling
and pushing between them.
To identify arbitrary commits, the 40-byte raw hex ID is probably easiest;
you can cut-and-paste them from the gitk window.
For example, in the git repository,
3f69d405d749742945afd462bff6541604ecd420
looks like an interesting merge. Its parents are
Parent: 7d55561986ffe94ca7ca22dc0a6846f698893226
Parent: 097dc3d8c32f4b85bf9701d5e1de98999ac25c1c
Let's try doing that manually:
$ git checkout -b test 7d55561986ffe94ca7ca22dc0a6846f698893226
$ git pull . 097dc3d8c32f4b85bf9701d5e1de98999ac25c1c
error: no such remote ref refs/heads/097dc3d8c32f4b85bf9701d5e1de98999ac25c1c
Fetch failure: .
Cool! I didn't know that wasn't allowed. (I'll have to ask why it's
not; perhaps it's because it uses the branch name in the automatic
commit message.) I could do it by hand with git-merge, but I'll just
give it a branch name:
$ git branch test2 097dc3d8c32f4b85bf9701d5e1de98999ac25c1c
$ git pull . test2
Merging HEAD with 097dc3d8c32f4b85bf9701d5e1de98999ac25c1c
Merging:
7d55561986ffe94ca7ca22dc0a6846f698893226 Merge branch 'jc/dirwalk-n-cache-tree' into jc/cache-tree
097dc3d8c32f4b85bf9701d5e1de98999ac25c1c Remove "tree->entries" tree-entry list from tree parser
found 2 common ancestor(s):
d9b814cc97f16daac06566a5340121c446136d22 Add builtin "git rm" command
288c0384505e6c25cc1a162242919a0485d50a74 Merge branch 'js/fetchconfig'
Merging:
d9b814cc97f16daac06566a5340121c446136d22 Add builtin "git rm" command
288c0384505e6c25cc1a162242919a0485d50a74 Merge branch 'js/fetchconfig'
found 1 common ancestor(s):
63dffdf03da65ddf1a02c3215ad15ba109189d42 Remove old "git-grep.sh" remnants
Auto-merging Makefile
merge: warning: conflicts during merge
CONFLICT (content): Merge conflict in Makefile
Auto-merging builtin.h
merge: warning: conflicts during merge
CONFLICT (content): Merge conflict in builtin.h
Auto-merging cache.h
Removing check-ref-format.c
Auto-merging git.c
merge: warning: conflicts during merge
CONFLICT (content): Merge conflict in git.c
Auto-merging read-cache.c
Auto-merging update-index.c
merge: warning: conflicts during merge
CONFLICT (content): Merge conflict in update-index.c
Renaming apply.c => builtin-apply.c
Auto-merging builtin-apply.c
Renaming read-tree.c => builtin-read-tree.c
Auto-merging builtin-read-tree.c
Auto-merging .gitignore
Auto-merging Makefile
merge: warning: conflicts during merge
CONFLICT (content): Merge conflict in Makefile
Auto-merging builtin.h
merge: warning: conflicts during merge
CONFLICT (content): Merge conflict in builtin.h
Auto-merging cache.h
Auto-merging fsck-objects.c
Removing git-format-patch.sh
Auto-merging git.c
merge: warning: conflicts during merge
CONFLICT (content): Merge conflict in git.c
Auto-merging update-index.c
Automatic merge failed; fix conflicts and then commit the result.
$ git status
Hey, look, lots of interesting stuff. Particularly, see
# Changed but not updated:
# (use git-update-index to mark for commit)
#
# unmerged: Makefile
# modified: Makefile
# unmerged: builtin.h
# modified: builtin.h
# unmerged: git.c
# modified: git.c
The "unmerged" (a.k.a. "staged") files are ones that need manual resolution.
(I notice that update-index.c isn't listed, despite being mentioned
as a conflict in the message. Can someone explain that?)
Fixing those is easy, but as you can see from the original commit comment
and diffs, there were some additional changes that were necessary to
make that compile.
You can test before committing the change, or do it the git way - commit
anyway, then test and "git commit --amend" with the fixes, of any.
Unlike a centralized VCS, committing is not the same as pushing upstream.
You can use test branches in the repository to save as much work as
you like. While it's still nice to keep the public repository clean,
you don't have to worry about "breaking the tree" every time you commit.
You can do all kinds of stuff in test branches, and clean it up later.
This is why all the git merge tools do the commit without waiting for
you to test it. The merge is usually okay, and it saves time. If not,
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2006-11-16 22:17 [DRAFT] Branching and merging with git linux
@ 2006-11-16 23:47 ` Junio C Hamano
2006-11-17 1:13 ` linux
2006-11-17 1:09 ` Junio C Hamano
` (6 subsequent siblings)
7 siblings, 1 reply; 66+ messages in thread
From: Junio C Hamano @ 2006-11-16 23:47 UTC (permalink / raw)
To: linux; +Cc: git
linux@horizon.com writes:
> I know it took me a while to get used to playing with branches, and I
> still get nervous when doing something creative. So I've been trying
> to get more comfortable, and wrote the following to document what I've
> learned.
>
> It's a first draft - I just finished writing it, so there are probably
> some glaring errors - but I thought it might be of interest anyway.
This is a greatest write-up I've seen for the past several
months. I find it very balanced to point out the quirks people
would find difficult and explain why things are so by including
historical notes in appropriate places when needed. Definitely
Documentation/ material when copyediting is done.
I have finished only the first half because it's not my git day
today, but so far...
> * Naming revisions
>...
> Second, you can refer to a head or tag name. Git looks in the
> following places, in order, for a head:
> 1) .git
> 2) .git/refs
> 3) .git/refs/heads
> 4) .git/refs/tags
You might want to check this with the array in sha1_name.c::get_sha1_basic().
I think tags comes earlier than heads.
> 2) Revert changes to a small number of files.
>
> git checkout [<revision>] [--] <paths>
> will copy the version of the <paths> from the index to the working
> directory. If a <revision> is given, the index for those paths will
> be updated from the given revision before copying from the index to
> the working tree.
>
> Unlike the version with no <paths> specified, this does NOT update
> HEAD, even if <paths> is ".".
It's great that you talk correctly about the latest feature-fix
that is queued for maint but not yet pushed out.
> 2) By path name. This is a feature which appears to be unique to git.
> If you give git-rev-list (or git-log, or gitk, or qgit) a list of
> pathname prefixes, it will list only commits which touch those
> paths. So "git log drivers/scsi include/scsi" will list only
> commits which alters a file whose name begins with drivers/scsi
> or include/scsi.
>
> (If there's any possible ambiguity between a path name and a commit
> name, git-rev-list will refuse to proceed. You can resolve it by
> including "--" on the command line. Everything before that is a
> commit name; everything after is a path.)
>
> This filter is in addition to the ancestry filter. It's also rather
> clever about omitting unnecessary detail. In particular, if there's
> a side branch which does touch drivers/scsi, then the entire branch,
> and the merge at the end, will be removed from the log.
"If there's a side branch which does NOT touch the paths..." I think.
> * Alternate branch naming
>
> The original git scheme mixes tracking branches with all the other heads.
> This requires that you remember which branches are tracking branches and
> which aren't. Hopefully, you remember what all your branches are for,
> but if you track a lot of remote repositories, you might not remember
> what every remote branch is for and what you called it locally.
I think you wanted to mention .git/refs/remotes hierarchy and
separate-remote here, but haven't elaborated yet...
> * Remote tags
>
> TODO: Figure out how remote tags work, under what circumstances
> they are fetched, and what git does if there are conflicts.
refs/tags namespace is not policed at all by git and is treated
as a global namespace, controlled mostly by social convention
that your "upstream" (or central distribution point) supplies
tags for people who use it to synchronize to share. Also, since
there is no guarantee that tags point at commits (v2.6.11-tree
tag is a pointer to a tree object, for example), there is no
farst-forward check performed for them.
The rule we use to autofollow tags currently is:
When you use shorthand fetch (or pull), we find tags that do not
exist locally, and if the object they point at are already found
in the repository then we fetch them automatically. So for
example, if you are only tracking my 'maint' and not 'master'
nor 'next', and if you have tags up to v1.4.3.2, your "git fetch
origin" would update your copy of 'maint' and bring the commits
reachable from the tip of my 'maint'. After that it notices
that v1.4.3.3, v1.4.3.4, v1.4.3.5 tags are in my repository but
missing from yours. It also notices that now you have
v1.4.3.3^{}, v1.4.3.4^{} and v1.4.3.5^{} in your repository, so
it issues another round of "git fetch" internally to fetch these
three tags. At the same time it would also notice that I have
v1.4.4 tag that you do not have, but v1.4.4^0 commit is not
something you would get by fetching 'maint', so it would not
fetch it automatically.
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2006-11-16 22:17 [DRAFT] Branching and merging with git linux
2006-11-16 23:47 ` Junio C Hamano
@ 2006-11-17 1:09 ` Junio C Hamano
2006-11-17 3:17 ` linux
2006-11-17 9:37 ` Jakub Narebski
` (5 subsequent siblings)
7 siblings, 1 reply; 66+ messages in thread
From: Junio C Hamano @ 2006-11-17 1:09 UTC (permalink / raw)
To: linux; +Cc: git
linux@horizon.com writes:
> One outstanding problem with git's man pages is that often the most detail
> is in the command page that was written first, not the user-friendly
> one that you should use.
This is a very important point to remember not for users but for
us in git community. Thanks for writing it down.
> * Git's representation of history
>...
> And then there are commits which have multiple parents. Two is most
> common, but git allows many more. (There's a limit of sixteen in the
> source code, and the most anyone's ever used in real life is 12, and
> that was generally regarded as overdoing it. Google on "doedecapus"
> for discussion of it.)
Dodecapus would find a few, no hits on doedecapus ;-).
> * Deleting branches
>
> "git branch -d <head>" is safe. It deletes the given <head>, but first
> it checks that the commit is reachable some other way. That is, you
> merged the branch in somewhere, or you never did any edits on that branch.
It is not "somewhere" but "in the current branch", so in a sense
it is a bit stricter than that. While on 'master' "branch -d
topic" would not remove the topic branch head if it is not fully
merged to my 'master' so that is a reasonable safety measure,
but when I am on 'next' it will happily remove it. It is
recoverable because it is reachable from 'next', though.
> * Remote tags
>
> TODO: Figure out how remote tags work, under what circumstances
> they are fetched, and what git does if there are conflicts.
One bug in my previous response is that I said we do this only
when the command was invoked with shorthand remote name. Not
so. We do this only when we use tracking branches.
The reason is because 'git pull $url $branch' (typical Linus's
use) and 'git pull' (defaulting to 'origin' and using the
tracking branch mapping stored in .git/remotes/origin prepared
by git-clone) are sign of very different workflows. The former
tends to be a one-shot event while the latter is most often
synchronizing with either an upstream or a common distribution
point (i.e. shared central repostiory). When you are fetching
from somebody in a one-shot manner, most likely as a part of
'pull', you do not want to get the tag the other person has made
to mark his private work in progress. But in the latter case,
the other end is where everybody who works in the same area
fetches from, and sharing the tags found there among the
developers by default is desirable, and more importantly there
is no risk of accidentally getting private tags, since the other
end is a public distribution point and by definition should not
have private tags that would clutter your refs/tags hierarchy.
> * Exchanging work with other repositories, part II: git-push
>...
> You have to merge the changes locally, and re-try the push when you've
> got a new head that includes the most recently pushed work as an ancestor.
>
> This is exactly like "cvs commit" not working if your recent checkout
> wasn't the (current) tip of the branch, but git can upload more than
> one commit.
>
> The simplest way to resolve the conflict is to merge the remote head with
> your local head. This is easiest if you have different local branches
> for fetching the remote repository and for pushing to it.
>
> That is, you have one head that just tracks the master repository's
> main branch, and another that you add your work to, and push from.
> This makes merging simpler when there are conflicts.
Here you _might_ want to mention an alternative workflow that
uses rebase, which seems to be the way Wine folks run their
project. Talking about all the different possibilities tends to
cloud things and may not add value to the document, so I am just
mentioning it as a possibility but I do not know if talking
about rebase is useful in the context of this document.
> * Merging (finally!)
>
> I went through everything else first because the most common merge case
> is local changes with remote changes. Not that you can't merge two
> branches of your own, but you don't need to do that nearly as often.
>
> The primitive that does the merging is called (guess what?) git-merge.
> And you can use that if you want. If you want to create a so-called
> octopus merge, with more than two parents, you have to.
This is not true; "git pull . topicA topicB topicC" works as
expected. But we probably would not want to even talk about
Octopus in a document like this. It is a curosity, and
sometimes tends to make histories even less cluttered, but
otherwise it does not add much value.
> However, it's usually easier to use the git-pull wrapper. This merges
> the changes from some other branch into the current HEAD and generates
> a commit message automatically.
>
> git-merge lets you specify the commit message (rather than generating it
> automatically) and use a non-HEAD destination branch, but those options
> are usually more annoying than useful.
I haven't tried for a long time, but I do not think non-HEAD
destination even works at all. It might be better not to even
mention git-merge at this point of the document.
> * How merging operates
>...
> If all three of a given file in O, A, B are different, then the three
> versions are pulled into the index file, called "stage 1", "stage 2",
> and "stage 3", and a merge strategy driver is called to resolve the mess.
> Git then uses the classic line-based three-way merge, looking for isolated
> changes and applying the same rules as for files when two of the source
> files are the same in some range.
You might also want to mention that recursive first 3-way merges
the renames. If O->A renames a path while O->B keeps it, the
resulting stages are written under the new name.
> * When merging goes wrong
>...
> and the result of "git diff" is
>
> diff --cc hello.c
> index 4b7f550,948a5f8..0000000
> --- a/hello.c
> +++ b/hello.c
> @@@ -3,5 -3,6 +3,10 @@@
> int
> main(void)
> {
> ++<<<<<<< HEAD/hello.c
> + printf("Goodbye, cruel world!");
> ++=======
> + printf("Hello, world!\n");
> + return 0;
> ++>>>>>>> edadc53fc7a8aef2a672a4fa9d09aa16f4e14706/hello.c
> }
>
> Notice how this is not a standard diff! It has two columns of diff
> symbols, and shows the difference from each of the ancestors to the
> current hello.c contents. I can also use "git diff -1" to compare
> against the common ancestor, or "-2" or "-3" to compare against each of
> the merged copies individually.
Another tool to help the user decide how the mess should be
sorted out is "git log --merge -- $path". It gives the logs of
commits that touched the path while the two branches were forked.
> * Cherry picking
>...
> You van get the list of commits on a branch with git-log or git-rev-list,
s/van/can/
> * Rebasing
>...
> Git-rebase can also help you divide up work. Suppose you've mixed up
> development of two features in the current HEAD, a branch called "dev".
Ancestry graph before and after this procedure would help the
reader a lot here.
>...
> This will find all patches that are in dev and not in dev1,
> apply them on top of master, and call the result dev2.
> * Experimenting with merging
>...
> $ git status
> Hey, look, lots of interesting stuff. Particularly, see
> # Changed but not updated:
> # (use git-update-index to mark for commit)
> #
> # unmerged: Makefile
> # modified: Makefile
> # unmerged: builtin.h
> # modified: builtin.h
> # unmerged: git.c
> # modified: git.c
>
> The "unmerged" (a.k.a. "staged") files are ones that need manual resolution.
>
> (I notice that update-index.c isn't listed, despite being mentioned
> as a conflict in the message. Can someone explain that?)
They were conflicts during the virtual ancestor computation by
recursive (the merge between 'a' and 'b' commits in your earlier
example). When a virtual ancestor is created, it can textually
have conflicted merge, but that is recorded along with conflict
markers without manual resolving for obvious reasons. If two
branches that use the virtual ancestor modifies the conflicted
region the same way (because they needed to resolve that
conflict in their branch), the final 3-way merge that uses the
virtual ancestor as the merge-base will replace that conflicted
region with their changes. This "even conflict markers can be
eliminated by a merge resolution" behaviour is what inspired
git-rerere, by the way.
If you are using this particular commit as an example, you might
also want to tell your readers about:
git show -M 3f69d405
(-M is there to make the output more readable, because this
merge involved a few renames).
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2006-11-16 23:47 ` Junio C Hamano
@ 2006-11-17 1:13 ` linux
2006-11-17 1:31 ` Junio C Hamano
0 siblings, 1 reply; 66+ messages in thread
From: linux @ 2006-11-17 1:13 UTC (permalink / raw)
To: junkio, linux; +Cc: git
> I find it very balanced to point out the quirks people
> would find difficult and explain why things are so by including
> historical notes in appropriate places when needed.
I'm trying; I've been following git since day 1, so occasionally an
obsolete fact gets stuck in my head.
If anyone has any advice on how and why one would invoke git-merge
directly (the one why I know is to do a >2-way merge), that would
be appreciated.
> I have finished only the first half because it's not my git day
> today, but so far...
Well, thank you for your time!
>> * Naming revisions
>>...
>> Second, you can refer to a head or tag name. Git looks in the
>> following places, in order, for a head:
>> 1) .git
>> 2) .git/refs
>> 3) .git/refs/heads
>> 4) .git/refs/tags
>
> You might want to check this with the array in sha1_name.c::get_sha1_basic().
> I think tags comes earlier than heads.
Quite right. It's
static const char *fmt[] = {
"%.*s",
"refs/%.*s",
"refs/tags/%.*s",
"refs/heads/%.*s",
"refs/remotes/%.*s",
"refs/remotes/%.*s/HEAD",
NULL
};
>> 2) Revert changes to a small number of files.
>>
>> git checkout [<revision>] [--] <paths>
>> will copy the version of the <paths> from the index to the working
>> directory. If a <revision> is given, the index for those paths will
>> be updated from the given revision before copying from the index to
>> the working tree.
>>
>> Unlike the version with no <paths> specified, this does NOT update
>> HEAD, even if <paths> is ".".
>
> It's great that you talk correctly about the latest feature-fix
> that is queued for maint but not yet pushed out.
Um... there's a fix in there? I thought that's how it always worked.
> "If there's a side branch which does NOT touch the paths..." I think.
Ah, yes, I added include/scsi to the example to illustrate how
mutiple paths worked and didn't update the later paragraph.
>> * Alternate branch naming
>>
>> The original git scheme mixes tracking branches with all the other heads.
>> This requires that you remember which branches are tracking branches and
>> which aren't. Hopefully, you remember what all your branches are for,
>> but if you track a lot of remote repositories, you might not remember
>> what every remote branch is for and what you called it locally.
>
> I think you wanted to mention .git/refs/remotes hierarchy and
> separate-remote here, but haven't elaborated yet...
Yes, sorry. I meant to research that and update this (I've never used
it before), but I forgot.
>> * Remote tags
>>
>> TODO: Figure out how remote tags work, under what circumstances
>> they are fetched, and what git does if there are conflicts.
>
> refs/tags namespace is not policed at all by git and is treated
> as a global namespace, controlled mostly by social convention
> that your "upstream" (or central distribution point) supplies
> tags for people who use it to synchronize to share. Also, since
> there is no guarantee that tags point at commits (v2.6.11-tree
> tag is a pointer to a tree object, for example), there is no
> farst-forward check performed for them.
>
> The rule we use to autofollow tags currently is:
>
> When you use shorthand fetch (or pull), we find tags that do not
> exist locally, and if the object they point at are already found
> in the repository then we fetch them automatically. So for
> example, if you are only tracking my 'maint' and not 'master'
> nor 'next', and if you have tags up to v1.4.3.2, your "git fetch
> origin" would update your copy of 'maint' and bring the commits
> reachable from the tip of my 'maint'. After that it notices
> that v1.4.3.3, v1.4.3.4, v1.4.3.5 tags are in my repository but
> missing from yours. It also notices that now you have
> v1.4.3.3^{}, v1.4.3.4^{} and v1.4.3.5^{} in your repository, so
> it issues another round of "git fetch" internally to fetch these
> three tags. At the same time it would also notice that I have
> v1.4.4 tag that you do not have, but v1.4.4^0 commit is not
> something you would get by fetching 'maint', so it would not
> fetch it automatically.
Ah, okay. Actually, v2.6.11-tree is a tag object
(5dc01c595e6c6ec9ccda4f6f69c131c0dd945f8c) which points
to a tree object (c39ae07f393806ccf406ef966e9a15afc43cc36a).
I was wondering if git only shared refs/tags that pointed to
heavyweight tag objects and not lightweight tags.
That appears to be the case:
mkdir a b
cd a
git-init-db
echo "Hello, world" > hello
git add hello
git commit -m "Initial commit"
git tag light
git tag -a -m "Test tag" heavy
cd ../b
git-init-db
echo "URL: ../a" > .git/remotes/a
echo "Pull: master:origin" >> .git/remotes/a
git fetch a
But! It only fetches tags if you specify a destination branch name.
I hadn't noticed that before, but "git-fetch <url> foo" and
"git-fetch <url> foo:foo" do different things on the receiver.
Didn't they used to be synonyms?
(I think it's a net gain in flexibility.)
Oh! Also, the git-pull man page says that multiple branch names are
allowed, even though the SYNOPSIS line says no.
I also need to mention that if you want to pull a remote tag,
you need to prefix it with "tags/". For some reason, the search
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2006-11-17 1:13 ` linux
@ 2006-11-17 1:31 ` Junio C Hamano
0 siblings, 0 replies; 66+ messages in thread
From: Junio C Hamano @ 2006-11-17 1:31 UTC (permalink / raw)
To: linux; +Cc: git
linux@horizon.com writes:
> If anyone has any advice on how and why one would invoke git-merge
> directly (the one why I know is to do a >2-way merge), that would
> be appreciated.
I use "git pull . topicA topicB" for a tetrapus, so that is not
a reason for me, but when a topicA's older parts are worthy to
be in 'next' while later parts are not yet, I often do (on 'next'):
git merge "Merge early part of branch 'topicA'" HEAD topicA~3
Also I used to do
git merge fast HEAD someTopicIknowIsAFastForward
because it felt faster than "git pull . someTopicIknowisAFastForward"
but I do not do that these days and I would not recommend it to anybody.
>>> 2) Revert changes to a small number of files.
>>>
>>> git checkout [<revision>] [--] <paths>
>>> will copy the version of the <paths> from the index to the working
>>> directory. If a <revision> is given, the index for those paths will
>>> be updated from the given revision before copying from the index to
>>> the working tree.
>>>
>>> Unlike the version with no <paths> specified, this does NOT update
>>> HEAD, even if <paths> is ".".
>>
>> It's great that you talk correctly about the latest feature-fix
>> that is queued for maint but not yet pushed out.
>
> Um... there's a fix in there? I thought that's how it always worked.
I do not think naming a directory (say, ".") to mean "revert
everything underneath this directory" worked until the patch I
sent out post 1.4.4 release.
> I also need to mention that if you want to pull a remote tag,
> you need to prefix it with "tags/".
Yes, recent -mm announce message says "git pull ... tag v2.x-mmY".
"tag v2.x-mmy" is a shorthand for "refs/tags/v2.x-mmY:refs/tags/v2.x-mmY"
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2006-11-17 1:09 ` Junio C Hamano
@ 2006-11-17 3:17 ` linux
2006-11-17 5:55 ` Junio C Hamano
0 siblings, 1 reply; 66+ messages in thread
From: linux @ 2006-11-17 3:17 UTC (permalink / raw)
To: junkio, linux; +Cc: git
Overall, thank you. I'm trying to merge all your comments
into the document to make it better, but there are enough that
it's taking me a while.
>> One outstanding problem with git's man pages is that often the most detail
>> is in the command page that was written first, not the user-friendly
>> one that you should use.
>
> This is a very important point to remember not for users but for
> us in git community. Thanks for writing it down.
There's a great example coming up, in the git-show
example you gave me. That's a very sparse man page...
> Dodecapus would find a few, no hits on doedecapus ;-).
Wups, thanks.
>> * Deleting branches
>>
>> "git branch -d <head>" is safe. It deletes the given <head>, but first
>> it checks that the commit is reachable some other way. That is, you
>> merged the branch in somewhere, or you never did any edits on that branch.
>
> It is not "somewhere" but "in the current branch", so in a sense
> it is a bit stricter than that. While on 'master' "branch -d
> topic" would not remove the topic branch head if it is not fully
> merged to my 'master' so that is a reasonable safety measure,
> but when I am on 'next' it will happily remove it. It is
> recoverable because it is reachable from 'next', though.
Oh! Thanks for the info. The limitation makes a certain
amount of sense, and as I'd never run into it, I'm not going to
complain.
> The reason is because 'git pull $url $branch' (typical Linus's
> use) and 'git pull' (defaulting to 'origin' and using the
> tracking branch mapping stored in .git/remotes/origin prepared
> by git-clone) are sign of very different workflows. The former
> tends to be a one-shot event while the latter is most often
> synchronizing with either an upstream or a common distribution
> point (i.e. shared central repostiory). When you are fetching
> from somebody in a one-shot manner, most likely as a part of
> 'pull', you do not want to get the tag the other person has made
> to mark his private work in progress. But in the latter case,
> the other end is where everybody who works in the same area
> fetches from, and sharing the tags found there among the
> developers by default is desirable, and more importantly there
> is no risk of accidentally getting private tags, since the other
> end is a public distribution point and by definition should not
> have private tags that would clutter your refs/tags hierarchy.
I'll work this in somehow, thanks.
> Here you _might_ want to mention an alternative workflow that
> uses rebase, which seems to be the way Wine folks run their
> project. Talking about all the different possibilities tends to
> cloud things and may not add value to the document, so I am just
> mentioning it as a possibility but I do not know if talking
> about rebase is useful in the context of this document.
Done, thanks.
>> The primitive that does the merging is called (guess what?) git-merge.
>> And you can use that if you want. If you want to create a so-called
>> octopus merge, with more than two parents, you have to.
>
> This is not true; "git pull . topicA topicB topicC" works as
> expected. But we probably would not want to even talk about
> Octopus in a document like this. It is a curosity, and
> sometimes tends to make histories even less cluttered, but
> otherwise it does not add much value.
Sorry; the SYNOPSIS line for git-pull had me fooled.
>> However, it's usually easier to use the git-pull wrapper. This merges
>> the changes from some other branch into the current HEAD and generates
>> a commit message automatically.
>>
>> git-merge lets you specify the commit message (rather than generating it
>> automatically) and use a non-HEAD destination branch, but those options
>> are usually more annoying than useful.
>
> I haven't tried for a long time, but I do not think non-HEAD
> destination even works at all. It might be better not to even
> mention git-merge at this point of the document.
Well, I want to at least say "you think so, wouldn't you?"
>> * How merging operates
>
> You might also want to mention that recursive first 3-way merges
> the renames. If O->A renames a path while O->B keeps it, the
> resulting stages are written under the new name.
Thanks! I wondered how that happened!
>> * When merging goes wrong
> Another tool to help the user decide how the mess should be
> sorted out is "git log --merge -- $path". It gives the logs of
> commits that touched the path while the two branches were forked.
The things I never knew about...
>> Git-rebase can also help you divide up work. Suppose you've mixed up
>> development of two features in the current HEAD, a branch called "dev".
>
> Ancestry graph before and after this procedure would help the
> reader a lot here.
I figured the (excellent) pictures in git-rebase would save me the
trouble, but yeah, I suppose so.
> They were conflicts during the virtual ancestor computation by
> recursive (the merge between 'a' and 'b' commits in your earlier
> example). When a virtual ancestor is created, it can textually
> have conflicted merge, but that is recorded along with conflict
> markers without manual resolving for obvious reasons. If two
> branches that use the virtual ancestor modifies the conflicted
> region the same way (because they needed to resolve that
> conflict in their branch), the final 3-way merge that uses the
> virtual ancestor as the merge-base will replace that conflicted
> region with their changes. This "even conflict markers can be
> eliminated by a merge resolution" behaviour is what inspired
> git-rerere, by the way.
Cool, thanks! I added mention of this.
> If you are using this particular commit as an example, you might
> also want to tell your readers about:
>
> git show -M 3f69d405
>
> (-M is there to make the output more readable, because this
> merge involved a few renames).
I'm wondering what the heck that does! I get a super-short diff with
no mention of any renames at all. Is this passed on to git-diff-tree?
What does "detect renames" mean if it doesn't tell me about them?
I'm actually confused.
diff --cc builtin-read-tree.c
index ec40d01,99e7c75..716f792
--- a/builtin-read-tree.c
+++ b/builtin-read-tree.c
@@@ -9,9 -9,9 +9,10 @@@
#include "object.h"
#include "tree.h"
+ #include "cache-tree.h"
#include <sys/time.h>
#include <signal.h>
+#include "builtin.h"
static int reset = 0;
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2006-11-17 3:17 ` linux
@ 2006-11-17 5:55 ` Junio C Hamano
0 siblings, 0 replies; 66+ messages in thread
From: Junio C Hamano @ 2006-11-17 5:55 UTC (permalink / raw)
To: linux; +Cc: git
linux@horizon.com writes:
>>> * When merging goes wrong
>
>> Another tool to help the user decide how the mess should be
>> sorted out is "git log --merge -- $path". It gives the logs of
>> commits that touched the path while the two branches were forked.
>
> The things I never knew about...
Of course "git log -p --merge -- $path" would give the patch
text as well.
>> If you are using this particular commit as an example, you might
>> also want to tell your readers about:
>>
>> git show -M 3f69d405
>>
>> (-M is there to make the output more readable, because this
>> merge involved a few renames).
>
> I'm wondering what the heck that does! I get a super-short diff with
> no mention of any renames at all. Is this passed on to git-diff-tree?
> What does "detect renames" mean if it doesn't tell me about them?
> I'm actually confused.
"show $merge" is really "diff-tree --cc -p $merge". So first I
should (not necessarily "you should to the readers of this
document") talk about three ways to describe a merge commit with
textual diffs.
(1) N independent diffs between each of the parents and the child.
We could get this with
git diff-tree -m -p $merge
but it is mostly useless, because very often many paths in a
merge are truly trivial and the version from one of the
parents is taken verbatim, whole file. When looking at the
development history, the real reason of the change is found
in earlier log for that parent, and not in the merge in
question.
The diff between the child and its first parent is somewhat
useful, because it represents the damage inflicted on his
branch the person saw when he made the merge. For this
reason, "--stat" gives the graph for the first-parent diff
for a merge. But otherwise "diff-tree -p" by default stays
silent about merges because it is not that useful, and that
is why the above asks for the "useless output ;-)" with an
explicit "-m".
(2) Uncompressed "combined diff" between all parents and the child.
We can get this with:
git diff-tree -c -p $merge
This gives a combined diff that shows all the files parents
and child disagreed (in other words, if the resulting file
matches verbatim with one of the parents, it is not shown).
This is already useful by reducing the clutter of truly
trivial merges, compared to (1) above, but most clean merges
take either first or second parent's version verbatim for
each hunk (but not necessarily taking all hunks from the
same parent) and these hunks are not very interesting.
Because "-c" explicitly tells something special to be done
for a merge, you do not need to say "-m" for the above
command (giving -m does not hurt, but is not necessary).
Uncompressed combined diff is a still per-file affair, so by
default the above gives the output in the "raw" format, and
that is why the above command still says "-p".
(3) Compressed "combined diff". This is what "show" gives by
default, and we can get this with:
git diff-tree --cc $merge
The difference from "-c -p" output is that this reduces the
clutter further by dropping uninteresting hunks from the
output. If all the changes in a hunk is from only one
parent, or the changes are the same from all but one parent,
the hunk is dropped from the output (looking at dodecapus
with --cc is interesting for this reason).
Like "-c", this explicitly asks a magic to be done on a
merge, so "-m" is implied. Unlike "-c", this operation is
per-hunk and "raw" format (which is inherently per file)
does not make any sense. Because "raw" is impossible, "-p"
is also implied.
Both (2) and (3) are "combined". It combines the diffs with
each parent; for N-parent merge, it combines N diffs into one.
What -M/-C does is to see which path in each parent is used to
diff against a path in the child. For example, the stat part
of:
git diff-tree --stat --cc -p -M 3f69d405
shows us that builtin-apply.c had a few insertion and deletion
(remember, this is the diff between the first parent and the
child -- "damage given to the first parent due to this merge").
If you run the above without -M, you will see a huge combined
diff for builtin-apply.c because the second parent (i.e. the
branch that was merged) did not have builtin-apply.c -- it still
had the file under its old name, apply.c.
So what ends up getting combined without -M is diff between
builtin-apply.c of the first parent and the child, and diff
between /dev/null and builtin-apply.c of the child.
But with -M, what it combines is diff between builtin-apply.c of
the first parent and the child, and diff between apply.c of the
second parent and builtin-apply.c of the child. This obviously
produces a lot more reasonable output -- actually the merge for
this particular path is texually trivial that it does not even
show in the --cc output. You can still view what it was by
looking at:
git diff-tree -c -p -M 3f69d405
Side note: I am not sure if the --cc hunk droppage logic is
doing the right thing for the first hunk for builtin-apply.c
case. I think it is an "interesting" hunk but somehow --cc
output does not show it. The second and later hunks are
definitely uninteresting, though.
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2006-11-16 22:17 [DRAFT] Branching and merging with git linux
2006-11-16 23:47 ` Junio C Hamano
2006-11-17 1:09 ` Junio C Hamano
@ 2006-11-17 9:37 ` Jakub Narebski
2006-11-17 9:41 ` Jakub Narebski
` (4 subsequent siblings)
7 siblings, 0 replies; 66+ messages in thread
From: Jakub Narebski @ 2006-11-17 9:37 UTC (permalink / raw)
To: git
linux@horizon.com wrote:
> Either way, they're just a 41-byte file that contains a 40-byte hex
> object ID, plus a newline. Tags are stored in .git/refs/tags, and heads
> are stored in .git/refs/heads. Creating a new branch is literally just
> picking a file name and writing the ID of an existing commit into it.
This is an implementation detail, and is not true in repository with
packed refs. Although usually (by default) only tags are packed.
But it remains true that ref (be it branch or tag) is just name and ID.
> The git programs enforce the immutability of tags, but that's a safety
> feature, not something fundamental. You can rename a tag to the heads
> directory and go wild.
You can have only refs to commit objects in heads directory (and I hope
this is verified by fsck-objects), you can have refs to tag objects
(heavyweight tags), to commits (lightweight tags), to blobs (for example
public PGP key used for signing tags), to trees (I guess unused).
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2006-11-16 22:17 [DRAFT] Branching and merging with git linux
` (2 preceding siblings ...)
2006-11-17 9:37 ` Jakub Narebski
@ 2006-11-17 9:41 ` Jakub Narebski
2006-11-17 10:37 ` Jakub Narebski
` (3 subsequent siblings)
7 siblings, 0 replies; 66+ messages in thread
From: Jakub Narebski @ 2006-11-17 9:41 UTC (permalink / raw)
To: git
linux@horizon.com wrote:
> There is always a current head, known as HEAD. (This is actually a
> symbolic link, .git/HEAD, to a file like refs/heads/master.)
Usually this is symref, not symlink, i.e. .git/HEAD (or rather
$GIT_DIR/HEAD) is a file which contains single line like this:
ref: refs/heads/master
There is a talk about relaxing HEAD restriction to allow it to contain ref
to tag, or bare SHA1 id for "seeking"; you are forbidden to commit to such
state.
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2006-11-16 22:17 [DRAFT] Branching and merging with git linux
` (3 preceding siblings ...)
2006-11-17 9:41 ` Jakub Narebski
@ 2006-11-17 10:37 ` Jakub Narebski
2006-11-17 15:32 ` Theodore Tso
` (2 subsequent siblings)
7 siblings, 0 replies; 66+ messages in thread
From: Jakub Narebski @ 2006-11-17 10:37 UTC (permalink / raw)
To: git
linux@horizon.com wrote:
> * Remotes files
>
> You can specify what to fetch on the git-fetch command line. However,
> if you intend to monitor another repository on an ongoing basis,
> it's generally easier to set up a short-cut by placing the options in
> .git/remotes/<name>.
You can also set up this in config file (remote and branch sections),
in modern git.
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2006-11-16 22:17 [DRAFT] Branching and merging with git linux
` (4 preceding siblings ...)
2006-11-17 10:37 ` Jakub Narebski
@ 2006-11-17 15:32 ` Theodore Tso
2006-11-17 15:57 ` Sean
2006-11-17 18:21 ` J. Bruce Fields
2006-11-17 17:44 ` [DRAFT] Branching and merging with git J. Bruce Fields
2007-01-03 17:04 ` Theodore Tso
7 siblings, 2 replies; 66+ messages in thread
From: Theodore Tso @ 2006-11-17 15:32 UTC (permalink / raw)
To: linux; +Cc: git
On Thu, Nov 16, 2006 at 05:17:01PM -0500, linux@horizon.com wrote:
> I know it took me a while to get used to playing with branches, and I
> still get nervous when doing something creative. So I've been trying
> to get more comfortable, and wrote the following to document what I've
> learned.
>
> It's a first draft - I just finished writing it, so there are probably
> some glaring errors - but I thought it might be of interest anyway.
This is really, really good stuff that you've written! Have you any
thoughts or suggestions about where this text should end up?
Personally, I think this information is actually more important to an
end-user than the current "part two" of the tutorial, which discusses
the object database and the index file. Perhaps this should be "part
2", and the object database and index file should become "part 3"?
It might also be a good to consider moving some of the "discussion"
portion the top-level git(7) man page into the object database and
index file discussion. Right now, the best way to introduce git's
concepts (IMHO), is to start with the part 1 of the tutorial, then go
into the your draft branch/merging with git, then the current part 2
of the tutorial, and then direct folks to read the "discussion"
section of git(7). Only then do they really have enough background
understanding of the fundamental concepts of git that they won't get
confused when they start talking to other git users, on the git
mailing list, for example.
It would be nice if there was an easy way to direct users through the
documentation in a way which makes good pedagogical sense. Right now,
one of the reasons why life gets hard for new users is that the
current tutorials aren't enough for them to really undersatnd what's
going on at a conceptual level. And if users start using "everyday
git" as a crutch, without the right background concepts, the human
brain naturally tries to intuit what's happening in the background,
but without reading the background docs, git is different enough that
they will probably get it wrong, which means more stuff that they have
to unlearn later.
> * Git's representation of history
>
> As you recall from Git 101, there are exactly four kinds of objects in
> Git's object database. All of them have globally unique 40-character hex
> names made by hashing their type and contents. Blob objects record file
> contents; they contain bytes. Tree objects record directory contents;
> they contain file names, permissions, and the associated tree or blob
> object names. Tag objects are shareable pointers to other objects;
> they're generally used to store a digital signature.
Hmm... this assumes that you've read the Git(7) discussion first.
There is enough information here though that maybe you don't need to
say "as you recall". It might be enough to give a quick summary of
the concepts that are needed to understand the rest of your tutorial,
and then point to git(7) Discussion section for people who need to
learn more details.
> * Remotes files
>
> Note that branches to fetch are identified by "Pull: " lines in the
> remotes file. This is another example of the fetch/pull confusion.
> git-pull will be explained eventually.
Maybe we should change git so that a "Fetch: " line in the remotes
file works the same way as "Pull: ", and then recommend that people
use "Fetch: " in order to reduce confusion, as opposed to simply
explaining it away as "yet another example of the histororical
fetch/pull confusion"?
Thanks,
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2006-11-17 15:32 ` Theodore Tso
@ 2006-11-17 15:57 ` Sean
2006-11-17 16:19 ` Nguyen Thai Ngoc Duy
2006-11-17 18:21 ` J. Bruce Fields
1 sibling, 1 reply; 66+ messages in thread
From: Sean @ 2006-11-17 15:57 UTC (permalink / raw)
To: Theodore Tso; +Cc: linux, git, Petr Baudis
On Fri, 17 Nov 2006 10:32:46 -0500
Theodore Tso <tytso@mit.edu> wrote:
> It would be nice if there was an easy way to direct users through the
> documentation in a way which makes good pedagogical sense. Right now,
> one of the reasons why life gets hard for new users is that the
> current tutorials aren't enough for them to really undersatnd what's
> going on at a conceptual level. And if users start using "everyday
> git" as a crutch, without the right background concepts, the human
> brain naturally tries to intuit what's happening in the background,
> but without reading the background docs, git is different enough that
> they will probably get it wrong, which means more stuff that they have
> to unlearn later.
It would be nice to post this information on the Git website and not
have it overshadowed by Cogito examples with paragraphs explaining how
Cogito makes things easier. The current website distracts users away
from learning Git or ever reading about this kind of information.
Maybe we can pass a hat around for some funds for a separate Cogito
website. ;o)
> Maybe we should change git so that a "Fetch: " line in the remotes
> file works the same way as "Pull: ", and then recommend that people
> use "Fetch: " in order to reduce confusion, as opposed to simply
> explaining it away as "yet another example of the histororical
> fetch/pull confusion"?
That's quite a good idea. The name was fixed when the option to move
this info into the config file was added (remote.<name>.fetch). So
another option would be to show new users the config file method and
just damn the remotes file to a historical footnote.
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2006-11-17 15:57 ` Sean
@ 2006-11-17 16:19 ` Nguyen Thai Ngoc Duy
2006-11-17 16:25 ` Marko Macek
` (2 more replies)
0 siblings, 3 replies; 66+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2006-11-17 16:19 UTC (permalink / raw)
To: Sean; +Cc: git, Petr Baudis
On 11/17/06, Sean <seanlkml@sympatico.ca> wrote:
> It would be nice to post this information on the Git website and not
> have it overshadowed by Cogito examples with paragraphs explaining how
> Cogito makes things easier. The current website distracts users away
> from learning Git or ever reading about this kind of information.
> Maybe we can pass a hat around for some funds for a separate Cogito
> website. ;o)
Or.. find a way to merge cogito back to git :-)
/me runs into a nearest bush.
--
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2006-11-17 16:19 ` Nguyen Thai Ngoc Duy
@ 2006-11-17 16:25 ` Marko Macek
2006-11-17 16:33 ` Petr Baudis
2006-11-17 16:34 ` Sean
[not found] ` <20061117113404.810fd4ea.seanlkml@sympatico.ca>
2 siblings, 1 reply; 66+ messages in thread
From: Marko Macek @ 2006-11-17 16:25 UTC (permalink / raw)
To: Nguyen Thai Ngoc Duy; +Cc: git, Petr Baudis
Nguyen Thai Ngoc Duy wrote:
> On 11/17/06, Sean <seanlkml@sympatico.ca> wrote:
>> It would be nice to post this information on the Git website and not
>> have it overshadowed by Cogito examples with paragraphs explaining how
>> Cogito makes things easier. The current website distracts users away
>> from learning Git or ever reading about this kind of information.
>> Maybe we can pass a hat around for some funds for a separate Cogito
>> website. ;o)
>
> Or.. find a way to merge cogito back to git :-)
> /me runs into a nearest bush.
I agree, this would certainly be the best solution. But it would imply
hiding the 'index' by default which would probably an incompatible change.
The alternative would be to explain that git is a low level tool suitable
mostly for integrators like Linus (that, and that Cogito and/or StGit should
be used by developers/contributors).
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2006-11-17 16:25 ` Marko Macek
@ 2006-11-17 16:33 ` Petr Baudis
0 siblings, 0 replies; 66+ messages in thread
From: Petr Baudis @ 2006-11-17 16:33 UTC (permalink / raw)
To: Marko Macek; +Cc: Nguyen Thai Ngoc Duy, git
On Fri, Nov 17, 2006 at 05:25:25PM CET, Marko Macek wrote:
> Nguyen Thai Ngoc Duy wrote:
> >On 11/17/06, Sean <seanlkml@sympatico.ca> wrote:
> >>It would be nice to post this information on the Git website and not
> >>have it overshadowed by Cogito examples with paragraphs explaining how
> >>Cogito makes things easier. The current website distracts users away
> >>from learning Git or ever reading about this kind of information.
> >>Maybe we can pass a hat around for some funds for a separate Cogito
> >>website. ;o)
> >
> >Or.. find a way to merge cogito back to git :-)
> >/me runs into a nearest bush.
I think we are trying to figure that out in the last few days in those
mammoth threads. UI-wise with no big breakthroughs so far I guess,
though.
> The alternative would be to explain that git is a low level tool suitable
> mostly for integrators like Linus (that, and that Cogito and/or StGit
> should be used by developers/contributors).
This is in essence what many people (including Junio) are saying. I'm
not saying it's a totally great situation, hence the previous paragraph.
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2006-11-17 16:19 ` Nguyen Thai Ngoc Duy
2006-11-17 16:25 ` Marko Macek
@ 2006-11-17 16:34 ` Sean
[not found] ` <20061117113404.810fd4ea.seanlkml@sympatico.ca>
2 siblings, 0 replies; 66+ messages in thread
From: Sean @ 2006-11-17 16:34 UTC (permalink / raw)
To: Nguyen Thai Ngoc Duy; +Cc: git, Petr Baudis
On Fri, 17 Nov 2006 23:19:23 +0700
"Nguyen Thai Ngoc Duy" <pclouds@gmail.com> wrote:
> Or.. find a way to merge cogito back to git :-)
> /me runs into a nearest bush.
Pasky has already given a lot to Git, and it would be great to see even
more merged back into Git where a consensus can be reached. In fact
Pasky has said that his plan is to push a lot more towards Git and
make Cogito a thinner UI layer. Either way, there's absolutely nothing
wrong with people choosing to use Cogito rather than Git. It's just
that the separate Cogito tool shouldn't have a place on the Git website
any more prominent than say StGit does.
The Git website should be a place where Git makes the best case
it can for _itself_, not for its sister tools. It's a distraction
and gets in the way of promoting Git as a stand alone tool. At
least one new user has complained that it was confusing.
Personally I have nothing against Cogito, I just think Pasky should
separate his role as Git webmaster from his role as Cogito author.
If people have good ideas for Git documentation, the website would be
a natural place for it, and it shouldn't have to compete with Cogito
tutorials etc.
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
[not found] ` <20061117113404.810fd4ea.seanlkml@sympatico.ca>
@ 2006-11-17 16:53 ` Petr Baudis
2006-11-17 17:01 ` Sean
[not found] ` <20061117120154.3eaf5611.seanlkml@sympatico.ca>
0 siblings, 2 replies; 66+ messages in thread
From: Petr Baudis @ 2006-11-17 16:53 UTC (permalink / raw)
To: Sean; +Cc: Nguyen Thai Ngoc Duy, git
On Fri, Nov 17, 2006 at 05:34:04PM CET, Sean wrote:
> It's just that the separate Cogito tool shouldn't have a place on the
> Git website any more prominent than say StGit does.
It doesn't - look at the "Maintaining external patches" crash course.
Porcelains are integral part of the Git environment. I think several
people have already tried to explain it before.
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2006-11-17 16:53 ` Petr Baudis
@ 2006-11-17 17:01 ` Sean
[not found] ` <20061117120154.3eaf5611.seanlkml@sympatico.ca>
1 sibling, 0 replies; 66+ messages in thread
From: Sean @ 2006-11-17 17:01 UTC (permalink / raw)
To: Petr Baudis; +Cc: Nguyen Thai Ngoc Duy, git
On Fri, 17 Nov 2006 17:53:33 +0100
Petr Baudis <pasky@suse.cz> wrote:
> On Fri, Nov 17, 2006 at 05:34:04PM CET, Sean wrote:
> > It's just that the separate Cogito tool shouldn't have a place on the
> > Git website any more prominent than say StGit does.
>
> It doesn't - look at the "Maintaining external patches" crash course.
>
> Porcelains are integral part of the Git environment. I think several
> people have already tried to explain it before.
>
There is enough native Git documentation and hopefully more coming
that third party tools should be pushed behind the scenes a bit.
At least on the GIT website.
Of course there is nothing wrong with having information there, but
the main thrust should be about Git and how to use it directly without
porcelains. Especially in the light that people have recently
expressed a desire to advocate and document the use of native Git
more strongly.
Having a link to Cogito off the front page of the Git website that
says... Cogito makes things "easier", no matter how much you
personally believe it, isn't the way everyone feels and is at
odds with the native-git message and improvement effort.
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2006-11-16 22:17 [DRAFT] Branching and merging with git linux
` (5 preceding siblings ...)
2006-11-17 15:32 ` Theodore Tso
@ 2006-11-17 17:44 ` J. Bruce Fields
2006-11-17 18:16 ` Jakub Narebski
2007-01-03 17:04 ` Theodore Tso
7 siblings, 1 reply; 66+ messages in thread
From: J. Bruce Fields @ 2006-11-17 17:44 UTC (permalink / raw)
To: linux; +Cc: git
This has some useful material that fills gaps in the existing
documentation. We need to think a little more about the intended
audience, and about how to fit it in with existing documentation.
On Thu, Nov 16, 2006 at 05:17:01PM -0500, linux@horizon.com wrote:
> * A brief digression on command names.
>
> Originally, all git commands were named "git-foo". When there got to
> be over a hundred, people started complaining about the clutter in
> /usr/bin. After some discussion, the following solution was reached:
>
> - It's now possible to place all of the git-foo commands into a separate
> directory. (Despite the complaints, not too many people are doing it
> yet.)
> - One option for git users is to add that directory to their $PATH.
> - Another is provided by a wrapper called just "git". It's intended to
> live in a public directory like /usr/bin, and knows the location of
> the separate directory. When you type "git foo", it finds and executes
> "git-foo".
> - Some simple commands are built into the git wrapper. When you type
> "git add", it just does it internally. (On the git mailing list,
> you will see patches like "make git diff a builtin"; this is what
> they're talking about.)
> - For compatibility, for each builtin, there is a "git-add" file,
> which is just a link to the "git" wrapper. It looks at the name it
> was invoked as to figure out what it should do.
>
> The one confusing thing is that, although people usually type "git foo"
> in examples, they're interchangeable in practice. I go back and forth
> for no good reason. The main caveat is that to get the man page, you
> still need to type "man git-foo". Fortunately, there are two other ways
> to get the man page:
>
> 1) "git help foo"
> 2) "git foo --help"
>
> Git doesn't have a specialized built-in help system; it just shows you
> the man pages.
Who's the audience for the above? I can see that it's useful for
administrators, who may need help deciding how to install stuff, and for
developers, who need to know where the heck the code for "git-add" came
from. But the case I'm most interested in is the user whose
distribution installs git for them, in which case I think the above
could be distilled down to:
- "git-foo" and "git foo" can be used interchangeably.
- Documentation for the command foo is available from any of
- man git-foo
- git help foo
- git foo --help
Then the additional details above could be postponed to a later part of
the documentation.
> One outstanding problem with git's man pages is that often the most detail
> is in the command page that was written first, not the user-friendly
> one that you should use. For example, there are a number of special
> cases of the "git diff" command that were written first, and the man
> pages for these commands (git-diff-index, git-diff-files, git-diff-tree,
> and git-diff-stages) are considerably more informative than the page for
> plain git-diff, even though that's the command that you should use 99%
> of the time.
I agree that that's helpful. Though we should probably also be working
on the man pages to make this organization clearer.
> As you recall from Git 101
Obviously a more specific reference would be more useful here--if
there's nothing useful to point to among the existing documentation, we
should figure out how to fix that problem.
That might also remove the need for some of the recap that follows.
> there are exactly four kinds of objects in
> Git's object database. All of them have globally unique 40-character hex
....
> Finally, there are references, stored in the .git/refs directory.
> These are the human-readable names associated with commits, and the
> "root set" from which all other commits should be reachable.
This is good; a comprehensive discussion of references will fill a gap
in the current documentation.
....
> * Naming revisions
>
> CVS encourages you to tag like crazy, because the only other way to
> find a given revision is by date. Git makes it a lot easier, so most
> revisions don't need names.
>
> You can find a full description in the git-rev-parse man page, but here's
> a summary.
This has a lot more overlap with existing documentation. The extra
detail is useful, but we need to decide what our audience and goal is
here, to decide exactly what niche we're trying to fill between the
brief stuff that's in the tutorial part I and the details in
"man git-rev-parse".
> * Converting between names
>
> Git has two helpers (programs designed mainly for use in shell scripts)
> to convert between global object IDs and human-readable names.
>
> The first is git-rev-parse. This is a general git shell script helper,
> which validates the command line and converts object names to absolute
> object IDs. Its man page has a detailed description of the object
> name syntax.
>
> The second is git-name-rev, which converts the other way around. It's
> particularly useful for seeing which tags a given commit falls between.
Also discuss git-describe?
> * The three uses of "git checkout"
Obviously there's a lot of overlap here with "man git-checkout". What's
the goal here? Maybe this should just be worked in to a revision of
that man page?
> * Deleting branches
>
> "git branch -d <head>" is safe. It deletes the given <head>, but first
> it checks that the commit is reachable some other way. That is, you
> merged the branch in somewhere, or you never did any edits on that branch.
It only checks whether the head of the branch to delete is reachable
from the *current* branch. The man page could be clearer here.
....
> * Examining history: git-log and git-rev-list
Yep, we should definitely have a good long chapter just devoted to
history examination. Most of it could be just cool examples, so it
would be fun.
Note some of this is done in the last half of cvs-migration.txt; we
should mine that section for whatever's useful and then replace by a
reference to the new chapter.
> * History diagrams
...
> * Trivial merges: fast-forward and already up-to-date.
These two sections are useful, yep.
> * Exchanging work with other repositories, part II: git-push
There's a lot of overlap here with cvs-migration.txt. Maybe some better
organization is needed to make that more prominent.
> The details are too advanced for this discussion, but the default
> "recursive" merge strategy that git uses solves the answer by merging
> a and b into a temporary commit and using *that* as the merge base.
I'm tempted to ignore any description of the merge strategy, or postpone
it till later; as a first pass I think it's better just to say "obvious
cases will be handled automatically, and you'll be prompted for
comments." Only other SCM developers are going to wonder how you handle
the corner cases.
> * When merging goes wrong
But yes, I think people could use more help on how to resolve merges.
> * Test merging
...
> * Cherry picking
...
> * Rebasing
Yup, I agree that that's good material to cover together.
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2006-11-17 17:44 ` [DRAFT] Branching and merging with git J. Bruce Fields
@ 2006-11-17 18:16 ` Jakub Narebski
0 siblings, 0 replies; 66+ messages in thread
From: Jakub Narebski @ 2006-11-17 18:16 UTC (permalink / raw)
To: git
J. Bruce Fields wrote:
> This has some useful material that fills gaps in the existing
> documentation. We need to think a little more about the intended
> audience, and about how to fit it in with existing documentation.
>
> On Thu, Nov 16, 2006 at 05:17:01PM -0500, linux@horizon.com wrote:
>> * A brief digression on command names.
> But the case I'm most interested in is the user whose
> distribution installs git for them, in which case I think the above
> could be distilled down to:
>
> - "git-foo" and "git foo" can be used interchangeably.
But it is encouraged (also for example by git-completion.bash) to use
"git foo" form in command line (because git commands can be not in the PATH,
although usually they are), and "git-foo" form in scripts (if possible).
>> The details are too advanced for this discussion, but the default
>> "recursive" merge strategy that git uses solves the answer by merging
>> a and b into a temporary commit and using *that* as the merge base.
>
> I'm tempted to ignore any description of the merge strategy, or postpone
> it till later; as a first pass I think it's better just to say "obvious
> cases will be handled automatically, and you'll be prompted for
> comments." Only other SCM developers are going to wonder how you handle
> the corner cases.
See below...
>> * When merging goes wrong
>
> But yes, I think people could use more help on how to resolve merges.
It would be useful to cover all non-reductible cases of recursive merge
strategy (the default merge strategy for two-head merges) conflicts:
contents (covered), add/add, rename/modify etc.
So some info about recirsive merge strategy would be useful.
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2006-11-17 15:32 ` Theodore Tso
2006-11-17 15:57 ` Sean
@ 2006-11-17 18:21 ` J. Bruce Fields
2006-11-18 0:13 ` linux
2006-11-19 17:50 ` J. Bruce Fields
1 sibling, 2 replies; 66+ messages in thread
From: J. Bruce Fields @ 2006-11-17 18:21 UTC (permalink / raw)
To: Theodore Tso; +Cc: linux, git
On Fri, Nov 17, 2006 at 10:32:46AM -0500, Theodore Tso wrote:
> Personally, I think this information is actually more important to an
> end-user than the current "part two" of the tutorial, which discusses
> the object database and the index file. Perhaps this should be "part
> 2", and the object database and index file should become "part 3"?
Yeah, the really difficult problem here is figuring out how to organize
the documentation. There are a few needs:
1. Quick-start/task-based documentation
- We want to "sell" git to the beginning user by getting
them up and running as quickly as possible.
- We need to help people with some limited needs--
testers who just need to download the latest linux git
tree, or bisect, or whatever.
- It's also a fun way to demonstrate the richness of
some git features (e.g. history explanation).
2. Conceptual background
- People need to understand the commit graph, branches,
merging, the index file (gack), pack files, etc.--some of
that can be put off a little while, some of it can't.
3. Reference documentation
The man pages do most of #3, but maybe they could be better organized--I
think people aren't finding stuff there that they should be.
Numbers 1 and 2 are scattered around git(7), the two-part tutorial, the
git-core tutorial, etc.
> It might also be a good to consider moving some of the "discussion"
> portion the top-level git(7) man page into the object database and
> index file discussion. Right now, the best way to introduce git's
> concepts (IMHO), is to start with the part 1 of the tutorial, then go
> into the your draft branch/merging with git, then the current part 2
> of the tutorial, and then direct folks to read the "discussion"
> section of git(7). Only then do they really have enough background
> understanding of the fundamental concepts of git that they won't get
> confused when they start talking to other git users, on the git
> mailing list, for example.
>
> It would be nice if there was an easy way to direct users through the
> documentation in a way which makes good pedagogical sense. Right now,
> one of the reasons why life gets hard for new users is that the
> current tutorials aren't enough for them to really undersatnd what's
> going on at a conceptual level. And if users start using "everyday
> git" as a crutch, without the right background concepts, the human
> brain naturally tries to intuit what's happening in the background,
> but without reading the background docs, git is different enough that
> they will probably get it wrong, which means more stuff that they have
> to unlearn later.
I agree. Unfortunately, people who need to use git but aren't
study-the-manual-first types *are* going to just dive in whether we want
them to or not, so we have to make it easy for them to pick up what they
need as they go.
How about this as a strawman "git user's manual" outline:
I. Quick-start: drawn from the tutorial part I and everyday.txt?
II. Basic git concepts, drawn from "discussion" in git(7) (the
README), tutorial part II, this branching-and-merging tutorial, etc.:
1. The commit graph and the object database
2. References
3. Fetching and pulling, remotes
4. The index file
III. Using git:
1. History exploration
2. merge resolution
3. pack files, fsck, repository maintenance
4. pushing, setting up a public repo
IV. Advanced examples: drawn from the howto directories,
cvs-migration.txt,...
1. More complicated commandline magic, scripting
(history exploration with git-rev-list, etc.)
2. History re-writing: cherry-picking, rebasing,...
3. Setting up a shared public repo?
4. Migration to/from other SCM's.
IV. Technical details: core-tutorial.txt, plumbing, code tours, etc.
Chapter II is the prerequisite for everything else, so a lot of thought
has to be given to treating exactly what's necessary there and no more.
Maybe more of it could be mixed into chapter I.
It has to be readable in order by the 10% of people who actually like to
read manuals, and easy to pick up in the middle for the 90% who will
just dive into the section they were told they need to read to
understand some particular problem.
In particular, ideally only I and II would really be sequential, and
the rest would be readable in any order.
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
[not found] ` <20061117120154.3eaf5611.seanlkml@sympatico.ca>
@ 2006-11-17 21:31 ` Petr Baudis
2006-11-17 22:36 ` Chris Riddoch
2006-11-17 23:30 ` Sean
0 siblings, 2 replies; 66+ messages in thread
From: Petr Baudis @ 2006-11-17 21:31 UTC (permalink / raw)
To: Sean; +Cc: Nguyen Thai Ngoc Duy, git
On Fri, Nov 17, 2006 at 06:01:54PM CET, Sean wrote:
> There is enough native Git documentation and hopefully more coming
> that third party tools should be pushed behind the scenes a bit.
> At least on the GIT website.
It's not about documentation but ease to use. I agree and sympathise
very much with the effort of making core Git more easy to use and
obsoleting Cogito, but until it gets there we should have what's nicest
to the users.
> Of course there is nothing wrong with having information there, but
> the main thrust should be about Git and how to use it directly without
> porcelains. Especially in the light that people have recently
> expressed a desire to advocate and document the use of native Git
> more strongly.
If someone writes a crash course in pure Git covering the same grounds
as the current ones (possibly by just extending/retouching the tutorial)
(it does not necessarily need to be a "refugee" crash course, it can
build up from scratch), I can add it on the web. If it becomes as easy
to use and with as mild learning curve as Cogito, it means Cogito got
mostly obsolete and I'll happily remove the Cogito crash courses from
the web.
> Having a link to Cogito off the front page of the Git website that
> says... Cogito makes things "easier", no matter how much you
> personally believe it, isn't the way everyone feels and is at
> odds with the native-git message and improvement effort.
If you disagree about that fact, can you provide some specific
argumentation?
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2006-11-17 21:31 ` Petr Baudis
@ 2006-11-17 22:36 ` Chris Riddoch
2006-11-17 22:50 ` Petr Baudis
2006-11-17 23:30 ` Sean
1 sibling, 1 reply; 66+ messages in thread
From: Chris Riddoch @ 2006-11-17 22:36 UTC (permalink / raw)
To: Petr Baudis; +Cc: Sean, Nguyen Thai Ngoc Duy, git
On 11/17/06, Petr Baudis <pasky@suse.cz> wrote:
> If someone writes a crash course in pure Git covering the same grounds
> as the current ones (possibly by just extending/retouching the tutorial)
> (it does not necessarily need to be a "refugee" crash course, it can
> build up from scratch), I can add it on the web. If it becomes as easy
> to use and with as mild learning curve as Cogito, it means Cogito got
> mostly obsolete and I'll happily remove the Cogito crash courses from
> the web.
As a relatively new user myself, I ran into the same confusion when I
came to the website for the first time. One of the most prominent
things on the front page is the "Git Crash Courses." Clicking on that
gives me the crash courses, all of which are about Cogito, not for
Git. So why doesn't the front page say "Cogito Crash Courses"
instead?
And I don't think it matters much whether Cogito makes things easier
or not -- the Git website really should make Git's documentation more
prominent than Cogito's. I'd expect the opposite of Cogito's website.
It *is* unnecessarily confusing.
--
epistemological humility
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2006-11-17 22:36 ` Chris Riddoch
@ 2006-11-17 22:50 ` Petr Baudis
0 siblings, 0 replies; 66+ messages in thread
From: Petr Baudis @ 2006-11-17 22:50 UTC (permalink / raw)
To: Chris Riddoch; +Cc: Sean, Nguyen Thai Ngoc Duy, git
On Fri, Nov 17, 2006 at 11:36:25PM CET, Chris Riddoch wrote:
> On 11/17/06, Petr Baudis <pasky@suse.cz> wrote:
> >If someone writes a crash course in pure Git covering the same grounds
> >as the current ones (possibly by just extending/retouching the tutorial)
> >(it does not necessarily need to be a "refugee" crash course, it can
> >build up from scratch), I can add it on the web. If it becomes as easy
> >to use and with as mild learning curve as Cogito, it means Cogito got
> >mostly obsolete and I'll happily remove the Cogito crash courses from
> >the web.
>
> As a relatively new user myself, I ran into the same confusion when I
> came to the website for the first time. One of the most prominent
> things on the front page is the "Git Crash Courses." Clicking on that
> gives me the crash courses, all of which are about Cogito, not for
> Git. So why doesn't the front page say "Cogito Crash Courses"
> instead?
>
> And I don't think it matters much whether Cogito makes things easier
> or not -- the Git website really should make Git's documentation more
> prominent than Cogito's. I'd expect the opposite of Cogito's website.
I think the difference here is the Git _tool_ vs. the Git version
control system. Cogito is an element of the second: To use Git, you can
either use the Git tool or the Cogito tool or the StGIT tool or even
just the qgit tool (which also lets you inspect the working copy and
commit). I believe the tool best suited for general usage by newbies _at
this point_ is Cogito, so that's what I use for introduction to Git. I'm
not saying this is ideal situation and I and others are/will be working
to fix it.
I'm all for making it more obvious what's going on at the website, I
think the current wording is better. Also, if people believe that a
crash course for core Git would help things, I'm all for it as well.
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2006-11-17 21:31 ` Petr Baudis
2006-11-17 22:36 ` Chris Riddoch
@ 2006-11-17 23:30 ` Sean
1 sibling, 0 replies; 66+ messages in thread
From: Sean @ 2006-11-17 23:30 UTC (permalink / raw)
To: Petr Baudis; +Cc: Nguyen Thai Ngoc Duy, git
On Fri, 17 Nov 2006 22:31:26 +0100
Petr Baudis <pasky@suse.cz> wrote:
> It's not about documentation but ease to use. I agree and sympathise
> very much with the effort of making core Git more easy to use and
> obsoleting Cogito, but until it gets there we should have what's nicest
> to the users.
As some new users have already tried to tell you, it's confusing for
_them_ when they're trying to learn Git to be confronted with Cogito
documentation.
The way we're going to get Git to be better is to expose new people
to it and respond to their comments, complaints and ideas about how
to make it better and easier to understand as they get up to speed.
Having Cogito plastered all over the Git website as the _easy_
alternative is counterproductive to that effort. We need fresh
blood looking at the Git documentation and trying to learn Git.
By using the GIT webpage to promote Cogito as the "easy" alternative
you make it look like the entire GIT community is recommending
new users should use Cogito instead. That does not represent
the views of the entire GIT community. You should be very careful
to represent the entire community in your role as GIT webmaster.
If people go to a Cogito website, _that's_ where they should learn
about your opinions about why someone should use Cogito in
place of Git. Cogito isn't "nicest" for users who don't need
its extra functionality, or for getting new users involved in
the improvement effort of native Git.
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2006-11-17 18:21 ` J. Bruce Fields
@ 2006-11-18 0:13 ` linux
2006-11-18 0:32 ` Jakub Narebski
2006-11-18 0:40 ` Junio C Hamano
2006-11-19 17:50 ` J. Bruce Fields
1 sibling, 2 replies; 66+ messages in thread
From: linux @ 2006-11-18 0:13 UTC (permalink / raw)
To: bfields, tytso; +Cc: git, linux
I'm working on incorporating all of the comments I've received, so
thank you all!
(BTW, the reason I didn't document git-describe is that I didn't
know about it! You fixed the latter, so I'll fix the former.)
I'm glad if others like it, but I was really scratching my own
itch. I'm still wrapping my head around how to work with git, and
writing this was my own learning experience.
Even writing it out in full rather than as rougher notes wasn't
an entirely unselfish act; it ensures:
1) I don't leave some important assumption unstated; that's the
type most likely to be wrong, and
2) If I can get it good enough to post publicly, I'll get all the
experts to fact-check it for me.
As for the target audience, it's basically someone who's read git(1)
and knows what a VCS is supposed to do, but has a CVS/SVN mindset.
The emphasis is on branching and merging because that's the big
"mental mode" difference in the way that git works.
For anyone else documenting git, I recommend describing "what if I
make a mistake" early. It was a bit of a revelation to realize
that there's not much point to "git pull --no-commit" because
it's so easy to undo.
Just a couple of questions:
We seem to have developed a consensus on the desirability of allowing
HEAD to point outside refs/heads, postponing the check until
commit/merge time. (At least, junkio and Linus seemed to like it.)
I proposed it, so I get to write it, but you notice I have a whole
section on how to work around the lack of that feature, so if someone
feels like picking up the baton while I'm still writing docs, it would
simplify things.
I'd like to learn more about the zillion options to git-log.
If people feel like sharing useful incantations, it would be
be very helpful to give a concrete example of its usefulness,
preferably within the git history itself.
(Are there any octopus merges in git's history? If not, could I ask
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2006-11-18 0:13 ` linux
@ 2006-11-18 0:32 ` Jakub Narebski
2006-11-18 0:40 ` Junio C Hamano
1 sibling, 0 replies; 66+ messages in thread
From: Jakub Narebski @ 2006-11-18 0:32 UTC (permalink / raw)
To: git
linux@horizon.com wrote:
[...]
> (Are there any octopus merges in git's history? If not, could I ask
> for one for pedagogical value?)
See commit d425142e2a045a9dd7879d028ec68bd748df48a3 (most legged octopus
I found in git.git repository). Doing git-rev-parse --parents -all, or
git log --all and greppoing for merges is a good idea to find octopi.
The commit is both v1.1.2-gd425142 (git describe) and tags/v1.2.0^0~143
(git name-rev --tags)
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2006-11-18 0:13 ` linux
2006-11-18 0:32 ` Jakub Narebski
@ 2006-11-18 0:40 ` Junio C Hamano
2006-11-18 1:11 ` Junio C Hamano
1 sibling, 1 reply; 66+ messages in thread
From: Junio C Hamano @ 2006-11-18 0:40 UTC (permalink / raw)
To: linux; +Cc: git
linux@horizon.com writes:
> We seem to have developed a consensus on the desirability of allowing
> HEAD to point outside refs/heads, postponing the check until
> commit/merge time. (At least, junkio and Linus seemed to like it.)
Yes, and I am actually interested in at least doing the initial
damage assessment myself but people are welcome to beat me to
it. The easies part would be to just try writing a bare SHA-1
to .git/HEAD with:
H=$(git-rev-parse --verify HEAD)
echo $H >.git/HEAD
and see what breaks and start picking up the pieces from there.
> I'd like to learn more about the zillion options to git-log.
> If people feel like sharing useful incantations, it would be
> be very helpful to give a concrete example of its usefulness,
> preferably within the git history itself.
>
> (Are there any octopus merges in git's history? If not, could I ask
> for one for pedagogical value?)
git.git itself is full of them, but the very first octopus (it
actually is a pentapus) is rather nice to watch in gitk:
211232bae64bcc60bbf5d1b5e5b2344c22ed767e
You can look for them with:
git rev-list --parents HEAD | grep '..* ..* ..* ..* ..* ..*'
Repeat as many " ..*" as the number of parents you would want to require.
I knew the very first one was pentapus (I did it) so I wrote six ..*
there (one for the commit, one each for parents).
Len's dodecapus in linux-2.6.git is this one:
9fdb62af92c741addbea15545f214a6e89460865
It is very interesting to watch it with "git show". Len has
another one in August:
da547d775fa9ba8d9dcaee7bc4e960540e2be576
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2006-11-18 0:40 ` Junio C Hamano
@ 2006-11-18 1:11 ` Junio C Hamano
2006-11-20 23:51 ` [DRAFT 2] " linux
2006-11-22 11:51 ` [DRAFT] " Junio C Hamano
0 siblings, 2 replies; 66+ messages in thread
From: Junio C Hamano @ 2006-11-18 1:11 UTC (permalink / raw)
To: linux; +Cc: git
Junio C Hamano <junkio@cox.net> writes:
>> (Are there any octopus merges in git's history? If not, could I ask
>> for one for pedagogical value?)
>
> git.git itself is full of them, but the very first octopus (it
> actually is a pentapus) is rather nice to watch in gitk:
>
> 211232bae64bcc60bbf5d1b5e5b2344c22ed767e
Having said that, I think it is not a good idea to talk about
octopus in introductory documents. The 'feature' may be unique
to git and some people might even find it cool, but new people
should not be encouraged to use it until they understand the
ramifications.
The first ever octopus merge was just a bundle of five forked
development branches, each of which had only one commit since it
forked from the common parent.
.-a-.
.--b--.
O---c---X
'--d--'
'-e-'
They were independent, un-overlapping changes. "diff-tree -c"
would not show anything, and there was no inherent reason that
one change should come before the others, so in that sense,
presenting this as an octopus was making the history more
truthful than pretending one happened before another.
But octopus has a negative effect on bisecting performance.
Suppose commit X was bad and commit O was good. Because X
bundles five branches into one, and we know one of them
(hopefully) is what introduced the regression, our task is to
find the guilty one commit among five commits. But in order to
do so, we would end up having to test four commits. That is,
knowing that a, b and c are Ok does not give us any useful
information to determine which of d or e is the bad one (after
learning that a, b and c are Ok, we still need to test d and if
it turns out to be Ok then we can finally say e is the bad one).
If I did not do an octopus and laid out the commit ancestry
graph this way when I gave them to Linus:
O--a--b--c--d--e--X
the same bisect would have asked us check c first. If it is
good, then we do not even have to test a or b. The linear part
of the history is what bisect takes advantage of to cut the
search space efficiently, and an octopus actively defeats that.
So doing an octopus is a wrong thing to do, if there is a
possibility that something wrong is found later. So people
should not do an octopus unless the component changes are all
truely trivial.
If you want an esoteric topic for an introductory documentation,
it would be more useful to talk about evil merges (an evil merge
is a merge commit whose result does not match any of its
parents). A good example is found in
git show v1.0.0
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2006-11-17 18:21 ` J. Bruce Fields
2006-11-18 0:13 ` linux
@ 2006-11-19 17:50 ` J. Bruce Fields
2006-11-19 17:59 ` Git manuals Petr Baudis
2006-11-26 4:01 ` [PATCH] Documentation: add a "git user's manual" J. Bruce Fields
1 sibling, 2 replies; 66+ messages in thread
From: J. Bruce Fields @ 2006-11-19 17:50 UTC (permalink / raw)
To: Theodore Tso; +Cc: linux, git
On Fri, Nov 17, 2006 at 01:21:57PM -0500, bfields wrote:
> On Fri, Nov 17, 2006 at 10:32:46AM -0500, Theodore Tso wrote:
> > It would be nice if there was an easy way to direct users through the
> > documentation in a way which makes good pedagogical sense.
....
> How about this as a strawman "git user's manual" outline:
In fact, I'm tempted to submit a patch that just assigns a chapter
number to everything under Documentation/, slaps a single table of
contents on the front, and calls the result "the git user's manual."
Of course, the moment people started trying to read the thing they'd
complain that it was a mess--some stuff referenced without being
introduced, other stuff introduced too many times. But then over time
maybe that'd force us to mold it into some sort of logical sequence.
^ permalink raw reply [flat|nested] 66+ messages in thread
* Git manuals
2006-11-19 17:50 ` J. Bruce Fields
@ 2006-11-19 17:59 ` Petr Baudis
2006-11-19 18:16 ` Jakub Narebski
2006-11-19 19:36 ` J. Bruce Fields
2006-11-26 4:01 ` [PATCH] Documentation: add a "git user's manual" J. Bruce Fields
1 sibling, 2 replies; 66+ messages in thread
From: Petr Baudis @ 2006-11-19 17:59 UTC (permalink / raw)
To: J. Bruce Fields; +Cc: Theodore Tso, linux, git
On Sun, Nov 19, 2006 at 06:50:40PM CET, J. Bruce Fields wrote:
> On Fri, Nov 17, 2006 at 01:21:57PM -0500, bfields wrote:
> > On Fri, Nov 17, 2006 at 10:32:46AM -0500, Theodore Tso wrote:
> > > It would be nice if there was an easy way to direct users through the
> > > documentation in a way which makes good pedagogical sense.
> ....
> > How about this as a strawman "git user's manual" outline:
(I was briefly discussing Git Book with Junio at OLS, I think the result
was "yeah, would be nice, perhaps we can start poking it soon". I
started to think about it once again in the last few weeks.)
> In fact, I'm tempted to submit a patch that just assigns a chapter
> number to everything under Documentation/, slaps a single table of
> contents on the front, and calls the result "the git user's manual."
>
> Of course, the moment people started trying to read the thing they'd
> complain that it was a mess--some stuff referenced without being
> introduced, other stuff introduced too many times. But then over time
> maybe that'd force us to mold it into some sort of logical sequence.
Sequencing isn't the only problem. A _manual_ is different from
_reference documentation_ in that it does not usually describe command
after command, but rather concept after concept. So instead of slamming
git-*-pack commands together, you have a section "Handling Packs" where
you try to coherently describe the commands together.
Your approach is fine for something you would call "Git Reference
Manual", but it is something really different from "The Git Book" or
"Git User's Manual".
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
The meaning of Stonehenge in Traflamadorian, when viewed from above, is:
"Replacement part being rushed with all possible speed."
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: Git manuals
2006-11-19 17:59 ` Git manuals Petr Baudis
@ 2006-11-19 18:16 ` Jakub Narebski
2006-11-19 19:50 ` Robin Rosenberg
2006-11-19 19:36 ` J. Bruce Fields
1 sibling, 1 reply; 66+ messages in thread
From: Jakub Narebski @ 2006-11-19 18:16 UTC (permalink / raw)
To: git
Petr Baudis wrote:
> Your approach is fine for something you would call "Git Reference
> Manual", but it is something really different from "The Git Book" or
> "Git User's Manual".
By the way, does AsciiDoc support conversion to texinfo (and then to info)
format? It would be nice to have "The Git Book" aka "GUM - Git User's
Manual" in texinfo, HTML and perhaps also PDF (HTML and PDF with
graphs/images).
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: Git manuals
2006-11-19 17:59 ` Git manuals Petr Baudis
2006-11-19 18:16 ` Jakub Narebski
@ 2006-11-19 19:36 ` J. Bruce Fields
1 sibling, 0 replies; 66+ messages in thread
From: J. Bruce Fields @ 2006-11-19 19:36 UTC (permalink / raw)
To: Petr Baudis; +Cc: Theodore Tso, linux, git
On Sun, Nov 19, 2006 at 06:59:52PM +0100, Petr Baudis wrote:
> On Sun, Nov 19, 2006 at 06:50:40PM CET, J. Bruce Fields wrote:
> > In fact, I'm tempted to submit a patch that just assigns a chapter
> > number to everything under Documentation/, slaps a single table of
> > contents on the front, and calls the result "the git user's manual."
> >
> > Of course, the moment people started trying to read the thing they'd
> > complain that it was a mess--some stuff referenced without being
> > introduced, other stuff introduced too many times. But then over time
> > maybe that'd force us to mold it into some sort of logical sequence.
>
> Sequencing isn't the only problem. A _manual_ is different from
> _reference documentation_ in that it does not usually describe command
> after command, but rather concept after concept. So instead of slamming
> git-*-pack commands together, you have a section "Handling Packs" where
> you try to coherently describe the commands together.
>
> Your approach is fine for something you would call "Git Reference
> Manual", but it is something really different from "The Git Book" or
> "Git User's Manual".
Yeah, of course, but I wasn't actually thinking of the man pages so much
as:
everyday.txt
tutorial.txt
tutorial-2.txt
core-tutorial.txt
howto/
hooks.txt
README
glossary.txt
etc.
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: Git manuals
2006-11-19 18:16 ` Jakub Narebski
@ 2006-11-19 19:50 ` Robin Rosenberg
0 siblings, 0 replies; 66+ messages in thread
From: Robin Rosenberg @ 2006-11-19 19:50 UTC (permalink / raw)
To: Jakub Narebski; +Cc: git
söndag 19 november 2006 19:16 skrev Jakub Narebski:
> Petr Baudis wrote:
> > Your approach is fine for something you would call "Git Reference
> > Manual", but it is something really different from "The Git Book" or
> > "Git User's Manual".
>
> By the way, does AsciiDoc support conversion to texinfo (and then to info)
> format? It would be nice to have "The Git Book" aka "GUM - Git User's
> Manual" in texinfo, HTML and perhaps also PDF (HTML and PDF with
> graphs/images).
Not sure if there is a texinfo translator, but HTML, PDF and anything else
that DocBook can be used for is possible.
^ permalink raw reply [flat|nested] 66+ messages in thread
* [DRAFT 2] Branching and merging with git
2006-11-18 1:11 ` Junio C Hamano
@ 2006-11-20 23:51 ` linux
2006-11-22 11:02 ` [Patch to DRAFT 2 (1/2)] " Junio C Hamano
` (3 more replies)
2006-11-22 11:51 ` [DRAFT] " Junio C Hamano
1 sibling, 4 replies; 66+ messages in thread
From: linux @ 2006-11-20 23:51 UTC (permalink / raw)
To: git; +Cc: linux
I tried to incorporate all the suggestions. There are still a few things
I have to research, and now I'm worried it's getting too long. Sigh.
Generally, it's wonderful when the whole is greater than the sum of
the parts. But trying to explain that is difficult, because you have
to explain all the parts before you can explain how they work together
to deliver a feature.
Oh, well. Perhaps I can rearrange this to talk about remote branches
*after* local merging?
Thanks to everyone who commented.
* Branching and merging in git
In CVS, branches are difficult and awkward to use, and generally
considered an advanced technique. Many people use CVS for a long time
without departing from the trunk.
Git is very different. Branching and merging are central to effective use
of git, and if you aren't comfortable with them, you won't be comfortable
with git. In particular, they are required to share work with other
people.
The only things that are a bit confusing are some of the names.
In particular, at least when beginning:
- You create new branches with "git checkout -b".
"git branch" should only be used to list and delete branches.
- You share work with "git fetch" and "git push". These are opposites.
- You merge with "git pull", not "git merge". "git pull" can also do a
"git fetch", but that's optional. What's not optional is the merge.
Also, a good habit it to never commit directly to your main "master"
branch. Do all work in temporary "topic" branches, and then merge them
into the master. Most experienced users don't bother to be quite this
purist, but you should err on the side of using separate topic branches,
so it's excellent practice.
* A brief digression on command names
All git commands can be invoked as "git-foo" and "git foo". This document
uses them interchangably. But you have to ask for the "git-foo" man page.
Git provides a few other ways to get the man page as well:
man git-foo
git help foo
git foo --help
Git doesn't have a specialized built-in help system; it just shows you
the man pages.
One outstanding problem with git's man pages is that often the most detail
is in the command page that was written first, not the user-friendly
one that you should use. For example, there are a number of special
cases of the "git diff" command that were written first, and the man
pages for these commands (git-diff-index, git-diff-files, git-diff-tree,
and git-diff-stages) are considerably more informative than the page for
plain git-diff, even though that's the command that you should use 99%
of the time.
Likewise, the man page for the incredibly useful "git show" command is a
sterile wasteland; you have to refer to the git-rev-list and git-diff-tree
man pages to find the many interesting options it supports.
"git-foo --help" often gives you the man page as well, but sometimes
gives a different help listing. (Try "git-rebase --help" and compare
it with "git rebase --help".)
If you care, here's why:
There are over 100 git-* commands. This has led to some complaints
about the clutter in /usr/bin. So it's now possible to move the git-*
commands into a separate directory, and you can either add that directory
to your $PATH, or always use the "git" wrapper to find and invoke them.
This is kind of academic as, despite the complaints, I don't know anyone
who has actually removed the git-* forms from the default $PATH.
Since the wrapper was developed, some simple commands have been made
"builtin", so for example, "git diff" is done internally. There's a
git-diff link to retain compatability.
* Git's representation of history
As you recall from Git 101 (or the git(7) man page), git's largest
data strucure is the object database, holding exactly four kidnds
of objects. Each of them has a globally unique 40-character hex name
(a.k.a. object IDs) made by hashing its type and contents. Since this is
an (effectively unforgeable) cryptographic hash, the name of an object
identifies its contents uniquely. This is what is meant when people
say that git's history is immutable; you can erase it and rewrite it,
but any alterations will have a different name, and it will be obvious
to anyone looking that something has been changed.
Blob objects are the simplest: they record file contents, and contain
uninterpreted bytes.
Tree objects record directory contents; they contain file names,
permissions, and the associated tree or blob object names.
Tag objects are shareable pointers to other objects; they're generally
used to store a digital signature, and generally point to commits.
(Although you can tag any object, including another tag.)
Finaly, there are commit objects. Every commit points to (contains the
name of) an associated tree object which records the state of the source
code at the time of the commit, and some descriptive data (time, author,
committer, commit comment) about the commit.
Because every commit is associated with one tree, it's considered
"tree-ish", and almost anywhere that git expect a tree object, you can
supply a commit, and it'll understand. (Tags on commits are likewise
considered "commit-ish", and can be used almost anywhere a commit or
tree is required.)
Most importantly, each commit contains a list of "parent commits", older
commits from which this one is derived. These pointers are what produce
the history graph.
Typically only one commit (the initial commit) has zero parents. It's
possible to have more than one such commit (if you merge two projects
with separate histories), but that's unusual.
Many commits have exactly one parent. These are made by a normal commit
after editing. From a branching and merging point of view, they're not
too exciting.
And then there are commits which have multiple parents. Two is most
common, but git allows many more. There's a limit of sixteen in
the source code, and the most anyone's ever used in real life is 12,
which was generally regarded as overdoing it. The famous "dodecapus"
is commit 9fdb62af in the linux kernel repository.
Finally, there are references, stored in the .git/refs directory.
These are the human-readable names associated with commits, and the
"root set" from which all other commits should be reachable.
These references are generally divided into two types, although
there is no fundamental difference:
- Tags are references that are intended to be immutable.
A tag like "v1.2" is a historical record. Tag references may or may not
point to tag objects! If they do, this is called a "heavyweight tag";
the tag can hold a digital signature and can be shared between repositores.
"Lightweight tags" point to commits directly, and are not automatically
shared.
- Heads are references that are intended to be updated.
They are invariably "lightweight." "Head" is actually synonymous with
"branch", although one emphasizes the tip more, while the other directs
your attention to the entire path that got there.
Either way, they're just a 41-byte file that contains a 40-byte hex
object ID, plus a newline. Tags are stored in .git/refs/tags, and heads
are stored in .git/refs/heads. Creating a new branch is literally just
picking a file name and writing the ID of an existing commit into it.
The git programs enforce the immutability of tags, but that's a safety
feature, not something fundamental. You can rename a tag to the heads
directory and go wild. (But the git-update-ref helper takes care of
a few corner cases involving symlinks.)
The only limit on branches is clutter. A number of git commands have
ways to operate on "all heads", and if you have too many, it can get
annoying. If you're not using a branch, either delete it, or move it
somewhere (like the tags directory) where it won't clutter up the list of
"currently active heads".
(Note that CVS doesn't have this all-heads default, so people tend to
use longer branch names and keep them around after they've been merged
into the trunk. Old CVS repositories converted to git generally need
an old-branch cleanup.)
Another thing that's worth mentioning is that head and tag names can
contain slashes; i.e. you're allowed to make subdirectories in the
.git/refs/heads and .git/refs/tags directories. See the man page
for "git-check-ref-format" for full details of legal names.
* Naming revisions
CVS encourages you to tag like crazy, because the only other way to
find a given revision is by date. Git makes it a lot easier, so most
revisions don't need names.
You can find a full description in the git-rev-parse man page, but here's
a summary.
First of all, every commit has a globally unique name, its 40-digit hex
object ID. It's a bit long and awkward, but always works. This is useful
for talking about a specific commit on a mailing list. You can abbreviate
it to a unique prefix; most people find about 8 digits sufficient.
(Subversion is easier yet, because it assigns a sequential number to each
commit. However, that isn't possible in a distributed system like git.)
Second, you can refer to a head or tag name. Git looks in the
following places, in order, for a reference:
1) .git/<name>
2) .git/refs/<name>
3) .git/refs/tags/<name>
4) .git/refs/heads/<name>
5) .git/refs/remotes/<name>
6) .git/refs/remotes/<name>/HEAD
You should avoid having e.g. a head and a tag with the same name, but
if you do, you can specify one or the other with heads/foo and tags/foo.
Third, you can specify a commit relative to another. The simplest
one is "the parent", specified by appending ^ to a name. E.g. HEAD^
or deadbeef^. If there are multiple parents, then ^ is the same as ^1,
and the others are ^2, ^3, etc.
So the last few commits you've made are HEAD, HEAD^, HEAD^^, HEAD^^^, etc.
After a while, counting carets becomes annoying, so you can abbreviate
^^^^ as ~4. Note that this only lets you specify the first parent.
If you want to follow a side branch, you have to specify something like
"master~305^2~22".
* Converting between names
Git has three helpers (programs designed mainly for use in shell scripts)
to convert between global object IDs and human-readable names.
The first is git-rev-parse. This is a general git shell script helper,
which validates the command line and converts object names to absolute
object IDs. Its man page has a detailed description of the object
name syntax.
The second is git-name-rev, which converts the other way around, giving
a name for the specified commit in terms of the existing references.
This searches the references for the closest descendant of the given
commit. If called with the --tags option, it'll search only tags,
showing you the next tag on the development path.
Third is git-describe. This is something like git-name-rev, but
backwards: it finds the closest reference that is an ancestor of the
specified commit. Its output is not acceptable git input, but takes
the form of either
v1.2 (tag name), or
v1.2-g12345678 (commit 12345678, whose nearest ancestor is v1.2)
git-describe's output is intended to be used as a software version
number, something like the "rcsid" feature in RCS and CVS.
By default, git-describe uses only heavyweight tags for its naming,
so they are globally unique, but you can ask for it to be more liberal.
Git has chosen not to implement anything like the RCS/CVS "keyword
substitution" feature, but you can invoke git-describe from a Makefile
and incorporate its output into an executable to achieve a similar effect.
* Working with branches, the trivial cases.
By convention, the local "trunk" of git development is called "master".
This is just the name of the branch it creates when you start an empty
repository. You can delete it if you don't like the name.
If you create your repository by cloning someone else's repository, the
remote "master" branch is copied to a local branch named "origin". You
get your own "master" branch which is not tied to the remote repository.
There is always a current head, known as HEAD. (This is actually a
symbolic link, .git/HEAD, to a file like refs/heads/master.) Git requires
that this always point to the refs/heads directory.
Minor technical details:
1) HEAD used to be a Unix symlink, and can still be though of that
way, but for Microsoft support, this is now what's called a
"symbolic reference" or symref, and is a plain file containing
"ref: refs/heads/master". Git treats it just like a symlink.
There's a git-update-ref helper which writes these.
2) While HEAD must point to refs/heads, it's legal for it to
point to a file that doesn't exist. This is what happens
before the first commit in a brand new repository.
When you do "git commit", a new commit object is created with the old
HEAD as a parent, and the new commit is written to the current head
(pointed to by HEAD).
* The three uses of "git checkout"
Git checkout can do three separate things:
1) Change to a new head
git checkout [-f|-m] <branch>
This changes the HEAD symref to point to <branch>, and copies its
state to the index and the working directory.
If a file has unsaved changes in the working directory, this tries
to preserve them. This is a simple attempt, and requires that the
modified files(s) are not altered between the old and new HEADs.
In that case, the version in the working directory is left untouched.
A more aggressive option is -m, which will try to do a three-way
(intra-file) merge. This can fail, leaving unmerged files in the
index; dealing with this is described later.
An alternative is to use -f, which will overwrite any unsaved changes
in the working directory. This option can be used with no <branch>
specified (defaults to HEAD) to undo local edits.
2) Revert changes to a small number of files.
git checkout [<revision>] [--] <paths>
will copy the version of the <paths> from the index to the working
directory. If a <revision> is given, the index for those paths will
be updated from the given revision before copying from the index to
the working tree.
Unlike the version with no <paths> specified, this does NOT change
HEAD, even if <paths> is ".".
The "--" is only required if the first <path> could be
mistaken for a revision.
3) Create a branch.
git checkout [-f|-m] -b <branch> [revision]
will create, and switch to, a new branch with the given name.
This is equivalent to
git branch <branch> [<revision>]
git checkout [-f|-m] <branch>
If <revision> is omitted, it defaults to the current HEAD, in which
case no working directory files are altered.
This is the usual way that one creates a new branch to start work
on it, or checks out a revision that does not have an existing head
pointing to it.
Note that you can create a branch at any time up until the git-commit;
* Deleting branches
"git branch -d <head>" is safe. It deletes the given <head>, but first
it checks that the commit is reachable from the current HEAD. That is,
you merged the branch in to the current HEAD, or you never did any edits
on that branch.
It's a good idea to create a "topic branch" when you're working on
anything bigger than a one-liner, but it's also a good idea to delete
them when you're done. The commits are still there in the history.
* Doing rude things to heads: git reset
If you need to overwrite the current HEAD for some reason, the tool to
do it with is "git reset". There are three levels of reset:
git reset --soft <head>
This overwrites the current HEAD with the contents of <head>.
If you omit <head>, it defaults to HEAD, so this does nothing.
git reset [<head>]
git reset --mixed [<head>]
These overwrite the current HEAD, and copy it to the index,
undoing any git-update-index commands you may have executed.
If you omit <head>, it default to HEAD, so there is no change
to the current branch, but all index changes are undone.
git reset --hard [<head>]
This does everything mentioned above, and updates the
working directory. This throws away all of your in-progress
edits and gets you a clean copy. This is also commonly
used without an explicit <head>, in which case the current
HEAD is used.
* Using git-reset to fix mistakes
"Oh, no! I didn't mean to commit *that*! How do I undo it?"
If you just want to undo a commit, then you can use "git reset HEAD^"
to return the current HEAD to the previous version. If you want to leave
the commit in the index (this only applies to you if you are familiar with
using the index; see below), then you can use "git reset --soft HEAD^".
And if you want to blow away every record of the changes you made,
you can use "git reset --hard HEAD^"
If you just want a stupid trivial mistake and want to replace the most
recent commit with a corrected one, "git commit --amend" is your friend.
It makes a new commit with HEAD^ rather than HEAD as its ancestor.
It's often much easier to use --amend than fiddle around with git-reset.
* Fixing mistakes without git-reset
git-reset has the problem that it doesn't preserve hacking in progress
in the working directory. It can leave the working directory alone
(making everything a "hack in progress"), but it can't merge in-progress
changes like git checkout.
So, suppose you've been trying something that should have been simple,
and made three commits to your main branch before realizing that the
problem is harder than you thought, and you want your work so far to be
on a new branch of its own; committing them on the current HEAD (I'll
call it "old") was a mistake.
You don't want to erase anything, just rename it. Make "new" a copy of
the current "old" and move old back to HEAD^^^ (three commits ago).
While there are ways to do that using git-reset, but far better is
to use "git branch -f":
git checkout -b new
Create (and switch to) the "new" branch.
git branch -f old HEAD^^^
Forcibly move "old" back three versions.
(You could also use old~3 or new^^^ or any synonymous name.)
You can use a similar trick to rename a branch. If it's the current
HEAD, then:
git checkout -b newname
git branch -d oldname
and if it's not, then
git branch newname oldname
git branch -d oldname
An alternative in the latter case is to just use mv on the raw
.git/refs/heads/oldname file.
* How do I check out an old version?
A very common beginning question is how to check out an old version.
Say you need to compile an old release for test purposes. "git checkout
v1.2" gives a funny error message. What's going on?
Well, "git checkout" makes the current HEAD point to the head that
you specify. And, as previously mentioned, git requires that it point
to something in the .git/refs/heads directory. So you can't do that.
If you're busy doing things in your working directory, and don't want to
overwrite your work with an old version, then you can get a snapshot with
the (old) git-tar-tree or (new) git-archive commands. These produce a
tar file (git-archive can also produce a zip file) which is a snapshot
of any version you like. You can then unpack this file in a different
directory and build it.
However, if you haven't got any edits in progress, and want to check out
the old version into your working directory, just create a temp branch!
git checkout -b temp v1.2
Will do what you want. This will also do what you want if you have a
local edit (like the "#define DEBUG 1" mentioned above) that you want
to preserve while working on the old version.
You'll see this in use if you ever use the (highly recommended) git-bisect
tool. It creates a branch called "bisect" for the duration of the bisect.
(Yes, I have to confess, I sometimes wish that git would enforce the
"HEAD must point to .git/refs/heads" rule when committing (checking in)
rather than when checking out, but that's the way git has grown up.)
Note that if you want *exactly* an old version, with no local hacks,
make sure there are none (with "git status") when doing this. It's more
convenient if you do it before the checkout, but you'll get the same
answer if you ask afterwards.
Now, what about the complex case: you have local hacks that you
want to keep, but not have polluting the old version?
Well, one way of the other, you'll have to commit it. If you don't mind
committing your changes to the current branch ("git commit -a"), do that.
If they're not ready to commit, you can commit them anyway, and back
them out when you're done:
git commit -a -m "Temp commit"
git checkout -b temp v1.2
make ; make test ; whatever
git checkout master
git branch -d temp
git reset HEAD^
This leaves both the working directory and the master head in the states
they were in at the beginning.
If you don't like committing to the master branch, you can make a new one.
In this example, it's "work in progress", a.k.a. "wip":
git checkout -b wip
git commit -a -m "Temp commit"
git checkout -b temp v1.2
make ; make test ; whatever
git checkout wip
git branch -d temp
git reset master
git checkout master # Won't change working directory
git branch -d wip
* Examining history: git-log and git-rev-list
In another example of docs being better on the first command written,
the all-purpose utility for examining history is "git log", but all of
the examples of clever ways to use it are in the git-rev-list man page.
And git-log also has most of git-diff's options.
Other utilities, notably the gitk and qgit GUIs, also use the git-rev-list
command-line options, so it's well worth learning them.
git-rev-list gives you a filtered subset of the repository history.
There are two basic ways that you can do the filtering:
1) By ancestry. You specify a set of commits to include all the
ancestors of, and another set to exclude all the ancestors of.
(For this purpose, a commit is considered an ancestor of itself.)
So if you want to see all commits between v1.1 and v1.2, you
can specify
git log ^v1.1 v1.2
or, with a more convenient syntax
git log v1.1..v1.2
However, there are times when you want to specify something more
complex. For example, if a big branch that had been in progress since
v1.0.7 was merged between v1.1 and v1.2, but you don't want to see it,
you could specify any of:
git log v1.2 ^v1.1 ^bigbranch
git log ^bigbranch v1.1..v1.2
git log ^v1.1 bigbranch..v1.2
They're all equivalent. Another special syntax that's sometimes
handy is
git log branch1...branch2
Note the three dots. This generates the symmetric difference between
the two; basically it's a diff between the commits that went into
each of them.
"git log" by default pipes its output through less(1), and generates
its output from newest to oldest on the fly, so there's no great
speed penalty to not specifying a starting place. It'll generate a
few screen fulls more than you look at, but not waste any more effort
than that.
2) By path name. This is a feature which appears to be unique to git.
If you give git-rev-list (or git-log, or gitk, or qgit) a list of
pathname prefixes, it will list only commits which touch those
paths. So "git log drivers/scsi include/scsi" will list only
commits which alters a file whose name begins with drivers/scsi
or include/scsi.
(If there's any possible ambiguity between a path name and a commit
name, git-rev-list will refuse to proceed. You can resolve it by
including "--" on the command line. Everything before that is a
commit name; everything after is a path.)
This filter is in addition to the ancestry filter. It's also rather
clever about omitting unnecessary detail. In particular, if there's
a side branch which does not touch the given paths, then the entire
branch, and the merge at the end, will be removed from the log.
You can additionally limit the commits to a certain number, or by date,
author, committer, and so on.
By default, "git log" only shows the commit messages, so it's important to
write good ones. Other tools compress commit messages down to
the first line, so try to make that as informative as possible.
("git show" is the standard tool for examining a single commit,
and it does show
* History diagrams
When talking about various situations involving multiple branches,
people often find it handy to draw pictures. Gitk draws nice pictures
vertically, but for e-mail, ASCII art drawn horizontally is often easier.
Commits are shown as "o", and the links between them with lines drawn with
- / and \. Time goes left to right, and heads may be labelled with names.
For example:
o--o--o <-- Branch A
/
o--o--o <-- master
\
o--o--o <-- Branch B
If someone needs to talk about a particular commit, the character "o"
may be replaced with another letter or number.
* Trivial merges: fast-forward and already up-to-date.
There are two kinds of merge that are particularly simple, and you will
encounter them in git a great deal. They are mirror images.
Suppose that you are working on branch A and merge in branch B, but no
work has been done to branch B since the last time you merged, or since
you spawned branch A from it. That is, the history looks like
o--o--o--o <-- B
\
o--o--o <-- A
or
o--o--o--o--o--o <-- B
\ \
o--o--o--o--o <-- A
If you then merge B into A, A is described as "already up to date".
It is already a strict superset of B, and the merge does nothing.
In particular, git will not create a dummy commit to record the fact that
a merge was done. It turns out that are a number of bad things that would
happen if you did this, but for now, I'll just say that git doesn't do it.
Now, the opposite scenario is the "fast-forward" merge. Suppose you
merge A into B. Again, A is a strict superset of B.
In this case, git will simply change the head B to point to the same
commit as A and say that it did a "fast-forward" merge. Again, no commit
object is created to reflect this fact.
The effect is to unclutter the git history. If I create a topic branch to
work on a feature, do some hacking, and then merge the result back into
the (untouched!) master, the history will look just like I did all the
work on the master directly. If I then delete the topic branch (because
I'm done using it), the repository state is truly indistinguishable.
While the topic branch existed, you could have done something to the
master branch, in which case the final merge would have been non-trivial,
but if that didn't happen, git produces a simple, easy-to-follow linear
history.
Some people used to heavyweight branches find this confusing; they
think a merge is a big deal and it should be memorialized, but there
are actually excellent reasons for doing this.
The most important one is that a fit of merging back and forth will
eventually end. Suppose that branches A and B are maintained by separate
developers who like to track each other's work closely.
If the fast-forward case did create a commit, then merging A into B
would produce
o--o--o--o---------o <-- B
\ /
o--o--o <-- A
then merging B into A would produce:
o--o--o--o---------o <-- B
\ / \
o--o--o---o <-- A
and further merges would produce more and more dummy commits, all without
ever reaching a steady state, and without making it obvious that the
two heads are actually identical.
Since history lasts forever, cluttering it up with unimportant stuff is a
burden to all future users, and not a good idea. Allowing the merge of a
branch to be seamless in the simple case encourages lightweight branches.
If you _might_ need a separate branch, create it. If it turned out that
you didn't, it won't make a difference.
* Exchanging work with other repositories
The basic tools for exchanging work with other repositories are "git
fetch" and "git push". The fact that "git pull" is not the opposite of
"git push" is often confusing to beginners (it's a superset of git fetch),
but that's the terminology that has grown up.
The unit of sharing in git is the branch. If you've used branches in
CVS, you'll be familiar with using "CVS update" to pull changes from your
"current branch" in the repository into your working directory.
In Git, you don't pull into the working directory, but rather into a
tracking branch. You set up a branch in your repository which will be
a copy of the branch in the remote repository. For example, if you use
"git clone", then the remote "master" branch is tracked by the local
"origin" branch.
Then, when you do a "git fetch", git fetches all of the new commits
and sets the origin head to point to the newly fetched head of the
remote branch.
By default, git checks that this is a trivial fast-forward merge, that
is not throwing away history. If it finds something like:
o--o--o--o--o--o <-- remote master
\
o <-- Local origin
It will complain and abort the fetch. This is usually a warning that
something has gone wrong - in particular, you forgot that this was
supposed to be a tracking branch and committed some work to it - and it
aborts before throwing your work away.
However, sometimes the remote git user will have a branch name that they
delete and re-create frequently. There are plenty of reasons to do this.
The most common is doing a "test merge" between various branches in
progress. They're all unfinished, so the developer of branch A doesn't
want to merge in all the new bugs in branch B, but a tester might want
to create a merged version with both sets of bugs for testing.
The merged version is not intended to be a permanent part of history -
it'll get deleted after the test - but it can still be useful to have
a draft copy.
In this case, you can mark the source branch with a leading "+", to
disable this sanity check. (See the git-fetch man page for details.)
Note that in this case, you should specifically avoid merging from such
a branch into any non-test branches of your own. It is, as mentioned,
not intended to be a permanent part of history, so it would be rude
to make it part of your permanent history. (You still might want to
test-merge it with your work in progress, of course.)
The fact that you should know to treat such branches specially is why
git doesn't try to automatically cope with them.
* Alternate branch naming
The original git scheme mixes tracking branches with all the other heads.
This requires that you remember which branches are tracking branches and
which aren't. Hopefully, you remember what all your branches are for,
but if you track a lot of remote repositories, you might not remember
what every remote branch is for and what you called it locally.
An alternative convention has been developed (which may become the
default in future), that places copies of the remote servers' heads
under .git/refs/remotes/<server>/<branch>.
Then you can refer to
If you use "git-line --use-separate-remote", it will set
If you want to use the "separate remotes" tracking branch
There's an alternate way, using the --use-separate-remote option
to git-clone. This sets up a copy of the remote server's heads
under .git/refs/remotes/origin/<name>. Then you can refer to
"origin/<branch>" whenever you want.
Because the branch names are got under .git/refs/heads, the git
tools will not let you commit to the branch.
* Remotes files
You can specify what to fetch on the git-fetch command line.
That is, you can type
git fetch <url> src1:dst1 src2:dst2
to fetch the given remote heads src1 and src2
However,
if you intend to monitor another repository on an ongoing basis,
it's generally easier to set up a short-cut by placing the options in
.git/remotes/<name>.
The syntax is explained in the git-fetch man page. When this is st
up, "git fetch <name>" will retrieve all the branches listed in the
.git/remotes/<name> file. The ability to fetch multiple branches at
once (such as release, beta, and development) is an advantage of using
a remotes file.
git-fetch with no argument uses the default file .git/remotes/origin.
If you have a single primary "upstream" repository that you sync to,
place it in the origin remotes file, and you can just type "git fetch"
to get all the latest changes.
Note that branches to fetch are identified by "Pull: " lines in the
remotes file. This is another example of the fetch/pull confusion.
git-pull will be explained eventually.
If you want to follow this convention when manually adding
branches to track, just supply .git/-relative branch names:
Pull: refs/heads/master:refs/remotes/origin/master
Pull: refs/heads/beta:refs/remotes/origin/beta
Pull: refs/heads/alpha:refs/remotes/origin/alpha
[TODO: Explain the git config file alternative. I need to write a section
on the config file itself first.]
* Cloning
If you want to watch a project that's hosted on a git server, the easiest
way is to use "git clone".
git-clone creates a new repository, sets up a remotes file to reack
every branch in the remote repository, and fetches all those branches.
By default, it maps them to local heads as follows:
- The remote "master" is tracked by the local "origin"
- The local "master" is made a copy of that.
- The remote "origin" is not tracked at all.
- All other heads are tracked by local heads of the same name.
If you use git-clone --use-separate-remote, then a different,
simpler convention for remote heads will be used:
- Every remote <head> is tracked by .git/refs/remotes/origin/<head>.
Either way, the fetch information is all placed in the .git/remotes/origin
file, which (as mentioned above) is the default used by git-fetch if no
argument is supplied.
* Fetching without tracking (advanced)
Whenever a fetch is done, the fetched heads are also stored in
.git/FETCH_HEAD. This is for use by later merging (coming up
really soon, promise!). In fact, it's possible to fetch without
writing to any local heads at all. If you just
git fetch <url> src1 src2
Then the fetch will be done, but the results will be written nowhere
but .git/FETCH_HEAD. This is actually the earliest form of git_fetch
impllemented; everything else is a later addition. It's not something
you'd do on purpose much, except as part of a script that uses FETCH_HEAD,
but it's worth mentioning it in case you type it and wonder what the
heck happened.
* Remote tags
When you fetch to a tracking branch, git-fetch also fetches every
heavyweight tag (one that involves an actual tag object) that
points to a commit reachable from the branch head and installs a
copy locally.
It will never overwrite a pre-existing tag of the same name.
This means that if you publish a heavyweight tag, and then try
to change it, people who fetched the old tag won't see the change!
This is perhaps an excessively liberal policy, but it has worked
well in practice so far. Still, it does mean that you should
think about who you fetch code from.
Since tags can be PGP-signed, one option would be to verify them
before installing them locally.
In case it helps, the primitive to get a list of remote heads and tags
is git-ls-remote. It's used by git-clone to get the list of heads to
track, and can be used manually to see what's been added since then that
you might be interested in tracking.
* Git network protocols
There a few basic ways to share a git repository over the 'net, which
break down into "dumb" protocols that just copy files, and the "smart"
git native protocol that can deliver just the objects that you need.
1) http protocol. This is sub-optimal, but may be all you have.
The challenge with this is that you have to pack your repository
into chunks that are neither too big (because you have to download
all of a chunk to download any of it) not too small (because the
http request overhead would kill you). So think carefully about
how often you run git-repack, and don't run git-repack -a.
http clients also need some extra index information to help them
find which pack files they need. git-update-server-info is the
command that generates these files, but it's run automatically
from git-repack, so it's not too important to know.
2) rsync protocol. This is basically an alternative to http, and
has the same strictures and limitations.
3) The git protocol. This is represented with a git:// URL, and talks
to a dedicated git daemon (see the git-daemin man page) on the
remote machine. It uses TP port 9418 by default. This is a smart
protocol that understands the git format and does sophisticated
wire compression.
It uses more CPU on the server, but less bandwidth. And it doesn't
require any special care when repacking. git-daemon is purposefully
written to provide read-only service.
4) The git protocol over ssh. This is a git+ssh:// URL; ssh:// is
accepted as a synonym. It has the same efficiency issues as
plain git. If you want to limit ssh users to just the git commands
necessary to share work, git provides a git-shell command that can
be used as a very limited login shell.
It's been reported that using http or rsync for the initial clone is
faster than the smart git protocol, because the smarts are wasted
when you just need to download everything. Then you can go into
the remotes/origin file and edit the URL line to git:// for later
smart access.
But unless you want to bother with that very minor optimization,
just remember to use the git native protocol whenever possible.
* Exchanging work with other repositories, part II: git-push
It's simpler to set up git sharing on a pull basis. If your source
code isn't secret, you can set up a public read-only server very easily
(see the git-daemon man page for details), and have other fetch from that.
However, N developers all pulling from each other is an N^2 mess.
Some centralization helps.
One way is to have a central coordinator (like Linus) who pulls from
all of the developers, and who they in turn pull from.
The other is familiar to users of centralized VCSs: have a central
repository that people can push to. This generally requires an ssh login
on the server. You can use git-shell as the login shell if all you want
to allow the account to do is git fetch and push. (You can use the hook
scripts to enforce rules about who's allowed to do what to which branch.)
Git-push to the remote machine works exactly like git-fetch from the
remote machine. The objects are moved over, and the branches pushed to
are fast-forwarded. If fast-forward is impossible, you get an error.
So if you have multiple people committing to a branch on the server,
you will not be allowed to push if someone has pushed more to that branch
since last time you fetched it.
You have to fix the problem locally, and re-try the push when you've
got a new head that includes the most recently pushed work as an ancestor.
This is exactly like "cvs commit" not working if your recent checkout
wasn't the (current) tip of the branch, but git can upload more than
one commit.
The simplest way to resolve the conflict is to merge the remote head with
your local head. Alternatively, you can rebase your work to the new
head. (See below for details.)
In either case, this is easiest if you have different local branches
for fetching the remote repository and for pushing to it.
That is, you have one head that just tracks the master repository's
main branch, and another that you add your work to, and push from.
It makes resolving conflicts between the two much easier if you
have a name for each of them.
Like git-fetch, you can specify everything on the command line:
git-push <url> src1:dst1 src2:dst2 src3:dst3
to upload the local branches src1, src2 and src3 to the corresponding
remote branches, or you can create a .git/remotes/<name> file.
The URL: and any Push: lines in the file are
(If you need different ssh options for different hosts you push to,
set them up in a .ssh_config file. You can even have different
options for the same host, as .ssh_config lets you specify "Host alias"
and "HostName real.dom.aim".)
Another use for git-push, even for a solo developer, is sharing your work
with the world. You can set up a public git server on a high-bandwidth
machine (possibly rented from a hosting service) and then push to it to
publish something. This also helps keep your private development
branches private; git-daemon doesn't have a way to limit which
branches it exposes to which people.
* Merging (finally!)
I went through everything else first because the most common merge case
is local changes with remote changes. Not that you can't merge two
branches of your own, but beginners will encounter the local/remote
case first.
The primitive that does the merging is called (guess what?) git-merge.
And there's nothing too terribly wrong with using it.
However, it's usually easier to use the git-pull wrapper. This merges
the changes from some other branch into the current HEAD and generates
a commit message automatically.
git-merge lets you specify the commit message (rather than generating it
automatically), but that's about it.
The basic git-pull syntax is
git-pull <repository> <branch>
The repository can be any URL that git supports. Including, particularly,
a local file. So to do a simple local merge, you just type
git-pull . <branch>
So after doing some hacking on branch "foo", you would
git checkout master
git pull . foo
and ba-boom, all is done.
Now, you can also specify a remote repository to merge from, using a
git://, http:// or git+ssh:// URL. This is what Linus does all day
long, and why the git-pull tool is optimized to allow that. It uses
git-fetch to fetch the remote branch without assigning it a branch name
(as mentioned above, it gets the magic name FETCH_HEAD), and them merges
it into the current HEAD directly.
There is absolutely nothing wrong with doing that, but beginners often
find it confusing to have a single short command do quite so much.
And if you are working closely with someone, it's often more convenient
and less confusing to keep local tracking branches. Then you can
git fetch # Fetches 'origin'
git pull . origin
It's also possible to give just a single remotes file name to git-pull:
git pull origin
That does a git fetch, updating all of the listed branches as usual,
then merges the _first_ listed branch into HEAD. It would be more
cinsistent to merge all the branches, but that's almost never what
you want.
By the way: don't blink, you might miss it! As I mentioned, pulling is
a very big part of Linus's daily routine, and he's made sure it's fast.
(Actually, you cant miss all the output it produces.)
Just to clarify, because people often get confused:
git-pull is a MERGING tool. It always does a merge, as well as an optional
fetch. If you just want to LOOK at a remote branch, use git-fetch.
* Undoing a merge
If you discover that a merge was a mistake, it can be undone just like
any other commit. The HEAD you merged to is the first parent, so just do
git reset --hard HEAD^
Since your local git repository is private, it's easy to un-commit;
it's your own private infinite undo. It's only when you publish
your branch that
This is why Linus likes a git-pull command that does so much in one shot;
if he doesn't like what he pulls, it's easy to undo.
* How merging operates
Git uses the basic three-way merge. First, it applies it to file names,
then to whole files, and then to lines within files.
To do a three-way merge, you need three versions of a file. The versions
A and B you want to merge, and a common ancestor, commonly called O.
That is, history proceeds something like:
o--o--A
/
o--o--O
\
o--B
The basic idea is "I want the file O, plus all the changes made from O
to A, plus all the changes made from O to B." Since the cases where one
of A or B is a direct ancestor of the other have already been disposed
of, the three commits must be different.
For each file, there are a few cases that are trivial, and git gets
these out of the way immediately:
- If A and B are identical, the merged result is obvious.
- If O and A are the same, then the result should be B.
- If O and B are the same, then the result should be A.
In the completely trivial case when O, A and B are the same, then
all three rules apply, they all produce the same obvious result.
Git automatically finds the masrge base O as the most recent
common ancestor of the heads A and B to be merged.
When doing a merge, git uses the above 2-out-of-3 merging rules
three times:
First, the rules are applied to file renames. So if foo.c is renamed
to bar.c on branch A, then branch B's foo.c will be merged with it
to produce the result.
Second, they're applied to whole files. If two out of three entire
files are identical, there's nothing more to do. Since the name of
a blob object uniquely identifies its contents, this lets git do a
considerable amount of merging without even looking at the file contents.
Finally, any files where all three versions are different are loaded
into the index file, with the same file name but marked "stage 1",
"stage 2", and "stage 3", and the classic line-based three-way merge
is used to resolve the mess, This looks for isolated "hunks" of change
and uses the same 2-out-of-3 rules to resolve each hunk separately.
Only if all three commits have differing hunks that overlap (or come
so glose that git can't be sure) is git unable to automatically resolve
the problem. This requires manual correction, as described below.
If the merge goes well, it is automatically comitted and the HEAD branch
updated to point to the new commit.
* Merge subtleties (advanced)
As mentioned before, the merge base is the most recent common ancestor
of A and B. The only problem is, that's not necessarily unique!
The classic confusing case is called a "criss-cross merge", and looks
like this:
o--b-o-o--B
/ \ /
o--o--o X
\ / \
o--a-o-o--A
There are two most-recent common ancestors of A and B, marked a and b
in the graph above. And they're not the same. You could use either
one and get reasonable results, but how to choose?
The details are too advanced for this discussion, but the default
"recursive" merge strategy that git uses solves the answer by merging
a and b into a temporary commit and using *that* as the merge base.
Of course, a and b could have the same problem, so merging them could
require another merge of still-older commits. This is why the algorithm
is called "recursive." It's been tested with pathological conditions,
but multiply nested criss-cross merges are very rare, so the recursion
isn't a performance limit in practice.
(This can lead to occasional confusing messages about merge conflicts
that aren't real. That's because git computes the full merge base,
but if the conflicting file is actually identical in A and B, the
messed-up version in O doesn't matter.)
* Alternate merge strategies (advanced)
In every version control system prior to git, the merging algorithm was
buried deep in the bowels of the software, and very difficult to change.
One of particularly nice things that git did was allow for easily
replaceable "merge strategies". Indeed, you can try multiple merge
strategies, and the fallback - print an error message and let the user
sort it out - can be thought of as just another merge strategy.
Enabling this is why the index is so important to git. It provides a
place to store an unfinished merge, so you can try various strategies
(including hand-editing) to finish it.
There are two non-default strategies that have their uses in special
circumstances.
* Octopus merge (advanced)
The first is the "octopus" stratgy. This is special because it can do
a three- or more-way merge. See 5401f304 in the git repository for
an example. (Run gitk, double-click on the "SHA1 ID" box to select
what's already there, enter "5401f304" instead, and click "Goto".)
The octopus strategy is invoked automatically when you specify more
than one branch at a time to merge in with "git pull". It can't handle
complicated overlaps and file renames as well as the 2-way recursive
strategy, but if you have a number of simple, independent changes that you
want to merge together, an octopus merge is the obvious way to document
the fact that they're truly independent.
The only downside to using an octopus to combine a number of simple
changes is that any merge makes git-bisect's job harder. If you have
a development history like
/-b-\
/ \
| /-c-\ |
|/ \|
--a---d---g--
|\ /|
| \-e-/ |
\ /
\-f-/
And you know that a works but g doesn't, there's no way to do a binary
search on b through f; they have to be searched linearly. This is
no harder to bisect, and a lot nicer-looking than the equivalent with
2-way merges:
/-b
/ \
| /-c-g
|/ \
--a---d---h--i--j--
|\ / /
| \-e----/ /
\ /
\-f-----/
But if they were just done one after the other, you'd have
--a--b--c--d--e--f--
Which may imply a non-existent dependency between the changes, but
is a bit simpler. Still, the first octopus merge in git's own development
(211232ba) is of this form. It's a matter of taste.
For the beginner, the important thing to know is that you never *need*
an octopus.
* "Ours" merge (advanced)
The other merge strategy that is surprisingly useful is specified with
the "-s ours" option to git-pull.
This strategy instructs git that the merged result should be the same
as the current HEAD. Any other branches are recorded as parents, but
their contents are ignored.
What the heck is the use of that? Well, it lets you record the fact
that some work has been done in the history, and that it shouldn't be
merged again. For example, say you write and share a popular patch set.
People are always merging it in to their local source trees. But then
you discover a much better way to achieve the goal of that patch set, and
you want to publish the fact that the new patch supersedes the old one.
If you developed the new set starting from the old one, that would happen
automatically. But another way to achieve the same goal is to merge the
old branch it in using the "ours" strategy. Everyone else's git will
notice that the patch is already included, and stop trying to merge it in.
* When merging goes wrong
This is the fun part. Git's default recursive-merge strategy is pretty
clever, but sometimes changes truly do conflict and need manual fix-up.
When git is unable to complete a merge, it leaves the three different
versions in the index and places a file with CVS-style conflict markers
in the working directory.
As long as there is a "staged" file like this in the index, you will
not be able to commit. You must resolve the conflict, and update the
index with the resolved versions. You can do this one at a time with
git-update-index, or at the end by giving the files as arguments to
git-commit.
Doing them one at a time is probably safest; checking in a file which
still has conflict markers makes a bit of a mess. Git will still use
the automatically generated commit message when you finally commit.
(It's stashed in .git/MERGE_MSG, if you care.)
Note that "git diff" knows how to be useful with a staged file.
By default, it displays a multi-way diff. For example, suppose I take a
(slightly buggy) hello.c:
--- hello.c ---
#include <stdio.h>
int main(void)
{
printf("Hello, world!");
}
--- end ---
Now, suppose that in branch A, I fix some bugs - add the missing newline
and "return 0;". In branch B, I display my angst and change it to
"Goodbye, cruel world!". When I try to merge A into B, obviously I'll
get a conflict. The resultant file, with conflict markers, looks like:
--- hello.c ---
#include <stdio.h>
int
main(void)
{
<<<<<<< HEAD/hello.c
printf("Goodbye, cruel world!");
=======
printf("Hello, world!\n");
return 0;
>>>>>>> edadc53fc7a8aef2a672a4fa9d09aa16f4e14706/hello.c
}
--- end ---
and the result of "git diff" is
diff --cc hello.c
index 4b7f550,948a5f8..0000000
--- a/hello.c
+++ b/hello.c
@@@ -3,5 -3,6 +3,10 @@@
int
main(void)
{
++<<<<<<< HEAD/hello.c
+ printf("Goodbye, cruel world!");
++=======
+ printf("Hello, world!\n");
+ return 0;
++>>>>>>> edadc53fc7a8aef2a672a4fa9d09aa16f4e14706/hello.c
}
Notice how this is not a standard diff! It has two columns of diff
symbols, and shows the difference from each of the ancestors to the
current hello.c contents. I can also use "git diff -1" to compare
against the common ancestor, or "-2" or "-3" to compare against each of
the merged copies individually.
In any case, I have to replace the lines from <<<<<<< to >>>>>>> with
correct code. Then, unless I'm feeling really brave, I should probably
do a test compile. Suppose I fix it to read:
--- hello.c ---
#include <stdio.h>
int
main(void)
{
printf("Goodbye, cruel world!\n");
return 0;
}
--- end ---
When I'm done, another "git diff" convinces me that I haven't forgotten
any conflict markers anywhere like a comment that doesn't get compiled,
and I can do
git commit -a
This then prompts me to edit the commit message, but there's a
difference; there's something already written:
--- .git/COMMIT_EDITMSG ---
Merge branch 'A' into B
Conflicts:
hello.c
#
# It looks like you may be committing a MERGE.
# If this is not correct, please remove the file
# .git/MERGE_HEAD
# and try again
#
# Please enter the commit message for your changes.
# (Comment lines starting with '#' will not be included)
# On branch refs/heads/B
# Updated but not checked in:
# (will commit)
#
# modified: hello.c
--- end ---
This is the automatically generated merge message and a reminder
to future readers of what had to be manually fixed.
In many cases, this is fine, and you can save it and complete the
commit. Or you can add something about the merge if it needs saying.
When I'm done, if I don't need branch A any more, I can
git branch -d A
* More on fixing broken merges
Often, a merge conflict is a simple textual issue that git's built-in
merge couldn't quite handle, like two #include additions in the same
place, and it doesn't know that the order doesn't actually matter.
But sometimes, you have real conceptual conflicts between two changes,
and it's not clear what to do. There are a number of ways to look
through history to see what's going on.
git log --merge
will show you all the commits that
1) touch files that are unresolved (staged) in the index, and
2) differ between the branches being merged.
By default, this only shows the commit messages, but you can
add all the usual git-log options, like -p to see the patches
themselves.
Note that this doesn't show you which branch each commit is on,
but it's still often useful to see what someone was trying to
do that caused the problem.
You can, of course, add a pathname limiter argument to further restrict
the commits being shown.
* Redoing a merge in case of push conflicts.
When pushing to an upstream repository, the usual procedure is
to merge the remote head and your new development into a push head,
and then push it to the remore repository. That is, your work looks
like this ("a" commits are your work, "o" commits are other people's):
--o--o
\
a--a--a <-- development
And suppose that the upstream repository has grown to
--o--o--o--o--o <-- upstream
So you fetch it:
--o--o--o--o--o <-- from-upstream
\
a--a--a <-- development
And merge it:
--o--o--o--o--o <-- from-upstream
\ \
a--a--a--a <-- to-upstream
And now you can push it upstream, which will be a simply
fast-forward merge:
--o--o--o--o--o
\ \
a--a--a--a <-- upstream
However, if someone else has pushed in the meantime, your push will fail
because it can't fast-forward:
b--b--b <-- upstream
/ /
--o--o--o--o--o
\ \
a--a--a--a <-- to-upstream
The obvious solution is to re-merge the upstream branch and push again:
b--b--b
/ / \
--o--o--o--o--o \
\ \ \
a--a--a--a---a <-- to-upstream
But that puts more merge commits in the history than necessary.
Better is to re-fetch and re-do the merge:
b--b--b
/ / \
--o--o--o--o--o \
\ \
a--a--a------a <-- to-upstream
This is just a matter of (starting from the to-upstream branch)
git fetch # Fetches "from-upstream"
git-reset --hard from-upstream
git pull . development
git push
* Alternatives to merging
The bigger and more active your source tree, the more important it is to
keep the history reasonably clean. Just because git can do a merge in
under a second doesn't mean that you should do one daily. When you look
back at a feature's development history, you'd like to see meaningful
changes recorded and not a lot of meaningless ones.
Another nice thing to keep out of the published history is commits that
don't compile or are catastrophically buggy. These makes git-bisect
harder to use, and once you've experienced the joys of git-bisect when
tracking down a newly introduced bug, you'll appreciate why it's good
to keep the public history clean.
Now, once you have shared a commit with others, and they have incorporated
it into their development, it becomes impossible to undo. But git
provides tools that are useful for "rewriting history" before public
release. These can be used to edit a commit for publication.
* Test merging
One way to keep the history clean is to simply not merge other branches
into your development branch. If you want to use your new features and
other people's code changes, make a test merge and use that, but don't
make that merge part of your branch.
This is slightly more work (you have to change to a test branch and do
your merging there), but not very much.
Sometimes, when doing this, a conflict appears between your changes and
someone else's development. If you get tired of fixing the same conflict
every time you do a test merge, have a look at the git-rerere tool.
This remembers resolved conflicts and tries to apply the same resolution
patch the next time.
It's written specifically to help you not do an extra merge unnecessarily.
Although its man page is well worth reading, you never invoke git-rerere
explicitly; it's invoked automatically by the merge and patch tools if
you create a .git/rr-cache directory.
* Cherry picking
If you have a series of patches on a branch, but you want a subset
of them, or in a different order, there's a handy utility called
"git-cherry-pick" which will find the diff and apply it as a patch to
the current HEAD. It automatically recycles the commit message from
the original commit.
If the patch can't be applied, it leaves the versions in the index and
conflict markers in the working directory just like a failed merge.
And just like a merge, it remembers the commit message and provides it
as a default when I finally commit.
Note that this can only work on a chain of single-parent commits.
If a commit has multiple parents, there's no single patch to apply.
You can get the list of commits on a branch with git-log or git-rev-list,
but for more complex cases, the git-cherry tool is designed to generate
the list of commits to merge. It has a rather neat approximate-match
function built in which identifies patches that appear to already be
present in the target branch.
* Rebasing
A special case of cherry-picking is if you want to move a whole branch
to a newer "base" commit. This is done by git-rebase. You specify
the branch to move (default HEAD) and where to move it to (no default),
and git cherry-picks every patch out of that branch, applies it on top
of the target, and moves the refs/heads/<branch> pointer to the newly
created commits.
By default, "the branch" is every commit back to the last common
ancestor of the branch head and the target, but you can override that
with command-line arguments.
If you want to avoid merge conflicts due to the master code changing out
from under your edits, but not have "cleanup" merges in your history,
git-rebase is the tool to use.
Git-rebase will also use git-rerere if enabled ("mkdir .git/rr-cache").
If rebasing encounters a conflict it can't resolve, it will stop halfway
and ask you to resolve the problem by hand. However, it still knows it
has a job to finish! The unapplied patches are remembered until you do
one of
git-rebase --continue
This will check in the current index. You should
do git-update-index <files> in the conflicts that
you resolve, but NOT do an actual git-commit.
git-rebase --continue will do the commit.
git-rebase --skip
This will skip the conflicting patch. You
don't have to resolve the conflicts; git will
just back up and try the next patch in the series.
git-rebase --abort
This will abandon the whole rebase operation (including
any half-done work) and return you to where you began.
* Rebasing 2: splitting a branch
Git-rebase can also help you divide up work. Suppose you've mixed up
development of two features in the current HEAD, a branch called "dev".
You want to divide them up into "dev1" and "dev2". Assuming that HEAD
is a branch off master, then you can either look through
git log master..HEAD
or just get a raw list of the commits with
git rev-list master..HEAD
Drawing a picture, suppose we start with:
o--o--o--o--o <-- master
\
x--y--y--x--x--y <-- dev
And want to end up with
x--x--x <-- dev1
/
o--o--o--o--o <-- master
\
y--y <-- dev2
Either way, you'll have to manually figure out a list of vommits that
you want in dev1 and create that branch:
git checkout -b dev1 master
for i in `cat commit_list`; do
git-cherry-pick $i
done
You can use the other half of the list you edited to generate the dev2
branch, but if you're not sure if you forgot something, or just don't
feel like doing that manual work, then you can use git-rebase to do it
for you...
git checkout -b dev2 dev # Create dev2 branch
git-rebase --onto master dev1 # Subtract dev1 and rebase
This will find all patches that are in dev and not in dev1,
apply them on top of master, and call the result dev2.
That is, after you've manually picked out the dev1 branch commits:
x--x--x <-- dev1
/
o--o--o--o--o <-- master
\
x--y--y--x--x--y <-- dev, dev2
this will automatically produce:
x--x--x <-- dev1
/
o--o--o--o--o <-- master
| \
| y--y <-- dev2
\
x--y--y--x--x--y <-- dev
If you had omitted the "--onto master" part, it would have produced instead
y--y <-- dev2
/
x--x--x <-- dev1
/
o--o--o--o--o <-- master
\
x--y--y--x--x--y <-- dev
git-rebase abandons the original branch, since (think about it!) it never
loses a significant change. In the example here, the branch is
still accessible via the "dev" name, but if you had skipped creating
a "dev2" branch and just called it "dev", that branch would be gone.
* Cherry picking and rebasing: Merging changes
Suppose that you accidentally ommitted a broken source tree,
and forgot to use "git-commit --amend" when comitting the following
one-liner. Or say you didn't notice the brokenness immediately
and made the fix a few commits later. But now you want to
rewrite history with the fix merged into the original commit.
Here's a simple way to do it, assuming that you're fixing the
"dev" branch, and the commits to merge are <commit1> and <commit2>:
1--o--o--2--o--o <-- dev
/
--o--o
git status # Make sure there's no uncommitted work
git checkout -b temp <commit1>
git cherry-pick -n <commit2>
git commit --amend -a
1--o--o--2--o--o <-- dev
/
--o--o
\
1+2 <-- temp
Now, we can git-rebase the remainder. The only complication is
that git can't tell that the change from commit2 has been applied,
since the combined 1+2 change isn't "the same". One option is to
trust that there will be amerge conflict when you try and just do:
git checkout dev
git rebase --onto temp <commit1>
# Which will stop with a conflict
git rebase --skip
git branch -d temp
The other is to explicitly do the two stretches separately:
git checkout -b temp2 <commit2>^
git rebase --onto temp <commit1>
git checkout dev
git rebase --onto temp <commit2>
git branch -d temp2
git branch -d temp
[TODO: are there any better tools for reordering patches? Maybe explain
how to create an mbox file and mess with it?]
* Dirty tricks: evil merges (advanced)
Generally, a merge is supposed to include all the changes made on
both contributing branches, and that's all. But sometimes, people
slip additional changes in. This is known as an "evil merge", because
it can be very misleading to someone reading the history.
But occasionally there are good reasons to do that. For example, look
at the output of "git show v1.0.0" in the git repository itself.
(Recall that git-show uses "git diff --cc", which only shows
hunks that are not trivially taken from one parent or the other.
Any line that starts with "++" is not taken from either parent.)
You'll see that, as the merge was the last thing required for a 1.0.0
release, the git maintainer also bumped the version number and updated
the changelog. It could have been a separate commit, but didn't seem
worth it.
This was done by forcing the git-pull to not commit:
git pull --no-commit . <branch>
(edit as desired)
git commit
The commit message should be edited to explain that this is not just
a normal merge, as was done in this case.
* Experimenting with git
The best way to learn how git works is to try it. Once you understand
the basic concept well enough to not delete anything by accident, it's
quite hard to hurt anything.
* Experimenting with fetching
Remember that fetching from a repository on the same machine is both
possible and fast. So if you want to play around, just make a new
directory, run git-init-db, and give it a try. You can't hurt the source
repoistory, and deleting the destination is as easy as "rm -rf".
(Of course, you can hurt things with "rm -rf", so make certain you're
in the right directory before excecuting that!)
* Experimenting with merging
To play with non-trivial merging, get an existing git repository of
a non-trivial project (git itself and the Linux kernel are readily
available. Fire up gitk to look at history, find some interesting-looking
merges, and redo them yourself on a test branch.
As long as you do everything on test branches, you aren't going to screw
anything up. So play!
You can use gitk to search for "Conflicts:" in the commit comments to
find merges that didn't go smoothly and see what happens. (Or you can
search in "git log" output. gitk just draws prettier pictures.)
You can also set up two repositories on the same machine and try pulling
and pushing between them.
To identify arbitrary commits, the 40-byte raw hex ID is probably easiest;
you can cut-and-paste them from the gitk window.
For example, in the git repository,
3f69d405d749742945afd462bff6541604ecd420
looks like an interesting merge. Its parents are
Parent: 7d55561986ffe94ca7ca22dc0a6846f698893226
Parent: 097dc3d8c32f4b85bf9701d5e1de98999ac25c1c
Let's try doing that manually:
$ git checkout -b test 7d55561986ffe94ca7ca22dc0a6846f698893226
$ git pull . 097dc3d8c32f4b85bf9701d5e1de98999ac25c1c
error: no such remote ref refs/heads/097dc3d8c32f4b85bf9701d5e1de98999ac25c1c
Fetch failure: .
Cool! I didn't know that wasn't allowed. (I'll have to ask why it's
not; perhaps it's because it uses the branch name in the automatic
commit message.) I could do it by hand with git-merge, but I'll just
give it a branch name:
$ git branch test2 097dc3d8c32f4b85bf9701d5e1de98999ac25c1c
$ git pull . test2
Merging HEAD with 097dc3d8c32f4b85bf9701d5e1de98999ac25c1c
Merging:
7d55561986ffe94ca7ca22dc0a6846f698893226 Merge branch 'jc/dirwalk-n-cache-tree' into jc/cache-tree
097dc3d8c32f4b85bf9701d5e1de98999ac25c1c Remove "tree->entries" tree-entry list from tree parser
found 2 common ancestor(s):
d9b814cc97f16daac06566a5340121c446136d22 Add builtin "git rm" command
288c0384505e6c25cc1a162242919a0485d50a74 Merge branch 'js/fetchconfig'
Merging:
d9b814cc97f16daac06566a5340121c446136d22 Add builtin "git rm" command
288c0384505e6c25cc1a162242919a0485d50a74 Merge branch 'js/fetchconfig'
found 1 common ancestor(s):
63dffdf03da65ddf1a02c3215ad15ba109189d42 Remove old "git-grep.sh" remnants
Auto-merging Makefile
merge: warning: conflicts during merge
CONFLICT (content): Merge conflict in Makefile
Auto-merging builtin.h
merge: warning: conflicts during merge
CONFLICT (content): Merge conflict in builtin.h
Auto-merging cache.h
Removing check-ref-format.c
Auto-merging git.c
merge: warning: conflicts during merge
CONFLICT (content): Merge conflict in git.c
Auto-merging read-cache.c
Auto-merging update-index.c
merge: warning: conflicts during merge
CONFLICT (content): Merge conflict in update-index.c
Renaming apply.c => builtin-apply.c
Auto-merging builtin-apply.c
Renaming read-tree.c => builtin-read-tree.c
Auto-merging builtin-read-tree.c
Auto-merging .gitignore
Auto-merging Makefile
merge: warning: conflicts during merge
CONFLICT (content): Merge conflict in Makefile
Auto-merging builtin.h
merge: warning: conflicts during merge
CONFLICT (content): Merge conflict in builtin.h
Auto-merging cache.h
Auto-merging fsck-objects.c
Removing git-format-patch.sh
Auto-merging git.c
merge: warning: conflicts during merge
CONFLICT (content): Merge conflict in git.c
Auto-merging update-index.c
Automatic merge failed; fix conflicts and then commit the result.
$ git status
Hey, look, lots of interesting stuff. Particularly, see
# Changed but not updated:
# (use git-update-index to mark for commit)
#
# unmerged: Makefile
# modified: Makefile
# unmerged: builtin.h
# modified: builtin.h
# unmerged: git.c
# modified: git.c
The "unmerged" (a.k.a. "staged") files are ones that need manual resolution.
(Notice the complaint about update-index.c, which doesn't
(I notice that update-index.c isn't listed, despite being mentioned
as a conflict in the message. Can someone explain that?)
Fixing those is easy, but as you can see from the original commit comment
and diffs, there were some additional changes that were necessary to
make that compile.
You can test before committing the change, or do it the git way - commit
anyway, then test and "git commit --amend" with the fixes, of any.
Unlike a centralized VCS, committing is not the same as pushing upstream.
You can use test branches in the repository to save as much work as
you like. While it's still nice to keep the public repository clean,
you don't have to worry about "breaking the tree" every time you commit.
You can do all kinds of stuff in test branches, and clean it up later.
This is why all the git merge tools do the commit without waiting for
you to test it. The merge is usually okay, and it saves time. If not,
^ permalink raw reply [flat|nested] 66+ messages in thread
* [Patch to DRAFT 2 (1/2)] Branching and merging with git
2006-11-20 23:51 ` [DRAFT 2] " linux
@ 2006-11-22 11:02 ` Junio C Hamano
2006-11-22 11:02 ` [Patch to DRAFT 2 (2/2)] " Junio C Hamano
` (2 subsequent siblings)
3 siblings, 0 replies; 66+ messages in thread
From: Junio C Hamano @ 2006-11-22 11:02 UTC (permalink / raw)
To: linux; +Cc: git
This installment is ispell only.
--- a/doc
+++ b/doc
@@ -81,7 +81,7 @@
* A brief digression on command names
All git commands can be invoked as "git-foo" and "git foo". This document
-uses them interchangably. But you have to ask for the "git-foo" man page.
+uses them interchangeably. But you have to ask for the "git-foo" man page.
Git provides a few other ways to get the man page as well:
man git-foo
git help foo
@@ -120,13 +120,13 @@
Since the wrapper was developed, some simple commands have been made
"builtin", so for example, "git diff" is done internally. There's a
-git-diff link to retain compatability.
+git-diff link to retain compatibility.
* Git's representation of history
As you recall from Git 101 (or the git(7) man page), git's largest
-data strucure is the object database, holding exactly four kidnds
+data structure is the object database, holding exactly four kinds
of objects. Each of them has a globally unique 40-character hex name
(a.k.a. object IDs) made by hashing its type and contents. Since this is
an (effectively unforgeable) cryptographic hash, the name of an object
@@ -141,11 +141,11 @@
Tree objects record directory contents; they contain file names,
permissions, and the associated tree or blob object names.
-Tag objects are shareable pointers to other objects; they're generally
+Tag objects are sharable pointers to other objects; they're generally
used to store a digital signature, and generally point to commits.
(Although you can tag any object, including another tag.)
-Finaly, there are commit objects. Every commit points to (contains the
+Finally, there are commit objects. Every commit points to (contains the
name of) an associated tree object which records the state of the source
code at the time of the commit, and some descriptive data (time, author,
committer, commit comment) about the commit.
@@ -172,7 +172,7 @@
common, but git allows many more. There's a limit of sixteen in
the source code, and the most anyone's ever used in real life is 12,
which was generally regarded as overdoing it. The famous "dodecapus"
-is commit 9fdb62af in the linux kernel repository.
+is commit 9fdb62af in the Linux kernel repository.
Finally, there are references, stored in the .git/refs directory.
@@ -184,7 +184,7 @@
- Tags are references that are intended to be immutable.
A tag like "v1.2" is a historical record. Tag references may or may not
point to tag objects! If they do, this is called a "heavyweight tag";
- the tag can hold a digital signature and can be shared between repositores.
+ the tag can hold a digital signature and can be shared between repositories.
"Lightweight tags" point to commits directly, and are not automatically
shared.
- Heads are references that are intended to be updated.
@@ -808,7 +808,7 @@
it's generally easier to set up a short-cut by placing the options in
.git/remotes/<name>.
-The syntax is explained in the git-fetch man page. When this is st
+The syntax is explained in the git-fetch man page. When this is set
up, "git fetch <name>" will retrieve all the branches listed in the
.git/remotes/<name> file. The ability to fetch multiple branches at
once (such as release, beta, and development) is an advantage of using
@@ -838,7 +838,7 @@
If you want to watch a project that's hosted on a git server, the easiest
way is to use "git clone".
-git-clone creates a new repository, sets up a remotes file to reack
+git-clone creates a new repository, sets up a remotes file to track
every branch in the remote repository, and fetches all those branches.
By default, it maps them to local heads as follows:
@@ -868,7 +868,7 @@
Then the fetch will be done, but the results will be written nowhere
but .git/FETCH_HEAD. This is actually the earliest form of git_fetch
-impllemented; everything else is a later addition. It's not something
+implemented; everything else is a later addition. It's not something
you'd do on purpose much, except as part of a script that uses FETCH_HEAD,
but it's worth mentioning it in case you type it and wonder what the
heck happened.
@@ -920,8 +920,8 @@
has the same strictures and limitations.
3) The git protocol. This is represented with a git:// URL, and talks
- to a dedicated git daemon (see the git-daemin man page) on the
- remote machine. It uses TP port 9418 by default. This is a smart
+ to a dedicated git daemon (see the git-daemon man page) on the
+ remote machine. It uses TCP port 9418 by default. This is a smart
protocol that understands the git format and does sophisticated
wire compression.
@@ -1066,7 +1066,7 @@
That does a git fetch, updating all of the listed branches as usual,
then merges the _first_ listed branch into HEAD. It would be more
-cinsistent to merge all the branches, but that's almost never what
+consistent to merge all the branches, but that's almost never what
you want.
By the way: don't blink, you might miss it! As I mentioned, pulling is
@@ -1125,7 +1125,7 @@
In the completely trivial case when O, A and B are the same, then
all three rules apply, they all produce the same obvious result.
-Git automatically finds the masrge base O as the most recent
+Git automatically finds the merge base O as the most recent
common ancestor of the heads A and B to be merged.
When doing a merge, git uses the above 2-out-of-3 merging rules
@@ -1147,10 +1147,10 @@
and uses the same 2-out-of-3 rules to resolve each hunk separately.
Only if all three commits have differing hunks that overlap (or come
-so glose that git can't be sure) is git unable to automatically resolve
+so close that git can't be sure) is git unable to automatically resolve
the problem. This requires manual correction, as described below.
-If the merge goes well, it is automatically comitted and the HEAD branch
+If the merge goes well, it is automatically committed and the HEAD branch
updated to point to the new commit.
@@ -1207,7 +1207,7 @@
* Octopus merge (advanced)
-The first is the "octopus" stratgy. This is special because it can do
+The first is the "octopus" strategy. This is special because it can do
a three- or more-way merge. See 5401f304 in the git repository for
an example. (Run gitk, double-click on the "SHA1 ID" box to select
what's already there, enter "5401f304" instead, and click "Goto".)
@@ -1641,7 +1641,7 @@
\
y--y <-- dev2
-Either way, you'll have to manually figure out a list of vommits that
+Either way, you'll have to manually figure out a list of commits that
you want in dev1 and create that branch:
git checkout -b dev1 master
@@ -1695,8 +1695,8 @@
* Cherry picking and rebasing: Merging changes
-Suppose that you accidentally ommitted a broken source tree,
-and forgot to use "git-commit --amend" when comitting the following
+Suppose that you accidentally committed a broken source tree,
+and forgot to use "git-commit --amend" when committing the following
one-liner. Or say you didn't notice the brokenness immediately
and made the fix a few commits later. But now you want to
rewrite history with the fix merged into the original commit.
@@ -1722,7 +1722,7 @@
Now, we can git-rebase the remainder. The only complication is
that git can't tell that the change from commit2 has been applied,
since the combined 1+2 change isn't "the same". One option is to
-trust that there will be amerge conflict when you try and just do:
+trust that there will be a merge conflict when you try and just do:
git checkout dev
git rebase --onto temp <commit1>
@@ -1786,10 +1786,10 @@
Remember that fetching from a repository on the same machine is both
possible and fast. So if you want to play around, just make a new
directory, run git-init-db, and give it a try. You can't hurt the source
-repoistory, and deleting the destination is as easy as "rm -rf".
+repository, and deleting the destination is as easy as "rm -rf".
(Of course, you can hurt things with "rm -rf", so make certain you're
-in the right directory before excecuting that!)
+in the right directory before executing that!)
* Experimenting with merging
^ permalink raw reply [flat|nested] 66+ messages in thread
* [Patch to DRAFT 2 (2/2)] Branching and merging with git
2006-11-20 23:51 ` [DRAFT 2] " linux
2006-11-22 11:02 ` [Patch to DRAFT 2 (1/2)] " Junio C Hamano
@ 2006-11-22 11:02 ` Junio C Hamano
2006-11-22 13:36 ` Rene Scharfe
2006-12-04 1:19 ` [DRAFT 2] " J. Bruce Fields
2006-12-15 21:38 ` Jakub Narebski
3 siblings, 1 reply; 66+ messages in thread
From: Junio C Hamano @ 2006-11-22 11:02 UTC (permalink / raw)
To: linux; +Cc: git
This comes on top of the ispell'ed one to correct technical
details:
* We made describe output to be a valid object name some time ago.
* With recent addition to take directory names and path patterns,
it is not limited to "small number of files" case anymore.
* The original about separate-remote was full of half sentences
so I stitched them together to make them make some sense.
* Sorry, I recently applied the same "fix" as Cogito got quite
some time ago, and both lightweight and annotated tags are
now followed upon a tracking fetch.
* rsync has been deprecated for quite some time.
* The official party line for git-native-over-ssh is host:path
* Octopus should be discouraged unless talking about truly
trivial merges. Explain its downside better.
* Amend is a lot handier than --no-commit, as you do not have to
plan ahead. We should encourage "Pull/merge normally and if
the result is not what you like, amend it" workflow.
--- a/doc
+++ b/doc
@@ -278,11 +278,14 @@
Third is git-describe. This is something like git-name-rev, but
backwards: it finds the closest reference that is an ancestor of the
-specified commit. Its output is not acceptable git input, but takes
-the form of either
+specified commit. It takes the form of either
+
v1.2 (tag name), or
v1.2-g12345678 (commit 12345678, whose nearest ancestor is v1.2)
+and it is accepted as an input if the abbreviated object name that follows
+"tagname-g" prefix is unambiguous.
+
git-describe's output is intended to be used as a software version
number, something like the "rcsid" feature in RCS and CVS.
By default, git-describe uses only heavyweight tags for its naming,
@@ -345,13 +348,14 @@
in the working directory. This option can be used with no <branch>
specified (defaults to HEAD) to undo local edits.
-2) Revert changes to a small number of files.
+2) Revert changes to the files in the working tree.
git checkout [<revision>] [--] <paths>
will copy the version of the <paths> from the index to the working
directory. If a <revision> is given, the index for those paths will
be updated from the given revision before copying from the index to
- the working tree.
+ the working tree. <paths> can name directories, and/or contain
+ glob patterns to revert many files.
Unlike the version with no <paths> specified, this does NOT change
HEAD, even if <paths> is ".".
@@ -782,18 +786,12 @@
default in future), that places copies of the remote servers' heads
under .git/refs/remotes/<server>/<branch>.
-Then you can refer to
-
-If you use "git-line --use-separate-remote", it will set
-
-If you want to use the "separate remotes" tracking branch
-There's an alternate way, using the --use-separate-remote option
-to git-clone. This sets up a copy of the remote server's heads
-under .git/refs/remotes/origin/<name>. Then you can refer to
-"origin/<branch>" whenever you want.
+If you use "git-clone --use-separate-remote", it sets up a copy of
+the remote server's heads under .git/refs/remotes/origin/<name>.
+Then you can refer to "origin/<branch>" whenever you want.
Because the branch names are got under .git/refs/heads, the git
-tools will not let you commit to the branch.
+tools will not let you commit to the remote branchbranch.
* Remotes files
@@ -877,7 +875,7 @@
* Remote tags
When you fetch to a tracking branch, git-fetch also fetches every
-heavyweight tag (one that involves an actual tag object) that
+tag under .git/refs/tags/ in the remote repository that
points to a commit reachable from the branch head and installs a
copy locally.
@@ -913,11 +911,19 @@
http clients also need some extra index information to help them
find which pack files they need. git-update-server-info is the
- command that generates these files, but it's run automatically
- from git-repack, so it's not too important to know.
+ command that generates these files, and it is important to keep
+ them up-to-date. A recommended practice is to have this command in
+ .git/hooks/update so that every time you push into the repository
+ they are automatically updated. You can enable the hook (which
+ is installed when the repository is initialized) with "chmod +x".
+
+ In addition to http://, https:// and ftp:// URL are allowed and
+ handled by the same backend that uses cURL library.
2) rsync protocol. This is basically an alternative to http, and
- has the same strictures and limitations.
+ has the same strictures and limitations. This is deprecated and
+ its use has been discouraged for quite some time, although it still
+ works.
3) The git protocol. This is represented with a git:// URL, and talks
to a dedicated git daemon (see the git-daemon man page) on the
@@ -929,8 +935,9 @@
require any special care when repacking. git-daemon is purposefully
written to provide read-only service.
-4) The git protocol over ssh. This is a git+ssh:// URL; ssh:// is
- accepted as a synonym. It has the same efficiency issues as
+4) The git protocol over ssh. This is spelled as "host:path" like scp
+ command, and ssh:// URL and git+ssh:// are
+ accepted as synonyms. It has the same efficiency issues as
plain git. If you want to limit ssh users to just the git commands
necessary to share work, git provides a git-shell command that can
be used as a very limited login shell.
@@ -1046,7 +1053,7 @@
Now, you can also specify a remote repository to merge from, using a
-git://, http:// or git+ssh:// URL. This is what Linus does all day
+git:// or http:// URLs or host:path syntax. This is what Linus does all day
long, and why the git-pull tool is optimized to allow that. It uses
git-fetch to fetch the remote branch without assigning it a branch name
(as mentioned above, it gets the magic name FETCH_HEAD), and them merges
@@ -1213,15 +1220,15 @@
what's already there, enter "5401f304" instead, and click "Goto".)
The octopus strategy is invoked automatically when you specify more
-than one branch at a time to merge in with "git pull". It can't handle
-complicated overlaps and file renames as well as the 2-way recursive
+than one branch at a time to merge in with "git pull". It refuses to
+handle complicated overlaps and file renames as well as the 2-way recursive
strategy, but if you have a number of simple, independent changes that you
want to merge together, an octopus merge is the obvious way to document
the fact that they're truly independent.
-The only downside to using an octopus to combine a number of simple
-changes is that any merge makes git-bisect's job harder. If you have
-a development history like
+A major downside to using an octopus to combine a number of
+changes is that an octopus merge makes git-bisect's job harder.
+If you have a development history like
/-b-\
/ \
@@ -1234,9 +1241,15 @@
\-f-/
And you know that a works but g doesn't, there's no way to do a binary
-search on b through f; they have to be searched linearly. This is
-no harder to bisect, and a lot nicer-looking than the equivalent with
-2-way merges:
+search on b through f; they have to be searched linearly. In
+addition, a merge tends to become more error prone as it has more
+parents. Your bisect could show that all of b, c, d, e, f are good
+and the error is in a mismerge at g. This is why the octopus strategy
+refuses to do anything other than a very simple merge.
+
+Although an octopus is somewhat nicer-looking than the equivalent with
+a series of 2-parent merges, the latter is a lot more efficient to
+bisect:
/-b
/ \
@@ -1248,7 +1261,12 @@
\ /
\-f-----/
-But if they were just done one after the other, you'd have
+With this structure, if bisecting at h proves that it was Ok,
+then you do not have to check b, c, d, g (the error must be in e, f
+or mismerge at i or j).
+
+Even simpler to bisect is if they were just done one after the other.
+In such a case, you'd have:
--a--b--c--d--e--f--
@@ -1410,6 +1428,9 @@
In many cases, this is fine, and you can save it and complete the
commit. Or you can add something about the merge if it needs saying.
+If the merge was complex, it will turn out to be useful to describe
+how you choose to resolve conflicts, and that is the primary reason
+the boilerplate lists conflicted files.
When I'm done, if I don't need branch A any more, I can
@@ -1763,11 +1784,11 @@
the changelog. It could have been a separate commit, but didn't seem
worth it.
-This was done by forcing the git-pull to not commit:
+This was done by amending the merge commit.
- git pull --no-commit . <branch>
+ git pull . <branch>
(edit as desired)
- git commit
+ git commit --amend -a
The commit message should be edited to explain that this is not just
a normal merge, as was done in this case.
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2006-11-18 1:11 ` Junio C Hamano
2006-11-20 23:51 ` [DRAFT 2] " linux
@ 2006-11-22 11:51 ` Junio C Hamano
1 sibling, 0 replies; 66+ messages in thread
From: Junio C Hamano @ 2006-11-22 11:51 UTC (permalink / raw)
To: linux; +Cc: git
Junio C Hamano <junkio@cox.net> writes:
> If you want an esoteric topic for an introductory documentation,
> it would be more useful to talk about evil merges (an evil merge
> is a merge commit whose result does not match any of its
> parents). A good example is found in
>
> git show v1.0.0
I actually remembered a better one.
Subject: Necessity of "evil" merge and topic branches
Date: Wed, 17 May 2006 23:25:55 -0700
Message-ID: <7vy7wz6e8c.fsf@assigned-by-dhcp.cox.net>
This talks about a real-world evil merge and the reason why it
was necessary, and speculates a possible way to make life
easier. I actually later used the "third branch to remember the
evil merge between two topics" technique I talked about in the
message to merge in another pair of topics, and it turned out
that it worked rather well.
There were two logically independent topics:
- lt/setup. Two commits, changing the calling convention of
setup_git_directory() function -- the final tip of the topic
was at a633fca0.
- js/mv. Three commits, making git-mv a built-in after
refactoring some code from other parts of the system -- the
final tip of the topic was at ac64a722).
They were not "obviously correct" when they started, so a topic
branch was used for each. They had textually and semantically
some conflicts, and if they were to progress at different paces,
there was a need for an evil merge when the later one is merged
to master.
So I created another branch to merge the two topics together and
resolved their conflicts while my reading of their code were
still fresh.
git checkout -b __/setup-n-mv js/mv
git pull . lt/setup
git checkout next
git pull . __/setup-n-mv
Later js/mv became ready to be merged first. So I merged it to
'master'.
git checkout master
git pull . js/mv
I was planning to cook lt/setup a bit longer but eventually
decided to merge it to 'master' as well after a short while.
git checkout master
git pull . __/setup-n-mv
I could have pulled lt/setup into master but then I would have
had to resolve the conflict between the two branches. Since I
recorded the resolution earlier by making the merge, and pulled
that branch (which contained all of lt/setup already) into
'master', I did not have to remember what I need to adjust when
I did so. If lt/setup had further updates on its own after the
"third branch __/setup-n-mv" was made, I would have then pulled
the tip of lt/setup into 'master' to complete the merge, and
that would have also resulted in non-conflicting simple merge.
This would have worked equally well if lt/setup were to graduate
first.
This might look too complex at the first look, but I thought it
might be an interesting topic in the "hints for managing your
topic branches" section.
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [Patch to DRAFT 2 (2/2)] Branching and merging with git
2006-11-22 11:02 ` [Patch to DRAFT 2 (2/2)] " Junio C Hamano
@ 2006-11-22 13:36 ` Rene Scharfe
0 siblings, 0 replies; 66+ messages in thread
From: Rene Scharfe @ 2006-11-22 13:36 UTC (permalink / raw)
To: Junio C Hamano; +Cc: linux, git
Junio C Hamano schrieb:
> -tools will not let you commit to the branch.
> +tools will not let you commit to the remote branchbranch.
s/branchbranch/branch/
^ permalink raw reply [flat|nested] 66+ messages in thread
* [PATCH] Documentation: add a "git user's manual"
2006-11-19 17:50 ` J. Bruce Fields
2006-11-19 17:59 ` Git manuals Petr Baudis
@ 2006-11-26 4:01 ` J. Bruce Fields
1 sibling, 0 replies; 66+ messages in thread
From: J. Bruce Fields @ 2006-11-26 4:01 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Theodore Tso, linux, git
On Sun, Nov 19, 2006 at 12:50:40PM -0500, J. Bruce Fields wrote:
> In fact, I'm tempted to submit a patch that just assigns a chapter
> number to everything under Documentation/, slaps a single table of
> contents on the front, and calls the result "the git user's manual."
Something like this, as a start?:
Add a manual.txt file which generates a "git user's manual" by including
a bunch of preexisting files under Documentation and declaring each to
be a chapter.
The result is a disorganized mess, because the documentation itself is a
disorganized mess. This is intended to call attention to that fact
rather than fix it. Hopefully we can massage it into a better order
over time. And hopefully we can encourage anyone that adds new
documentation to think about where in the order it should be inserted.
Not built or installed by default for now.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
---
Documentation/Makefile | 7 ++++++-
Documentation/manual.conf | 2 ++
Documentation/manual.txt | 30 ++++++++++++++++++++++++++++++
3 files changed, 38 insertions(+), 1 deletions(-)
diff --git a/Documentation/Makefile b/Documentation/Makefile
index c00f5f6..684dacf 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -85,9 +85,14 @@ clean:
%.1 %.7 : %.xml
xmlto -m callouts.xsl man $<
-%.xml : %.txt
+%.html : %.txt
asciidoc -b docbook -d manpage -f asciidoc.conf $<
+manual.html: manual.txt
+ a2x -f xhtml --no-icons --asciidoc-opts="-d book -f asciidoc.conf" $<
+
+# a2x -f xhtml --ascidoc-opts="-d book -f asciidoc.conf" $<
+
git.html: git.txt README
glossary.html : glossary.txt sort_glossary.pl
diff --git a/Documentation/manual.conf b/Documentation/manual.conf
new file mode 100644
index 0000000..0d0cfad
--- /dev/null
+++ b/Documentation/manual.conf
@@ -0,0 +1,2 @@
+[titles]
+underlines="__","==","--","~~","^^"
diff --git a/Documentation/manual.txt b/Documentation/manual.txt
new file mode 100644
index 0000000..5512212
--- /dev/null
+++ b/Documentation/manual.txt
@@ -0,0 +1,30 @@
+Git User's manual
+_________________
+
+include::tutorial.txt[]
+
+include::tutorial-2.txt[]
+
+Git design overview
+===================
+
+include::README[]
+
+include::everyday.txt[]
+
+include::cvs-migration.txt[]
+
+include::howto-index.txt[]
+
+include::hooks.txt[]
+
+include::diffcore.txt[]
+
+include::repository-layout.txt[]
+
+include::core-tutorial.txt[]
+
+Glossary of git terms
+=====================
+
+include::glossary.txt[]
--
1.4.4.rc1.g83ee9
^ permalink raw reply related [flat|nested] 66+ messages in thread
* Re: [DRAFT 2] Branching and merging with git
2006-11-20 23:51 ` [DRAFT 2] " linux
2006-11-22 11:02 ` [Patch to DRAFT 2 (1/2)] " Junio C Hamano
2006-11-22 11:02 ` [Patch to DRAFT 2 (2/2)] " Junio C Hamano
@ 2006-12-04 1:19 ` J. Bruce Fields
2006-12-04 7:23 ` J. Bruce Fields
2006-12-15 21:38 ` Jakub Narebski
3 siblings, 1 reply; 66+ messages in thread
From: J. Bruce Fields @ 2006-12-04 1:19 UTC (permalink / raw)
To: linux; +Cc: git
On Mon, Nov 20, 2006 at 06:51:36PM -0500, linux@horizon.com wrote:
> I tried to incorporate all the suggestions. There are still a few things
> I have to research, and now I'm worried it's getting too long. Sigh.
If you made another pass for it asking whether each sentence was really
absolutely necessary you'd be able to cut quite a bit without
compromising on content. One example:
> In CVS, branches are difficult and awkward to use, and generally
> considered an advanced technique. Many people use CVS for a long time
> without departing from the trunk.
Lots of people have CVS experience, but not everyone does, and this
paragraph isn't really necessary. Cut it out, and the following
paragraph (minus first sentence) stands just fine on its own:
> Git is very different. Branching and merging are central to effective use
> of git, and if you aren't comfortable with them, you won't be comfortable
> with git. In particular, they are required to share work with other
> people.
Note also "if you aren't comfortable with them..." just repeats
something you've already said. So now we're down to just:
"Branching and merging are central to effective use of git. In
particular, they are required to share work with other people."
which is short and to the point. Neat!
I'm not sure of the ordering. For example:
> The only things that are a bit confusing are some of the names.
> In particular, at least when beginning:
> - You create new branches with "git checkout -b".
> "git branch" should only be used to list and delete branches.
> - You share work with "git fetch" and "git push". These are opposites.
> - You merge with "git pull", not "git merge". "git pull" can also do a
> "git fetch", but that's optional. What's not optional is the merge.
>
> Also, a good habit it to never commit directly to your main "master"
> branch. Do all work in temporary "topic" branches, and then merge them
> into the master. Most experienced users don't bother to be quite this
> purist, but you should err on the side of using separate topic branches,
> so it's excellent practice.
We're diving in here without explaining what checkout, fetch, push,
pull, or merge are yet, or what the master branch is.
The document seems to be targetted at someone who has read some
scattered git documentation, gotten confused, and needs help putting it
all together. This is understandable--there are a lot of people like
that right now! But if we're going to get the documentation in some
sort of sensible order then we need to think about how to start with
someone who is a blank slate and lead them step by step to what they
most need to know.
That doesn't mean *you* need to do everything from scratch, but it would
be helpful to figure out where this would fit in with the other
documentation in a logical progression. As a start, the first paragraph
could say "before reading this, we assume you've read X, Y, and Z", and
then the rest of the document could be audited to make sure that it
didn't assume anything that isn't in X, Y, and Z.
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT 2] Branching and merging with git
2006-12-04 1:19 ` [DRAFT 2] " J. Bruce Fields
@ 2006-12-04 7:23 ` J. Bruce Fields
2006-12-04 10:56 ` Johannes Schindelin
0 siblings, 1 reply; 66+ messages in thread
From: J. Bruce Fields @ 2006-12-04 7:23 UTC (permalink / raw)
To: linux; +Cc: git
On Sun, Dec 03, 2006 at 08:19:58PM -0500, J. Bruce Fields wrote:
> That doesn't mean *you* need to do everything from scratch, but it would
> be helpful to figure out where this would fit in with the other
> documentation in a logical progression. As a start, the first paragraph
> could say "before reading this, we assume you've read X, Y, and Z", and
> then the rest of the document could be audited to make sure that it
> didn't assume anything that isn't in X, Y, and Z.
By the way, I have some draft rough work on getting that introductory
documentation organized at
git://linux-nfs.org/~bfields/git.git
See Documentation/user-manual.txt and Documentation/quick-start.txt. I
think I've stolen a small amount of text from you--hope that's OK!
I have two ideas in mind:
- The tutorial is supposed to a very quick "look what git can
do" document, but people also want it to really explain git,
prepare people to read the man pages, etc., which will make it
much longer over time. So I'm trying to split out an
extremely concise "quick-start" guide (modelled partly on
Mercurial's) that doesn't even pretend to explain anything, and
a "user manual" that's much more verbose and tries to cover
the basics comprehensively.
- A lot of people don't actually need to do commits or merges at
all--they just need to know how to clone a repository, check
out a few versions, etc. (Witness the number of web pages
with "how to check out our latest code from CVS" out
there....) I'm also assuming most people are joining an
ongoing project instead of creating a new one. So instead of
starting right away with init-db/add/commit, I'm putting off
actual "development" stuff till pretty late:
1. clone
2. checking out old versions, basic branch management
3. keeping up-to-date with fetch
4. bisect
5. archaeology (commits DAG, git-log, ...)
6. creating commits, index file
7. resolving merges, pull
8. publishing a public repository, push
etc. I'm hoping you'd be interested in working together on
the last parts (7 and 8 especially).
Comments welcomed...
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT 2] Branching and merging with git
2006-12-04 7:23 ` J. Bruce Fields
@ 2006-12-04 10:56 ` Johannes Schindelin
0 siblings, 0 replies; 66+ messages in thread
From: Johannes Schindelin @ 2006-12-04 10:56 UTC (permalink / raw)
To: J. Bruce Fields; +Cc: linux, git
Hi,
On Mon, 4 Dec 2006, J. Bruce Fields wrote:
> So I'm trying to split out an extremely concise "quick-start" guide
> (modelled partly on Mercurial's) that doesn't even pretend to explain
> anything,
you might want to look at the QuickStart page in Git's wiki...
> 1. clone
> 2. checking out old versions, basic branch management
> 3. keeping up-to-date with fetch
> 4. bisect
> 5. archaeology (commits DAG, git-log, ...)
> 6. creating commits, index file
> 7. resolving merges, pull
> 8. publishing a public repository, push
Another approach would be to illustrate short stories of a failed merge,
or "how I put up a public repository", etc. Like, more example-based (and
of course short enough that people actually read through it).
Ciao,
Dscho
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT 2] Branching and merging with git
2006-11-20 23:51 ` [DRAFT 2] " linux
` (2 preceding siblings ...)
2006-12-04 1:19 ` [DRAFT 2] " J. Bruce Fields
@ 2006-12-15 21:38 ` Jakub Narebski
2006-12-15 21:41 ` J. Bruce Fields
3 siblings, 1 reply; 66+ messages in thread
From: Jakub Narebski @ 2006-12-15 21:38 UTC (permalink / raw)
To: git
linux@horizon.com wrote:
> I tried to incorporate all the suggestions. There are still a few things
> I have to research, and now I'm worried it's getting too long. Sigh.
Tutorials can (and usually are) be long, don't worry.
Could you resend this as patch creating Documentation/tutorial-3.txt
This way it would be in the repository, and people would be able to correct
this (I guess that it at first would appear in 'next' branch)...
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT 2] Branching and merging with git
2006-12-15 21:38 ` Jakub Narebski
@ 2006-12-15 21:41 ` J. Bruce Fields
0 siblings, 0 replies; 66+ messages in thread
From: J. Bruce Fields @ 2006-12-15 21:41 UTC (permalink / raw)
To: Jakub Narebski; +Cc: git
On Fri, Dec 15, 2006 at 10:38:05PM +0100, Jakub Narebski wrote:
> linux@horizon.com wrote:
>
> > I tried to incorporate all the suggestions. There are still a few things
> > I have to research, and now I'm worried it's getting too long. Sigh.
>
> Tutorials can (and usually are) be long, don't worry.
>
>
> Could you resend this as patch creating Documentation/tutorial-3.txt
> This way it would be in the repository, and people would be able to correct
> this (I guess that it at first would appear in 'next' branch)...
Yes please. Even if it has some problems or isn't really a perfect fit
in the current tutorial sequence, we can fix that later.
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2006-11-16 22:17 [DRAFT] Branching and merging with git linux
` (6 preceding siblings ...)
2006-11-17 17:44 ` [DRAFT] Branching and merging with git J. Bruce Fields
@ 2007-01-03 17:04 ` Theodore Tso
2007-01-03 17:08 ` Junio C Hamano
2007-01-07 23:44 ` J. Bruce Fields
7 siblings, 2 replies; 66+ messages in thread
From: Theodore Tso @ 2007-01-03 17:04 UTC (permalink / raw)
To: linux; +Cc: git
On Thu, Nov 16, 2006 at 05:17:01PM -0500, linux@horizon.com wrote:
> I know it took me a while to get used to playing with branches, and I
> still get nervous when doing something creative. So I've been trying
> to get more comfortable, and wrote the following to document what I've
> learned.
What ever happened to this document? There was some talk of getting
this integrated into the git tree as Docmentation/tutorial-3.txt.
IMHO it would be really, really good to do this before 1.5.0, since I
think a lot of users would find it really useful. Some of the text
may need to be moved to other locations, but it might go faster if we
get the base document into the tree first, and then we can submit
patches to move text around to integrate it into the other
documentation files.
I'm certainly willing to help out submitting patches to improve the
documentation, and I think this would be a big step towards helping
new users to git become much more quickly proficient.
- Ted
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2007-01-03 17:04 ` Theodore Tso
@ 2007-01-03 17:08 ` Junio C Hamano
2007-01-04 5:28 ` linux
2007-01-07 23:44 ` J. Bruce Fields
1 sibling, 1 reply; 66+ messages in thread
From: Junio C Hamano @ 2007-01-03 17:08 UTC (permalink / raw)
To: linux; +Cc: git, Theodore Tso
Theodore Tso <tytso@mit.edu> writes:
> On Thu, Nov 16, 2006 at 05:17:01PM -0500, linux@horizon.com wrote:
>> I know it took me a while to get used to playing with branches, and I
>> still get nervous when doing something creative. So I've been trying
>> to get more comfortable, and wrote the following to document what I've
>> learned.
>
> What ever happened to this document? There was some talk of getting
> this integrated into the git tree as Docmentation/tutorial-3.txt.
> IMHO it would be really, really good to do this before 1.5.0, since I
> think a lot of users would find it really useful.
Seconded. Can I have the latest round?
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2007-01-03 17:08 ` Junio C Hamano
@ 2007-01-04 5:28 ` linux
2007-01-04 6:11 ` Junio C Hamano
0 siblings, 1 reply; 66+ messages in thread
From: linux @ 2007-01-04 5:28 UTC (permalink / raw)
To: junkio, linux; +Cc: git, tytso
> Seconded. Can I have the latest round?
Uh... can it wait a day or two? I'm leaving for a camping trip
tomorrow and won't have much keyboard access...
Sorry about that.
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2007-01-04 5:28 ` linux
@ 2007-01-04 6:11 ` Junio C Hamano
0 siblings, 0 replies; 66+ messages in thread
From: Junio C Hamano @ 2007-01-04 6:11 UTC (permalink / raw)
To: linux; +Cc: git
linux@horizon.com writes:
>> Seconded. Can I have the latest round?
>
> Uh... can it wait a day or two? I'm leaving for a camping trip
> tomorrow and won't have much keyboard access...
No worries. Have fun.
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2007-01-03 17:04 ` Theodore Tso
2007-01-03 17:08 ` Junio C Hamano
@ 2007-01-07 23:44 ` J. Bruce Fields
2007-01-08 0:24 ` Junio C Hamano
2007-01-08 0:40 ` Theodore Tso
1 sibling, 2 replies; 66+ messages in thread
From: J. Bruce Fields @ 2007-01-07 23:44 UTC (permalink / raw)
To: Theodore Tso; +Cc: linux, git
On Wed, Jan 03, 2007 at 12:04:11PM -0500, Theodore Tso wrote:
> What ever happened to this document? There was some talk of getting
> this integrated into the git tree as Docmentation/tutorial-3.txt.
Just to throw more fuel on the fire....
I have a draft attempt at a complete "git user's manual" at
http://www.fieldses.org/~bfields/
The goals are:
- Readable from beginning to end in order without having read
any other git documentation beforehand.
- Helpful section names and cross-references, so it's not too
hard to skip around some if you need to.
- Organized to allow it to grow much larger (unlike the
tutorials)
It's more liesurely than tutorial.txt, but tries to stay focused on
practical how-to stuff. It adds a discussion of how to resolve merge
conflicts, and partial instructions on setting up and dealing with a
public repository.
I've lifted a little bit from "branching and merging" (e.g., some of the
discussion of history diagrams), and could probably steal more if that's
OK. (Similarly anyone should of course feel free to reuse bits of this
if any parts seem more useful than the whole.)
There's a lot of detail on managing branches and using git-fetch, just
because those are essential even to people needing read-only access
(e.g., kernel testers). I think those sections will be much shorter
once the new "git remote" command and the disconnected checkouts are
taken into account.
I do feel bad about adding yet another piece of documentation, but I we
need something that goes through all the basics in a logical order, and
I wasn't seeing how to grow the tutorials into that.
Opinions?
--b.
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2007-01-07 23:44 ` J. Bruce Fields
@ 2007-01-08 0:24 ` Junio C Hamano
2007-01-08 2:35 ` J. Bruce Fields
2007-01-08 0:40 ` Theodore Tso
1 sibling, 1 reply; 66+ messages in thread
From: Junio C Hamano @ 2007-01-08 0:24 UTC (permalink / raw)
To: J. Bruce Fields; +Cc: linux, git
"J. Bruce Fields" <bfields@fieldses.org> writes:
> I do feel bad about adding yet another piece of documentation, but I we
> need something that goes through all the basics in a logical order, and
> I wasn't seeing how to grow the tutorials into that.
>
> Opinions?
I was having the feeling that we need to start over the
documentation from a clean slate by first coming up with a
coherent presentation order and then filling sections in it,
instead of tweaking existing documents here and there. The
existing documents were written in different development stages
of git, and each document tries to be more or less independent
from others in the area it wants to talk about, and reading all
of them in _any_ order is not the best way to learn git because
of duplication. Also I suspect some information in older
documents, while being still valid and technically correct,
predates invention of a better/simpler alternative.
In other words, I think we have enough information in the
tutorial documents, but the problem is not the lack of
information -- the problem is the lack of organization.
I think this effort of yours is wonderful because it directly
tackles that problem.
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2007-01-07 23:44 ` J. Bruce Fields
2007-01-08 0:24 ` Junio C Hamano
@ 2007-01-08 0:40 ` Theodore Tso
2007-01-08 0:46 ` J. Bruce Fields
1 sibling, 1 reply; 66+ messages in thread
From: Theodore Tso @ 2007-01-08 0:40 UTC (permalink / raw)
To: J. Bruce Fields; +Cc: linux, git
On Sun, Jan 07, 2007 at 06:44:11PM -0500, J. Bruce Fields wrote:
> On Wed, Jan 03, 2007 at 12:04:11PM -0500, Theodore Tso wrote:
> > What ever happened to this document? There was some talk of getting
> > this integrated into the git tree as Docmentation/tutorial-3.txt.
>
> Just to throw more fuel on the fire....
>
> I have a draft attempt at a complete "git user's manual" at
>
> http://www.fieldses.org/~bfields/
Is that the right URL? That gets me to "Not Bruce's Webpage" and I
don't see an obvious link to git documentation...
- Ted
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2007-01-08 0:40 ` Theodore Tso
@ 2007-01-08 0:46 ` J. Bruce Fields
2007-01-08 1:22 ` Jakub Narebski
` (2 more replies)
0 siblings, 3 replies; 66+ messages in thread
From: J. Bruce Fields @ 2007-01-08 0:46 UTC (permalink / raw)
To: Theodore Tso; +Cc: linux, git
On Sun, Jan 07, 2007 at 07:40:06PM -0500, Theodore Tso wrote:
> On Sun, Jan 07, 2007 at 06:44:11PM -0500, J. Bruce Fields wrote:
> > On Wed, Jan 03, 2007 at 12:04:11PM -0500, Theodore Tso wrote:
> > > What ever happened to this document? There was some talk of getting
> > > this integrated into the git tree as Docmentation/tutorial-3.txt.
> >
> > Just to throw more fuel on the fire....
> >
> > I have a draft attempt at a complete "git user's manual" at
> >
> > http://www.fieldses.org/~bfields/
>
> Is that the right URL? That gets me to "Not Bruce's Webpage" and I
> don't see an obvious link to git documentation...
Crap:
http://www.fieldses.org/~bfields/git-user-manual.html
Sorry about that.--b.
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2007-01-08 0:46 ` J. Bruce Fields
@ 2007-01-08 1:22 ` Jakub Narebski
2007-01-08 1:46 ` Horst H. von Brand
2007-01-08 12:38 ` Guilhem Bonnefille
2 siblings, 0 replies; 66+ messages in thread
From: Jakub Narebski @ 2007-01-08 1:22 UTC (permalink / raw)
To: git
J. Bruce Fields wrote:
> On Sun, Jan 07, 2007 at 07:40:06PM -0500, Theodore Tso wrote:
>> On Sun, Jan 07, 2007 at 06:44:11PM -0500, J. Bruce Fields wrote:
>>>
>>> I have a draft attempt at a complete "git user's manual" at
>>>
>>> http://www.fieldses.org/~bfields/
>>
>> Is that the right URL? That gets me to "Not Bruce's Webpage" and I
>> don't see an obvious link to git documentation...
>
> Crap:
>
> http://www.fieldses.org/~bfields/git-user-manual.html
Added to
http://git.or.cz/gitwiki/GitDocumentation
http://git.or.cz/gitwiki/GitLinks
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2007-01-08 0:46 ` J. Bruce Fields
2007-01-08 1:22 ` Jakub Narebski
@ 2007-01-08 1:46 ` Horst H. von Brand
2007-01-08 2:22 ` J. Bruce Fields
2007-01-08 12:38 ` Guilhem Bonnefille
2 siblings, 1 reply; 66+ messages in thread
From: Horst H. von Brand @ 2007-01-08 1:46 UTC (permalink / raw)
To: J. Bruce Fields; +Cc: Theodore Tso, linux, git
J. Bruce Fields <bfields@fieldses.org> wrote:
[...]
> http://www.fieldses.org/~bfields/git-user-manual.html
A git repo? People want to rummage around in it...
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 2654431
Universidad Tecnica Federico Santa Maria +56 32 2654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 2797513
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2007-01-08 1:46 ` Horst H. von Brand
@ 2007-01-08 2:22 ` J. Bruce Fields
0 siblings, 0 replies; 66+ messages in thread
From: J. Bruce Fields @ 2007-01-08 2:22 UTC (permalink / raw)
To: Horst H. von Brand; +Cc: Theodore Tso, linux, git
On Sun, Jan 07, 2007 at 10:46:34PM -0300, Horst H. von Brand wrote:
> J. Bruce Fields <bfields@fieldses.org> wrote:
>
> [...]
>
> > http://www.fieldses.org/~bfields/git-user-manual.html
>
> A git repo? People want to rummage around in it...
git://git.linux-nfs.org/~bfields/git.git
Note that I'm clueless about asciidoc, docbook, and friends, so I'm just
using whatever hack I could figure out to get the html looking OK.
And in general suggestions are welcomed.
--b.
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2007-01-08 0:24 ` Junio C Hamano
@ 2007-01-08 2:35 ` J. Bruce Fields
2007-01-08 13:04 ` David Kågedal
2007-01-08 14:03 ` Theodore Tso
0 siblings, 2 replies; 66+ messages in thread
From: J. Bruce Fields @ 2007-01-08 2:35 UTC (permalink / raw)
To: Junio C Hamano; +Cc: linux, git
On Sun, Jan 07, 2007 at 04:24:08PM -0800, Junio C Hamano wrote:
> "J. Bruce Fields" <bfields@fieldses.org> writes:
> In other words, I think we have enough information in the
> tutorial documents, but the problem is not the lack of
> information -- the problem is the lack of organization.
>
> I think this effort of yours is wonderful because it directly
> tackles that problem.
OK, thanks for the vote of confidence.... My tentative organization
(which I'm totally open to argument about) is:
chapters 1 and 2: "Read-only" operations:
clone, fetch, the commit DAG, etc.; material that could be
useful to a linux kernel tester, for example. This also
includes lots of stuff about branch manipulation and fetching,
just because that's necessary to keep a repo up to date and
check out random commits. Once we have "git remote" and
disconnected checkouts most of this could be postponed till
later.
Chapter 3: "Read-write" operations:
Read-write stuff: creating commits (basic mention of index),
handling merges, git-gc, ending with distributed stuff:
importing and exporting patches, pull and push, etc.
Chapter 4 (unwritten): interactions with other VCS's
cvs, subversion. Also some of us use track projects with git
even when all we've got is a sequence of release tarballs to
track, and that might be worth documenting.
Chapter 5 (unwritten): rewriting history
rebasing, cherry-picking, managing patch series, etc.
Chapter 6 (unwritten): git internals
I intend to just do a wholesale import of either tutorial-2.txt,
core-tutorial.txt, or the README, or some combination thereof,
but can't decide which.
--b.
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2007-01-08 0:46 ` J. Bruce Fields
2007-01-08 1:22 ` Jakub Narebski
2007-01-08 1:46 ` Horst H. von Brand
@ 2007-01-08 12:38 ` Guilhem Bonnefille
2007-01-09 4:17 ` J. Bruce Fields
2 siblings, 1 reply; 66+ messages in thread
From: Guilhem Bonnefille @ 2007-01-08 12:38 UTC (permalink / raw)
To: J. Bruce Fields; +Cc: Theodore Tso, linux, git
On 1/8/07, J. Bruce Fields <bfields@fieldses.org> wrote:
> On Sun, Jan 07, 2007 at 07:40:06PM -0500, Theodore Tso wrote:
> > On Sun, Jan 07, 2007 at 06:44:11PM -0500, J. Bruce Fields wrote:
> > > On Wed, Jan 03, 2007 at 12:04:11PM -0500, Theodore Tso wrote:
> > > > What ever happened to this document? There was some talk of getting
> > > > this integrated into the git tree as Docmentation/tutorial-3.txt.
> > >
> > > Just to throw more fuel on the fire....
> > >
> > > I have a draft attempt at a complete "git user's manual" at
> > >
> > > http://www.fieldses.org/~bfields/
> >
> > Is that the right URL? That gets me to "Not Bruce's Webpage" and I
> > don't see an obvious link to git documentation...
>
> Crap:
>
> http://www.fieldses.org/~bfields/git-user-manual.html
Nice work.
My only 2 cents: the SVN book is really a good book, as it contains
both simple user and advanced hacker info. As it is in free licence,
perhaps it could be possible to "port" the book to Git. I saw that the
SVK book is such a port. But it's a DocBook document.
http://svnbook.red-bean.com/
--
Guilhem BONNEFILLE
-=- #UIN: 15146515 JID: guyou@im.apinc.org MSN: guilhem_bonnefille@hotmail.com
-=- mailto:guilhem.bonnefille@gmail.com
-=- http://nathguil.free.fr/
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2007-01-08 2:35 ` J. Bruce Fields
@ 2007-01-08 13:04 ` David Kågedal
2007-01-08 14:03 ` Theodore Tso
1 sibling, 0 replies; 66+ messages in thread
From: David Kågedal @ 2007-01-08 13:04 UTC (permalink / raw)
To: git
"J. Bruce Fields" <bfields@fieldses.org> writes:
> OK, thanks for the vote of confidence.... My tentative organization
> (which I'm totally open to argument about) is:
>
> chapters 1 and 2: "Read-only" operations:
> Chapter 3: "Read-write" operations:
> Chapter 4 (unwritten): interactions with other VCS's
I think this should be considered more peripheral, since it is really
an independent piece, and nobody needs to read it to learn how git
works. So I would probably move it to the end.
> Chapter 5 (unwritten): rewriting history
> Chapter 6 (unwritten): git internals
--
David Kågedal
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2007-01-08 2:35 ` J. Bruce Fields
2007-01-08 13:04 ` David Kågedal
@ 2007-01-08 14:03 ` Theodore Tso
2007-01-09 2:41 ` J. Bruce Fields
1 sibling, 1 reply; 66+ messages in thread
From: Theodore Tso @ 2007-01-08 14:03 UTC (permalink / raw)
To: J. Bruce Fields; +Cc: Junio C Hamano, linux, git
On Sun, Jan 07, 2007 at 09:35:11PM -0500, J. Bruce Fields wrote:
> chapters 1 and 2: "Read-only" operations:
>
> clone, fetch, the commit DAG, etc.; material that could be
> useful to a linux kernel tester, for example. This also
> includes lots of stuff about branch manipulation and fetching,
> just because that's necessary to keep a repo up to date and
> check out random commits. Once we have "git remote" and
> disconnected checkouts most of this could be postponed till
> later.
I would add a QuickStart Chapter before you start going into the
"read-only" oeperations. It would show how to create a completely
empty repository, and add a few commits. It would also demonstrate
how to clone an example repository (with a fixed set of contents,
stored at git://git.kernel.org/pub/scm/git/example and add a commit
using "git commit -a".
The basic idea is to show the user that git really isn't that hard,
*before* you start diving into a lot of details. If you don't tell a
user how to make a commit until Chapter 3, he/she will assume it's
because it's Really Hard, and you may end up losing them before that.
> Chapter 3: "Read-write" operations:
>
> Read-write stuff: creating commits (basic mention of index),
> handling merges, git-gc, ending with distributed stuff:
> importing and exporting patches, pull and push, etc.
At least some discussions of branches needs to happen here; it's
really important to talk about different workflows, and how you use
branches as part of your read-write operations. Some folks might or
might not use topic branches, but the concept of using temporary
branches to try things out is critical.
> Chapter 4 (unwritten): interactions with other VCS's
>
> cvs, subversion. Also some of us use track projects with git
> even when all we've got is a sequence of release tarballs to
> track, and that might be worth documenting.
>
> Chapter 6 (unwritten): git internals
>
> I intend to just do a wholesale import of either tutorial-2.txt,
> core-tutorial.txt, or the README, or some combination thereof,
> but can't decide which.
You might want to consider putting these two chapters into appendices.
- Ted
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2007-01-08 14:03 ` Theodore Tso
@ 2007-01-09 2:41 ` J. Bruce Fields
2007-01-09 8:46 ` Andreas Ericsson
0 siblings, 1 reply; 66+ messages in thread
From: J. Bruce Fields @ 2007-01-09 2:41 UTC (permalink / raw)
To: Theodore Tso; +Cc: Junio C Hamano, linux, git
On Mon, Jan 08, 2007 at 09:03:05AM -0500, Theodore Tso wrote:
> I would add a QuickStart Chapter before you start going into the
> "read-only" oeperations. It would show how to create a completely
> empty repository, and add a few commits. It would also demonstrate
> how to clone an example repository (with a fixed set of contents,
> stored at git://git.kernel.org/pub/scm/git/example and add a commit
> using "git commit -a".
>
> The basic idea is to show the user that git really isn't that hard,
> *before* you start diving into a lot of details. If you don't tell a
> user how to make a commit until Chapter 3, he/she will assume it's
> because it's Really Hard, and you may end up losing them before that.
Yeah, I agree. I just haven't been able to decide quite what to choose
for that purpose. Some choices:
- We could just pare down the tutorial a bit and drag it in as
chapter one.
- I tried writing something modeled loosely on the hg quick
start. It's a little out of date now, but that could be
fixed:
http://www.fieldses.org/~bfields/git-quick-start.html
- Or maybe a revised everyday.txt would do the job?
Any opinions?
> At least some discussions of branches needs to happen here;
The basic nuts-and-bolts (how to create and delete branches, etc.)
should all be covered, of course, but....
> it's really important to talk about different workflows, and how you
> use branches as part of your read-write operations. Some folks might
> or might not use topic branches, but the concept of using temporary
> branches to try things out is critical.
.... Maybe it'd be fun to have a section called just "examples" at the
end of each chapter. The sort of thing you're describing could fit in
well there. I'd need some help collecting interesting examples.
--b.
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2007-01-08 12:38 ` Guilhem Bonnefille
@ 2007-01-09 4:17 ` J. Bruce Fields
0 siblings, 0 replies; 66+ messages in thread
From: J. Bruce Fields @ 2007-01-09 4:17 UTC (permalink / raw)
To: Guilhem Bonnefille; +Cc: Theodore Tso, linux, git
On Mon, Jan 08, 2007 at 01:38:19PM +0100, Guilhem Bonnefille wrote:
> Nice work.
Thanks!
> My only 2 cents: the SVN book is really a good book, as it contains
> both simple user and advanced hacker info. As it is in free licence,
> perhaps it could be possible to "port" the book to Git. I saw that the
> SVK book is such a port. But it's a DocBook document.
> http://svnbook.red-bean.com/
Thanks, yes, that does look very polished.
If there's any part you'd be particularly interested in seeing "ported",
I'd be happy to help incorporate your work.
--b.
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2007-01-09 2:41 ` J. Bruce Fields
@ 2007-01-09 8:46 ` Andreas Ericsson
2007-01-09 15:49 ` J. Bruce Fields
2007-01-09 16:58 ` Theodore Tso
0 siblings, 2 replies; 66+ messages in thread
From: Andreas Ericsson @ 2007-01-09 8:46 UTC (permalink / raw)
To: J. Bruce Fields; +Cc: Theodore Tso, Junio C Hamano, linux, git
J. Bruce Fields wrote:
> On Mon, Jan 08, 2007 at 09:03:05AM -0500, Theodore Tso wrote:
>> I would add a QuickStart Chapter before you start going into the
>> "read-only" oeperations. It would show how to create a completely
>> empty repository, and add a few commits. It would also demonstrate
>> how to clone an example repository (with a fixed set of contents,
>> stored at git://git.kernel.org/pub/scm/git/example and add a commit
>> using "git commit -a".
>>
>> The basic idea is to show the user that git really isn't that hard,
>> *before* you start diving into a lot of details. If you don't tell a
>> user how to make a commit until Chapter 3, he/she will assume it's
>> because it's Really Hard, and you may end up losing them before that.
>
> Yeah, I agree. I just haven't been able to decide quite what to choose
> for that purpose. Some choices:
>
> - We could just pare down the tutorial a bit and drag it in as
> chapter one.
>
> - I tried writing something modeled loosely on the hg quick
> start. It's a little out of date now, but that could be
> fixed:
>
> http://www.fieldses.org/~bfields/git-quick-start.html
>
I like this, although fetch should probably have "--force" instead of
the "+branch" notation. --force stands out more and users are familiar
with --force possibly destroying things (rm -rf, anyone?).
> - Or maybe a revised everyday.txt would do the job?
>
> Any opinions?
>
I think the document is fine as it is, but could probably start off with
a link to the tutorial, quickstart or a revised version of everyday.txt,
stating that "here's something you might want to read if you prefer to
experiment. If you think something goes wrong, come back here and find
out why".
>> At least some discussions of branches needs to happen here;
>
> The basic nuts-and-bolts (how to create and delete branches, etc.)
> should all be covered, of course, but....
>
I found it quite sufficient. Perhaps it would be nice to include some
more advanced examples, like octopus merges and things like that,
although I feel such things could well live in an appendix to keep all
the easy operations up front. Most people I know will most likely
*never* use octopus merges. 90% of the merges we do here at work result
in fast-forwards, so a real merge is already considered a bit odd.
>> it's really important to talk about different workflows, and how you
>> use branches as part of your read-write operations. Some folks might
>> or might not use topic branches, but the concept of using temporary
>> branches to try things out is critical.
>
> .... Maybe it'd be fun to have a section called just "examples" at the
> end of each chapter. The sort of thing you're describing could fit in
> well there. I'd need some help collecting interesting examples.
>
Indeed. I for one like examples that tell me
# type this
# this will happen
# you can see what you just did with this, this, and this command
# this is because...
Not only is it good for learning the how and the why, but it also trains
the fingers right from the start. Hopefully the UI is stabilized enough
by now that we can reliably tell users how to accomplish a certain
thing. UI changes must almost certainly be listed at whatever official
site git has. As Junio has already pointed out, the members of the git
mailing list are now in minority among the git users, so some other
place has to hold the user-visible changes as well and the location of
that site must probably be published along with the tools.
--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2007-01-09 8:46 ` Andreas Ericsson
@ 2007-01-09 15:49 ` J. Bruce Fields
2007-01-09 16:58 ` Theodore Tso
1 sibling, 0 replies; 66+ messages in thread
From: J. Bruce Fields @ 2007-01-09 15:49 UTC (permalink / raw)
To: Andreas Ericsson; +Cc: Theodore Tso, Junio C Hamano, linux, git
On Tue, Jan 09, 2007 at 09:46:06AM +0100, Andreas Ericsson wrote:
> J. Bruce Fields wrote:
> > - I tried writing something modeled loosely on the hg quick
> > start. It's a little out of date now, but that could be
> > fixed:
> >
> > http://www.fieldses.org/~bfields/git-quick-start.html
> >
>
> I like this, although fetch should probably have "--force" instead of
> the "+branch" notation. --force stands out more and users are familiar
> with --force possibly destroying things (rm -rf, anyone?).
I started out writing it that way (for the reasons you give), then
changed it on the theory starting out with the "+" notation would make
it simpler explaining how to do the remote configuration.
Now that there's git-remote, and less need to manipulate the remote
configuration by hand, maybe that's less important.
> I think the document is fine as it is, but could probably start off with
> a link to the tutorial, quickstart or a revised version of everyday.txt,
> stating that "here's something you might want to read if you prefer to
> experiment. If you think something goes wrong, come back here and find
> out why".
Sounds sensible.
> Indeed. I for one like examples that tell me
>
> # type this
> # this will happen
> # you can see what you just did with this, this, and this command
> # this is because...
>
> Not only is it good for learning the how and the why, but it also trains
> the fingers right from the start.
OK. This is a place where I'd really appreciate any contributions.
--b.
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2007-01-09 8:46 ` Andreas Ericsson
2007-01-09 15:49 ` J. Bruce Fields
@ 2007-01-09 16:58 ` Theodore Tso
2007-01-10 4:15 ` J. Bruce Fields
1 sibling, 1 reply; 66+ messages in thread
From: Theodore Tso @ 2007-01-09 16:58 UTC (permalink / raw)
To: Andreas Ericsson; +Cc: J. Bruce Fields, Junio C Hamano, linux, git
On Tue, Jan 09, 2007 at 09:46:06AM +0100, Andreas Ericsson wrote:
> I think the document is fine as it is, but could probably start off with
> a link to the tutorial, quickstart or a revised version of everyday.txt,
> stating that "here's something you might want to read if you prefer to
> experiment. If you think something goes wrong, come back here and find
> out why".
If what we're going to do is a "git user's manual", I'd recommend
keeping the 2-3 pages in the manual, and do it via a link to some
other document. One of the issues with the git documentation is that
it's *too* branchy, and some the branches go off to some truly scary
low-level implementation detail. If we are going to assume that isn't
going to change (and I am glad that the low-level details are
documented, and am not advocating that they be deleted), then keeping
a user-friendly QuickStart in the main document might not be a bad
decision.
- Ted
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [DRAFT] Branching and merging with git
2007-01-09 16:58 ` Theodore Tso
@ 2007-01-10 4:15 ` J. Bruce Fields
0 siblings, 0 replies; 66+ messages in thread
From: J. Bruce Fields @ 2007-01-10 4:15 UTC (permalink / raw)
To: Theodore Tso; +Cc: Andreas Ericsson, Junio C Hamano, linux, git
On Tue, Jan 09, 2007 at 11:58:28AM -0500, Theodore Tso wrote:
> If what we're going to do is a "git user's manual", I'd recommend
> keeping the 2-3 pages in the manual, and do it via a link to some
> other document. One of the issues with the git documentation is that
> it's *too* branchy, and some the branches go off to some truly scary
> low-level implementation detail. If we are going to assume that isn't
> going to change (and I am glad that the low-level details are
> documented, and am not advocating that they be deleted), then keeping
> a user-friendly QuickStart in the main document might not be a bad
> decision.
Sounds reasonable.
I'll probably set this aside a few days, then do some more work on it
this weekend. (Patches welcomed, though--source is in the master branch
of git://linux-nfs.org/~bfields/git.git.)
--b.
^ permalink raw reply [flat|nested] 66+ messages in thread
end of thread, other threads:[~2007-01-10 4:15 UTC | newest]
Thread overview: 66+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-11-16 22:17 [DRAFT] Branching and merging with git linux
2006-11-16 23:47 ` Junio C Hamano
2006-11-17 1:13 ` linux
2006-11-17 1:31 ` Junio C Hamano
2006-11-17 1:09 ` Junio C Hamano
2006-11-17 3:17 ` linux
2006-11-17 5:55 ` Junio C Hamano
2006-11-17 9:37 ` Jakub Narebski
2006-11-17 9:41 ` Jakub Narebski
2006-11-17 10:37 ` Jakub Narebski
2006-11-17 15:32 ` Theodore Tso
2006-11-17 15:57 ` Sean
2006-11-17 16:19 ` Nguyen Thai Ngoc Duy
2006-11-17 16:25 ` Marko Macek
2006-11-17 16:33 ` Petr Baudis
2006-11-17 16:34 ` Sean
[not found] ` <20061117113404.810fd4ea.seanlkml@sympatico.ca>
2006-11-17 16:53 ` Petr Baudis
2006-11-17 17:01 ` Sean
[not found] ` <20061117120154.3eaf5611.seanlkml@sympatico.ca>
2006-11-17 21:31 ` Petr Baudis
2006-11-17 22:36 ` Chris Riddoch
2006-11-17 22:50 ` Petr Baudis
2006-11-17 23:30 ` Sean
2006-11-17 18:21 ` J. Bruce Fields
2006-11-18 0:13 ` linux
2006-11-18 0:32 ` Jakub Narebski
2006-11-18 0:40 ` Junio C Hamano
2006-11-18 1:11 ` Junio C Hamano
2006-11-20 23:51 ` [DRAFT 2] " linux
2006-11-22 11:02 ` [Patch to DRAFT 2 (1/2)] " Junio C Hamano
2006-11-22 11:02 ` [Patch to DRAFT 2 (2/2)] " Junio C Hamano
2006-11-22 13:36 ` Rene Scharfe
2006-12-04 1:19 ` [DRAFT 2] " J. Bruce Fields
2006-12-04 7:23 ` J. Bruce Fields
2006-12-04 10:56 ` Johannes Schindelin
2006-12-15 21:38 ` Jakub Narebski
2006-12-15 21:41 ` J. Bruce Fields
2006-11-22 11:51 ` [DRAFT] " Junio C Hamano
2006-11-19 17:50 ` J. Bruce Fields
2006-11-19 17:59 ` Git manuals Petr Baudis
2006-11-19 18:16 ` Jakub Narebski
2006-11-19 19:50 ` Robin Rosenberg
2006-11-19 19:36 ` J. Bruce Fields
2006-11-26 4:01 ` [PATCH] Documentation: add a "git user's manual" J. Bruce Fields
2006-11-17 17:44 ` [DRAFT] Branching and merging with git J. Bruce Fields
2006-11-17 18:16 ` Jakub Narebski
2007-01-03 17:04 ` Theodore Tso
2007-01-03 17:08 ` Junio C Hamano
2007-01-04 5:28 ` linux
2007-01-04 6:11 ` Junio C Hamano
2007-01-07 23:44 ` J. Bruce Fields
2007-01-08 0:24 ` Junio C Hamano
2007-01-08 2:35 ` J. Bruce Fields
2007-01-08 13:04 ` David Kågedal
2007-01-08 14:03 ` Theodore Tso
2007-01-09 2:41 ` J. Bruce Fields
2007-01-09 8:46 ` Andreas Ericsson
2007-01-09 15:49 ` J. Bruce Fields
2007-01-09 16:58 ` Theodore Tso
2007-01-10 4:15 ` J. Bruce Fields
2007-01-08 0:40 ` Theodore Tso
2007-01-08 0:46 ` J. Bruce Fields
2007-01-08 1:22 ` Jakub Narebski
2007-01-08 1:46 ` Horst H. von Brand
2007-01-08 2:22 ` J. Bruce Fields
2007-01-08 12:38 ` Guilhem Bonnefille
2007-01-09 4:17 ` J. Bruce Fields
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).