git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* svn user trying to recover from brain damage
@ 2007-05-09 15:30 Joshua Ball
  2007-05-09 16:02 ` Carl Worth
                   ` (4 more replies)
  0 siblings, 5 replies; 8+ messages in thread
From: Joshua Ball @ 2007-05-09 15:30 UTC (permalink / raw)
  To: git

Hi all,

The git page says that this mailing list is for "bug reports, feature
requests, comments and patches". Is there a mailing list for new users
crying out for help? If so, forward me there.

OK, I'm feeling very frustrated right now, so let me just say that git
documentation sucks. All the documentation I can find anywhere falls
into two categories:

1. Tutorials for people brand new to version control, with just enough
information for them to "obey the rules", but completely empty of any
information that could help them exploit the real power of
decentralized version control.
2. Technical documentation which assumes pre-obtained knowledge.

Now that I've insulted you and am probably not on your good side...

What the heck do these terms mean? The glossary on the Git wiki was
unhelpful (I'll explain later). BTW, what is wrong with the wiki?
(Particularly the excessive [grayed-out text [no match, add rest:
"used by any common UNIX command. The fact that it is a
mispronunciation of "]]. Is this some new kind of spam, or a buggy
wiki feature?)

HEAD
HEAD REF
working tree
object
branch
merge
master
commit (as in the phrase "bring the working tree to a given commit")

While the Git wiki does in fact define all of these, it doesn't answer
any of my questions about those terms:

Is there a difference between HEAD and the working tree?
Does HEAD change when I cg-switch/git-checkout?
What is an object? Is it a set of patches? A tree snapshot?
What the heck is a branch? (Why does it have so many different
definitions? I feel like every time I come across "branch" in the man
pages, it means something different.)

More on branches: The wiki says that a group of commits linked
together form a DAG. Does that mean every fork/clone/branch-create
possibly doubles the number of branches. So if I fork and then
remerge, do I have two branches?

A -> B -> D
A -> C -> D

Would D be the head of this branch? If so, then heads do not uniquely
identify a branch?

Is there a standard revision notation? (Where my definition of
"revision" is a tree snapshot. In SVN, it would be identified by a
number.) `cg-diff -r A..B` works fine if A and B are branches, but how
do I diff from an older revision to a newer revision? Can I diff
between two revisions which haven't shared the same parent since 2006?

What about the master branch? Is there anything special about it? By
special I mean, do any of the git or cogito commands implicitly assume
that you are working with master? If git is truly decentralized, then
wouldn't master be on an equal footing with all other branches?

What is a merge? My understanding of merge comes from the SVN book,
where it was described as diff+apply. Diff takes 2 arguments, and
apply takes a 1 argument (if the patch is implicit). However, cg-merge
only appears to take one branch. (There again a use of the word
branch! Wouldn't commit or revision be a more accurate term?) Why does
cg-merge only take one argument? Even if I use the -b switch, I'm
still only up to two arguments. Where is the hidden argument?

Lastly, the most important question of all, which may answer many of
the questions above:

Can you fill in the missing pieces, making corrections where
necessary? (recommend unispace font)

Command     |   Reads               |   Writes
cg-fetch    | remote branch         | corresponding branch in local respository
cg-commit   | working copy          | HEAD
cg-update   | remote branch         | working copy AND HEAD
cg-merge    | branch & working copy | working copy
cg-diff     | arguments             | STDOUT
cg-push     |                       | remote branch (usually origin)
cg-pull     | remote branch         |
cg-restore  |                       |

Perhaps the Reads column should be split into two, like ReadInfo and ReadSafety.
ReadInfo would say which revision/branch/commit/object is being read for actual
content, while ReadSafety is only read to make sure that nothing will be lost
after running the command. (e.g., cg-update reads the working copy to make sure
that you are not in a partial merge, but once it knows that it is safe, it
ignores the contents of working directory. I may have this totally wrong.)

On cg-fetch, is the remote branch necessarily remote? Or can you fetch
from local
cg-switch-branches? What does "corresponding branch in local
repository" mean? Does cg-fetch touch your working copy?

What is the difference between cg-restore and cg-seek?

Please reply even if you can only answer one of my many questions! If
I can grab just one fact and say about it, "This is truth", then it
gives me a rock to stand on amidst all the term-mashing out there.

In the words of Dijkstra, "Since breaking out of bad habits, rather
than acquiring new ones, is the toughest part of learning, we must
expect from that system permanent mental damage for most ... exposed
to it."

May you lead me to a quick recovery. Hail to decentralized version control.

Josh "Ua" Ball

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: svn user trying to recover from brain damage
  2007-05-09 15:30 svn user trying to recover from brain damage Joshua Ball
@ 2007-05-09 16:02 ` Carl Worth
  2007-05-09 16:12   ` Karl Hasselström
  2007-05-09 16:22 ` Petr Baudis
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 8+ messages in thread
From: Carl Worth @ 2007-05-09 16:02 UTC (permalink / raw)
  To: Joshua Ball; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 1543 bytes --]

On Wed, 9 May 2007 10:30:18 -0500, "Joshua Ball" wrote:
> Is there a difference between HEAD and the working tree?

Yes. HEAD is a pointer to committed state. HEAD is an alias for the
current branch.

> Does HEAD change when I cg-switch/git-checkout?

Yes.

> What is an object?

It's a low-level aspect of git's unified storage model. The various
object types, (blob, tree, commit, tag),  are defined quite clearly in
the documentation.

> What the heck is a branch?

Simply a pointer to the DAG that moves as new commits are created
while "on" that branch.

> More on branches: The wiki says that a group of commits linked
> together form a DAG.

Yes.

>                      Does that mean every fork/clone/branch-create
> possibly doubles the number of branches.

No. Creating a new branch simply references some existing commit in
the DAG already.

>                                           So if I fork and then
> remerge, do I have two branches?
>
> A -> B -> D
> A -> C -> D

No. If you merged you would have history that looks like this:

 /-> B -\
A        D
 \-> C -/

> Is there a standard revision notation?

Yes. See the documentation for git-rev-parse and the section on
specifing revisions, (hint: when you run a command like "git log" and
see long sequence of hex characters, you can use those (or abbreviated
versions of those) to name revisons). You can also do many other
things as described in the documentation.

[Beyond this point you bring in too many svn misconceptions to make
the questions easy to answer.]

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: svn user trying to recover from brain damage
  2007-05-09 16:02 ` Carl Worth
@ 2007-05-09 16:12   ` Karl Hasselström
  0 siblings, 0 replies; 8+ messages in thread
From: Karl Hasselström @ 2007-05-09 16:12 UTC (permalink / raw)
  To: Carl Worth; +Cc: Joshua Ball, git

On 2007-05-09 09:02:29 -0700, Carl Worth wrote:

> [Beyond this point you bring in too many svn misconceptions to make
> the questions easy to answer.]

But do feel free to come with follow-up questions, Joshua! We usually
try our best to be friendly and helpful here, even if it's hard
sometimes not to say mean things about other SCMs. :-)

-- 
Karl Hasselström, kha@treskal.com
      www.treskal.com/kalle

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: svn user trying to recover from brain damage
  2007-05-09 15:30 svn user trying to recover from brain damage Joshua Ball
  2007-05-09 16:02 ` Carl Worth
@ 2007-05-09 16:22 ` Petr Baudis
  2007-05-09 20:16   ` Jan Hudec
  2007-05-09 16:57 ` Linus Torvalds
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 8+ messages in thread
From: Petr Baudis @ 2007-05-09 16:22 UTC (permalink / raw)
  To: Joshua Ball; +Cc: git

  Hi,

On Wed, May 09, 2007 at 05:30:18PM CEST, Joshua Ball wrote:
> The git page says that this mailing list is for "bug reports, feature
> requests, comments and patches". Is there a mailing list for new users
> crying out for help? If so, forward me there.

  I think it fits in the "comments" category. :-)

> OK, I'm feeling very frustrated right now, so let me just say that git
> documentation sucks. All the documentation I can find anywhere falls
> into two categories:
> 
> 1. Tutorials for people brand new to version control, with just enough
> information for them to "obey the rules", but completely empty of any
> information that could help them exploit the real power of
> decentralized version control.
> 2. Technical documentation which assumes pre-obtained knowledge.
> 
> Now that I've insulted you and am probably not on your good side...
> 
> What the heck do these terms mean? The glossary on the Git wiki was
> unhelpful (I'll explain later). BTW, what is wrong with the wiki?
> (Particularly the excessive [grayed-out text [no match, add rest:
> "used by any common UNIX command. The fact that it is a
> mispronunciation of "]]. Is this some new kind of spam, or a buggy
> wiki feature?)

  Sorry, it was a bug-ridden wiki. I was desperately trying to debug
some weird behaviour, few moments ago I've finally nailed it down and
all should be fine now.

> HEAD
> HEAD REF
> working tree
> object
> branch
> merge
> master
> commit (as in the phrase "bring the working tree to a given commit")
> 
> While the Git wiki does in fact define all of these, it doesn't answer
> any of my questions about those terms:
> 
> Is there a difference between HEAD and the working tree?

  This is (unfortunately) case-sensitive:

  HEAD identifies the commit (our slightly confusing name for a
revision) that corresponds to your working tree - usually the latest
commit in your current branch (by default 'master').

  head is just the latest commit in a branch, any branch.

> Does HEAD change when I cg-switch/git-checkout?

  Yes, HEAD changes (starts pointing to your new branch) when you
cg-switch or git-checkout -b.

> What is an object? Is it a set of patches? A tree snapshot?

  Object is the basic unit of stored data Git works with. object may be
either:

	"blob" - file at a particular point of time
	"tree" - list of files, corresponding to a particular directory
	         (again at a particular point of time)
	"commit" - one revision in the project history; contains
	           information about the parent commit(s), who did
	           the commit, the commit message and link to the
	           corresponding root tree object
	"tag" - links to another object, with additional information
	        like who/when made the tag and the tag comment

  Git does not store patches on a conceptual level, only snapshots. (At
the implementation level, Git uses "patches" for more optimized storage,
but that's not so important.)

> What the heck is a branch? (Why does it have so many different
> definitions? I feel like every time I come across "branch" in the man
> pages, it means something different.)

  Because it's hard to define. :-)

  To make a cyclical definition, branch is the set of commits
referenced by a given head. Hmm, I'll have to think out some cute
non-confusing definition of branch, I'll follow up unless someone beats
me to it.

> More on branches: The wiki says that a group of commits linked
> together form a DAG. Does that mean every fork/clone/branch-create
> possibly doubles the number of branches. So if I fork and then
> remerge, do I have two branches?
> 
> A -> B -> D
> A -> C -> D
> 
> Would D be the head of this branch? If so, then heads do not uniquely
> identify a branch?

  Branch is a much looser concept than you seem to assume. Branch is
really just a fancy name for a 'head', so let's redefine 'head'. Let's
just say for now that 'head' is a named commit reference.

  This means that when you create a "new branch" 'foo' from branch
'master', the _only_ thing you really did was to copy the commit
reference 'master.

> Is there a standard revision notation? (Where my definition of
> "revision" is a tree snapshot. In SVN, it would be identified by a
> number.) `cg-diff -r A..B` works fine if A and B are branches, but how
> do I diff from an older revision to a newer revision? Can I diff
> between two revisions which haven't shared the same parent since 2006?

  You can diff between any two revisions. The ultimately "standard"
notation is to use the id of the revision (the long string of
hexadecimal digits), but the syntax is quite rich - see SPECIFYING
REVISIONS section of git-rev-parse(1).

  If you specify a branch where revision is expected, it means that the
latest commit (revision) on the branch is used.

> What about the master branch? Is there anything special about it? By
> special I mean, do any of the git or cogito commands implicitly assume
> that you are working with master? If git is truly decentralized, then
> wouldn't master be on an equal footing with all other branches?

  'master' is just the default name for the first branch in a
repository, but in theory you can name it any way you wish and use as
many branches as you want, all are equal.

  When fetching from a remote repository, some commands might assume in
certain conditions that 'master' is the primary branch of the remote
repository, but I'm not sure about the details and in which cases does
this still hold true.

> What is a merge? My understanding of merge comes from the SVN book,
> where it was described as diff+apply. Diff takes 2 arguments, and
> apply takes a 1 argument (if the patch is implicit). However, cg-merge
> only appears to take one branch. (There again a use of the word
> branch! Wouldn't commit or revision be a more accurate term?) Why does
> cg-merge only take one argument? Even if I use the -b switch, I'm
> still only up to two arguments. Where is the hidden argument?

  The hidden argument is your current branch. So cg-merge x will merge
the branch 'x' to your current branch: symbolically, kind of

	base=-b argument | base(HEAD, x)
	apply(HEAD, diff(base, x))

  The word 'branch' is used in an attempt to make it all less confusing
:-). But in fact, you can give cg-merge just id of a commit, it does not
have to be branch name.

> Lastly, the most important question of all, which may answer many of
> the questions above:
> 
> Can you fill in the missing pieces, making corrections where
> necessary? (recommend unispace font)
> 
> Command     |   Reads               |   Writes
> cg-fetch    | remote branch         | corresponding branch in local respository
> cg-commit   | working copy          | HEAD
> cg-update   | remote branch         | working copy AND HEAD
> cg-merge    | branch & working copy | working copy
> cg-diff     | arguments             | STDOUT
> cg-push     |                       | remote branch (usually origin)
> cg-pull     | remote branch         |
> cg-restore  |                       |

  Yes, mostly right. cg-merge calls cg-commit unless there are
conflicts, so it should be "working copy AND HEAD" too. cg-push reads
local branch (HEAD or -r argument). There is no cg-pull since people
coming from different VCSes have different ideas about what pull is; git
pull is equivalent to cg-update.

> Perhaps the Reads column should be split into two, like ReadInfo and 
> ReadSafety.
> ReadInfo would say which revision/branch/commit/object is being read for 
> actual
> content, while ReadSafety is only read to make sure that nothing will be 
> lost
> after running the command. (e.g., cg-update reads the working copy to make 
> sure
> that you are not in a partial merge, but once it knows that it is safe, it
> ignores the contents of working directory. I may have this totally wrong.)

  It actually does some magic so that you can do a merge while having
uncommitted changes in your working tree. ;-)

> On cg-fetch, is the remote branch necessarily remote? Or can you fetch
> from local
> cg-switch-branches? What does "corresponding branch in local
> repository" mean? Does cg-fetch touch your working copy?

  Fetch means that a remote branch's content is transferred to the local
repository; furthermore, all the remote branches have their local
counterparts that "reflect" how the branch looked in the remote
repository at a particular point of time. So e.g. when you clone a
repository, the remote default branch is mirrored locally as branch
'origin' - you can't switch to it (technically you could but that would
be very confusing), but you can merge it.

> What is the difference between cg-restore and cg-seek?

  cg-seek will temporarily bring your tree to a different commit to
explore the state back then, but you cannot make commits in this state;
your HEAD points to the seeked commit. On the other hand, cg-restore
only changes files in your working tree - it works on the individual
files, does not touch HEAD and does not make the tree "read-only".

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Ever try. Ever fail. No matter. // Try again. Fail again. Fail better.
		-- Samuel Beckett

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: svn user trying to recover from brain damage
  2007-05-09 15:30 svn user trying to recover from brain damage Joshua Ball
  2007-05-09 16:02 ` Carl Worth
  2007-05-09 16:22 ` Petr Baudis
@ 2007-05-09 16:57 ` Linus Torvalds
  2007-05-09 17:43 ` J. Bruce Fields
  2007-05-11 17:29 ` Jakub Narebski
  4 siblings, 0 replies; 8+ messages in thread
From: Linus Torvalds @ 2007-05-09 16:57 UTC (permalink / raw)
  To: Joshua Ball; +Cc: git



On Wed, 9 May 2007, Joshua Ball wrote:
> 
> What the heck do these terms mean?
> 
> HEAD
> HEAD REF

These are the same thing. HEAD is basically a special local branch, which 
usually (but not always) points to one of the local branches. I say 
"usually", because you *can* make it an independent branch in its own 
right, in which case you are using what is now called a "detached" HEAD.

But even when HEAD is "detached", and it's thus really an independent 
branch in its own right, it's still special: it's the branch that your 
current working tree is associated with.

So if you think of "HEAD" as just "current branch", you'll be in good 
shape.

> working tree

This is just your files - both untracked (ie you may be building stuff in 
the working tree) and tracked (ie the ones that git knows about).

It's *not* necessarily going to match the state that HEAD describes: HEAD 
describes the last *committed* state, while your working tree obviously 
can have changes to the tracked files (along with files that aren't 
tracked at all), but the working tree state is certainly _associated_ with 
HEAD, in that HEAD would point to the most recent commit that the working 
tree is all about.

> object

This is just an internal git term. It's how git stores all revision 
history information - as a set of objects in a content-addressable 
filesystem. As a pure user, you generally never need to worry about this 
term, although you might notice in the case you have corruption, and run 
"git fsck" and it starts talking about corrupt or missing objects.

> branch

A "branch" is just any "tip of development". It's *literally* defined by 
its name (which git doesn't care about, but you do), and the name of the 
top-most commit (the SHA1) of that developmet series. That SHA1 is all 
that git really cares about - the name is purely for your enjoyment and to 
clarify what the branch is about.

Git can track an arbitrary number of branches, but your working tree would 
be associated with just one of them, and HEAD points to that branch. The  
default branch is called "master", but that doesn't really have any 
special meaning per se, and as mentioned, HEAD might even be a detached 
branch and not associated with any "real" branch at all!

> merge

That's the act of bringing in the contents of another branch (possibly 
from an external repository) into your current branch. If you merge from 
something external, you need to "fetch" that other branch first, and the 
combination of "fetch+merge" is called a "pull".

It sounds like you may have never worked with branches before, in which 
case you can just ignore *all* of this. Git will set up one branch for you 
at "git init" time (the "master" branch) and you don't actually ever have 
to use any more than that one branch, in which case you can literally 
ignore everything about branches and merging.

> master

See above: it's just the default name of the initial branch. It has no 
other meaning - git itself doesn't care about branch names at all, and 
it's literally nothing more than a "default branch name".

> commit (as in the phrase "bring the working tree to a given commit")

Any development series is just a series of "commits". They point to the 
"parent" commit(s) and can thus form a series (or more generally a DAG: 
directed acyclic graph). 

So a "branch" is really just a named pointer to a commit, and that commit 
will in turn point to its parent commit, which will point to its parent 
etc. Which is why I started by explaining a branch as a "tip of 
development", because you'd see a branch as the top-most commit that it 
points to, and you'd normally *change* the branch by committing to it, 
which will create a new commit (and move the branch to point to it), and 
make that new commit point to the old commit as its parent.

One of the best ways to visualize this is probably to just do

 - clone git itself if you haven't already

	git clone git://git.kernel.org/pub/scm/git/git.git git

 - use "gitk" to see the commit history and see the branches as pointers 
   into that commit history. With "--all -d", it will show you all 
   branches, and the "-d" shows the commit history in date order, so while 
   it's a bit messier than the default cleaned-up format that tries to 
   show branches on their own, it's perhaps also a bit more instructive:

	gitk --all -d

In particular, you should see a commit that has both a green-boxed 
"master" pointer pointing to it, and a "remotes/origin/master" (colored in 
a mixed brown/green box). Those are examples of branches: the "master" 
branch is your local (and normally current) brach, while the 
"remotes/origin/master" thing is a so-called "remote branch", which means 
that you cannot check it out, but you can see it and you can update it by 
fetching new versions from the remote.

> Is there a difference between HEAD and the working tree?

Yes, see above.

> Does HEAD change when I cg-switch/git-checkout?

Yes. But it switches by making it point to a different branch, while 
something like "git reset" will *also* potentially change HEAD, but do so 
by still staying on the same branch, but making that branch "reset" (aka 
jump) to another point in history.

So you can literally change HEAD two fundamentally different ways:

 - by switching branches (which includes making HEAD be a detached branch 
   of its own)

 - by changing the state of the current branch (the most common form of 
   this is just "git commit" - it will update HEAD by creating a new 
   commit, but as mentioned, "git reset" can also do this by jumping 
   around in history, and that's how you'd undo work entirely, for 
   example).

> What is an object? Is it a set of patches? A tree snapshot?

An object is the lowest-level of git information. It's an indivisible and 
unchanging "thing", that can potentially point to other objects. You 
can kind of think of it as an "inode" in a UNIX filesystem, and like an 
inode, it can point to file data or be a directory (but unlike an inode, 
it's immutable by design, and it can also be a "commit" or a "tag" 
object).

So internally, git does have "tree snapshots" (not patches - git is 
*purely* based on snapshotting states of the project), but they are not a 
single object, they are built up from "tree objects" that point to other 
tree objects or to "blob objects".

And a commit is literally a "commit object" that points to the snapshot 
(the "tree object") that it's associated with, and the previous commits 
(the "parents") that build up the history.

> What the heck is a branch? (Why does it have so many different
> definitions? I feel like every time I come across "branch" in the man
> pages, it means something different.)

Ok, hope I clarified that.

> More on branches: The wiki says that a group of commits linked
> together form a DAG. Does that mean every fork/clone/branch-create
> possibly doubles the number of branches. So if I fork and then
> remerge, do I have two branches?

Yes and no. When you do a clone, you do get your totally own set of 
branches, but a branch is just a *pointer*. So it does _not_ duplicate 
history in any way, you do *not* get:

> A -> B -> D
> A -> C -> D

But instead you get

	A -> B -> C -> D

as commits, and you now have a new pointer to D.

So creating a branch *literally* just creates a new pointer.

In fact, you can still create a new branch manually by doing

	echo "sha1-of-branch-goes-here" > .git/refs/heads/my-new-branch

and that is how the git scripts literally used to do it (well, slightly 
simplified: verifying that the SHA1 is valid, and that the branch didn't 
already exist).

So the branch really *is* just a named commit.

> Would D be the head of this branch? If so, then heads do not uniquely
> identify a branch?

A branch uniquely identify a particular commit, but many branches can 
point to the same commit (and the branches are considered "identical" when 
they do that - you can have two different branches, but if they point to 
the same thing they are identical in all respectcs except for naming).

> Is there a standard revision notation? (Where my definition of
> "revision" is a tree snapshot. In SVN, it would be identified by a
> number.) `cg-diff -r A..B` works fine if A and B are branches, but how
> do I diff from an older revision to a newer revision? Can I diff
> between two revisions which haven't shared the same parent since 2006?

The "standard" revision notation is the SHA1 of the commit, but quite 
frankly, you'd never use it.

If you have two branches named A and B, you'd generate the diff with

	git diff A..B

and it doesn't matter if they share a parent since yesterday, since five 
years ago or whether they are related AT ALL. Git will happily diff 
totally unrelated branches (if you imported two tar-balls independently, 
they may not have any common history at all, but you may still want to 
diff them if they are from the same project!)

> What about the master branch? Is there anything special about it? By
> special I mean, do any of the git or cogito commands implicitly assume
> that you are working with master? If git is truly decentralized, then
> wouldn't master be on an equal footing with all other branches?

Correct. 

The only thing that is special about master is that it's the one that is 
created by "git init" (or "git clone", for that matter).

> What is a merge? My understanding of merge comes from the SVN book,

Forget SVN merges. SVN cannot do merges (SVN also cannot really do 
branches - what SVN calls branches is some abhorrent and stupid copy of a 
working tree with copying of the limited notion of history that SVN 
knows about).

> where it was described as diff+apply. Diff takes 2 arguments, and
> apply takes a 1 argument (if the patch is implicit). However, cg-merge
> only appears to take one branch. (There again a use of the word
> branch! Wouldn't commit or revision be a more accurate term?)

(You're likely better off using just "raw git" rather than cg these days, 
so I'll talk about "git merge").

A "git merge" actually does have two branches: the current one, aka HEAD, 
and the one you want to merge _into_ the current one.

So when you do

	git merge other-branch

it will merge 'other-branch' into the current branch (HEAD).

And no, it's not a "diff+apply" (although early and *very* broken versions 
of cg implemented the data part that way), it's a much more interesting 
operation that figures out the last common point from the history, and 
does a series of three-way merges (especially if there were *multiple* 
independent common history points), and then records the set of parents 
in the result.

That, btw, is why SVN cannot do merges. It really *does* do a fancy 
"diff+apply" that probably involves three-way operations too, but since it 
doesn't actually remember the resulting history, it cannot be considered a 
"merge". It didn't really merge the history - it just smushed the 
*contents* of two branches together, and then totally threw out all the 
really important bits.

> Lastly, the most important question of all, which may answer many of
> the questions above:
> 
> Can you fill in the missing pieces, making corrections where
> necessary? (recommend unispace font)
> 
> Command     |   Reads               |   Writes
> cg-fetch    | remote branch         | corresponding branch in local respository
> cg-commit   | working copy          | HEAD
> cg-update   | remote branch         | working copy AND HEAD
> cg-merge    | branch & working copy | working copy
> cg-diff     | arguments             | STDOUT
> cg-push     |                       | remote branch (usually origin)
> cg-pull     | remote branch         |
> cg-restore  |                       |

I'll use the git names (which are generally the same)

  Command	| reads			| writes
  --------------+-----------------------+-----------
  git fetch	| remote branch(es)	| local branch(es)
  git commit	| local data		| HEAD
  git pull	| remote branch(es)	| HEAD
  git merge	| local branch(es)	| HEAD
  git diff	| local data		|
  git push	| local branch(es)	| remote branch(es)
  git reset	| ---			| HEAD

and everything that writes HEAD implicitly will always also update the 
working tree too (with the obvious exception of "git commit" - since it's 
filling in the HEAD with the current state, it's obviously not going to 
update the working tree).

The "local data" is really a combination of "local branches, staging area 
and working tree": neither "git diff" and "git commit" really work purely 
on the working tree, they both will mix using the staging area, the 
working tree, and pure branch information depending on exact flags.

And note that most of the operations really can work on multiple branches 
(that's not true in cg). IOW, you can actually merge multiple branches in 
one go (the end result is called an "octopus merge", because it looks cool 
and has many "legs" when you see the merge history in a bottom-to-top kind 
of thing like gitk).

> On cg-fetch, is the remote branch necessarily remote? Or can you fetch
> from local

You can always consider the local tree to be a remote one: just use ".".

So

	git merge other-branch

is basically the same as

	git pull . other-branch

> cg-switch-branches? What does "corresponding branch in local
> repository" mean? Does cg-fetch touch your working copy?

Confusing cogito terminology.

The pure git stuff is actually clearer. And in git, you can specify what 
the "corresponding" branch is for any local branch. For example, if you 
just do the "git clone" of the git repository, then assuming you have a 
recent enough git, you can look into the ".git/config" file of the result, 
and you should see something like this:

	[remote "origin"]
		url = master.kernel.org:/pub/scm/git/git.git
		fetch = +refs/heads/*:refs/remotes/origin/*

	[branch "master"]
		remote = origin
		merge = refs/heads/master

which describes a remote repository ("origin") and tells you what branches 
should be fetched when you do a "git fetch origin", but it *also* 
describes the local branch "master", and says that when you do a "git 
pull", it should merge the *remote* branch "refs/heads/master" from 
"origin".

> What is the difference between cg-restore and cg-seek?

Don't use them. Cogito confusion.

			Linus

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: svn user trying to recover from brain damage
  2007-05-09 15:30 svn user trying to recover from brain damage Joshua Ball
                   ` (2 preceding siblings ...)
  2007-05-09 16:57 ` Linus Torvalds
@ 2007-05-09 17:43 ` J. Bruce Fields
  2007-05-11 17:29 ` Jakub Narebski
  4 siblings, 0 replies; 8+ messages in thread
From: J. Bruce Fields @ 2007-05-09 17:43 UTC (permalink / raw)
  To: Joshua Ball; +Cc: git

On Wed, May 09, 2007 at 10:30:18AM -0500, Joshua Ball wrote:
> 1. Tutorials for people brand new to version control, with just enough
> information for them to "obey the rules", but completely empty of any
> information that could help them exploit the real power of
> decentralized version control.
> 2. Technical documentation which assumes pre-obtained knowledge.

Would you mind also trying

http://www.kernel.org/pub/software/scm/git/docs/user-manual.html

It *tries* to fill that gap you're referring to, but you might find it
has some more gaps (which we'd like to hear about).

> What the heck do these terms mean? The glossary on the Git wiki was
> unhelpful (I'll explain later).

Hm.  I wonder how that and the glossary included in Documentation/ have
diverged?

--b.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: svn user trying to recover from brain damage
  2007-05-09 16:22 ` Petr Baudis
@ 2007-05-09 20:16   ` Jan Hudec
  0 siblings, 0 replies; 8+ messages in thread
From: Jan Hudec @ 2007-05-09 20:16 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Joshua Ball, git

[-- Attachment #1: Type: text/plain, Size: 2825 bytes --]

On Wed, May 09, 2007 at 18:22:59 +0200, Petr Baudis wrote:
> > What is a merge? My understanding of merge comes from the SVN book,
> > where it was described as diff+apply. Diff takes 2 arguments, and
> > apply takes a 1 argument (if the patch is implicit). However, cg-merge
> > only appears to take one branch. (There again a use of the word
> > branch! Wouldn't commit or revision be a more accurate term?) Why does
> > cg-merge only take one argument? Even if I use the -b switch, I'm
> > still only up to two arguments. Where is the hidden argument?
> 
>   The hidden argument is your current branch. So cg-merge x will merge
> the branch 'x' to your current branch: symbolically, kind of
> 
> 	base=-b argument | base(HEAD, x)
> 	apply(HEAD, diff(base, x))
> 
>   The word 'branch' is used in an attempt to make it all less confusing
> :-). But in fact, you can give cg-merge just id of a commit, it does not
> have to be branch name.

I believe the important thing to explain here is the BASE, as that is really
the missing argument.

Subversion Book describes merge as diff + apply. Diff takes 2 arguments - OLD
and NEW, and apply takes 2 arguments - TARGET and result of diff. That gives
us 3 arguments in total. 2 of them are passed to merge and the third is
current state of working tree.

Now in git (and in any other version control tool), merge is still diff
+ apply[1]. The TARGET is again implied by working tree. The argument to git
merge is the NEW. So where is the OLD missing?

The answer is simple: It is implied by the history! It is the most recent
common ancestor of the NEW and TARGET, or in other words latest revision that
is predecessor of both revisions being merged.

An important distinction between subversion and git here is, that in git BOTH
NEW and TARGET are considered parents of the new commit created by merge.
This means that repeated merges just work without need to look in the logs or
anywhere for what changes need to be applied.

The OLD, NEW and TARGET argument names are derived from diff/patch
terminology. More common (also in git) is to call them BASE, REMOTE and LOCAL
respectively.

It might be interesting to note, that merging is *symetrical* operation.
Swapping the LOCAL and REMOTE will give the same result, except for order in
which parents are recorded in the commit object and the order in which
conflicted sections are written out in case of conflict.

This is property is not in any way special to git. It is fundamental property
of patches. Git just cares very little about the order.

[1] The 3-way merge algorithm is not diff+apply internally, but is
    equivalent to diff+apply with full context (whole file is kept), except
    for way it marks conflicts.

-- 
						 Jan 'Bulb' Hudec <bulb@ucw.cz>

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: svn user trying to recover from brain damage
  2007-05-09 15:30 svn user trying to recover from brain damage Joshua Ball
                   ` (3 preceding siblings ...)
  2007-05-09 17:43 ` J. Bruce Fields
@ 2007-05-11 17:29 ` Jakub Narebski
  4 siblings, 0 replies; 8+ messages in thread
From: Jakub Narebski @ 2007-05-11 17:29 UTC (permalink / raw)
  To: git

[Cc: Joshua Ball <sciolizer@gmail.com>, git@vger.kernel.org]

Joshua Ball wrote:

> What the heck do these terms mean? The glossary on the Git wiki was
> unhelpful (I'll explain later).

Glossary at GitWiki, http://git.or.cz/gitwiki/GitGlossary is
wikification of "GIT Glossary" from Documentation/glossary.txt
distributed with git-core and installed usually under

  /usr/share/doc/git-core-<version>/glossary.html

but also available at

  http://www.kernel.org/pub/software/scm/git/doc/glossary.html

This wiki page was created for GitWiki to be self contained
(to be able to reference to anchor in GitGlossary when referring
to some term which needs explanation), and also to be able to add
some links to wiki pages in wikified GitGlossary. This wiki page
is probably a bit outdated.

> head
> head ref
> working tree
> object
> branch
> merge
> master
> commit (as in the phrase "bring the working tree to a given commit")
> 
> While the Git wiki does in fact define all of these, it doesn't answer
> any of my questions about those terms:
[cut]

I hope that the following mini-tutorial with some explanations would
help you understand those terms, and clean some SVN misconceptions.


In git history itself is separated from the references to it; when cloning
or fetching from other repository, you get and append missing parts of
history, but the refs on the remote and on local side does not need to
have the same names. In the ascii-art graphs of history objects which
are in "object database" (in history) are on the left, and references
to history are on the right.

 /------ object database -----------\  /------- refs --------\

Let's start with the following history (the following repository
structure)

   A <-- B <-- C <-- D <----------------- master <------- HEAD


$ git branch branchA
(does not change working directory)

   A <-- B <-- C <-- D <----------------- master <------- HEAD
                      \
                       \
                        \--------------- branchA

Branching does not create copy of revisions so far (even if it is
cheap copy like in the case of Subversion). You can always find
the place where branches diverge; it is recorded in repository.
"git merge-base master branchA" returns [id of] revision D.

Creating a branch is just creating a pointer (reference) to some
commit. Head ref, or just a head is this pointer, e.g. 'branchA'
(it resides in  $GIT_DIR/refs/heads/branchA). Commit D is often
called branch tip. Branch as a non-cyclical graph of revisions
is, in the case of 'branchA', branch history of commit D including
this commit, i.e. A<--B<--C<--D DAG.

HEAD (case sensitive, all uppercase) is current branch, usuually
pointer to some other branch.


$ git checkout branchA
(changes working directory, updates HEAD)

   A <-- B <-- C <-- D <----------------- master      /- HEAD
                      \                              /
                       \                            /
                        \--------------- branchA <-/

Those two above steps can be combined to single command
$ git checkout -b branchA


$ edit; edit; ... (changes working directory)
$ git commit -a
(this creates new commit object E, updates branchA ref, i.e.
 ref pointed by HEAD, aka. current branch [head])

   A <-- B <-- C <-- D <----------------- master       /- HEAD
                      \                               /
                       \                             /
                        \- E <----------- branchA <-/

Committing (commit as verb) creates commit object E (commit as noun),
and advances branch head to the newly created commit.


$ git checkout master
$ edit; edit; ...
$ git commit -a

   A <-- B <-- C <-- D <-- F <------------ master <----- HEAD
                      \
                       \
                        \- E <----------- branchA

Note that "git commit" advances current branch head / tip of current
branch, i.e. branch pointed to by HEAD reference.


$ git merge branchA
This does equivalent of doing "diff3 -E F D E", i.e. 3-way merge on
file level, or "diff3 -E HEAD $(git merge-base HEAD branchA) branchA"

   A <-- B <-- C <-- D <-- F <-- G <------ master <----- HEAD
                      \        /
                       \      /
                        \- E- <---------- branchA

Merging (merge as verb) creates merge commit [object] G (merge as noun).
Commit object G has commits G and E as parents (more than one parent).
The information that G is result of merge is recorded in commit
[object] G.


If you have noticed that you want to discard the merge, for example
you don't want to merge yet, i.e. you want to return to state before
merge you can do:

$ git reset --hard ORIG_HEAD
(updates current branch, *does not* update HEAD as a ref,
 contrary to git-checkout)

                           |------\
                           v       \
   A <-- B <-- C <-- D <-- F <-- G  \----- master <----- HEAD
                      \        /
                       \      /
                        \- E- <---------- branchA


Assume situation at the graph before

$ git checkout branchA
$ edit; edit; ...
$ git commit -a
(creates commit H)
$ git checkout master
$ git merge branchA
This time, because merge is recorded as such, the merge base between
'master' branch (branch we merge into) and 'branchA' (branch being
merged) is commit E. Git knows that it has to merge only changes
accumulated since last merge.

   A <-- B <-- C <-- D <-- F <-- G <- J <- master <----- HEAD
                      \        /     /
                       \      /     /
                        \- E-<-- H <------ branchA

 
> In the words of Dijkstra, "Since breaking out of bad habits, rather
> than acquiring new ones, is the toughest part of learning, we must
> expect from that system permanent mental damage for most ... exposed
> to it."
> 
> May you lead me to a quick recovery. Hail to decentralized version control.

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2007-05-11 17:25 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-09 15:30 svn user trying to recover from brain damage Joshua Ball
2007-05-09 16:02 ` Carl Worth
2007-05-09 16:12   ` Karl Hasselström
2007-05-09 16:22 ` Petr Baudis
2007-05-09 20:16   ` Jan Hudec
2007-05-09 16:57 ` Linus Torvalds
2007-05-09 17:43 ` J. Bruce Fields
2007-05-11 17:29 ` Jakub Narebski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).