From: Jakub Narebski <jnareb@gmail.com>
To: Thore Husfeldt <thore.husfeldt@gmail.com>
Cc: git@vger.kernel.org, Jakub Narebski <jnareb@gmail.com>
Subject: Re: Git terminology: remote, add, track, stage, etc.
Date: Mon, 18 Oct 2010 14:57:19 -0700 (PDT) [thread overview]
Message-ID: <m3ocar5fmo.fsf@localhost.localdomain> (raw)
In-Reply-To: <8835ADF9-45E5-4A26-9F7F-A72ECC065BB2@gmail.com>
Thore Husfeldt <thore.husfeldt@gmail.com> writes:
> Ive just learned Git. What a wonderful system, thanks for building
> it.
>
> And what an annoying learning experience.
>
> I promised myself to try to remember what made it all so hard, and to
> write it down in a comprehensive and possibly even constructive
> fashion. Here it is, for what its worth. Read it as the friendly, but
> somewhat exasparated suggestions of a newcomer. Id love to help (in
> the form of submitting patches to the documentation or CLI responses),
> but Id like to test the waters first.
Thank you very much for writing those down. It is very helpful for
us, which are used to Git and know by heart its sometimes obscure
jargon, and might not notice that it is hard to understand.
> Remote (tracking) branches
> --------------------------
>
> There are at least two uses of the word *tracking* in Git's
> terminology.
>
> The first, used in the form `git tracks a file' (in the sense that Git
> knows about the file) is harmless enough, and is handled under `git
> add` below.
In this sense of "tracked", i.e. "tracked file", it means that given
file is versioned / is under version control.
Though I don't think we use `git tracks a file` anywhere in the
documentation and messages (at least I hope so); we use `tracked file`.
I think it is all right for `tracked file` and `"tracked" branch`
to mean different things.
> But the real monster is the *tracking branch*, sometimes called the
> remote branch, the remote-tracking branch, or the remote tracking
> branch. Boy did that ever confuse me. [...]
>
> Please, *please* fix this. It was the single most confusing and
> annoying part of learning Git.
>
> First, the word, "tracking". These branches dont track or follow
> anything. They are standing completely still. Please believe me that
> when first you are led to believe that origin/master tracks a branch
> on the remote (like a hound tracks it quarry, or a radar tracks a
> flight) that it is very difficult to hunt this misunderstanding down:
> I believed for a long time that the tracking branch stayed in sync,
> automagically, with a synonymous branch at the remote.
But those 'remote-tracking branches' are *used* to track where there
are branches in remote repository.
Sidenote: give thanks that you didn't start to use git before version
1.5.0, when so called "separate remote" layout was made default (which
means tracking branch 'foo' in remote 'origin' using 'origin/foo'
remote-tracking branch).
[...]
> The hyphenated *remote-tracking* is a lot better terminology already
> (and sometimes even used in the documentation), because at least it
> doesn't pretend to be a remote branch (`git branch -r`, of course,
> still does). So that single hyphen already does some good, and should
> be edited for consistency. [...]
The name 'remote-tracking branch' is the name we arrived at after long
discussions not that long time ago, and it is a name that should be
used thorough the documentation. It is ongoing effort.
> [...] It may be that terminology is slowly converging. (To something
> confusing, but still...)
[...]
> More radically, I am sure some head scratching would be able to find
> useful terminology for master, origin/master, and origins master. Id
> love to see suggestions. As I said, I admire how wonderfully simple
> and clean this has been implemented, and the documentation, CLI, and
> terminology should reflect that.
There is also additional complication that you can have the same
relation that local branch 'master' has to 'origin/master'
remote-tracking branch with two local branches.
We nowadays say that 'origin/master' is "upstream" for 'master'
branch; we used to say that 'master' branch "tracks" 'origin/master'
branch (which can be seen in the name of `--track' option to
'git branch').
> The staging area
> ----------------
>
> The wonderful and central concept of staging area exists under at
> least three names in Git terminology. And thats really, really
> annoying. The index, the cache, and the staging area are all the same,
> which is a huge revelation to a newcomer.
This inconsistence is results of historical issues; the concrete
object that is used as mediator betweeb working area and repository
was first called 'dircache', and now is called 'the index'.
There was strong push towards replacing 'index' and 'cache' by
'staging area' (and 'to stage' as verb), but it meets with some
resistance.
> 2. Introduce the alias `git unstage` for `git reset HEAD` in the
> standard distribution.
That is IMHO a very good idea. The `git unstage <file>` form
describes what we want to achieve (user story), while `git reset HEAD
<file>` requires us to know what operation must we do in order to
remove staged changes from a file.
> 3. Duplicate various occurences of `cached` flags as `staged` (and
> change the documentation and man pages accordingly), so as to have,
> e.g., `git diff --staged`.
Note that it is not as easy as it seems at first glance. There are
*two* such options, which (as you can read in gitcli(7) manpage) have
slightly different meaning:
* The `--cached` option is used to ask a command that
usually works on files in the working tree to *only* work
with the index. For example, `git grep`, when used
without a commit to specify from which commit to look for
strings in, usually works on files in the working tree,
but with the `--cached` option, it looks for strings in
the index.
* The `--index` option is used to ask a command that
usually works on files in the working tree to *also*
affect the index. For example, `git stash apply` usually
merges changes recorded in a stash to the working tree,
but with the `--index` option, it also merges changes to
the index as well.
Some commands like `git apply` support both (though not at the same
time).
> git status
> ----------
[...]
> 2.
> Untracked files:
> (use "git add <file>..." to include in what will be committed)
>
> should be
>
> Untracked files:
> (use "git track <file>" to track)
To "track a file" means to put a file under version control (to
version control the file).
Note also that "git track <file>" would be "git add -N <file>"
(where `-N` is `--intent-to-add`), which only marks a file to be
tracked / versioned, but doesn't stage its contents.
> Adding
> ------
>
> The tutorial tells us that
>
> Many revision control systems provide an add command that tells
> the system to start tracking changes to a new file. Git's add
> command does something simpler and more powerful: git add is used
> both for new and newly modified files, and in both cases it takes
> a snapshot of the given files and stages that content in the
> index, ready for inclusion in the next commit.
>
> This is true, and once you grok how Git actually works it also makes
> complete sense. `Making the file known to Git' (sometimes called
> `tracking the file') and `staging for the next commit' result in the
> exact same operations, from Gits perspective.
>
> But this is a good example of whats wrong with the way the
> documentation thinks: Gits implementation perspective should not
> define how concepts are explained. In particular, *tracking* (in the
> sense of making a file known to git) and *staging* are conceptually
> different things.
But they are not independent. When you stage contents of a file which
was not known to git, it is automatically made "tracked" i.e. put
under version control. Obvious.
> In fact, the two things remain conceptually
> different later on: un-tracking (removing the file from Gits
> worldview) and un-staging are not the same thing at all, neither
> conceptually nor implementationally. The opposite of staging is `git
> reset HEAD <file>` and the opposite of tracking is -- well, Im not
> sure, actually. Maybe `git update-index --force-remove <filename>`?
`git rm <filename>` to remove it both from staging area, and working
area, or `git rm --cached <filename>` to remove it only from staging
area, which means that it is removed from version control but kept on
disk.
[...]
> Fixing this requires no change to the implementation. `git stage` is
> already a synonym for `git add`. It merely requires discipline in
> using the terminology of staging. Note that it completely valid to
> tell the reader, maybe immediately and in a footnote, that `git add`
> and `git stage` *are* indeed synonyms, because of Gits elegant
> model. In fact, given the amount of documentation cruft one can find
> on the Internet, this would be a welcome footnote.
>
> An even more radical suggestion (which would take all of 20 seconds to
> implement) is to introduce `git track` as another alias for `git
> add`. (See above under `git status`). This would be especially useful
> if tracking *branches* no longer existed.
Well, there is different suggestion: make `git stage`, `git track` and
`git mark-resolved` to be *specializations* of `git add`, with added
safety checks: 'git stage' would work only on files known to git /
under version control already, 'git track' would work only on
untracked files (and do what 'git add -N' does), and 'git mark-resolved'
would work only on files which were part of a merge conflict.
> Theres another issue with this, namely that added files are
> immediately staged. In fact, I do understand why Git does that, but
> conceptually its pure evil: one of the conceptual conrnerstones of
> Git -- that files can be tracked and changed yet not staged, i.e., the
> staging areas is conceptually a first-class citizen -- is violated
> every time a new file is born. Newborn files are *special* until
> their first commit, and thats a shame, because the first thing the
> new file (and, vicariously, the new user) experiences is an
> aberration. I admit that I have not thought this through.
--
Jakub Narebski
Poland
ShadeHawk on #git
next prev parent reply other threads:[~2010-10-18 21:57 UTC|newest]
Thread overview: 59+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-10-18 20:45 Git terminology: remote, add, track, stage, etc Thore Husfeldt
2010-10-18 21:15 ` Jonathan Nieder
2010-10-18 22:48 ` [RFC/PATCH] reset: accept "git reset <removed file>" Jonathan Nieder
2010-10-18 23:56 ` Junio C Hamano
2010-10-19 0:23 ` Jonathan Nieder
2010-10-19 17:34 ` Junio C Hamano
2010-10-19 22:34 ` Jonathan Nieder
2010-10-18 21:35 ` Git terminology: remote, add, track, stage, etc Sverre Rabbelier
2010-10-19 0:03 ` Junio C Hamano
2010-10-19 17:51 ` Ramkumar Ramachandra
2010-10-19 18:28 ` Jonathan Nieder
2010-10-19 18:34 ` Sverre Rabbelier
2010-10-19 18:43 ` Thore Husfeldt
2010-10-19 19:04 ` User manual: "You cannot check out these remote-tracking branches" Jonathan Nieder
2010-10-19 20:52 ` Matthieu Moy
2010-10-19 19:15 ` Git terminology: remote, add, track, stage, etc Nicolas Pitre
2010-10-19 19:20 ` Junio C Hamano
2010-10-19 22:10 ` [RFC/PATCH 0/4] reset: be more flexible about <rev> Jonathan Nieder
2010-10-19 22:11 ` [WIP/PATCH 1/4] reset -p: accept "git reset -p <tree>" Jonathan Nieder
2010-10-19 22:12 ` [PATCH 2/4] reset: accept "git reset <tree> <path>" Jonathan Nieder
2010-10-19 22:13 ` [PATCH 3/4] reset: accept "git reset -- <path>" from unborn branch Jonathan Nieder
2010-10-19 22:14 ` [PATCH 4/4] reset: accept "git reset HEAD " Jonathan Nieder
2010-10-19 23:08 ` Junio C Hamano
2010-10-19 23:26 ` Jonathan Nieder
2010-10-27 15:03 ` Git terminology: remote, add, track, stage, etc Ramkumar Ramachandra
2010-10-27 15:16 ` Drew Northup
2010-10-27 16:08 ` Matthieu Moy
2010-10-28 15:20 ` Ramkumar Ramachandra
2010-10-28 18:25 ` Matthieu Moy
2010-10-18 21:41 ` Matthieu Moy
2010-10-19 4:49 ` Miles Bader
2010-10-19 7:19 ` Wincent Colaiuta
2010-10-19 7:48 ` Miles Bader
2010-10-19 8:05 ` Wincent Colaiuta
2010-10-19 15:09 ` Eugene Sajine
2010-10-22 20:16 ` Paul Bolle
2010-10-22 21:00 ` Eugene Sajine
2010-10-22 21:46 ` Drew Northup
2010-10-20 9:53 ` Thore Husfeldt
2010-10-20 11:34 ` Matthieu Moy
2010-10-20 14:01 ` Drew Northup
2010-10-18 21:57 ` Jakub Narebski [this message]
2010-10-19 8:05 ` Matthijs Kooijman
2010-10-19 8:27 ` Jakub Narebski
2010-10-19 17:30 ` Thore Husfeldt
2010-10-19 20:57 ` Jakub Narebski
2010-10-21 8:44 ` Michael Haggerty
2010-10-21 11:20 ` Drew Northup
2010-10-21 12:31 ` Thore Husfeldt
2010-10-21 12:56 ` Drew Northup
2010-10-21 14:06 ` Thore Husfeldt
2010-10-21 20:06 ` Drew Northup
2010-10-22 4:07 ` Miles Bader
2010-10-22 11:51 ` Drew Northup
2010-10-19 14:39 ` [PATCH v3] Porcelain scripts: Rewrite cryptic "needs update" error message Ramkumar Ramachandra
2010-10-27 14:55 ` Ramkumar Ramachandra
2010-11-05 22:38 ` Junio C Hamano
2011-02-12 23:14 ` Ævar Arnfjörð Bjarmason
2010-10-19 21:53 ` Git terminology: remote, add, track, stage, etc Drew Northup
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m3ocar5fmo.fsf@localhost.localdomain \
--to=jnareb@gmail.com \
--cc=git@vger.kernel.org \
--cc=thore.husfeldt@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).