From: Neal Kreitzinger <nkreitzinger@gmail.com>
To: Jakub Narebski <jnareb@gmail.com>
Cc: Federico Galassi <federico.galassi@gmail.com>, git@vger.kernel.org
Subject: Re: Question about your comment on the git parable
Date: Mon, 27 Feb 2012 20:41:45 -0600 [thread overview]
Message-ID: <4F4C3EE9.60709@gmail.com> (raw)
In-Reply-To: <201202261606.25599.jnareb@gmail.com>
On 2/26/2012 9:06 AM, Jakub Narebski wrote:
> On Sun, 26 Feb 2012, Federico Galassi wrote:
>> On 26/feb/2012, at 12:29, Jakub Narebski wrote:
>>
>>> Would you mind if this discussion was moved to git mailing
>>> list (git@vger.kernel.org), of course always with copy directly
>>> to you? There are people there that can answer your questions
>>> better.
>>
>> No problem.
>>
>>> On Sun, 26 Feb 2012, Federico Galassi wrote:
>>>> Hello, i think you're the author of these comments:
>>>> http://news.ycombinator.com/item?id=616610
>>>>
>>>> I'm doing educational work on git based on the parable (talks,
>>>> articles, etc..) and i'd like to improve on the real reason
>>>> for a staging area.
>>>>
>>>> My question basically is: why is it really needed for merging?
>>>> I mean, given the fictional git-like system of the parable,
>>>> if I need to merge 2 snapshots i could:
>>>>
>>>> 1) search the commit tree for a base point
> [...]
>>>> 2) compare the diffs between the snapshots and the base point snapshot
>>>> 3) if a conflict happens (change in the same line), just leave
>>>> something in the working dir to mark the conflict. For example,
>>>> keeping it simple, the system could reject a new commit until
>>>> the markers of the conflict are removed from the conflicting file.
>>>>
>>>> Couldn't it just work this way?
>>>
>>> Well, it could; that is how many if not most of other version control
>>> systems work.
>>>
>>>
>>> There are (at least!) three problems with that approach. First, sometimes
>>> it is not possible to "leave something in the working dir to mark the
>>> conflict". Take for example case where binary file (e.g. image) was
>>> changed, and textual 3-way diff file-merge algorithm wouldn't work.
>>>
>>> Second, what to do in the case of *tree-level* conflict, for example
>>> rename/rename conflict, where one side renamed file to different
>>> name (moved to different place) than the other side. There are no
>>> conflict markers for this...
>>>
>>> Third, what about false positives with detecting conflict markers,
>>> i.e. the case where "rejecting new commit until conflict markers are
>>> removed", for example AsciiDoc files can be falsely detected as having
>>> partial conflict markers, and of course test vectors for testing conflict
>>> would have to have conflict markers in them.
>>
>> Ok, it's clear to me that the markers in file approach is just a little
>> bit too simple. Do you see any concrete advantage in the staging area
>> compared to, say, tree conflict metadata in the working dir and maybe
>> a dedicated smart "resolve conflict" command?
>
> First, for such _local_ information working directory isn't the best place.
> What if you accidentally delete this? It is not and should not be
> committed to repository,so there is no way to undelete it, except redoing
> merge and losing all your progress so far in resolving merge conflicts.
> It is much better to put such information somewhere in administrative
> area[1] of repository.
>
> Second, if we have staging area where we store information about which
> files are tracked, and a bunch of per-file metadata like modification time
> for better performance, why not use it also for storing information about
> merge in progress?
>
> [1]: Name taken from "Version Control by Example" (free e-book) by
> Eric Sink.
>
>
> There is also a thing very specific to Git, namely that "git add" adds
> a current content of a file to object database of a repository (though
> with modern git there is also "git add --intent-to-add" which works
> like add-ing file in other version control systems)... and you have to
> store reference to newly created object somewhere so that it doesn't get
> garbage-collected.
>
>>>> Can you mention other situations in which the pattern "files to be added"
>>>> is either mandatory or really helpful?
>>>
>>> Note that any version control system must have a kind of proto-staging
>>> area to know which files are to be added in next commit.
>>>
>>> If you do
>>>
>>> $ scm add file.c
>>>
>>> then version control system must save somewhere that 'file.c' is to be
>>> tracked (to be added in next commit).
>>
>> Yes, the fictional vcs just tracked all the files in the working dir.
>> Being selective on which file to track is of course another interesting
>> feature.
>
> IRL it is a _necessary_ feature. One of more common, if not most common
> application of version control system is to manage source files for a
> computer program. And there you have object files, executables and other
> _generated_ files which shouldn't be put in version control, not to
> mention backups created by your editor / IDE (e.g. "*~" files in Unix
> world, "*.bak" files in MS Windows world).
>
> Not to mention files which you have added to working directory, but are
> not ready to be added to new commit.
>
In the google tech talk "Contributing to Git":
http://www.youtube.com/watch?v=j45cs5_nY2k , at the 44:00 min mark the
index is discussed. It notes that many scm's have an "index" but hide
it from you. Some of the advantages of git giving you access to the
index are also discussed.
v/r,
neal
prev parent reply other threads:[~2012-02-28 2:41 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <A98A438D-76DD-41B5-B8E1-6FA170B00801@gmail.com>
[not found] ` <201202261303.38957.jnareb@gmail.com>
[not found] ` <4B4C5353-9820-4068-92DA-50665B1011E1@gmail.com>
2012-02-26 14:10 ` Question about your comment on the git parable Jakub Narebski
[not found] ` <201202261229.51199.jnareb@gmail.com>
[not found] ` <1E5ECB5A-595A-4B04-8269-6E35BF3FEA1A@gmail.com>
2012-02-26 15:06 ` Jakub Narebski
2012-02-28 2:41 ` Neal Kreitzinger [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F4C3EE9.60709@gmail.com \
--to=nkreitzinger@gmail.com \
--cc=federico.galassi@gmail.com \
--cc=git@vger.kernel.org \
--cc=jnareb@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).