From: Jakub Narebski <jnareb@gmail.com>
To: Federico Galassi <federico.galassi@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: Question about your comment on the git parable
Date: Sun, 26 Feb 2012 16:06:24 +0100 [thread overview]
Message-ID: <201202261606.25599.jnareb@gmail.com> (raw)
In-Reply-To: <1E5ECB5A-595A-4B04-8269-6E35BF3FEA1A@gmail.com>
On Sun, 26 Feb 2012, Federico Galassi wrote:
> On 26/feb/2012, at 12:29, Jakub Narebski wrote:
>
>> Would you mind if this discussion was moved to git mailing
>> list (git@vger.kernel.org), of course always with copy directly
>> to you? There are people there that can answer your questions
>> better.
>
> No problem.
>
>> On Sun, 26 Feb 2012, Federico Galassi wrote:
>>> Hello, i think you're the author of these comments:
>>> http://news.ycombinator.com/item?id=616610
>>>
>>> I'm doing educational work on git based on the parable (talks,
>>> articles, etc..) and i'd like to improve on the real reason
>>> for a staging area.
>>>
>>> My question basically is: why is it really needed for merging?
>>> I mean, given the fictional git-like system of the parable,
>>> if I need to merge 2 snapshots i could:
>>>
>>> 1) search the commit tree for a base point
[...]
>>> 2) compare the diffs between the snapshots and the base point snapshot
>>> 3) if a conflict happens (change in the same line), just leave
>>> something in the working dir to mark the conflict. For example,
>>> keeping it simple, the system could reject a new commit until
>>> the markers of the conflict are removed from the conflicting file.
>>>
>>> Couldn't it just work this way?
>>
>> Well, it could; that is how many if not most of other version control
>> systems work.
>>
>>
>> There are (at least!) three problems with that approach. First, sometimes
>> it is not possible to "leave something in the working dir to mark the
>> conflict". Take for example case where binary file (e.g. image) was
>> changed, and textual 3-way diff file-merge algorithm wouldn't work.
>>
>> Second, what to do in the case of *tree-level* conflict, for example
>> rename/rename conflict, where one side renamed file to different
>> name (moved to different place) than the other side. There are no
>> conflict markers for this...
>>
>> Third, what about false positives with detecting conflict markers,
>> i.e. the case where "rejecting new commit until conflict markers are
>> removed", for example AsciiDoc files can be falsely detected as having
>> partial conflict markers, and of course test vectors for testing conflict
>> would have to have conflict markers in them.
>
> Ok, it's clear to me that the markers in file approach is just a little
> bit too simple. Do you see any concrete advantage in the staging area
> compared to, say, tree conflict metadata in the working dir and maybe
> a dedicated smart "resolve conflict" command?
First, for such _local_ information working directory isn't the best place.
What if you accidentally delete this? It is not and should not be
committed to repository,so there is no way to undelete it, except redoing
merge and losing all your progress so far in resolving merge conflicts.
It is much better to put such information somewhere in administrative
area[1] of repository.
Second, if we have staging area where we store information about which
files are tracked, and a bunch of per-file metadata like modification time
for better performance, why not use it also for storing information about
merge in progress?
[1]: Name taken from "Version Control by Example" (free e-book) by
Eric Sink.
There is also a thing very specific to Git, namely that "git add" adds
a current content of a file to object database of a repository (though
with modern git there is also "git add --intent-to-add" which works
like add-ing file in other version control systems)... and you have to
store reference to newly created object somewhere so that it doesn't get
garbage-collected.
>>> Can you mention other situations in which the pattern "files to be added"
>>> is either mandatory or really helpful?
>>
>> Note that any version control system must have a kind of proto-staging
>> area to know which files are to be added in next commit.
>>
>> If you do
>>
>> $ scm add file.c
>>
>> then version control system must save somewhere that 'file.c' is to be
>> tracked (to be added in next commit).
>
> Yes, the fictional vcs just tracked all the files in the working dir.
> Being selective on which file to track is of course another interesting
> feature.
IRL it is a _necessary_ feature. One of more common, if not most common
application of version control system is to manage source files for a
computer program. And there you have object files, executables and other
_generated_ files which shouldn't be put in version control, not to
mention backups created by your editor / IDE (e.g. "*~" files in Unix
world, "*.bak" files in MS Windows world).
Not to mention files which you have added to working directory, but are
not ready to be added to new commit.
--
Jakub Narebski
Poland
next prev parent reply other threads:[~2012-02-26 15:06 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <A98A438D-76DD-41B5-B8E1-6FA170B00801@gmail.com>
[not found] ` <201202261303.38957.jnareb@gmail.com>
[not found] ` <4B4C5353-9820-4068-92DA-50665B1011E1@gmail.com>
2012-02-26 14:10 ` Question about your comment on the git parable Jakub Narebski
[not found] ` <201202261229.51199.jnareb@gmail.com>
[not found] ` <1E5ECB5A-595A-4B04-8269-6E35BF3FEA1A@gmail.com>
2012-02-26 15:06 ` Jakub Narebski [this message]
2012-02-28 2:41 ` Neal Kreitzinger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201202261606.25599.jnareb@gmail.com \
--to=jnareb@gmail.com \
--cc=federico.galassi@gmail.com \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).