git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Pēteris Kļaviņš" <klavins@netspace.net.au>
To: git@vger.kernel.org
Subject: Re: An interaction with ce_match_stat_basic() and autocrlf
Date: Tue, 8 Jan 2008 18:12:18 +0100	[thread overview]
Message-ID: <fm0au5$i65$1@ger.gmane.org> (raw)
In-Reply-To: <7vfxx8tt1z.fsf@gitster.siamese.dyndns.org>

> At this point, the index records a blob with LF line ending,
> while the work tree file has the same content with CRLF line
> ending.

I think this needs more than just sleeping on.

There are two separate problems related to crlf treatment in git that 
manifest themselves in the quirks you see in the current implementation:

(1) The fact that the index may be misaligned with the work tree. Junio's 
example demonstrates this well. I have resorted to

$ rm -rf *
$ git reset --hard

in the past to get a work tree that passes

$ git status

without false positives after changing the value of autocrlf.

(2) The fact that repository content may be mangled in an indeterminate way 
because of the current work tree <-> repository transformation algorithm. 
While criticism in the past has mainly been levelled at not knowing whether 
a truly binary file will be correctly determined as such, content can be 
lost in the round trip work tree -> repository -> work tree much more 
simply:

$ git init
$ git config core.autocrlf true
$ echo ab | tr ab \\r\\n >a.txt
$ od -t a a.txt
0000000  cr  nl  nl
0000003
$ git add a.txt
$ git commit
$ rm a.txt
$ git reset --hard
$ od -t a a.txt
0000000  cr  nl  cr  nl
0000004

In summary, it irks me that autocrlf true mode is a second cousin of 
autocrlf false and I think that there *should* be an acceptable 
deterministic solution to this.

The solution to (2) seems easier than (1): could the transformation 
algorithm be made deterministic and changed to something like "convert all 
crlf pairs to lf if and only if no singleton cr or lf exist in the file 
before conversion"? If a binary file gets mangled in error, it would be an 
easy transformation with standard tools to get the file back again. If an 
otherwise text file has mixed lf and crlf endings, or additional cr or lf 
sprinkled randomly through it, the file is not transformed.

Given a deterministic transformation algorithm, the solution to (1) boils 
down to recording for each file in the work tree whether the transformation 
algorithm was used or not in arriving at the file's current contents, 
together with a way of telling git to force the use of the transformation 
algorithm or not for a particular file. It seems to me the place that this 
information *should* be recorded is the index, given that both .git/config 
and .gitattributes can be changed independently of the work tree. Recording 
the information in the index would mean that both autocrlf true and autocrlf 
false clones of the same repository would produce equally valid work trees 
with no loss of information. I am however not well versed enough in git 
internals at the moment to know whether this is an acceptable solution or 
not. 

  parent reply	other threads:[~2008-01-08 17:13 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-08 12:12 An interaction with ce_match_stat_basic() and autocrlf Junio C Hamano
2008-01-08 16:10 ` Linus Torvalds
2008-01-08 18:04   ` Junio C Hamano
2008-01-10  2:11   ` Junio C Hamano
2008-01-08 17:12 ` Pēteris Kļaviņš [this message]
2008-01-08 17:30   ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='fm0au5$i65$1@ger.gmane.org' \
    --to=klavins@netspace.net.au \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).