git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sam Vilain <sam.vilain@catalyst.net.nz>
To: git@vger.kernel.org
Cc: Finn Arne Gangstad <finnag@pvv.org>,
	Finn Arne Gangstad <finag@pvv.org>,
	Junio C Hamano <gitster@pobox.com>
Subject: [PATCH 084/104] autocrlf: Make it work also for un-normalized repositories
Date: Wed, 26 May 2010 18:00:54 +1200	[thread overview]
Message-ID: <1274853674-18521-84-git-send-email-sam.vilain@catalyst.net.nz> (raw)
In-Reply-To: <1274853674-18521-1-git-send-email-sam.vilain@catalyst.net.nz>

From: Finn Arne Gangstad <finnag@pvv.org>

Previously, autocrlf would only work well for normalized
repositories. Any text files that contained CRLF in the repository
would cause problems, and would be modified when handled with
core.autocrlf set.

Change autocrlf to not do any conversions to files that in the
repository already contain a CR. git with autocrlf set will never
create such a file, or change a LF only file to contain CRs, so the
(new) assumption is that if a file contains a CR, it is intentional,
and autocrlf should not change that.

The following sequence should now always be a NOP even with autocrlf
set (assuming a clean working directory):

git checkout <something>
touch *
git add -A .    (will add nothing)
git commit      (nothing to commit)

Previously this would break for any text file containing a CR.

Some of you may have been folowing Eyvind's excellent thread about
trying to make end-of-line translation in git a bit smoother.

I decided to attack the problem from a different angle: Is it possible
to make autocrlf behave non-destructively for all the previous problem cases?

Stealing the problem from Eyvind's initial mail (paraphrased and
summarized a bit):

1. Setting autocrlf globally is a pain since autocrlf does not work well
   with CRLF in the repo
2. Setting it in individual repos is hard since you do it "too late"
   (the clone will get it wrong)
3. If someone checks in a file with CRLF later, you get into problems again
4. If a repository once has contained CRLF, you can't tell autocrlf
   at which commit everything is sane again
5. autocrlf does needless work if you know that all your users want
   the same EOL style.

I belive that this patch makes autocrlf a safe (and good) default
setting for Windows, and this solves problems 1-4 (it solves 2 by being
set by default, which is early enough for clone).

I implemented it by looking for CR charactes in the index, and
aborting any conversion attempt if this is found.

Signed-off-by: Finn Arne Gangstad <finag@pvv.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 convert.c       |   49 +++++++++++++++++++++++++++++++++++++++++++++++++
 t/t0020-crlf.sh |   52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 101 insertions(+), 0 deletions(-)

diff --git a/convert.c b/convert.c
index 27acce5..a54c5fc 100644
--- a/convert.c
+++ b/convert.c
@@ -120,6 +120,43 @@ static void check_safe_crlf(const char *path, int action,
 	}
 }
 
+static int has_cr_in_index(const char *path)
+{
+	int pos, len;
+	unsigned long sz;
+	enum object_type type;
+	void *data;
+	int has_cr;
+	struct index_state *istate = &the_index;
+
+	len = strlen(path);
+	pos = index_name_pos(istate, path, len);
+	if (pos < 0) {
+		/*
+		 * We might be in the middle of a merge, in which
+		 * case we would read stage #2 (ours).
+		 */
+		int i;
+		for (i = -pos - 1;
+		     (pos < 0 && i < istate->cache_nr &&
+		      !strcmp(istate->cache[i]->name, path));
+		     i++)
+			if (ce_stage(istate->cache[i]) == 2)
+				pos = i;
+	}
+	if (pos < 0)
+		return 0;
+	data = read_sha1_file(istate->cache[pos]->sha1, &type, &sz);
+	if (!data || type != OBJ_BLOB) {
+		free(data);
+		return 0;
+	}
+
+	has_cr = memchr(data, '\r', sz) != NULL;
+	free(data);
+	return has_cr;
+}
+
 static int crlf_to_git(const char *path, const char *src, size_t len,
                        struct strbuf *buf, int action, enum safe_crlf checksafe)
 {
@@ -145,6 +182,13 @@ static int crlf_to_git(const char *path, const char *src, size_t len,
 		 */
 		if (is_binary(len, &stats))
 			return 0;
+
+		/*
+		 * If the file in the index has any CR in it, do not convert.
+		 * This is the new safer autocrlf handling.
+		 */
+		if (has_cr_in_index(path))
+			return 0;
 	}
 
 	check_safe_crlf(path, action, &stats, checksafe);
@@ -203,6 +247,11 @@ static int crlf_to_worktree(const char *path, const char *src, size_t len,
 		return 0;
 
 	if (action == CRLF_GUESS) {
+		/* If we have any CR or CRLF line endings, we do not touch it */
+		/* This is the new safer autocrlf-handling */
+		if (stats.cr > 0 || stats.crlf > 0)
+			return 0;
+
 		/* If we have any bare CR characters, we're not going to touch it */
 		if (stats.cr != stats.crlf)
 			return 0;
diff --git a/t/t0020-crlf.sh b/t/t0020-crlf.sh
index c3e7e32..234a94f 100755
--- a/t/t0020-crlf.sh
+++ b/t/t0020-crlf.sh
@@ -453,5 +453,57 @@ test_expect_success 'invalid .gitattributes (must not crash)' '
 	git diff
 
 '
+# Some more tests here to add new autocrlf functionality.
+# We want to have a known state here, so start a bit from scratch
+
+test_expect_success 'setting up for new autocrlf tests' '
+	git config core.autocrlf false &&
+	git config core.safecrlf false &&
+	rm -rf .????* * &&
+	for w in I am all LF; do echo $w; done >alllf &&
+	for w in Oh here is CRLFQ in text; do echo $w; done | q_to_cr >mixed &&
+	for w in I am all CRLF; do echo $w; done | append_cr >allcrlf &&
+	git add -A . &&
+	git commit -m "alllf, allcrlf and mixed only" &&
+	git tag -a -m "message" autocrlf-checkpoint
+'
+
+test_expect_success 'report no change after setting autocrlf' '
+	git config core.autocrlf true &&
+	touch * &&
+	git diff --exit-code
+'
+
+test_expect_success 'files are clean after checkout' '
+	rm * &&
+	git checkout -f &&
+	git diff --exit-code
+'
+
+cr_to_Q_no_NL () {
+    tr '\015' Q | tr -d '\012'
+}
+
+test_expect_success 'LF only file gets CRLF with autocrlf' '
+	test "$(cr_to_Q_no_NL < alllf)" = "IQamQallQLFQ"
+'
+
+test_expect_success 'Mixed file is still mixed with autocrlf' '
+	test "$(cr_to_Q_no_NL < mixed)" = "OhhereisCRLFQintext"
+'
+
+test_expect_success 'CRLF only file has CRLF with autocrlf' '
+	test "$(cr_to_Q_no_NL < allcrlf)" = "IQamQallQCRLFQ"
+'
+
+test_expect_success 'New CRLF file gets LF in repo' '
+	tr -d "\015" < alllf | append_cr > alllf2 &&
+	git add alllf2 &&
+	git commit -m "alllf2 added" &&
+	git config core.autocrlf false &&
+	rm * &&
+	git checkout -f &&
+	test_cmp alllf alllf2
+'
 
 test_done
-- 
1.7.1.rc2.333.gb2668

  parent reply	other threads:[~2010-05-26  6:07 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1274853674-18521-1-git-send-email-sam.vilain@catalyst.net.nz>
2010-05-26  6:00 ` [PATCH 075/104] tests: chmod +x t5150 Sam Vilain
2010-05-26  6:00 ` [PATCH 076/104] t7604-merge-custom-message: shift expected output creation Sam Vilain
2010-05-26  6:00 ` [PATCH 082/104] fmt-merge-msg: add function to append shortlog only Sam Vilain
2010-05-26  6:00 ` Sam Vilain [this message]
2010-05-26  6:00 ` [PATCH 086/104] gitweb: Use @diff_opts while using format-patch Sam Vilain
2010-05-26  6:00 ` [PATCH 087/104] hash_object: correction for zero length file Sam Vilain
2010-05-26  6:00 ` [PATCH 088/104] for-each-ref: Field with abbreviated objectname Sam Vilain
2010-05-26  6:01 ` [PATCH 090/104] Documentation: rebase -i ignores options passed to "git am" Sam Vilain
2010-05-26  6:01 ` [PATCH 091/104] Documentation: fix minor inconsistency Sam Vilain
2010-05-26  6:01 ` [PATCH 092/104] Documentation/gitdiffcore: fix order in pickaxe description Sam Vilain
2010-05-26  6:01 ` [PATCH 093/104] post-receive-email: document command-line mode Sam Vilain
2010-05-26  6:01 ` [PATCH 094/104] diff: fix coloring of extended diff headers Sam Vilain
2010-05-26  6:01 ` [PATCH 095/104] Fix "Out of memory? mmap failed" for files larger than 4GB on Windows Sam Vilain
2010-05-26  6:01 ` [PATCH 096/104] start_command: close cmd->err descriptor when fork/spawn fails Sam Vilain
2010-05-26  6:01 ` [PATCH 097/104] Fix checkout of large files to network shares on Windows XP Sam Vilain
2010-05-26  6:01 ` [PATCH 098/104] mingw: use _commit to implement fsync Sam Vilain
2010-05-26  6:01 ` [PATCH 099/104] Recent MinGW has a C99 implementation of snprintf functions Sam Vilain
2010-05-26  6:01 ` [PATCH 100/104] Complete prototype of git_config_from_parameters() Sam Vilain
2010-05-26  6:01 ` [PATCH 101/104] test get_git_work_tree() return value for NULL Sam Vilain
2010-05-26  6:01 ` [PATCH 102/104] t7502-commit: fix spelling Sam Vilain
2010-05-26  6:01 ` [PATCH 103/104] show-branch: use DEFAULT_ABBREV instead of 7 Sam Vilain
2010-05-26  6:01 ` [PATCH 104/104] Documentation/SubmittingPatches: clarify GMail section and SMTP Sam Vilain
     [not found] ` <1274853674-18521-22-git-send-email-sam.vilain@catalyst.net.nz>
2010-05-26  7:46   ` [PATCH 022/104] Gitweb: ignore built file Sverre Rabbelier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1274853674-18521-84-git-send-email-sam.vilain@catalyst.net.nz \
    --to=sam.vilain@catalyst.net.nz \
    --cc=finag@pvv.org \
    --cc=finnag@pvv.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).