git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] pre-commit hook should ignore carriage returns at EOL
@ 2008-06-24 19:21 Christian Holtje
  2008-06-25 18:14 ` Alex Riesen
  0 siblings, 1 reply; 14+ messages in thread
From: Christian Holtje @ 2008-06-24 19:21 UTC (permalink / raw)
  To: git; +Cc: gitster

When commit files that use DOS style CRLF end-of-lines, the pre-commit
hook would raise an error.  When combined with the fact that the hooks
get activated by default on windows, it makes life difficult for
people working with DOS files.

This patch causes the pre-commit hook to deal with crlf files
correctly.

Signed-off-by: Christian Höltje <docwhat@gmail.com>
---
  t/t7503-template-hook--pre-commit.sh |   75 +++++++++++++++++++++++++ 
+++++++++
  templates/hooks--pre-commit          |   10 ++++-
  2 files changed, 83 insertions(+), 2 deletions(-)
  create mode 100755 t/t7503-template-hook--pre-commit.sh

diff --git a/t/t7503-template-hook--pre-commit.sh b/t/t7503-template- 
hook--pre-commit.sh
new file mode 100755
index 0000000..c78a507
--- /dev/null
+++ b/t/t7503-template-hook--pre-commit.sh
@@ -0,0 +1,75 @@
+#!/bin/sh
+#
+# Copyright (c) 2008 Christian Höltje
+#
+
+test_description='t7503 templates-hooks--pre-commit
+
+This test verifies that the pre-commit hook shipped with
+git by default works correctly.
+'
+
+. ./test-lib.sh
+
+test_expect_success 'verify that autocrlf is unset' '
+   if git config core.autocrlf
+   then
+     false
+   else
+     test $? -eq 1
+   fi
+'
+
+test_expect_success 'lf without hook' '
+
+	printf "foo" > lf.txt &&
+	git add lf.txt &&
+	git commit -m "lf without hook" lf.txt
+
+'
+
+test_expect_success 'crlf without hook' '
+
+	printf "foo\r" > crlf.txt &&
+	git add crlf.txt &&
+	git commit -m "crlf without hook" crlf.txt
+
+'
+
+# Set up the pre-commit hook.
+HOOKDIR="$(git rev-parse --git-dir)/hooks"
+mkdir -p "${HOOKDIR}"
+cp -r "${HOOKDIR}-disabled/pre-commit" "${HOOKDIR}/pre-commit"
+chmod +x "${HOOKDIR}/pre-commit"
+
+test_expect_success 'lf with hook' '
+
+	printf "bar" >> lf.txt &&
+	git add lf.txt &&
+	git commit -m "lf with hook" lf.txt
+
+'
+test_expect_success 'crlf with hook' '
+
+	printf "bar\r" >> crlf.txt &&
+	git add crlf.txt &&
+	git commit -m "crlf with hook" crlf.txt
+
+'
+
+test_expect_success 'lf with hook white-space failure' '
+
+	printf "bar " >> lf.txt &&
+	git add lf.txt &&
+	! git commit -m "lf with hook" lf.txt
+
+'
+test_expect_success 'crlf with hook white-space failure' '
+
+	printf "bar \r" >> crlf.txt &&
+	git add crlf.txt &&
+	! git commit -m "crlf with hook" crlf.txt
+
+'
+
+test_done
diff --git a/templates/hooks--pre-commit b/templates/hooks--pre-commit
index b25dce6..335ca09 100644
--- a/templates/hooks--pre-commit
+++ b/templates/hooks--pre-commit
@@ -55,8 +55,14 @@ perl -e '
  	if (s/^\+//) {
  	    $lineno++;
  	    chomp;
-	    if (/\s$/) {
-		bad_line("trailing whitespace", $_);
+	    if (/\r$/) {
+		if (/\s\r$/) {
+		    bad_line("trailing whitespace", $_);
+		}
+	    } else {
+		if (/\s$/) {
+		    bad_line("trailing whitespace", $_);
+		}
  	    }
  	    if (/^\s* \t/) {
  		bad_line("indent SP followed by a TAB", $_);
-- 
1.5.5.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] pre-commit hook should ignore carriage returns at EOL
  2008-06-24 19:21 [PATCH v2] pre-commit hook should ignore carriage returns at EOL Christian Holtje
@ 2008-06-25 18:14 ` Alex Riesen
  2008-06-25 18:47   ` Christian Holtje
  0 siblings, 1 reply; 14+ messages in thread
From: Alex Riesen @ 2008-06-25 18:14 UTC (permalink / raw)
  To: Christian Holtje; +Cc: git, Junio C Hamano

Christian Holtje, Tue, Jun 24, 2008 21:21:22 +0200:
> diff --git a/t/t7503-template-hook--pre-commit.sh b/t/t7503-template- 
> hook--pre-commit.sh

Your patch has long lines wrapped.

> diff --git a/templates/hooks--pre-commit b/templates/hooks--pre-commit
> index b25dce6..335ca09 100644
> --- a/templates/hooks--pre-commit
> +++ b/templates/hooks--pre-commit
> @@ -55,8 +55,14 @@ perl -e '
>  	if (s/^\+//) {
>  	    $lineno++;
>  	    chomp;
> -	    if (/\s$/) {
> -		bad_line("trailing whitespace", $_);
> +	    if (/\r$/) {
> +		if (/\s\r$/) {
> +		    bad_line("trailing whitespace", $_);
> +		}
> +	    } else {
> +		if (/\s$/) {
> +		    bad_line("trailing whitespace", $_);
> +		}

You coud just strip the trailing (cr)lf, instead of chomp:

  	if (s/^\+//) {
  	    $lineno++;
- 	    chomp;
+	    s/\r?\n$//so;
 	    if (/\s$/) {
 		bad_line("trailing whitespace", $_);

Makes for a shorter patch and less code.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] pre-commit hook should ignore carriage returns at EOL
  2008-06-25 18:14 ` Alex Riesen
@ 2008-06-25 18:47   ` Christian Holtje
  2008-06-25 19:14     ` Junio C Hamano
  0 siblings, 1 reply; 14+ messages in thread
From: Christian Holtje @ 2008-06-25 18:47 UTC (permalink / raw)
  To: Alex Riesen; +Cc: git, Junio C Hamano

On Jun 25, 2008, at 2:14 PM, Alex Riesen wrote:
> Christian Holtje, Tue, Jun 24, 2008 21:21:22 +0200:
>> diff --git a/t/t7503-template-hook--pre-commit.sh b/t/t7503-template-
>> hook--pre-commit.sh
>
> Your patch has long lines wrapped.
>
>> diff --git a/templates/hooks--pre-commit b/templates/hooks--pre- 
>> commit
>> index b25dce6..335ca09 100644
>> --- a/templates/hooks--pre-commit
>> +++ b/templates/hooks--pre-commit
>> @@ -55,8 +55,14 @@ perl -e '
>> 	if (s/^\+//) {
>> 	    $lineno++;
>> 	    chomp;
>> -	    if (/\s$/) {
>> -		bad_line("trailing whitespace", $_);
>> +	    if (/\r$/) {
>> +		if (/\s\r$/) {
>> +		    bad_line("trailing whitespace", $_);
>> +		}
>> +	    } else {
>> +		if (/\s$/) {
>> +		    bad_line("trailing whitespace", $_);
>> +		}
>
> You coud just strip the trailing (cr)lf, instead of chomp:
>
>  	if (s/^\+//) {
>  	    $lineno++;
> - 	    chomp;
> +	    s/\r?\n$//so;
> 	    if (/\s$/) {
> 		bad_line("trailing whitespace", $_);
>
> Makes for a shorter patch and less code.

That's a good idea!  However, this patch is not going anyplace, I  
think.  Junio submitted a different patch to disable the pre-commit  
example.

Junio, do you want me to make this change anyway?  It does make  
sense.  The unittests for the pre-commit hook may or may not still be  
useful.

Ciao!

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] pre-commit hook should ignore carriage returns at EOL
  2008-06-25 18:47   ` Christian Holtje
@ 2008-06-25 19:14     ` Junio C Hamano
  2008-06-26  2:41       ` Christian Holtje
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2008-06-25 19:14 UTC (permalink / raw)
  To: Christian Holtje; +Cc: Alex Riesen, git

Christian Holtje <docwhat@gmail.com> writes:

> On Jun 25, 2008, at 2:14 PM, Alex Riesen wrote:
>> Christian Holtje, Tue, Jun 24, 2008 21:21:22 +0200:
>>> diff --git a/t/t7503-template-hook--pre-commit.sh b/t/t7503-template-
>>> hook--pre-commit.sh
>>
>> Your patch has long lines wrapped.
>>
>>> diff --git a/templates/hooks--pre-commit b/templates/hooks--pre-
>>> commit
>>> index b25dce6..335ca09 100644
>>> --- a/templates/hooks--pre-commit
>>> +++ b/templates/hooks--pre-commit
>>> @@ -55,8 +55,14 @@ perl -e '
>>> 	if (s/^\+//) {
>>> 	    $lineno++;
>>> 	    chomp;
>>> -	    if (/\s$/) {
>>> -		bad_line("trailing whitespace", $_);
>>> +	    if (/\r$/) {
>>> +		if (/\s\r$/) {
>>> +		    bad_line("trailing whitespace", $_);
>>> +		}
>>> +	    } else {
>>> +		if (/\s$/) {
>>> +		    bad_line("trailing whitespace", $_);
>>> +		}
>>
>> You coud just strip the trailing (cr)lf, instead of chomp:
>>
>>  	if (s/^\+//) {
>>  	    $lineno++;
>> - 	    chomp;
>> +	    s/\r?\n$//so;
>> 	    if (/\s$/) {
>> 		bad_line("trailing whitespace", $_);
>>
>> Makes for a shorter patch and less code.
>
> That's a good idea!  However, this patch is not going anyplace, I
> think.  Junio submitted a different patch to disable the pre-commit
> example.
>
> Junio, do you want me to make this change anyway?  It does make sense.
> The unittests for the pre-commit hook may or may not still be  useful.

"disable" is not an issue.  The intention has always been that these are
samples, and it was an accident that some packaging shipped them enabled
by mistake.  The patch was to make that mistake harder to make.

The issue now is about keeping the example hooks _relevant_.  The one we
have does not work well with projects that want to check in files with
CRLF line endings (iow, without using autocrlf to attempt to make the
project files cross-platform), so it is irrelevant for such projects with
Windows origin.

The "solution" you are proposing to strip out \r makes the check less
useful for projects that want to keep files with LF line endings in the
commited history, because your patch would stop catching such a mistake of
adding an CR before LF.  It is robbing from Peter to pay Paul, and I am
afraid it would make the sample even more irrelevant in the end.  I do not
think we would want to go there.

I suggested using "diff --check" (and possibly teaching "diff --check"
other things the scripted example checks, such as conflict markers),
which would know to honor the line endings specified per path via
gitattributes(5), instead of building on top of the big Perl script, and I
had an impression that you agreed to the approach.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] pre-commit hook should ignore carriage returns at EOL
  2008-06-25 19:14     ` Junio C Hamano
@ 2008-06-26  2:41       ` Christian Holtje
  2008-06-26 22:33         ` Junio C Hamano
  0 siblings, 1 reply; 14+ messages in thread
From: Christian Holtje @ 2008-06-26  2:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Alex Riesen, git

On Jun 25, 2008, at 3:14 PM, Junio C Hamano wrote:
> Christian Holtje <docwhat@gmail.com> writes:
>>
>> Junio, do you want me to make this change anyway?  It does make  
>> sense.
>> The unittests for the pre-commit hook may or may not still be   
>> useful.
>
> "disable" is not an issue.  The intention has always been that these  
> are
> samples, and it was an accident that some packaging shipped them  
> enabled
> by mistake.  The patch was to make that mistake harder to make.
>
> The issue now is about keeping the example hooks _relevant_.  The  
> one we
> have does not work well with projects that want to check in files with
> CRLF line endings (iow, without using autocrlf to attempt to make the
> project files cross-platform), so it is irrelevant for such projects  
> with
> Windows origin.
>
> The "solution" you are proposing to strip out \r makes the check less
> useful for projects that want to keep files with LF line endings in  
> the
> commited history, because your patch would stop catching such a  
> mistake of
> adding an CR before LF.  It is robbing from Peter to pay Paul, and I  
> am
> afraid it would make the sample even more irrelevant in the end.  I  
> do not
> think we would want to go there.
>
> I suggested using "diff --check" (and possibly teaching "diff --check"
> other things the scripted example checks, such as conflict markers),
> which would know to honor the line endings specified per path via
> gitattributes(5), instead of building on top of the big Perl script,  
> and I
> had an impression that you agreed to the approach.

I'm completely confused how gitattributes and core.autocrlf interact,  
etc.

I'm expecting the default behavior is that git will leave my files  
alone.  This seems to be the case.

In order for the crlf stuff to work, does it need to have  
core.autocrlf set to true?  If so, that seems wrong.  gitattributes(5)  
is supposed to let you have fine grain control over files.  In  
addition, .git/config isn't passed around via clone so .gitattributes  
works differently depending on who clones it and their settings.

Furthermore, how would 'git diff --check' know what the line endings  
are for a file?  I may have a mostly unix repository but I may have a  
few crlf text files I need to have checked out as crlf on windows (but  
not unix).

Shouldn't the crlf stuff be something like:
.gitattributes:
   eol=crlf -- dos files. no conversion, but diff --check will know  
what is what.
   eol=lf -- unix (otherwise same)
   eol=convert -- stores internally as lf, converts on the fly to lf  
or crlf on filesystem based on system preference. diff --check won't  
worry about new lines matching system-eol if they don't match the eol  
in the file.
   eol=binary -- binary

core.eol=crlf -- ony files with all crlf endings are text.
core.eol=lf -- only files with all lf (no crlf) endings are text.
core.eol=convert -- crlf changes will be converted to lf as per system- 
eol
core.eol=false -- do nothing (default) (binary)

core.system-eol=crlf -- pretend the system is windows
core.system-eol=lf -- pretend the system is unix
core.system-eol=auto -- determine from the system (default)

Otherwise, I have no clue what's going on.  Once someone writes this  
in english, I'll try to write unittests and I'll submit them.  Then  
someone who knows what he/she is doing can write the actual code to  
make the tests pass.  Or maybe I'll try, who knows.

Ciao!

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] pre-commit hook should ignore carriage returns at EOL
  2008-06-26  2:41       ` Christian Holtje
@ 2008-06-26 22:33         ` Junio C Hamano
  2008-06-26 22:34           ` [PATCH 1/5] diff --check: explain why we do not care whether old side is binary Junio C Hamano
                             ` (5 more replies)
  0 siblings, 6 replies; 14+ messages in thread
From: Junio C Hamano @ 2008-06-26 22:33 UTC (permalink / raw)
  To: Christian Holtje; +Cc: Alex Riesen, git

Christian Holtje <docwhat@gmail.com> writes:

>> I suggested using "diff --check" (and possibly teaching "diff --check"
>> other things the scripted example checks, such as conflict markers),
>> which would know to honor the line endings specified per path via
>> gitattributes(5), instead of building on top of the big Perl script,
>> and I
>> had an impression that you agreed to the approach.
>
> I'm completely confused how gitattributes and core.autocrlf interact,
> etc.

Here is a series I just cooked up so that we can remove the whole Perl
script and replace it by adding --check to "diff-index" used there. 

The first three are code clean-ups and the last two implements necessary
new features to "diff --check".  The whole series somewhat depend on the
fix to 'maint' not to lose the exit status I sent earlier.

[PATCH 1/5] diff --check: explain why we do not care whether old side is binary
[PATCH 2/5] check_and_emit_line(): rename and refactor
[PATCH 3/5] checkdiff: pass diff_options to the callback
[PATCH 4/5] Teach "diff --check" about a new blank lines at end
[PATCH 5/5] diff --check: detect leftover conflict markers

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/5] diff --check: explain why we do not care whether old side is binary
  2008-06-26 22:33         ` Junio C Hamano
@ 2008-06-26 22:34           ` Junio C Hamano
  2008-06-26 22:35           ` [PATCH 2/5] check_and_emit_line(): rename and refactor Junio C Hamano
                             ` (4 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: Junio C Hamano @ 2008-06-26 22:34 UTC (permalink / raw)
  To: Christian Holtje; +Cc: Alex Riesen, git

All other codepaths refrain from running textual diff when either the old
or the new side is binary, but this function only checks the new side.  I
was almost going to change it to check both, but that would be a bad
change.  Explain why to prevent future mistakes.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 diff.c |   11 +++++++++--
 1 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/diff.c b/diff.c
index 8939423..c00d633 100644
--- a/diff.c
+++ b/diff.c
@@ -1544,8 +1544,9 @@ static void builtin_diffstat(const char *name_a, const char *name_b,
 
 static void builtin_checkdiff(const char *name_a, const char *name_b,
 			      const char *attr_path,
-			     struct diff_filespec *one,
-			     struct diff_filespec *two, struct diff_options *o)
+			      struct diff_filespec *one,
+			      struct diff_filespec *two,
+			      struct diff_options *o)
 {
 	mmfile_t mf1, mf2;
 	struct checkdiff_t data;
@@ -1564,6 +1565,12 @@ static void builtin_checkdiff(const char *name_a, const char *name_b,
 	if (fill_mmfile(&mf1, one) < 0 || fill_mmfile(&mf2, two) < 0)
 		die("unable to read files to diff");
 
+	/*
+	 * All the other codepaths check both sides, but not checking
+	 * the "old" side here is deliberate.  We are checking the newly
+	 * introduced changes, and as long as the "new" side is text, we
+	 * can and should check what it introduces.
+	 */
 	if (diff_filespec_is_binary(two))
 		goto free_and_return;
 	else {
-- 
1.5.6.1.78.gde8d9

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/5] check_and_emit_line(): rename and refactor
  2008-06-26 22:33         ` Junio C Hamano
  2008-06-26 22:34           ` [PATCH 1/5] diff --check: explain why we do not care whether old side is binary Junio C Hamano
@ 2008-06-26 22:35           ` Junio C Hamano
  2008-06-26 22:36           ` [PATCH 3/5] checkdiff: pass diff_options to the callback Junio C Hamano
                             ` (3 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: Junio C Hamano @ 2008-06-26 22:35 UTC (permalink / raw)
  To: Christian Holtje; +Cc: Alex Riesen, git

The function name was too bland and not explicit enough as to what it is
checking.  Split it into two, and call the one that checks if there is a
whitespace breakage "ws_check()", and call the other one that checks and
emits the line after color coding "ws_check_emit()".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin-apply.c |    5 ++---
 cache.h         |    5 ++---
 diff.c          |   13 ++++++-------
 ws.c            |   18 +++++++++++++++---
 4 files changed, 25 insertions(+), 16 deletions(-)

diff --git a/builtin-apply.c b/builtin-apply.c
index c497889..92f0047 100644
--- a/builtin-apply.c
+++ b/builtin-apply.c
@@ -979,8 +979,7 @@ static int find_header(char *line, unsigned long size, int *hdrsize, struct patc
 static void check_whitespace(const char *line, int len, unsigned ws_rule)
 {
 	char *err;
-	unsigned result = check_and_emit_line(line + 1, len - 1, ws_rule,
-	    NULL, NULL, NULL, NULL);
+	unsigned result = ws_check(line + 1, len - 1, ws_rule);
 	if (!result)
 		return;
 
@@ -991,7 +990,7 @@ static void check_whitespace(const char *line, int len, unsigned ws_rule)
 	else {
 		err = whitespace_error_string(result);
 		fprintf(stderr, "%s:%d: %s.\n%.*s\n",
-		     patch_input_file, linenr, err, len - 2, line + 1);
+			patch_input_file, linenr, err, len - 2, line + 1);
 		free(err);
 	}
 }
diff --git a/cache.h b/cache.h
index 64ef86e..3dfa53c 100644
--- a/cache.h
+++ b/cache.h
@@ -819,9 +819,8 @@ void shift_tree(const unsigned char *, const unsigned char *, unsigned char *, i
 extern unsigned whitespace_rule_cfg;
 extern unsigned whitespace_rule(const char *);
 extern unsigned parse_whitespace_rule(const char *);
-extern unsigned check_and_emit_line(const char *line, int len, unsigned ws_rule,
-    FILE *stream, const char *set,
-    const char *reset, const char *ws);
+extern unsigned ws_check(const char *line, int len, unsigned ws_rule);
+extern void ws_check_emit(const char *line, int len, unsigned ws_rule, FILE *stream, const char *set, const char *reset, const char *ws);
 extern char *whitespace_error_string(unsigned ws);
 extern int ws_fix_copy(char *, const char *, int, unsigned, int *);
 
diff --git a/diff.c b/diff.c
index c00d633..52a34ee 100644
--- a/diff.c
+++ b/diff.c
@@ -535,9 +535,9 @@ static void emit_add_line(const char *reset, struct emit_callback *ecbdata, cons
 	else {
 		/* Emit just the prefix, then the rest. */
 		emit_line(ecbdata->file, set, reset, line, ecbdata->nparents);
-		(void)check_and_emit_line(line + ecbdata->nparents,
-		    len - ecbdata->nparents, ecbdata->ws_rule,
-		    ecbdata->file, set, reset, ws);
+		ws_check_emit(line + ecbdata->nparents,
+			      len - ecbdata->nparents, ecbdata->ws_rule,
+			      ecbdata->file, set, reset, ws);
 	}
 }
 
@@ -1153,8 +1153,7 @@ static void checkdiff_consume(void *priv, char *line, unsigned long len)
 	if (line[0] == '+') {
 		unsigned bad;
 		data->lineno++;
-		bad = check_and_emit_line(line + 1, len - 1,
-		    data->ws_rule, NULL, NULL, NULL, NULL);
+		bad = ws_check(line + 1, len - 1, data->ws_rule);
 		if (!bad)
 			return;
 		data->status |= bad;
@@ -1162,8 +1161,8 @@ static void checkdiff_consume(void *priv, char *line, unsigned long len)
 		fprintf(data->file, "%s:%d: %s.\n", data->filename, data->lineno, err);
 		free(err);
 		emit_line(data->file, set, reset, line, 1);
-		(void)check_and_emit_line(line + 1, len - 1, data->ws_rule,
-		    data->file, set, reset, ws);
+		ws_check_emit(line + 1, len - 1, data->ws_rule,
+			      data->file, set, reset, ws);
 	} else if (line[0] == ' ')
 		data->lineno++;
 	else if (line[0] == '@') {
diff --git a/ws.c b/ws.c
index ba7e834..24d3e3d 100644
--- a/ws.c
+++ b/ws.c
@@ -117,9 +117,9 @@ char *whitespace_error_string(unsigned ws)
 }
 
 /* If stream is non-NULL, emits the line after checking. */
-unsigned check_and_emit_line(const char *line, int len, unsigned ws_rule,
-			     FILE *stream, const char *set,
-			     const char *reset, const char *ws)
+static unsigned ws_check_emit_1(const char *line, int len, unsigned ws_rule,
+				FILE *stream, const char *set,
+				const char *reset, const char *ws)
 {
 	unsigned result = 0;
 	int written = 0;
@@ -213,6 +213,18 @@ unsigned check_and_emit_line(const char *line, int len, unsigned ws_rule,
 	return result;
 }
 
+void ws_check_emit(const char *line, int len, unsigned ws_rule,
+		   FILE *stream, const char *set,
+		   const char *reset, const char *ws)
+{
+	(void)ws_check_emit_1(line, len, ws_rule, stream, set, reset, ws);
+}
+
+unsigned ws_check(const char *line, int len, unsigned ws_rule)
+{
+	return ws_check_emit_1(line, len, ws_rule, NULL, NULL, NULL, NULL);
+}
+
 /* Copy the line to the buffer while fixing whitespaces */
 int ws_fix_copy(char *dst, const char *src, int len, unsigned ws_rule, int *error_count)
 {
-- 
1.5.6.1.78.gde8d9

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/5] checkdiff: pass diff_options to the callback
  2008-06-26 22:33         ` Junio C Hamano
  2008-06-26 22:34           ` [PATCH 1/5] diff --check: explain why we do not care whether old side is binary Junio C Hamano
  2008-06-26 22:35           ` [PATCH 2/5] check_and_emit_line(): rename and refactor Junio C Hamano
@ 2008-06-26 22:36           ` Junio C Hamano
  2008-06-26 22:36           ` [PATCH 4/5] Teach "diff --check" about a new blank lines at end Junio C Hamano
                             ` (2 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: Junio C Hamano @ 2008-06-26 22:36 UTC (permalink / raw)
  To: Christian Holtje; +Cc: Alex Riesen, git

We could later use more information from the diff_options.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 diff.c |   21 +++++++++++----------
 1 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/diff.c b/diff.c
index 52a34ee..6bcbe20 100644
--- a/diff.c
+++ b/diff.c
@@ -1136,18 +1136,19 @@ static void free_diffstat_info(struct diffstat_t *diffstat)
 struct checkdiff_t {
 	struct xdiff_emit_state xm;
 	const char *filename;
-	int lineno, color_diff;
+	int lineno;
+	struct diff_options *o;
 	unsigned ws_rule;
 	unsigned status;
-	FILE *file;
 };
 
 static void checkdiff_consume(void *priv, char *line, unsigned long len)
 {
 	struct checkdiff_t *data = priv;
-	const char *ws = diff_get_color(data->color_diff, DIFF_WHITESPACE);
-	const char *reset = diff_get_color(data->color_diff, DIFF_RESET);
-	const char *set = diff_get_color(data->color_diff, DIFF_FILE_NEW);
+	int color_diff = DIFF_OPT_TST(data->o, COLOR_DIFF);
+	const char *ws = diff_get_color(color_diff, DIFF_WHITESPACE);
+	const char *reset = diff_get_color(color_diff, DIFF_RESET);
+	const char *set = diff_get_color(color_diff, DIFF_FILE_NEW);
 	char *err;
 
 	if (line[0] == '+') {
@@ -1158,11 +1159,12 @@ static void checkdiff_consume(void *priv, char *line, unsigned long len)
 			return;
 		data->status |= bad;
 		err = whitespace_error_string(bad);
-		fprintf(data->file, "%s:%d: %s.\n", data->filename, data->lineno, err);
+		fprintf(data->o->file, "%s:%d: %s.\n",
+			data->filename, data->lineno, err);
 		free(err);
-		emit_line(data->file, set, reset, line, 1);
+		emit_line(data->o->file, set, reset, line, 1);
 		ws_check_emit(line + 1, len - 1, data->ws_rule,
-			      data->file, set, reset, ws);
+			      data->o->file, set, reset, ws);
 	} else if (line[0] == ' ')
 		data->lineno++;
 	else if (line[0] == '@') {
@@ -1557,9 +1559,8 @@ static void builtin_checkdiff(const char *name_a, const char *name_b,
 	data.xm.consume = checkdiff_consume;
 	data.filename = name_b ? name_b : name_a;
 	data.lineno = 0;
-	data.color_diff = DIFF_OPT_TST(o, COLOR_DIFF);
+	data.o = o;
 	data.ws_rule = whitespace_rule(attr_path);
-	data.file = o->file;
 
 	if (fill_mmfile(&mf1, one) < 0 || fill_mmfile(&mf2, two) < 0)
 		die("unable to read files to diff");
-- 
1.5.6.1.78.gde8d9

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/5] Teach "diff --check" about a new blank lines at end
  2008-06-26 22:33         ` Junio C Hamano
                             ` (2 preceding siblings ...)
  2008-06-26 22:36           ` [PATCH 3/5] checkdiff: pass diff_options to the callback Junio C Hamano
@ 2008-06-26 22:36           ` Junio C Hamano
  2008-06-26 22:37           ` [PATCH 5/5] diff --check: detect leftover conflict markers Junio C Hamano
  2008-06-26 23:01           ` [PATCH v2] pre-commit hook should ignore carriage returns at EOL Junio C Hamano
  5 siblings, 0 replies; 14+ messages in thread
From: Junio C Hamano @ 2008-06-26 22:36 UTC (permalink / raw)
  To: Christian Holtje; +Cc: Alex Riesen, git

When a patch adds new blank lines at the end, "git apply --whitespace"
warns.  This teaches "diff --check" to do the same.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 cache.h                    |    1 +
 diff.c                     |   17 +++++++++++++++--
 t/t4015-diff-whitespace.sh |    6 ++++++
 ws.c                       |   15 +++++++++++++++
 4 files changed, 37 insertions(+), 2 deletions(-)

diff --git a/cache.h b/cache.h
index 3dfa53c..188428d 100644
--- a/cache.h
+++ b/cache.h
@@ -823,6 +823,7 @@ extern unsigned ws_check(const char *line, int len, unsigned ws_rule);
 extern void ws_check_emit(const char *line, int len, unsigned ws_rule, FILE *stream, const char *set, const char *reset, const char *ws);
 extern char *whitespace_error_string(unsigned ws);
 extern int ws_fix_copy(char *, const char *, int, unsigned, int *);
+extern int ws_blank_line(const char *line, int len, unsigned ws_rule);
 
 /* ls-files */
 int pathspec_match(const char **spec, char *matched, const char *filename, int skiplen);
diff --git a/diff.c b/diff.c
index 6bcbe20..f31c721 100644
--- a/diff.c
+++ b/diff.c
@@ -1140,6 +1140,7 @@ struct checkdiff_t {
 	struct diff_options *o;
 	unsigned ws_rule;
 	unsigned status;
+	int trailing_blanks_start;
 };
 
 static void checkdiff_consume(void *priv, char *line, unsigned long len)
@@ -1154,6 +1155,10 @@ static void checkdiff_consume(void *priv, char *line, unsigned long len)
 	if (line[0] == '+') {
 		unsigned bad;
 		data->lineno++;
+		if (!ws_blank_line(line + 1, len - 1, data->ws_rule))
+			data->trailing_blanks_start = 0;
+		else if (!data->trailing_blanks_start)
+			data->trailing_blanks_start = data->lineno;
 		bad = ws_check(line + 1, len - 1, data->ws_rule);
 		if (!bad)
 			return;
@@ -1165,14 +1170,16 @@ static void checkdiff_consume(void *priv, char *line, unsigned long len)
 		emit_line(data->o->file, set, reset, line, 1);
 		ws_check_emit(line + 1, len - 1, data->ws_rule,
 			      data->o->file, set, reset, ws);
-	} else if (line[0] == ' ')
+	} else if (line[0] == ' ') {
 		data->lineno++;
-	else if (line[0] == '@') {
+		data->trailing_blanks_start = 0;
+	} else if (line[0] == '@') {
 		char *plus = strchr(line, '+');
 		if (plus)
 			data->lineno = strtol(plus, NULL, 10) - 1;
 		else
 			die("invalid diff");
+		data->trailing_blanks_start = 0;
 	}
 }
 
@@ -1584,6 +1591,12 @@ static void builtin_checkdiff(const char *name_a, const char *name_b,
 		ecb.outf = xdiff_outf;
 		ecb.priv = &data;
 		xdi_diff(&mf1, &mf2, &xpp, &xecfg, &ecb);
+
+		if (data.trailing_blanks_start) {
+			fprintf(o->file, "%s:%d: ends with blank lines.\n",
+				data.filename, data.trailing_blanks_start);
+			data.status = 1; /* report errors */
+		}
 	}
  free_and_return:
 	diff_free_filespec_data(one);
diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
index b7cc6b2..0922c70 100755
--- a/t/t4015-diff-whitespace.sh
+++ b/t/t4015-diff-whitespace.sh
@@ -335,4 +335,10 @@ test_expect_success 'line numbers in --check output are correct' '
 
 '
 
+test_expect_success 'checkdiff detects trailing blank lines' '
+	echo "foo();" >x &&
+	echo "" >>x &&
+	git diff --check | grep "ends with blank"
+'
+
 test_done
diff --git a/ws.c b/ws.c
index 24d3e3d..7a7ff13 100644
--- a/ws.c
+++ b/ws.c
@@ -225,6 +225,21 @@ unsigned ws_check(const char *line, int len, unsigned ws_rule)
 	return ws_check_emit_1(line, len, ws_rule, NULL, NULL, NULL, NULL);
 }
 
+int ws_blank_line(const char *line, int len, unsigned ws_rule)
+{
+	/*
+	 * We _might_ want to treat CR differently from other
+	 * whitespace characters when ws_rule has WS_CR_AT_EOL, but
+	 * for now we just use this stupid definition.
+	 */
+	while (len-- > 0) {
+		if (!isspace(*line))
+			return 0;
+		line++;
+	}
+	return 1;
+}
+
 /* Copy the line to the buffer while fixing whitespaces */
 int ws_fix_copy(char *dst, const char *src, int len, unsigned ws_rule, int *error_count)
 {
-- 
1.5.6.1.78.gde8d9

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 5/5] diff --check: detect leftover conflict markers
  2008-06-26 22:33         ` Junio C Hamano
                             ` (3 preceding siblings ...)
  2008-06-26 22:36           ` [PATCH 4/5] Teach "diff --check" about a new blank lines at end Junio C Hamano
@ 2008-06-26 22:37           ` Junio C Hamano
  2008-06-26 23:01           ` [PATCH v2] pre-commit hook should ignore carriage returns at EOL Junio C Hamano
  5 siblings, 0 replies; 14+ messages in thread
From: Junio C Hamano @ 2008-06-26 22:37 UTC (permalink / raw)
  To: Christian Holtje; +Cc: Alex Riesen, git

This teaches "diff --check" to detect and complain if new lines
have lines that look like leftover conflict markers.

We should be able to remove the old Perl script used in the sample
pre-commit hook and modernize the script with this facility.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 diff.c                 |   35 +++++++++++++++++++++++++++++++++++
 t/t4017-diff-retval.sh |   14 ++++++++++++++
 2 files changed, 49 insertions(+), 0 deletions(-)

diff --git a/diff.c b/diff.c
index f31c721..d515b06 100644
--- a/diff.c
+++ b/diff.c
@@ -1143,6 +1143,35 @@ struct checkdiff_t {
 	int trailing_blanks_start;
 };
 
+static int is_conflict_marker(const char *line, unsigned long len)
+{
+	char firstchar;
+	int cnt;
+
+	if (len < 8)
+		return 0;
+	firstchar = line[0];
+	switch (firstchar) {
+	case '=': case '>': case '<':
+		break;
+	default:
+		return 0;
+	}
+	for (cnt = 1; cnt < 7; cnt++)
+		if (line[cnt] != firstchar)
+			return 0;
+	/* line[0] thru line[6] are same as firstchar */
+	if (firstchar == '=') {
+		/* divider between ours and theirs? */
+		if (len != 8 || line[7] != '\n')
+			return 0;
+	} else if (len < 8 || !isspace(line[7])) {
+		/* not divider before ours nor after theirs */
+		return 0;
+	}
+	return 1;
+}
+
 static void checkdiff_consume(void *priv, char *line, unsigned long len)
 {
 	struct checkdiff_t *data = priv;
@@ -1159,6 +1188,12 @@ static void checkdiff_consume(void *priv, char *line, unsigned long len)
 			data->trailing_blanks_start = 0;
 		else if (!data->trailing_blanks_start)
 			data->trailing_blanks_start = data->lineno;
+		if (is_conflict_marker(line + 1, len - 1)) {
+			data->status |= 1;
+			fprintf(data->o->file,
+				"%s:%d: leftover conflict marker\n",
+				data->filename, data->lineno);
+		}
 		bad = ws_check(line + 1, len - 1, data->ws_rule);
 		if (!bad)
 			return;
diff --git a/t/t4017-diff-retval.sh b/t/t4017-diff-retval.sh
index 0d0fb87..d748d45 100755
--- a/t/t4017-diff-retval.sh
+++ b/t/t4017-diff-retval.sh
@@ -113,4 +113,18 @@ test_expect_success 'check should test not just the last line' '
 
 '
 
+test_expect_success 'check detects leftover conflict markers' '
+	git reset --hard &&
+	git checkout HEAD^ &&
+	echo binary >>b &&
+	git commit -m "side" b &&
+	test_must_fail git merge master &&
+	git add b && (
+		git --no-pager diff --cached --check >test.out
+		test $? = 2
+	) &&
+	test "$(grep "conflict marker" test.out | wc -l)" = 3 &&
+	git reset --hard
+'
+
 test_done
-- 
1.5.6.1.78.gde8d9

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] pre-commit hook should ignore carriage returns at EOL
  2008-06-26 22:33         ` Junio C Hamano
                             ` (4 preceding siblings ...)
  2008-06-26 22:37           ` [PATCH 5/5] diff --check: detect leftover conflict markers Junio C Hamano
@ 2008-06-26 23:01           ` Junio C Hamano
  2008-06-26 23:06             ` Junio C Hamano
  5 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2008-06-26 23:01 UTC (permalink / raw)
  To: Christian Holtje; +Cc: Alex Riesen, git

Junio C Hamano <gitster@pobox.com> writes:

> Christian Holtje <docwhat@gmail.com> writes:
>
>>> I suggested using "diff --check" (and possibly teaching "diff --check"
>>> other things the scripted example checks, such as conflict markers),
>>> which would know to honor the line endings specified per path via
>>> gitattributes(5), instead of building on top of the big Perl script,
>>> and I
>>> had an impression that you agreed to the approach.
>>
>> I'm completely confused how gitattributes and core.autocrlf interact,
>> etc.
>
> Here is a series I just cooked up so that we can remove the whole Perl
> script and replace it by adding --check to "diff-index" used there. 
>
> The first three are code clean-ups and the last two implements necessary
> new features to "diff --check".  The whole series somewhat depend on the
> fix to 'maint' not to lose the exit status I sent earlier.
>
> [PATCH 1/5] diff --check: explain why we do not care whether old side is binary
> [PATCH 2/5] check_and_emit_line(): rename and refactor
> [PATCH 3/5] checkdiff: pass diff_options to the callback
> [PATCH 4/5] Teach "diff --check" about a new blank lines at end
> [PATCH 5/5] diff --check: detect leftover conflict markers

With these enhancements in place, I think the pre-commit hook to find
problematic change would become essentially a one-liner, something like:

	git diff-index --check -M --cached

and the checking will obey what you configured with core.whitespace, which
globally defines what kind of whitespace breakages are "problematic",
and/or whitespace attribute which determines the same per path.

If you have for example Python source files that you would want all the
default whitespace checks (that is, trailing whitespaces are not allowed,
initial indentation part should not have SP followed by HT), you would
have

	*.py whitespace=trail,space-before-tab

in your .gitattributes, and the above command would catch such a
breakage.  If you further want to catch indentation with more than 8
SPs that can be replaced with HTs in your C sources, you would say:

	*.[ch] whitespace=indent-with-no-tab,trail,space-before-tab

You could choose to have CRLF line endings in the repository [*1*], and
for such projects, diff output would have tons of lines that end with
CRs.  To consider these CRs part of the line terminator, add cr-at-eol
to the value of whitespace attribute, like so:

	*.py whitespace=trail,space,cr-at-eol
	*.[ch] whitespace=indent,trail,space,cr-at-eol

[Footnote]

*1* I do not do Windows, but my understanding is that this practice is not
recommended because it would hurt cross-platform use of the project.  You
would instead keep your repository copy with LF line endings, and make
your checkouts have CRLF line endings by core.autocrlf configuration.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] pre-commit hook should ignore carriage returns at EOL
  2008-06-26 23:01           ` [PATCH v2] pre-commit hook should ignore carriage returns at EOL Junio C Hamano
@ 2008-06-26 23:06             ` Junio C Hamano
  2008-06-27  4:24               ` Jeff King
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2008-06-26 23:06 UTC (permalink / raw)
  To: Christian Holtje; +Cc: Alex Riesen, git

Junio C Hamano <gitster@pobox.com> writes:

> With these enhancements in place, I think the pre-commit hook to find
> problematic change would become essentially a one-liner, something like:
>
> 	git diff-index --check -M --cached

More specifically, it would be like this.

 templates/hooks--pre-commit.sample |   89 ++++++++----------------------------
 1 files changed, 19 insertions(+), 70 deletions(-)

diff --git a/templates/hooks--pre-commit.sample b/templates/hooks--pre-commit.sample
dissimilarity index 79%
index 71c10f2..c302ed1 100755
--- a/templates/hooks--pre-commit.sample
+++ b/templates/hooks--pre-commit.sample
@@ -1,70 +1,19 @@
-#!/bin/sh
-#
-# An example hook script to verify what is about to be committed.
-# Called by git-commit with no arguments.  The hook should
-# exit with non-zero status after issuing an appropriate message if
-# it wants to stop the commit.
-#
-# To enable this hook, rename this file to "pre-commit".
-
-# This is slightly modified from Andrew Morton's Perfect Patch.
-# Lines you introduce should not have trailing whitespace.
-# Also check for an indentation that has SP before a TAB.
-
-if git-rev-parse --verify HEAD 2>/dev/null
-then
-	git-diff-index -p -M --cached HEAD --
-else
-	# NEEDSWORK: we should produce a diff with an empty tree here
-	# if we want to do the same verification for the initial import.
-	:
-fi |
-perl -e '
-    my $found_bad = 0;
-    my $filename;
-    my $reported_filename = "";
-    my $lineno;
-    sub bad_line {
-	my ($why, $line) = @_;
-	if (!$found_bad) {
-	    print STDERR "*\n";
-	    print STDERR "* You have some suspicious patch lines:\n";
-	    print STDERR "*\n";
-	    $found_bad = 1;
-	}
-	if ($reported_filename ne $filename) {
-	    print STDERR "* In $filename\n";
-	    $reported_filename = $filename;
-	}
-	print STDERR "* $why (line $lineno)\n";
-	print STDERR "$filename:$lineno:$line\n";
-    }
-    while (<>) {
-	if (m|^diff --git a/(.*) b/\1$|) {
-	    $filename = $1;
-	    next;
-	}
-	if (/^@@ -\S+ \+(\d+)/) {
-	    $lineno = $1 - 1;
-	    next;
-	}
-	if (/^ /) {
-	    $lineno++;
-	    next;
-	}
-	if (s/^\+//) {
-	    $lineno++;
-	    chomp;
-	    if (/\s$/) {
-		bad_line("trailing whitespace", $_);
-	    }
-	    if (/^\s* \t/) {
-		bad_line("indent SP followed by a TAB", $_);
-	    }
-	    if (/^([<>])\1{6} |^={7}$/) {
-		bad_line("unresolved merge conflict", $_);
-	    }
-	}
-    }
-    exit($found_bad);
-'
+#!/bin/sh
+#
+# An example hook script to verify what is about to be committed.
+# Called by git-commit with no arguments.  The hook should
+# exit with non-zero status after issuing an appropriate message if
+# it wants to stop the commit.
+#
+# To enable this hook, rename this file to "pre-commit".
+
+if git-rev-parse --verify HEAD 2>/dev/null
+then
+	against=
+else
+	# Initial commit: diff against an empty tree object
+	against=4b825dc642cb6eb9a060e54bf8d69288fbee4904
+fi
+
+exec git diff-index --check --cached $against
+

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] pre-commit hook should ignore carriage returns at EOL
  2008-06-26 23:06             ` Junio C Hamano
@ 2008-06-27  4:24               ` Jeff King
  0 siblings, 0 replies; 14+ messages in thread
From: Jeff King @ 2008-06-27  4:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Christian Holtje, Alex Riesen, git

On Thu, Jun 26, 2008 at 04:06:56PM -0700, Junio C Hamano wrote:

> +if git-rev-parse --verify HEAD 2>/dev/null
> +then
> +	against=
> +else
> +	# Initial commit: diff against an empty tree object
> +	against=4b825dc642cb6eb9a060e54bf8d69288fbee4904
> +fi

Heh, I am happy that the virtual empty tree object is coming in handy
again.

>From quick review, the series looks good to me (and it was, btw, very
easy to read -- I think that you, as somebody who reviews a lot of
patches, have gotten very good at splitting up your own sensibly. :) ).

-Peff

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2008-06-27  4:25 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-24 19:21 [PATCH v2] pre-commit hook should ignore carriage returns at EOL Christian Holtje
2008-06-25 18:14 ` Alex Riesen
2008-06-25 18:47   ` Christian Holtje
2008-06-25 19:14     ` Junio C Hamano
2008-06-26  2:41       ` Christian Holtje
2008-06-26 22:33         ` Junio C Hamano
2008-06-26 22:34           ` [PATCH 1/5] diff --check: explain why we do not care whether old side is binary Junio C Hamano
2008-06-26 22:35           ` [PATCH 2/5] check_and_emit_line(): rename and refactor Junio C Hamano
2008-06-26 22:36           ` [PATCH 3/5] checkdiff: pass diff_options to the callback Junio C Hamano
2008-06-26 22:36           ` [PATCH 4/5] Teach "diff --check" about a new blank lines at end Junio C Hamano
2008-06-26 22:37           ` [PATCH 5/5] diff --check: detect leftover conflict markers Junio C Hamano
2008-06-26 23:01           ` [PATCH v2] pre-commit hook should ignore carriage returns at EOL Junio C Hamano
2008-06-26 23:06             ` Junio C Hamano
2008-06-27  4:24               ` Jeff King

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).