Git development
 help / color / mirror / Atom feed
* [PATCH] pre-commit hook: less easily-tripped conflict marker detection
@ 2006-06-27  9:06 Eric Wong
  2006-06-27 17:24 ` Junio C Hamano
  0 siblings, 1 reply; 4+ messages in thread
From: Eric Wong @ 2006-06-27  9:06 UTC (permalink / raw)
  To: git, Junio C Hamano; +Cc: Eric Wong

This should make adding asciidoc files to Documentation easier.

Only complain about conflict markers if we see that we have
some combination of '<<<<<<< ', '>>>>>>> ', and '======='.

Also add a NO_VERIFY environment check to this hook, in case
there's something that we want to force in but still gets
tripped by this hook.  It'd be a lot more work to add
--no-verify flags to all things that could potentially call this
hook.

Signed-off-by: Eric Wong <normalperson@yhbt.net>
---
 templates/hooks--pre-commit |   32 ++++++++++++++++++++++++++++----
 1 files changed, 28 insertions(+), 4 deletions(-)

diff --git a/templates/hooks--pre-commit b/templates/hooks--pre-commit
index 723a9ef..1657f96 100644
--- a/templates/hooks--pre-commit
+++ b/templates/hooks--pre-commit
@@ -1,4 +1,5 @@
 #!/bin/sh
+test -n "$NO_VERIFY" && exit 0
 #
 # An example hook script to verify what is about to be committed.
 # Called by git-commit with no arguments.  The hook should
@@ -24,8 +25,9 @@ perl -e '
     my $filename;
     my $reported_filename = "";
     my $lineno;
-    sub bad_line {
-	my ($why, $line) = @_;
+    my $in_unresolved;
+    my @unresolved;
+    sub bad_common {
 	if (!$found_bad) {
 	    print STDERR "*\n";
 	    print STDERR "* You have some suspicious patch lines:\n";
@@ -36,12 +38,34 @@ perl -e '
 	    print STDERR "* In $filename\n";
 	    $reported_filename = $filename;
 	}
+    }
+    sub bad_line {
+	my ($why, $line) = @_;
+	bad_common();
 	print STDERR "* $why (line $lineno)\n";
 	print STDERR "$filename:$lineno:$line\n";
     }
+    sub show_unresolved {
+	# if we want even less easily-tripped checks,
+	# change the "||" to "&&" here.  Right now, we can deal with
+	# the case where somebody removed one of the <{7} or >{7} lines
+	# but left the other one (as well as ={7}) in there.
+	if (($unresolved[0]->[0] =~ /^<{7} / ||
+				$unresolved[-1]->[0] =~ /^>{7} /) &&
+				grep { $_->[0] =~ /^={7}$/ } @unresolved) {
+	    bad_common();
+	    foreach my $l (@unresolved) {
+		print STDERR "* unresolved merge conflict (line $l->[1])\n";
+		print STDERR "$filename:$l->[1]:$l->[0]\n"
+	    }
+	}
+	@unresolved = ();
+    }
+
     while (<>) {
 	if (m|^diff --git a/(.*) b/\1$|) {
 	    $filename = $1;
+	    show_unresolved() if @unresolved;
 	    next;
 	}
 	if (/^@@ -\S+ \+(\d+)/) {
@@ -61,8 +85,8 @@ perl -e '
 	    if (/^\s* 	/) {
 		bad_line("indent SP followed by a TAB", $_);
 	    }
-	    if (/^(?:[<>=]){7}/) {
-		bad_line("unresolved merge conflict", $_);
+	    if (/^[<>]{7} / || /^={7}$/) {
+		push @unresolved, [ $_, $lineno ];
 	    }
 	}
     }
-- 
1.4.1.rc1.g2faf-dirty

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] pre-commit hook: less easily-tripped conflict marker detection
  2006-06-27  9:06 [PATCH] pre-commit hook: less easily-tripped conflict marker detection Eric Wong
@ 2006-06-27 17:24 ` Junio C Hamano
  2006-06-27 22:32   ` Eric Wong
  0 siblings, 1 reply; 4+ messages in thread
From: Junio C Hamano @ 2006-06-27 17:24 UTC (permalink / raw)
  To: Eric Wong; +Cc: git

Eric Wong <normalperson@yhbt.net> writes:

> This should make adding asciidoc files to Documentation easier.
>
> Only complain about conflict markers if we see that we have
> some combination of '<<<<<<< ', '>>>>>>> ', and '======='.

Are you sure everybody uses exactly seven?  I did not have SP
after a run of '<' and '>' because I didn't know.

> Also add a NO_VERIFY environment check to this hook, in case
> there's something that we want to force in but still gets
> tripped by this hook.

Hmm.  Undecided.

> @@ -24,8 +25,9 @@ perl -e '
>...
> +    my $in_unresolved;

Unused as far as I can see.

> +    sub show_unresolved {
> +	# if we want even less easily-tripped checks,
> +	# change the "||" to "&&" here.  Right now, we can deal with
> +	# the case where somebody removed one of the <{7} or >{7} lines
> +	# but left the other one (as well as ={7}) in there.

I think '||' is fine as is -- I think <<< and === removed with
>>> left is a common mistake (think of superseding a smallish
"our change" between <<< and === with much larger and polished
upstream "their change" after ===).  But I am not sure about the
code around here:

> +	if (($unresolved[0]->[0] =~ /^<{7} / ||
> +				$unresolved[-1]->[0] =~ /^>{7} /) &&
> +				grep { $_->[0] =~ /^={7}$/ } @unresolved) {
> +	    bad_common();
> +	    foreach my $l (@unresolved) {
> +		print STDERR "* unresolved merge conflict (line $l->[1])\n";
> +		print STDERR "$filename:$l->[1]:$l->[0]\n"
> +	    }
> +	}

 - why check only the first and last element in @unresolved but
   use all of them without checking the ones in-between?

 - if you are keeping the range in an array, maybe the error
   message can point at the range.

Printing $l->[0] is not so useful (the user sees only the
conflict marker) but the original code did so only because it
operated on one-line at a time.

> +	@unresolved = ();
> +    }
> +
>      while (<>) {
>  	if (m|^diff --git a/(.*) b/\1$|) {
>  	    $filename = $1;
> +	    show_unresolved() if @unresolved;
>  	    next;
>  	}
>  	if (/^@@ -\S+ \+(\d+)/) {
> @@ -61,8 +85,8 @@ perl -e '
>  	    if (/^\s* 	/) {
>  		bad_line("indent SP followed by a TAB", $_);
>  	    }
> -	    if (/^(?:[<>=]){7}/) {
> -		bad_line("unresolved merge conflict", $_);
> +	    if (/^[<>]{7} / || /^={7}$/) {
> +		push @unresolved, [ $_, $lineno ];
>  	    }
>  	}
>      }

Maybe you are missing show_unresolved() for the last patch here?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] pre-commit hook: less easily-tripped conflict marker detection
  2006-06-27 17:24 ` Junio C Hamano
@ 2006-06-27 22:32   ` Eric Wong
  2006-06-28  1:58     ` Junio C Hamano
  0 siblings, 1 reply; 4+ messages in thread
From: Eric Wong @ 2006-06-27 22:32 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Junio C Hamano <junkio@cox.net> wrote:
> Eric Wong <normalperson@yhbt.net> writes:
> 
> > This should make adding asciidoc files to Documentation easier.
> >
> > Only complain about conflict markers if we see that we have
> > some combination of '<<<<<<< ', '>>>>>>> ', and '======='.
> 
> Are you sure everybody uses exactly seven?  I did not have SP
> after a run of '<' and '>' because I didn't know.

That's what rerere checks for.  I'm not sure about other programs
leave 3-way merge markers besides merge(1).

> > @@ -24,8 +25,9 @@ perl -e '
> >...
> > +    my $in_unresolved;
> 
> Unused as far as I can see.

Oops, I was planning to store the entire hunk in @unresolved, but
decided line numbers were enough.

> > +    sub show_unresolved {
> > +	# if we want even less easily-tripped checks,
> > +	# change the "||" to "&&" here.  Right now, we can deal with
> > +	# the case where somebody removed one of the <{7} or >{7} lines
> > +	# but left the other one (as well as ={7}) in there.
> 
> I think '||' is fine as is -- I think <<< and === removed with
> >>> left is a common mistake (think of superseding a smallish
> "our change" between <<< and === with much larger and polished
> upstream "their change" after ===).  But I am not sure about the
> code around here:

My code won't detect when <<< and === are removed with only >>>
left.  I didn't think it was a common mistake at all.

> > +	if (($unresolved[0]->[0] =~ /^<{7} / ||
> > +				$unresolved[-1]->[0] =~ /^>{7} /) &&
> > +				grep { $_->[0] =~ /^={7}$/ } @unresolved) {
> > +	    bad_common();
> > +	    foreach my $l (@unresolved) {
> > +		print STDERR "* unresolved merge conflict (line $l->[1])\n";
> > +		print STDERR "$filename:$l->[1]:$l->[0]\n"
> > +	    }
> > +	}
> 
>  - why check only the first and last element in @unresolved but
>    use all of them without checking the ones in-between?

I assume that any partially finished resolutions would either start with
<<< or end with >>>.  I don't actually know which are the most oftenly
made mistakes when accidentally committing merge-conflicts.

>  - if you are keeping the range in an array, maybe the error
>    message can point at the range.
> 
> Printing $l->[0] is not so useful (the user sees only the
> conflict marker) but the original code did so only because it
> operated on one-line at a time.

I agree.  I was thinking about adding entire hunks with $in_unresolved,
but forgot or decided against it (probably out of laziness, as it was
late when I did this).

> > +	@unresolved = ();
> > +    }
> > +
> >      while (<>) {
> >  	if (m|^diff --git a/(.*) b/\1$|) {
> >  	    $filename = $1;
> > +	    show_unresolved() if @unresolved;
> >  	    next;
> >  	}
> >  	if (/^@@ -\S+ \+(\d+)/) {
> > @@ -61,8 +85,8 @@ perl -e '
> >  	    if (/^\s* 	/) {
> >  		bad_line("indent SP followed by a TAB", $_);
> >  	    }
> > -	    if (/^(?:[<>=]){7}/) {
> > -		bad_line("unresolved merge conflict", $_);
> > +	    if (/^[<>]{7} / || /^={7}$/) {
> > +		push @unresolved, [ $_, $lineno ];
> >  	    }
> >  	}
> >      }
> 
> Maybe you are missing show_unresolved() for the last patch here?

Oops, good catch.

<finally, moved from above>:
> > Also add a NO_VERIFY environment check to this hook, in case
> > there's something that we want to force in but still gets
> > tripped by this hook.
> 
> Hmm.  Undecided.

At this point, I think this is probably the best change to make.  There
are many things that a user could do that an automated checker could
miss, and there are also many things that it could be overchecking for.


-	    if (/^(?:[<>=]){7}/) {
+	    if (/^[<>]{7} / || /^={7}$/) {

I would also make this change, because I'm pretty certain 7 characters
(and one space for [<>]) is standard for merge(1).  We already rely on
that for rerere.

-- 
Eric Wong

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] pre-commit hook: less easily-tripped conflict marker detection
  2006-06-27 22:32   ` Eric Wong
@ 2006-06-28  1:58     ` Junio C Hamano
  0 siblings, 0 replies; 4+ messages in thread
From: Junio C Hamano @ 2006-06-28  1:58 UTC (permalink / raw)
  To: Eric Wong; +Cc: git

Eric Wong <normalperson@yhbt.net> writes:

>> Hmm.  Undecided.
>
> At this point, I think this is probably the best change to make.  There
> are many things that a user could do that an automated checker could
> miss, and there are also many things that it could be overchecking for.

Agreed to the latter part of the last sentence.  Undecided about
the rest and the implementation.

>
> -	    if (/^(?:[<>=]){7}/) {
> +	    if (/^[<>]{7} / || /^={7}$/) {
>
> I would also make this change, because I'm pretty certain 7 characters
> (and one space for [<>]) is standard for merge(1).  We already rely on
> that for rerere.

One of the things we might want is to use diff3 instead of merge
but I presume the latter is a thin wrapper around the former so
that would be OK.  I am however not enthused about the @unresolved
array approach.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2006-06-28  1:58 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-27  9:06 [PATCH] pre-commit hook: less easily-tripped conflict marker detection Eric Wong
2006-06-27 17:24 ` Junio C Hamano
2006-06-27 22:32   ` Eric Wong
2006-06-28  1:58     ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox