git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] userdiff: allow * between cpp funcname words
@ 2011-12-06 16:35 Thomas Rast
  2011-12-06 19:02 ` Jeff King
  0 siblings, 1 reply; 9+ messages in thread
From: Thomas Rast @ 2011-12-06 16:35 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

The cpp pattern, used for C and C++, would not match the start of a
declaration such as

  static char *prepare_index(int argc,

because it did not allow for * anywhere between the various words that
constitute the modifiers, type and function name.  Fix it.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
---

This is a really sneaky one-character bug that I cannot believe went
unnoticed for so long, seeing as there are plenty of instances within
git itself where it matters.


 userdiff.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/userdiff.c b/userdiff.c
index 7c983c1..76109da 100644
--- a/userdiff.c
+++ b/userdiff.c
@@ -118,7 +118,7 @@
 	 /* Jump targets or access declarations */
 	 "!^[ \t]*[A-Za-z_][A-Za-z_0-9]*:.*$\n"
 	 /* C/++ functions/methods at top level */
-	 "^([A-Za-z_][A-Za-z_0-9]*([ \t]+[A-Za-z_][A-Za-z_0-9]*([ \t]*::[ \t]*[^[:space:]]+)?){1,}[ \t]*\\([^;]*)$\n"
+	 "^([A-Za-z_][A-Za-z_0-9]*([ \t*]+[A-Za-z_][A-Za-z_0-9]*([ \t]*::[ \t]*[^[:space:]]+)?){1,}[ \t]*\\([^;]*)$\n"
 	 /* compound type at top level */
 	 "^((struct|class|enum)[^;]*)$",
 	 /* -- */
-- 
1.7.8.424.gcb840

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] userdiff: allow * between cpp funcname words
  2011-12-06 16:35 [PATCH] userdiff: allow * between cpp funcname words Thomas Rast
@ 2011-12-06 19:02 ` Jeff King
  2011-12-06 20:17   ` Thomas Rast
  0 siblings, 1 reply; 9+ messages in thread
From: Jeff King @ 2011-12-06 19:02 UTC (permalink / raw)
  To: Thomas Rast; +Cc: Junio C Hamano, git

On Tue, Dec 06, 2011 at 05:35:08PM +0100, Thomas Rast wrote:

> The cpp pattern, used for C and C++, would not match the start of a
> declaration such as
> 
>   static char *prepare_index(int argc,
> 
> because it did not allow for * anywhere between the various words that
> constitute the modifiers, type and function name.  Fix it.
> 
> Signed-off-by: Thomas Rast <trast@student.ethz.ch>
> ---
> 
> This is a really sneaky one-character bug that I cannot believe went
> unnoticed for so long, seeing as there are plenty of instances within
> git itself where it matters.

Looks reasonable to me. You can see the difference, for instance, with:

  git show -U1 3c73a1d

(The -U1 is because of the annoying "we will start looking for the
header at the top of context, not the top of changes" behavior I
mentioned last week).

-Peff

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] userdiff: allow * between cpp funcname words
  2011-12-06 19:02 ` Jeff King
@ 2011-12-06 20:17   ` Thomas Rast
  2011-12-06 20:19     ` Jeff King
  2011-12-06 21:15     ` Junio C Hamano
  0 siblings, 2 replies; 9+ messages in thread
From: Thomas Rast @ 2011-12-06 20:17 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git

Jeff King wrote:
> On Tue, Dec 06, 2011 at 05:35:08PM +0100, Thomas Rast wrote:
> 
> > The cpp pattern, used for C and C++, would not match the start of a
> > declaration such as
> > 
> >   static char *prepare_index(int argc,
> > 
> > because it did not allow for * anywhere between the various words that
> > constitute the modifiers, type and function name.  Fix it.
> > 
> > Signed-off-by: Thomas Rast <trast@student.ethz.ch>
> > ---
> > 
> > This is a really sneaky one-character bug that I cannot believe went
> > unnoticed for so long, seeing as there are plenty of instances within
> > git itself where it matters.
> 
> Looks reasonable to me. You can see the difference, for instance, with:
> 
>   git show -U1 3c73a1d
> 
> (The -U1 is because of the annoying "we will start looking for the
> header at the top of context, not the top of changes" behavior I
> mentioned last week).

Actually (sadly) I'll have to revise it.  It doesn't match much of C++
either, and I haven't yet come up with a reasonable regex that
matches, say,

  foo::Bar<int>::t& Baz::operator<<(

which I would call ludicrous, but it's valid C++.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] userdiff: allow * between cpp funcname words
  2011-12-06 20:17   ` Thomas Rast
@ 2011-12-06 20:19     ` Jeff King
  2011-12-06 20:52       ` Johannes Sixt
  2011-12-06 21:15     ` Junio C Hamano
  1 sibling, 1 reply; 9+ messages in thread
From: Jeff King @ 2011-12-06 20:19 UTC (permalink / raw)
  To: Thomas Rast; +Cc: Junio C Hamano, git

On Tue, Dec 06, 2011 at 09:17:56PM +0100, Thomas Rast wrote:

> > Looks reasonable to me. You can see the difference, for instance, with:
> > 
> >   git show -U1 3c73a1d
> > 
> > (The -U1 is because of the annoying "we will start looking for the
> > header at the top of context, not the top of changes" behavior I
> > mentioned last week).
> 
> Actually (sadly) I'll have to revise it.  It doesn't match much of C++
> either, and I haven't yet come up with a reasonable regex that
> matches, say,
> 
>   foo::Bar<int>::t& Baz::operator<<(
> 
> which I would call ludicrous, but it's valid C++.

Ick, yeah. Maybe it is worth doing the "*" thing for now, and then
worrying about advanced C++ stuff on top as another patch. AFAICT, your
original patch is a strict improvement.

-Peff

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] userdiff: allow * between cpp funcname words
  2011-12-06 20:19     ` Jeff King
@ 2011-12-06 20:52       ` Johannes Sixt
  2011-12-06 21:07         ` René Scharfe
  0 siblings, 1 reply; 9+ messages in thread
From: Johannes Sixt @ 2011-12-06 20:52 UTC (permalink / raw)
  To: Jeff King; +Cc: Thomas Rast, Junio C Hamano, git

Am 06.12.2011 21:19, schrieb Jeff King:
> On Tue, Dec 06, 2011 at 09:17:56PM +0100, Thomas Rast wrote:
> 
>>> Looks reasonable to me. You can see the difference, for instance, with:
>>>
>>>   git show -U1 3c73a1d
>>>
>>> (The -U1 is because of the annoying "we will start looking for the
>>> header at the top of context, not the top of changes" behavior I
>>> mentioned last week).
>>
>> Actually (sadly) I'll have to revise it.  It doesn't match much of C++
>> either, and I haven't yet come up with a reasonable regex that
>> matches, say,
>>
>>   foo::Bar<int>::t& Baz::operator<<(
>>
>> which I would call ludicrous, but it's valid C++.
> 
> Ick, yeah. Maybe it is worth doing the "*" thing for now, and then
> worrying about advanced C++ stuff on top as another patch. AFAICT, your
> original patch is a strict improvement.

Excuse me, where's the problem? The above example shows this

@@ -105,8 +105,8 @@ char *url_decode(const char *url)
        struct strbuf out = STRBUF_INIT;
-       const char *slash = strchr(url, '/');
+       const char *colon = strchr(url, ':');
...

with current 4cb5d10b. This looks quite correct, no?

-- Hannes

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] userdiff: allow * between cpp funcname words
  2011-12-06 20:52       ` Johannes Sixt
@ 2011-12-06 21:07         ` René Scharfe
  0 siblings, 0 replies; 9+ messages in thread
From: René Scharfe @ 2011-12-06 21:07 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Jeff King, Thomas Rast, Junio C Hamano, git

Am 06.12.2011 21:52, schrieb Johannes Sixt:
> Am 06.12.2011 21:19, schrieb Jeff King:
>> On Tue, Dec 06, 2011 at 09:17:56PM +0100, Thomas Rast wrote:
>>
>>>> Looks reasonable to me. You can see the difference, for instance, with:
>>>>
>>>>    git show -U1 3c73a1d
>>>>
>>>> (The -U1 is because of the annoying "we will start looking for the
>>>> header at the top of context, not the top of changes" behavior I
>>>> mentioned last week).
>>>
>>> Actually (sadly) I'll have to revise it.  It doesn't match much of C++
>>> either, and I haven't yet come up with a reasonable regex that
>>> matches, say,
>>>
>>>    foo::Bar<int>::t&  Baz::operator<<(
>>>
>>> which I would call ludicrous, but it's valid C++.
>>
>> Ick, yeah. Maybe it is worth doing the "*" thing for now, and then
>> worrying about advanced C++ stuff on top as another patch. AFAICT, your
>> original patch is a strict improvement.
>
> Excuse me, where's the problem? The above example shows this
>
> @@ -105,8 +105,8 @@ char *url_decode(const char *url)
>          struct strbuf out = STRBUF_INIT;
> -       const char *slash = strchr(url, '/');
> +       const char *colon = strchr(url, ':');
> ...
>
> with current 4cb5d10b. This looks quite correct, no?

That's with the default heuristic; try something like this first to turn 
on userdiff:

	$ echo url.c diff=cpp >>.gitattributes

René

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] userdiff: allow * between cpp funcname words
  2011-12-06 20:17   ` Thomas Rast
  2011-12-06 20:19     ` Jeff King
@ 2011-12-06 21:15     ` Junio C Hamano
  2011-12-07  8:04       ` Thomas Rast
  1 sibling, 1 reply; 9+ messages in thread
From: Junio C Hamano @ 2011-12-06 21:15 UTC (permalink / raw)
  To: Thomas Rast; +Cc: Jeff King, git

Thomas Rast <trast@student.ethz.ch> writes:

> Actually (sadly) I'll have to revise it.  It doesn't match much of C++
> either, and I haven't yet come up with a reasonable regex that
> matches, say,
>
>   foo::Bar<int>::t& Baz::operator<<(
>
> which I would call ludicrous, but it's valid C++.

Heh, I'd rather not see us go that route, which would either end up
implementing a C++ parser or reverting the heuristics back to "non-blank
at the beginning of the line" that was already reasonably useful.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] userdiff: allow * between cpp funcname words
  2011-12-06 21:15     ` Junio C Hamano
@ 2011-12-07  8:04       ` Thomas Rast
  2011-12-07 21:13         ` Johannes Sixt
  0 siblings, 1 reply; 9+ messages in thread
From: Thomas Rast @ 2011-12-07  8:04 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, git

Junio C Hamano wrote:
> Thomas Rast <trast@student.ethz.ch> writes:
> 
> > Actually (sadly) I'll have to revise it.  It doesn't match much of C++
> > either, and I haven't yet come up with a reasonable regex that
> > matches, say,
> >
> >   foo::Bar<int>::t& Baz::operator<<(
> >
> > which I would call ludicrous, but it's valid C++.
> 
> Heh, I'd rather not see us go that route, which would either end up
> implementing a C++ parser or reverting the heuristics back to "non-blank
> at the beginning of the line" that was already reasonably useful.

Well, there are many things that we deliberately do not match right
now and for which that's a good thing:

  label:
  public:
  void declaration_only(...);
  int global_variable;

At some point I was wondering whether it would be better to just
declare a non-match for '.*;' and '^[A-Za-z_][A-Za-z_0-9]+:', and
otherwise match all '^[A-Za-z].*\(' but I may be missing something.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] userdiff: allow * between cpp funcname words
  2011-12-07  8:04       ` Thomas Rast
@ 2011-12-07 21:13         ` Johannes Sixt
  0 siblings, 0 replies; 9+ messages in thread
From: Johannes Sixt @ 2011-12-07 21:13 UTC (permalink / raw)
  To: Thomas Rast; +Cc: Junio C Hamano, Jeff King, git

Am 07.12.2011 09:04, schrieb Thomas Rast:
> Junio C Hamano wrote:
>> Thomas Rast <trast@student.ethz.ch> writes:
>>
>>> Actually (sadly) I'll have to revise it.  It doesn't match much of C++
>>> either, and I haven't yet come up with a reasonable regex that
>>> matches, say,
>>>
>>>   foo::Bar<int>::t& Baz::operator<<(
>>>
>>> which I would call ludicrous, but it's valid C++.
>>
>> Heh, I'd rather not see us go that route, which would either end up
>> implementing a C++ parser or reverting the heuristics back to "non-blank
>> at the beginning of the line" that was already reasonably useful.
> 
> Well, there are many things that we deliberately do not match right
> now and for which that's a good thing:
> 
>   label:
>   public:
>   void declaration_only(...);
>   int global_variable;
> 
> At some point I was wondering whether it would be better to just
> declare a non-match for '.*;' and '^[A-Za-z_][A-Za-z_0-9]+:', and
> otherwise match all '^[A-Za-z].*\(' but I may be missing something.

The current cpp pattern doesn't work that well with C++. Since it
requires a blank before a name before the opening parentheses, it
doesn't catch constructors:

Foo::Foo()

and it should fail for GNU style C function definitions as well (I
didn't test):

void
do_the_foo()

I'll run this pattern for a while:

diff.cpp.xfuncname=!^[a-zA-Z_][a-zA-Z_0-9]*[[:space:]]*:[[:space:]]*$
^[a-zA-Z_].*

BTW, your match pattern requires an opening parenthesis; it would not
catch class definitions.

-- Hannes

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2011-12-07 21:13 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-06 16:35 [PATCH] userdiff: allow * between cpp funcname words Thomas Rast
2011-12-06 19:02 ` Jeff King
2011-12-06 20:17   ` Thomas Rast
2011-12-06 20:19     ` Jeff King
2011-12-06 20:52       ` Johannes Sixt
2011-12-06 21:07         ` René Scharfe
2011-12-06 21:15     ` Junio C Hamano
2011-12-07  8:04       ` Thomas Rast
2011-12-07 21:13         ` Johannes Sixt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).