* [PATCH] userdiff: allow * between cpp funcname words
@ 2011-12-06 16:35 Thomas Rast
2011-12-06 19:02 ` Jeff King
0 siblings, 1 reply; 9+ messages in thread
From: Thomas Rast @ 2011-12-06 16:35 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
The cpp pattern, used for C and C++, would not match the start of a
declaration such as
static char *prepare_index(int argc,
because it did not allow for * anywhere between the various words that
constitute the modifiers, type and function name. Fix it.
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
---
This is a really sneaky one-character bug that I cannot believe went
unnoticed for so long, seeing as there are plenty of instances within
git itself where it matters.
userdiff.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/userdiff.c b/userdiff.c
index 7c983c1..76109da 100644
--- a/userdiff.c
+++ b/userdiff.c
@@ -118,7 +118,7 @@
/* Jump targets or access declarations */
"!^[ \t]*[A-Za-z_][A-Za-z_0-9]*:.*$\n"
/* C/++ functions/methods at top level */
- "^([A-Za-z_][A-Za-z_0-9]*([ \t]+[A-Za-z_][A-Za-z_0-9]*([ \t]*::[ \t]*[^[:space:]]+)?){1,}[ \t]*\\([^;]*)$\n"
+ "^([A-Za-z_][A-Za-z_0-9]*([ \t*]+[A-Za-z_][A-Za-z_0-9]*([ \t]*::[ \t]*[^[:space:]]+)?){1,}[ \t]*\\([^;]*)$\n"
/* compound type at top level */
"^((struct|class|enum)[^;]*)$",
/* -- */
--
1.7.8.424.gcb840
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] userdiff: allow * between cpp funcname words
2011-12-06 16:35 [PATCH] userdiff: allow * between cpp funcname words Thomas Rast
@ 2011-12-06 19:02 ` Jeff King
2011-12-06 20:17 ` Thomas Rast
0 siblings, 1 reply; 9+ messages in thread
From: Jeff King @ 2011-12-06 19:02 UTC (permalink / raw)
To: Thomas Rast; +Cc: Junio C Hamano, git
On Tue, Dec 06, 2011 at 05:35:08PM +0100, Thomas Rast wrote:
> The cpp pattern, used for C and C++, would not match the start of a
> declaration such as
>
> static char *prepare_index(int argc,
>
> because it did not allow for * anywhere between the various words that
> constitute the modifiers, type and function name. Fix it.
>
> Signed-off-by: Thomas Rast <trast@student.ethz.ch>
> ---
>
> This is a really sneaky one-character bug that I cannot believe went
> unnoticed for so long, seeing as there are plenty of instances within
> git itself where it matters.
Looks reasonable to me. You can see the difference, for instance, with:
git show -U1 3c73a1d
(The -U1 is because of the annoying "we will start looking for the
header at the top of context, not the top of changes" behavior I
mentioned last week).
-Peff
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] userdiff: allow * between cpp funcname words
2011-12-06 19:02 ` Jeff King
@ 2011-12-06 20:17 ` Thomas Rast
2011-12-06 20:19 ` Jeff King
2011-12-06 21:15 ` Junio C Hamano
0 siblings, 2 replies; 9+ messages in thread
From: Thomas Rast @ 2011-12-06 20:17 UTC (permalink / raw)
To: Jeff King; +Cc: Junio C Hamano, git
Jeff King wrote:
> On Tue, Dec 06, 2011 at 05:35:08PM +0100, Thomas Rast wrote:
>
> > The cpp pattern, used for C and C++, would not match the start of a
> > declaration such as
> >
> > static char *prepare_index(int argc,
> >
> > because it did not allow for * anywhere between the various words that
> > constitute the modifiers, type and function name. Fix it.
> >
> > Signed-off-by: Thomas Rast <trast@student.ethz.ch>
> > ---
> >
> > This is a really sneaky one-character bug that I cannot believe went
> > unnoticed for so long, seeing as there are plenty of instances within
> > git itself where it matters.
>
> Looks reasonable to me. You can see the difference, for instance, with:
>
> git show -U1 3c73a1d
>
> (The -U1 is because of the annoying "we will start looking for the
> header at the top of context, not the top of changes" behavior I
> mentioned last week).
Actually (sadly) I'll have to revise it. It doesn't match much of C++
either, and I haven't yet come up with a reasonable regex that
matches, say,
foo::Bar<int>::t& Baz::operator<<(
which I would call ludicrous, but it's valid C++.
--
Thomas Rast
trast@{inf,student}.ethz.ch
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] userdiff: allow * between cpp funcname words
2011-12-06 20:17 ` Thomas Rast
@ 2011-12-06 20:19 ` Jeff King
2011-12-06 20:52 ` Johannes Sixt
2011-12-06 21:15 ` Junio C Hamano
1 sibling, 1 reply; 9+ messages in thread
From: Jeff King @ 2011-12-06 20:19 UTC (permalink / raw)
To: Thomas Rast; +Cc: Junio C Hamano, git
On Tue, Dec 06, 2011 at 09:17:56PM +0100, Thomas Rast wrote:
> > Looks reasonable to me. You can see the difference, for instance, with:
> >
> > git show -U1 3c73a1d
> >
> > (The -U1 is because of the annoying "we will start looking for the
> > header at the top of context, not the top of changes" behavior I
> > mentioned last week).
>
> Actually (sadly) I'll have to revise it. It doesn't match much of C++
> either, and I haven't yet come up with a reasonable regex that
> matches, say,
>
> foo::Bar<int>::t& Baz::operator<<(
>
> which I would call ludicrous, but it's valid C++.
Ick, yeah. Maybe it is worth doing the "*" thing for now, and then
worrying about advanced C++ stuff on top as another patch. AFAICT, your
original patch is a strict improvement.
-Peff
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] userdiff: allow * between cpp funcname words
2011-12-06 20:19 ` Jeff King
@ 2011-12-06 20:52 ` Johannes Sixt
2011-12-06 21:07 ` René Scharfe
0 siblings, 1 reply; 9+ messages in thread
From: Johannes Sixt @ 2011-12-06 20:52 UTC (permalink / raw)
To: Jeff King; +Cc: Thomas Rast, Junio C Hamano, git
Am 06.12.2011 21:19, schrieb Jeff King:
> On Tue, Dec 06, 2011 at 09:17:56PM +0100, Thomas Rast wrote:
>
>>> Looks reasonable to me. You can see the difference, for instance, with:
>>>
>>> git show -U1 3c73a1d
>>>
>>> (The -U1 is because of the annoying "we will start looking for the
>>> header at the top of context, not the top of changes" behavior I
>>> mentioned last week).
>>
>> Actually (sadly) I'll have to revise it. It doesn't match much of C++
>> either, and I haven't yet come up with a reasonable regex that
>> matches, say,
>>
>> foo::Bar<int>::t& Baz::operator<<(
>>
>> which I would call ludicrous, but it's valid C++.
>
> Ick, yeah. Maybe it is worth doing the "*" thing for now, and then
> worrying about advanced C++ stuff on top as another patch. AFAICT, your
> original patch is a strict improvement.
Excuse me, where's the problem? The above example shows this
@@ -105,8 +105,8 @@ char *url_decode(const char *url)
struct strbuf out = STRBUF_INIT;
- const char *slash = strchr(url, '/');
+ const char *colon = strchr(url, ':');
...
with current 4cb5d10b. This looks quite correct, no?
-- Hannes
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] userdiff: allow * between cpp funcname words
2011-12-06 20:52 ` Johannes Sixt
@ 2011-12-06 21:07 ` René Scharfe
0 siblings, 0 replies; 9+ messages in thread
From: René Scharfe @ 2011-12-06 21:07 UTC (permalink / raw)
To: Johannes Sixt; +Cc: Jeff King, Thomas Rast, Junio C Hamano, git
Am 06.12.2011 21:52, schrieb Johannes Sixt:
> Am 06.12.2011 21:19, schrieb Jeff King:
>> On Tue, Dec 06, 2011 at 09:17:56PM +0100, Thomas Rast wrote:
>>
>>>> Looks reasonable to me. You can see the difference, for instance, with:
>>>>
>>>> git show -U1 3c73a1d
>>>>
>>>> (The -U1 is because of the annoying "we will start looking for the
>>>> header at the top of context, not the top of changes" behavior I
>>>> mentioned last week).
>>>
>>> Actually (sadly) I'll have to revise it. It doesn't match much of C++
>>> either, and I haven't yet come up with a reasonable regex that
>>> matches, say,
>>>
>>> foo::Bar<int>::t& Baz::operator<<(
>>>
>>> which I would call ludicrous, but it's valid C++.
>>
>> Ick, yeah. Maybe it is worth doing the "*" thing for now, and then
>> worrying about advanced C++ stuff on top as another patch. AFAICT, your
>> original patch is a strict improvement.
>
> Excuse me, where's the problem? The above example shows this
>
> @@ -105,8 +105,8 @@ char *url_decode(const char *url)
> struct strbuf out = STRBUF_INIT;
> - const char *slash = strchr(url, '/');
> + const char *colon = strchr(url, ':');
> ...
>
> with current 4cb5d10b. This looks quite correct, no?
That's with the default heuristic; try something like this first to turn
on userdiff:
$ echo url.c diff=cpp >>.gitattributes
René
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] userdiff: allow * between cpp funcname words
2011-12-06 20:17 ` Thomas Rast
2011-12-06 20:19 ` Jeff King
@ 2011-12-06 21:15 ` Junio C Hamano
2011-12-07 8:04 ` Thomas Rast
1 sibling, 1 reply; 9+ messages in thread
From: Junio C Hamano @ 2011-12-06 21:15 UTC (permalink / raw)
To: Thomas Rast; +Cc: Jeff King, git
Thomas Rast <trast@student.ethz.ch> writes:
> Actually (sadly) I'll have to revise it. It doesn't match much of C++
> either, and I haven't yet come up with a reasonable regex that
> matches, say,
>
> foo::Bar<int>::t& Baz::operator<<(
>
> which I would call ludicrous, but it's valid C++.
Heh, I'd rather not see us go that route, which would either end up
implementing a C++ parser or reverting the heuristics back to "non-blank
at the beginning of the line" that was already reasonably useful.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] userdiff: allow * between cpp funcname words
2011-12-06 21:15 ` Junio C Hamano
@ 2011-12-07 8:04 ` Thomas Rast
2011-12-07 21:13 ` Johannes Sixt
0 siblings, 1 reply; 9+ messages in thread
From: Thomas Rast @ 2011-12-07 8:04 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Jeff King, git
Junio C Hamano wrote:
> Thomas Rast <trast@student.ethz.ch> writes:
>
> > Actually (sadly) I'll have to revise it. It doesn't match much of C++
> > either, and I haven't yet come up with a reasonable regex that
> > matches, say,
> >
> > foo::Bar<int>::t& Baz::operator<<(
> >
> > which I would call ludicrous, but it's valid C++.
>
> Heh, I'd rather not see us go that route, which would either end up
> implementing a C++ parser or reverting the heuristics back to "non-blank
> at the beginning of the line" that was already reasonably useful.
Well, there are many things that we deliberately do not match right
now and for which that's a good thing:
label:
public:
void declaration_only(...);
int global_variable;
At some point I was wondering whether it would be better to just
declare a non-match for '.*;' and '^[A-Za-z_][A-Za-z_0-9]+:', and
otherwise match all '^[A-Za-z].*\(' but I may be missing something.
--
Thomas Rast
trast@{inf,student}.ethz.ch
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] userdiff: allow * between cpp funcname words
2011-12-07 8:04 ` Thomas Rast
@ 2011-12-07 21:13 ` Johannes Sixt
0 siblings, 0 replies; 9+ messages in thread
From: Johannes Sixt @ 2011-12-07 21:13 UTC (permalink / raw)
To: Thomas Rast; +Cc: Junio C Hamano, Jeff King, git
Am 07.12.2011 09:04, schrieb Thomas Rast:
> Junio C Hamano wrote:
>> Thomas Rast <trast@student.ethz.ch> writes:
>>
>>> Actually (sadly) I'll have to revise it. It doesn't match much of C++
>>> either, and I haven't yet come up with a reasonable regex that
>>> matches, say,
>>>
>>> foo::Bar<int>::t& Baz::operator<<(
>>>
>>> which I would call ludicrous, but it's valid C++.
>>
>> Heh, I'd rather not see us go that route, which would either end up
>> implementing a C++ parser or reverting the heuristics back to "non-blank
>> at the beginning of the line" that was already reasonably useful.
>
> Well, there are many things that we deliberately do not match right
> now and for which that's a good thing:
>
> label:
> public:
> void declaration_only(...);
> int global_variable;
>
> At some point I was wondering whether it would be better to just
> declare a non-match for '.*;' and '^[A-Za-z_][A-Za-z_0-9]+:', and
> otherwise match all '^[A-Za-z].*\(' but I may be missing something.
The current cpp pattern doesn't work that well with C++. Since it
requires a blank before a name before the opening parentheses, it
doesn't catch constructors:
Foo::Foo()
and it should fail for GNU style C function definitions as well (I
didn't test):
void
do_the_foo()
I'll run this pattern for a while:
diff.cpp.xfuncname=!^[a-zA-Z_][a-zA-Z_0-9]*[[:space:]]*:[[:space:]]*$
^[a-zA-Z_].*
BTW, your match pattern requires an opening parenthesis; it would not
catch class definitions.
-- Hannes
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2011-12-07 21:13 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-06 16:35 [PATCH] userdiff: allow * between cpp funcname words Thomas Rast
2011-12-06 19:02 ` Jeff King
2011-12-06 20:17 ` Thomas Rast
2011-12-06 20:19 ` Jeff King
2011-12-06 20:52 ` Johannes Sixt
2011-12-06 21:07 ` René Scharfe
2011-12-06 21:15 ` Junio C Hamano
2011-12-07 8:04 ` Thomas Rast
2011-12-07 21:13 ` Johannes Sixt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).